Commit Graph

160072 Commits

Author SHA1 Message Date
Michel Dänzer c65e1ae016 lavapipe: Fix float32_atomic_min_max spelling
Fixes build with LLVM >= 15.

Fixes: 31695f81c9 ("lavapipe: export VK_KHR_shader_atomic_float")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18701>
2022-09-20 16:26:37 +00:00
Michel Dänzer 43c8064b1e gallivm: Fix LLVMAtomicRMWBinOpFMax spelling
Fixes build with LLVM >= 15.

Fixes: 203920d4c6 ("gallivm: add atomic 32-bit float support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18701>
2022-09-20 16:26:37 +00:00
Emma Anholt 2a1a8ce472 ci/nouveau: Update gm20b xfails.
Similar set of skips as gk20a, so we can find any remaining flakes given
the firehose of SSBOs and geom/tess flakes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18674>
2022-09-20 16:15:16 +00:00
Emma Anholt c8207158b5 ci/nouveau: fix up Jetson Nano
The updated board has a stabilized GPU and now I just need to decide if
I'm building a farm of them or not.  The new firmware flash needs a
reminder to the kernel of how to do NFS (no v2, thanks).  Also, the full
run is long and we need the TEST_PHASE_TIMEOUT variable to go past 20
minutes now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18674>
2022-09-20 16:15:16 +00:00
Emma Anholt 0fa857de28 ci/nouveau: Rearrange job setup variables.
Now there's "generic stuff for nouveau with bare metal", "the two board
types and how to use them", and "the specific jobs for those boards."

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18674>
2022-09-20 16:15:16 +00:00
Emma Anholt e8f708dbba ci/nouveau: Drop BM_POE_TIMEOUT.
Unused since 5f09b1ebe9 ("ci/bare-metal: Add test phase timeouts to all
boards.").

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18674>
2022-09-20 16:15:16 +00:00
Michael Skorokhodov 5e76850cff egl: Return EGL_BAD_MATCH for invalid share_list
From the eglspec.1.5: "An EGL_BAD_MATCH is generated if [...]
share context was created on a different display than
the one referenced by config."

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6414
Signed-off-by: Mykhailo Skorokhodov<mykhailo.skorokhodov@globallogic.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16368>
2022-09-20 15:47:22 +00:00
Friedrich Vock a418ab6654 radv: Correct accel struct header size
The size was changed when adding metadata but not updated here.

Fixes: 07eceb4f ("radv: Add metadata to acceleration structures")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18680>
2022-09-20 14:20:00 +00:00
Pavel Ondračka c0074b22cd r300: reduce CPU overhead in IF transformation pass
Right now there is a call to rc_get_variables, which performs a global
analysis of the whole shader, for every IF encountered. As a result,
shaders with a lot of IFs are compiled very slowly. The patological
cases are shaders using relative adressing, where the lowered array
access can result in tens of IFs.

This patch restructures the pass to call the rc_get_variables just once
at the beginning and later reuse the gathered info. We can do this,
because even though we transform the shader in the meantime (like for
example adding extra MOVs) the transformations are not siginificant
enough to influence the relevant variable info we are using.

This reduces CPU time for my shader-db by more than a half. I also
checked that the generated code for all shaders in shader-db is
identical.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18678>
2022-09-20 13:43:00 +00:00
Samuel Pitoiset 19eec024d2 radv,aco: do not compact MRTs if the pipeline uses a PS epilog
We can't detect color attachment without exports when compiling a PS
epilog, so we can't compact MRTs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18514>
2022-09-20 13:12:49 +00:00
Iago Toral Quiroga 09a0fd6925 v3dv: fix VK_EXT_texel_buffer_alignment
This extension was promoted to Vulkan 1.3 so we should be setting its
properties directly in the VkPhysicalDeviceVulkan13Properties struct
which the common mesa code will use to populate outgoing properties.

Apparently, only the properties struct was promoted and not the features
struct.

Reviewed-by: Eric Engestrom <eric@igalia.com>
Tested-by: Eric Engestrom <eric@igalia.com>
Fixes: ee62a4c751 ('v3dv: implement VK_EXT_texel_buffer_alignment')
Fixes: dEQP-VK.api.info.get_physical_device_properties2.properties.basic
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18697>
2022-09-20 12:52:31 +00:00
Rhys Perry 6df5ff7f19 aco: DCE ra_ctx::defs_done
This was used to distinguish definitions fixed before and during RA, but
it seems it isn't used anymore.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18547>
2022-09-20 12:24:03 +00:00
Samuel Pitoiset 0f88f57223 radv: allow to build the main FS in a graphics pipeline library
Corner cases like implicit gl_PrimitiveID are currently broken and
will be fixed later, but the general case should work.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18516>
2022-09-20 11:53:38 +00:00
Samuel Pitoiset e529745be3 radv: do not link shaders when the next stage is unknown
With GPL, it's possible to build the pre-rasterization stages separately
from the fragment stage. Implicit IO (like gl_PrimitiveID) between the
last pre-rast stage and the FS will be addressed later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18516>
2022-09-20 11:53:38 +00:00
Marcin Ślusarz 037404b441 nir, anv, hasvk, radv: pull uses_wide_subgroup_intrinsics into shader_info
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18504>
2022-09-20 10:19:21 +00:00
Marcin Ślusarz de5b137a2d anv: small cleanup of anv_graphics_pipeline_compile
Extract variables for things that are computed multiple times.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18504>
2022-09-20 10:19:21 +00:00
Marcin Ślusarz 06e0342a0d anv: add support for anv_assume_full_subgroups to task & mesh stages
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18504>
2022-09-20 10:19:21 +00:00
Marcin Ślusarz fa437f87ca nir: add uses_wide_subgroup_intrinsics to task/mesh shader_info
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18504>
2022-09-20 10:19:21 +00:00
Samuel Pitoiset 704ef1fd3b radv,aco: lower barycentric_at_sample in NIR
fossils-db (NAVI21):
Totals from 158 (0.12% of 134913) affected shaders:
CodeSize: 569456 -> 568824 (-0.11%)

Only Control seems affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
2022-09-20 09:52:37 +00:00
Samuel Pitoiset 9f0b4da875 radv: run nir_opt_cse before lowering FS intrinsics
Otherwise, there might be redundant barycentric_at_sample intrinsics
that will be lowered and this will increase code size.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
2022-09-20 09:52:37 +00:00
Samuel Pitoiset 7e433e25c8 radv: add nir_intrinsic_load_sample_positions_amd in the ABI
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
2022-09-20 09:52:37 +00:00
Samuel Pitoiset 7f444fc72c nir: add nir_intrinsic_load_sample_positions_amd
This will be used to lower barycentric_at_sample in NIR for RADV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18615>
2022-09-20 09:52:37 +00:00
Bas Nieuwenhuizen 266fe31666 ac/surface: Fix some warnings.
../mesa/src/amd/common/ac_surface.c:2324:48: warning: implicit conversion from enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') to different enumeration type 'enum gfx9_resource_type' [-Wenum-conversion]
   surf->u.gfx9.resource_type = AddrSurfInfoIn.resourceType;
                              ~ ~~~~~~~~~~~~~~~^~~~~~~~~~~~
../mesa/src/amd/common/ac_surface.c:3046:38: warning: implicit conversion from enumeration type 'const enum gfx9_resource_type' to different enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') [-Wenum-conversion]
   input.resourceType = surf->u.gfx9.resource_type;
                      ~ ~~~~~~~~~~~~~^~~~~~~~~~~~~
../mesa/src/amd/common/ac_surface.c:3069:38: warning: implicit conversion from enumeration type 'const enum gfx9_resource_type' to different enumeration type 'AddrResourceType' (aka 'enum _AddrResourceType') [-Wenum-conversion]
   input.resourceType = surf->u.gfx9.resource_type;

The enums are compatible so lets just add some casts.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18694>
2022-09-20 09:25:09 +00:00
Yonggang Luo 196d29a506 clover: Fixes use of designated initializers requires in c++ that doesn't support by MSVC
../src/gallium/frontends/clover/nir/invocation.cpp(400): error C7555: use of designated initializers requires at least '/std:c++20'

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18683>
2022-09-20 07:19:21 +00:00
Iago Toral Quiroga 15cdf5bb48 v3dv: optimize ldunif load into unifa write
If we emit a ldunif to load the ubo/ssbo base address and
then we are immediately moving it to the unifa register we
can have the ldunif write directly to unifa and avoid the mov
in between, which won't be done by copy propagation because that
only works with temp registers.

Also, since we can't read from unifa we must be careful to disallow
reuse of the ldunif result for a future ldunif of the same base address.
We do that by only reusing ldunif results from temp registers.

total instructions in shared programs: 12468943 -> 12455139 (-0.11%)
instructions in affected programs: 1661233 -> 1647429 (-0.83%)
helped: 8307
HURT: 3994

total uniforms in shared programs: 3704532 -> 3704522 (<.01%)
uniforms in affected programs: 339 -> 329 (-2.95%)
helped: 7
HURT: 0

total max-temps in shared programs: 2148158 -> 2148290 (<.01%)
max-temps in affected programs: 9320 -> 9452 (1.42%)
helped: 175
HURT: 295

total spills in shared programs: 2202 -> 2202 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 3059 -> 3057 (-0.07%)
fills in affected programs: 27 -> 25 (-7.41%)
helped: 1
HURT: 0

total sfu-stalls in shared programs: 21167 -> 21056 (-0.52%)
sfu-stalls in affected programs: 497 -> 386 (-22.33%)
helped: 209
HURT: 127

total inst-and-stalls in shared programs: 12490110 -> 12476195 (-0.11%)
inst-and-stalls in affected programs: 1662875 -> 1648960 (-0.84%)
helped: 8312
HURT: 3987

total nops in shared programs: 316563 -> 313553 (-0.95%)
nops in affected programs: 24269 -> 21259 (-12.40%)
helped: 2158
HURT: 1006

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18667>
2022-09-20 06:56:28 +00:00
Iago Toral Quiroga cbc5169ef9 broadcom/compiler: check signal writes to magic regs when updating scoreboard
We have only been checking magic writes from ADD and MUL ports, but signals
can potentially write to magic registers too.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18667>
2022-09-20 06:56:28 +00:00
Iago Toral Quiroga 90857262da broadcom/compiler: detect unifa write from signal
It is possible for some signals to write to unifa directly. We will
enable this from ldunif shortly so we should check for it here.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18667>
2022-09-20 06:56:28 +00:00
Vinson Lee 97d307406b radv: Use count_tes_user_sgprs return value.
Fix defect reported by Coverity Scan.

Useless call (USELESS_CALL)
side_effect_free: Calling count_tes_user_sgprs(key) is only useful for
its return value, which is ignored.

Fixes: 8253ec3855 ("radv: add shader arguments for dynamic patch control points")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18659>
2022-09-20 06:14:47 +00:00
Qiang Yu f4179f203d radeonsi: print out remove_streamout shader key
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17456>
2022-09-20 05:41:50 +00:00
Qiang Yu 4d15a06dee radeonsi: implement nir_intrinsic_load_streamout_buffer_amd
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17456>
2022-09-20 05:41:50 +00:00
Qiang Yu 8049edb653 radeonsi: implement nir_intrinsic_load_num_vertices_per_primitive_amd
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17456>
2022-09-20 05:41:50 +00:00
Marek Olšák 540e695b29 radeonsi: set VS_OUT_MISC_SIDE_BUS_ENA=1 for clip distance exports on gfx10.3
This should improve performance of clip distances.

Reviewed-by: Pierre-eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18639>
2022-09-20 04:26:46 +00:00
Tapani Pälli c184b49cf3 anv: remove vk_sample_locations_state from emit_multisample
State for sample locations is not used within this function.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18669>
2022-09-20 03:59:04 +00:00
Ruijing Dong 7829403809 frontends/va: enable sao in hevc encoding
enable sao feature from config attribute. by default
from vcn2, hevc encoding enables sao.

Signed-off-by: Ruijing Dong <ruijing.dong@amd.com>
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18626>
2022-09-20 02:54:48 +00:00
Rob Clark 0bf25a5313 freedreno/a6xx: Simplify fd6_build_user_consts()
Get rid of the table indirects.  Cuts in half the time spent in this fxn
in drawoverhead test 31 ("many uniforms / 1 changemany uniforms / 1
change")

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark 2f3b980caa freedreno/a6xx: Move user const upload to bind
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark f8204018fd freedreno: Drop unused arg
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark 075218f756 freedreno/a6xx: Pre-calculate user const state size
We can do this when we construct the program state object, rather than
at draw time.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark a81c6d7439 freedreno/a6xx: Skip IBO state when unused
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark e960431621 freedreno/drm: Simplify emit_reloc_common
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark 8609d62e4d freedreno/a6xx: Drop "hardpin" support
The upstream kernel supported everything needed to stop doing
kernel-side relocs before the first things with a6xx were fully
supported in upstream kernel.  Take advantage to drop some extra
overhead in OUT_RELOC() and equiv in the pack macros.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark 3fb60e9cef freedreno/drm: Add fd_ringbuffer_attach_bo()
Which does only the bo bookkeeping and skips all the extra legacy reloc
related overhead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark 07d9df0ce2 freedreno/drm: Inline fd_bo_get_iova()
The struct body was originally hidden to avoid it being part of the ABI
between libdrm_freedreno and mesa.  But that is no longer a problem.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark 76953ca4bb freedreno/ir3: GC unused macro
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Rob Clark c00491d4ab freedreno: Update github wiki links
The github wiki isn't really maintained anymore.  Update references to
point to the gitlab wiki instead.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18646>
2022-09-20 02:22:19 +00:00
Bas Nieuwenhuizen b2972cf410 radv: Add scratch stack to reduce LDS stack in RT traversal.
The current stack size is a significant limiter for occupancy, and
hence we need smaller stacks in LDS.

Rhys earlier had a patch that just put the N entries closest to the
root in LDS and the rest in scratch. However, this is not ideal for
performance as most of the activity is happening away from the root,
near the leaves. Of course we can't just switch it around, as the
leaf activity likely isn't happening all the way at the end of the
stack.

So what we do is make the LDS stack kinda a ringbuffer by always
accessing it using the stack index modulo the buffer size (always
a power of two so we can efficiently mask). If we then do not have
free space in this buffer we evict the entries closest to the root
to scratch and if we hit the "bottom" of the LDS space we load from
scratch.

Some rough perf numbers for indication with Q2RTX:

| evicting | LDS entries | perf |
|----------|-------------|------|
|       no |          76 |  55% |
|       no |          32 | 100% |
|       no |          24 | 105% |
|      yes |          32 |  95% |
|      yes |          16 | 100% |
|      yes |           8 |  90% |
|      yes |           4 |  75% |

(For the case with 4 entries we need to do some extra accounting as
 a full batch may not be available to evict)

So an obvious choice is to use a stack of 16 entries.

One might wonder if Q2RTX perf is mainly good due to BVHs with very
little geometry and hence low depth, so I also did some profiling
with control. This is done with RGP instruction timing, so this is
instructions executed not weighted for enabled masks, i.e. divergence
effects included.

| game    | LDS entries | scratch action | fraction of iterations |
|---------|-------------|----------------|------------------------|
| Control |           8 |          store |                  10.3% |
| Control |           8 |          load  |                  34.8% |
| Control |          16 |          store |                  0.58% |
| Control |          16 |          load  |                  2.62% |
| Q2RTX   |          16 |          store |                  1.00% |
| Q2RTX   |          16 |          load  |                  3.07% |

So Q2RTX doesn't seem like an unreasonably good case for this
algorithm.

On the implementation side, we can always place the scratch stack at
address 0 by just reserving the scratch space, and in the case of fixed
callstack size moving that up. In the dynamic case the dynamic stack
base already takes any reserved scratch space into account.

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>
2022-09-20 01:39:20 +00:00
Rhys Perry 7d26fafacf radv: fix dynamic RT stack size with VGPR spilling
VGPR spilling might cause VGPRs to be spilled at scratch offset 0, so we
can't use that.

fossil-db (Sienna Cichlid, Q2RTX and Control):
Totals from 4 (0.26% of 1524) affected shaders:
Instrs: 8734 -> 8737 (+0.03%)
CodeSize: 48492 -> 48504 (+0.02%)
Latency: 384375 -> 384369 (-0.00%)
InvThroughput: 256250 -> 256246 (-0.00%)
Copies: 1312 -> 1313 (+0.08%)
Branches: 256 -> 258 (+0.78%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18541>
2022-09-20 01:39:20 +00:00
Dave Airlie b983fcb585 docs: add new llvmpipe/lavapipe atomic float extensions
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18636>
2022-09-20 01:10:36 +00:00
Dave Airlie 31695f81c9 lavapipe: export VK_KHR_shader_atomic_float
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18636>
2022-09-20 01:10:36 +00:00
Dave Airlie 64845cdfed llvmpipe: export GL_NV_shader_atomic_float
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18636>
2022-09-20 01:10:36 +00:00