AlexIndustrial/mesa

Author	SHA1	Message	Date
Dmitry Osipenko	7b40d32187	util/mesa-db: Open DB files during access time Open DB files when DB is accessed and close them afterwards to reduce number of FDs used by multi-part DB cache. Fixes: `fd9f7b748e` ("util/mesa-db: Introduce multipart mesa-db cache") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11776 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11810 Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Dmitry Osipenko	2a9378a0f9	util/mesa-db-multipart: Open one cache part at a time Open one cache DB part at a time for a multi-part cache to reduce number of FDs used by the cache. Previously multi-part DB cache instance was consuming 100 FDs, now it's 2 and cache files are opened when cache is read or written instead of opening them at the init time. Fixes: `fd9f7b748e` ("util/mesa-db: Introduce multipart mesa-db cache") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11776 Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Dmitry Osipenko	6a2f5cb556	util/mesa-db: Fix missing O_CLOEXEC Use O_CLOEXEC flag for opened cache DB files to not leak cache FDs when process forks. Fixes: `32211788d0` ("util/disk_cache: Add new mesa-db cache type") Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11810 Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Michel Dänzer	92893309bc	util/mesa-db: Further simplify mesa_db_compact Taking advantage of the persistent array of index entries. In particular, it's no longer necessary to read from the index file during compaction. Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Michel Dänzer	031f2c2a69	util: Use persistent array of index entries Instead of allocating separate memory for each index entry in the hash table, use a single array (backed by a mapping of anonymous memory pages, which allows efficient array resizes) which holds a copy of the index file contents. The hash table now references each entry via its offset in the index file, so that the array address can change on resize. This eliminates some index file reads and reduces memory management overhead for the hash table entries. It should be more efficient in general. Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Michel Dänzer	feef4bf828	util/mesa-db: Use single read for whole index Instead of separate reads per index entry. Should be more efficient. Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Michel Dänzer	1ba3996fd5	util/mesa-db: Reserve hash table for total number of index entries Without this, the hash table needed to be rehashed about log2(<total number of entries>) times as it grew. Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Michel Dänzer	e596882dd1	util/mesa-db: Recreate files if header load or index update fails The previous behaviour had these issues: 1. It meant that this part of the cache couldn't be used this time. 2. It left the corrupted index/cache files unchanged, so the same failure might happen again next time. Recreating the index & cache files for this part means it can be used, it just loses any previously cached contents. Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Michel Dänzer	13c44abaac	util/mesa-db: Make mesa_db_lock robust against signals flock may be interrupted by a signal, in which case it returns with EINTR error. In this case we need to retry until it returns success or another error. Fixes: `32211788d0` ("util/disk_cache: Add new mesa-db cache type") Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30988>	2024-10-25 18:06:14 +00:00
Georg Lehmann	d01c1ba939	aco: move exec copy out of waterfall loops Foz-DB Navi21: Totals from 348 (0.44% of 79395) affected shaders: CodeSize: 17944800 -> 17946268 (+0.01%); split: -0.02%, +0.03% Latency: 29775973 -> 29774369 (-0.01%); split: -0.01%, +0.00% InvThroughput: 10233380 -> 10232801 (-0.01%); split: -0.01%, +0.00% Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>	2024-10-25 16:47:32 +00:00
Georg Lehmann	6c73a8a7f2	aco: optimize conditional divergent breaks at the end of loops Removes one branch and one s_mov. Foz-DB Navi21: Totals from 1483 (1.87% of 79395) affected shaders: Instrs: 6424114 -> 6373084 (-0.79%) CodeSize: 35309320 -> 35091084 (-0.62%); split: -0.63%, +0.01% Latency: 87950935 -> 88030841 (+0.09%); split: -0.03%, +0.12% InvThroughput: 24784756 -> 24799536 (+0.06%); split: -0.02%, +0.08% Copies: 588743 -> 561805 (-4.58%) Branches: 242521 -> 215578 (-11.11%) SALU: 877856 -> 850918 (-3.07%) Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>	2024-10-25 16:47:32 +00:00
Georg Lehmann	075c5818cb	aco/ssa_elimination: don't assume exec writes can be removed based on block kind Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>	2024-10-25 16:47:32 +00:00
Georg Lehmann	61ab33c883	aco/ssa_elimination: add instr_accesses helper Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19070>	2024-10-25 16:47:32 +00:00
Valentine Burley	7ec0b62341	ir3: Don't lower to LCSSA before calling nir_divergence_analysis() NIR can now calculate divergence without converting to LCSSA beforehand. However, removing this particular instance of nir_convert_to_lcssa was missed in commit `87cb42f953` ("treewide: don't lower to LCSSA before calling nir_divergence_analysis()") Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31821>	2024-10-25 16:12:51 +00:00
Valentine Burley	5bb0296e08	freedreno/devices: Establish a7xx sub-generations We can differentiate three distinctive sub-generations on a7xx. This reduces the number of copy-pasted quirks. Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31821>	2024-10-25 16:12:51 +00:00
Valentine Burley	0981f983ee	freedreno/devices: Enable 64-bit atomics on a735 and a740v3 The blob exposes VK_KHR_shader_atomic_int64 on these devices too, but this was missed during initial enablement. Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31821>	2024-10-25 16:12:51 +00:00
Valentine Burley	da989edde8	freedreno/devices: Document common name for a635 speedbins Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31821>	2024-10-25 16:12:51 +00:00
Valentine Burley	45bb8002df	freedreno/devices: Inline a690 quirk Similarly as on FD621, we only have one GPU-specific quirk, no need to use a separate dictionary for it. Signed-off-by: Valentine Burley <valentine.burley@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31821>	2024-10-25 16:12:51 +00:00
Rob Clark	7f63fa34da	nir/lower_amul: Fix ASAN error We shouldn't assume the bindings are sparse when we allocate an array indexed on the binding. See, for example: dEQP-GLES31.functional.program_interface_query.buffer_variable.random.55 Fixes: `2e833b16bc` ("nir/lower_amul: Use num_ubos/ssbos instead of recomputing it.") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31611>	2024-10-25 15:38:51 +00:00
Rob Clark	e548f90edb	freedreno/ir3: Create UBO variables for driver-UBOs Some nir passes, like lower_amul, expect to have varibles declared for things that are accessed via load_ubo(). Fixes: `76e417ca59` ("turnip,ir3/a750: Implement consts loading via preamble") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31611>	2024-10-25 15:38:51 +00:00
Jocelyn Falempe	b24d4f0c86	gbm/dri: Fix color format for big endian. Using wayland on s390x has all the colors wrong. Mesa reports using GBM_FORMAT_XRGB8888 but inside the buffer, the colors are in GBM_FORMAT_BGRX8888 order. This patch fixes it for common formats, and also introduced BGRX8888 which is the default on big endian. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31707>	2024-10-25 14:18:24 +00:00
Jocelyn Falempe	3814dee11a	gbm/dri: Use PIPE_FORMAT_* instead of using __DRI_IMAGE_* __DRI_IMAGE formats are not well defined for big endian. This patch has no functionnal change and prepare the work to better support big endian. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31707>	2024-10-25 14:18:24 +00:00
Jocelyn Falempe	c6d7ab7c1f	loader: Fix typo in __DRI_IMAGE_FORMAT_XBGR16161616 definition The X and A format are inverted by mistake. Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31707>	2024-10-25 14:18:24 +00:00
Pierre-Eric Pelloux-Prayer	60f7b2fc9f	radeonsi/ci: mark *.tessellation_shader_tessellation.max_in_out_attributes as fixed Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>	2024-10-25 13:36:54 +00:00
Pierre-Eric Pelloux-Prayer	9434ac65f4	glsl: use nir_io_add_const_offset_to_base in gl_nir_opts This fixes: KHR-GLES32.core.tessellation_shader.tessellation_shader_tessellation.max_in_out_attributes Without this change the assert in gather_output is hit: assert(!nir_src_is_const(offset) \|\| nir_src_as_uint(offset) == 0) Because nir_opt_algebraic determines that some ssa values are constant, but the nir_io_add_const_offset_to_base wasn't run afterwards. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>	2024-10-25 13:36:54 +00:00
Pierre-Eric Pelloux-Prayer	60578df33a	nir: skip offset=0 in nir_io_add_const_offset_to_base When offset=0, the pass was a no-op but was setting the progress flag which could cause infinite loops when this pass is going to be added to gl_nir_opts. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31684>	2024-10-25 13:36:54 +00:00
David Rosca	f24c799c67	radeonsi/vcn: Only enable skip mode with matching references Skip mode frames must match the reference frames otherwise skip mode needs to be disabled. Fixes: `1e1f078099` ("radeonsi/vcn: Add support for VCN5 AV1 compound") Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31805>	2024-10-25 13:09:15 +00:00
Samuel Pitoiset	38d7492391	ci: uprev VKCTS to 1.3.10.0 This tag contains tests for DGC EXT. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31789>	2024-10-25 14:03:37 +02:00
Joshua Ashton	c66fd95d92	radv: Fix sample locations at 0 for X/Y We cannot set the {X,Y}MAX_RIGHT_EXCLUSION bits if we have a sample location at a pixel boundary. CTS does not seem to be catching this. Signed-off-by: Joshua Ashton <joshua@froggi.es> Co-authored-by: Vitaliy Triang3l Kuzmin <triang3l@yandex.ru> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31839>	2024-10-25 11:24:12 +00:00
Joshua Ashton	130a423118	radv: Enable variableSampleLocations This should come for free now we are dynamic rendering based. This passes CTS on RX 7900XTX. Signed-off-by: Joshua Ashton <joshua@froggi.es> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31839>	2024-10-25 11:24:12 +00:00
Rhys Perry	8efc765a3d	nir/algebraic: fix shfr optimization with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Rhys Perry	b2abd3bdba	nir: fix shfr constant folding with zero src2 No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `08903bbe89` ("nir: add mqsad_4x8, shfr and nir_opt_mqsad") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31808>	2024-10-25 09:59:40 +00:00
Eric Engestrom	03f056ea71	ci: skip slow tests on all non-"full" jobs Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31828>	2024-10-25 08:26:31 +00:00
Eric Engestrom	bedb2f8a86	ci: rename "merge-skips" to "slow-skips" as they're about to be used outside of merge piplines Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31828>	2024-10-25 08:26:31 +00:00
Samuel Pitoiset	927a17f30a	amd: do not emit PA_SU_PRIM_FILTER_CNTL in the common GFX preamble RADV needs to adjust this register for user sample locations because it seems possible to have a sample on the -8 coordinate. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31815>	2024-10-25 07:41:22 +00:00
Samuel Pitoiset	3d172d08b0	radv: do no emit PA_SC_CONSERVATIVE_RASTERIZATION_CNTL in the preamble on GFX12 It's already emitted as part of the cmdbuf. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31815>	2024-10-25 07:41:22 +00:00
Samuel Pitoiset	56cffd4b9b	radv: simplify determining if a graphics pipeline uses NGG culling has_ngg_culling can only be TRUE if the last VGT shader also uses NGG. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31829>	2024-10-25 07:10:28 +00:00
Samuel Pitoiset	62efebfd70	radv: fix emitting NGG culling state for ESO It's possible to enable NGG culling with ESO if shaders are linked, or if the VS doesn't need a prolog or if TES is used. This wasn't supposed to be enabled but I think it worked just by luck because the user SGPR value was probably zero and NGGC was disabled at draw time. Found by inspection. Cc: mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31829>	2024-10-25 07:10:27 +00:00
Samuel Pitoiset	982af1a2bc	radv: capture shader statistics when RGP is enabled This is useful in order to correlate shader hashes between RGP and Fossilize. This is because Fossilize needs to pass the capture statistics flag for getting shader hashes and the pipeline key won't match otherwise. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31820>	2024-10-25 06:29:02 +00:00
Eric Engestrom	460c2eb967	ci: move shellcheck options to .shellcheckrc That way, IDEs get to have the same behaviour as the CI Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31826>	2024-10-24 22:43:03 +00:00
Francisco Jerez	e2eba3c7da	intel/brw/xe2+: Adjust performance analysis divergence weight due to EU fusion removal. This reduces the penalty the heuristic gives to SIMD32 shaders relative to SIMD16 in presence of discard control flow on Xe2+. The penalty was meant to account for the inefficient divergence behavior of SIMD32 shaders on Gfx12.x platforms, since Gfx12 hardware had EUs bundled in groups of two, and each pair shared control flow logic so both EUs could only execute instructions in lockstep, which meant that SIMD32 shaders had an effective warp size of 64 on Gfx12.x. This change switches back to more optimistic modelling of discard divergence. With it we gain about 6% performance in a Shadow of the Tomb Raider trace (tested on BMG). One may wonder if there are still workloads that would suffer materially from enabling SIMD32 for all pixel shaders on Xe2 instead of using this heuristic, since Xe2 EUs have twice the GRF space, twice the FPU throughput and better divergence behavior than Xe, but the answer seems to be yes unfortunately: E.g. Superposition has some pixel shaders where SIMD32 has substantially worse scheduling due to the increased number of false dependencies due to higher register pressure, and using SIMD32 for them reduces performance significantly. The heuristic seems to model this correctly so it doesn't look like we can do without it at least right now on Xe2. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31697>	2024-10-24 22:06:52 +00:00
Kenneth Graunke	7bed11fbde	intel/brw: Allow immediates in the BFE instruction on Gfx12+ We weren't allowing immediates in BFE at all. Gfx12+ supports immediates in src0 (value) and src2 (width), but not src1 (offset). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31437>	2024-10-24 21:31:28 +00:00
Patrick Lerda	d19e2597ce	r600: fix spec ext_packed_depth_stencil getteximage This very test was working until the commit `4da147a02b` ("mesa: remove fallback for GL_DEPTH_STENCIL"). Indeed this commit lets the driver handles this path and this was failing on evergreen r600. The test was processed through r600_blit() which loads the fragment shader util_make_fs_blit_zs(). This fragment shader loads two textures the stencil and depth. The texture depth was processed properly but the other texture was generating incorrect values. This issue, which seems to be related to the hardware configuration, disappears when the underlying surface is allocated using a width multiple of 32. This change was tested on cayman and palm with the normal test: "piglit/bin/ext_packed_depth_stencil-getteximage -auto -fb" and the test was modified to test all the relevant width and height values. The gpu rv770 was not affected by this issue. Here is the result: spec/ext_packed_depth_stencil/getteximage: fail pass Cc: mesa-stable Signed-off-by: Patrick Lerda <patrick9876@free.fr> Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31757>	2024-10-24 21:06:36 +00:00
Aditya Swarup	e98759c7f4	anv: Use RCS engine for copying stencil resource for gfx125 HSD 14021541470 lists a HW bug on blitter engine where the compression pairing bit is not programmed correctly for stencil resources. Use RCS Engine to perform copy instead. Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31792>	2024-10-24 20:14:13 +00:00
Chia-I Wu	5fea98c4a1	panvk: fix scissor box Fix a typo in prepare_vp which causes incorrect scissor box with non-zero X in viewport/scissor. Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31832>	2024-10-24 12:49:02 -07:00
Chia-I Wu	029b8b11a0	panvk: fix gl_VertexIndex According to pandecode, r32 is global attribute offset and r36 is vertex offset. Follow panfrost to use r36 instead of r32 for both non-indexed firstVertex and indexed vertexOffset. With this, gl_VertexIndex stops being zero-based which is incorrect. Fixes: `5544d39f44` ("panvk: Add a CSF backend for panvk_queue/cmd_buffer") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31810>	2024-10-24 18:19:48 +00:00
Georg Lehmann	b79950fc1f	aco: remove heuristic that restricts VOP2/C with 2 sgprs Looking at the stats, the slightly increased code size isn't a problem compared to the benefits. This also only affects gfx10+, and those generations aren't throughput limited by 64bit instructions like early gcn. Foz-DB Navi21: Totals from 12377 (15.59% of 79395) affected shaders: MaxWaves: 269323 -> 269857 (+0.20%); split: +0.23%, -0.03% Instrs: 16505304 -> 16472552 (-0.20%); split: -0.21%, +0.01% CodeSize: 89815804 -> 90130344 (+0.35%); split: -0.02%, +0.37% VGPRs: 661160 -> 658640 (-0.38%); split: -0.40%, +0.02% SpillSGPRs: 3032 -> 3049 (+0.56%) SpillVGPRs: 826 -> 796 (-3.63%) Latency: 145800231 -> 145818568 (+0.01%); split: -0.14%, +0.15% InvThroughput: 39026010 -> 38892467 (-0.34%); split: -0.36%, +0.02% VClause: 325693 -> 325992 (+0.09%); split: -0.12%, +0.21% SClause: 497938 -> 497208 (-0.15%); split: -0.23%, +0.08% Copies: 1239036 -> 1204045 (-2.82%); split: -2.90%, +0.07% Branches: 462952 -> 462934 (-0.00%); split: -0.01%, +0.00% PreSGPRs: 586066 -> 587558 (+0.25%) PreVGPRs: 550024 -> 547736 (-0.42%) VALU: 11147608 -> 11114528 (-0.30%); split: -0.31%, +0.01% SALU: 2105546 -> 2105131 (-0.02%); split: -0.03%, +0.01% VMEM: 575983 -> 575923 (-0.01%) Foz-DB Navi31: Totals from 11544 (14.54% of 79395) affected shaders: MaxWaves: 319612 -> 319804 (+0.06%) Instrs: 17563158 -> 17527341 (-0.20%); split: -0.22%, +0.02% CodeSize: 92366832 -> 92626280 (+0.28%); split: -0.03%, +0.31% VGPRs: 667620 -> 665484 (-0.32%); split: -0.33%, +0.01% SpillSGPRs: 3418 -> 3434 (+0.47%) SpillVGPRs: 896 -> 858 (-4.24%) Scratch: 4738048 -> 4736512 (-0.03%) Latency: 141366653 -> 141399756 (+0.02%); split: -0.10%, +0.12% InvThroughput: 26213994 -> 26165751 (-0.18%); split: -0.21%, +0.03% VClause: 307956 -> 308124 (+0.05%); split: -0.12%, +0.18% SClause: 477816 -> 477326 (-0.10%); split: -0.18%, +0.08% Copies: 1161148 -> 1129386 (-2.74%); split: -2.81%, +0.08% Branches: 411509 -> 411506 (-0.00%); split: -0.00%, +0.00% PreSGPRs: 531354 -> 535027 (+0.69%) PreVGPRs: 525201 -> 521861 (-0.64%) VALU: 10360363 -> 10330274 (-0.29%); split: -0.30%, +0.01% SALU: 1778044 -> 1777585 (-0.03%); split: -0.04%, +0.01% VMEM: 551379 -> 551303 (-0.01%) VOPD: 3539 -> 3471 (-1.92%); split: +0.14%, -2.06% Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31804>	2024-10-24 17:44:13 +00:00
Georg Lehmann	54fa55a3f7	radv: don't use v_mqsad_u32_u8 on gfx7 According to tests on hawaii, v_mqsad_u32_u8 always uses saturating accumulation while v_msad_u8 truncates. GFX8+ can control this with the VOP3 clamp bit, on older hardware that's not supported. We want truncation for the NIR opcode. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12062 Fixes: `c3c138b10f` ("radv: optimize msad_4x8 to mqsad_4x8") Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31809>	2024-10-24 17:20:56 +00:00
Eric Engestrom	a85ed2a28f	lavapipe/ci: document regression in the commit range 765d1c47...366f63fd There's a cts uprev in one of these commits, so it's possible they're all just new tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31825>	2024-10-24 16:50:44 +00:00
Eric Engestrom	150fd992b6	lavapipe/ci: skip builtin ray query tests that take too long and time out Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31825>	2024-10-24 16:50:44 +00:00

1 2 3 4 5 ...

196923 Commits