Commit Graph

186467 Commits

Author SHA1 Message Date
Casey Bowman 0b60969ec2 vulkan/screenshot-layer: Fix memory leaks
This frees a fairly large amount of memory from the 2D matrix by
iterating over the rows to free them individually.

Liuqiang spotted some areas that we return early in the threaded
function and don't free some pointers.

To remedy this, we'll reorder the checks so that we don't have to
return early and can instead use an if/else flow to take care of
these problematic areas in a more elegant way.

Co-authored-by: Casey Bowman <casey.g.bowman@intel.com>
Co-authored-by: liuqiang <liuqiang@kylinos.cn>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
2024-11-01 17:11:29 +00:00
Casey Bowman 1438cb5c25 vulkan/screenshot-layer: Increase buffer sizes
This allows larger buffer sizes when using the env config as well
as filepath for the output directory.

This will allow, for example, using a large number of singular frames:
frames=1/2/3/4/5/6/7/8/.../300

Also fixed an issue with filepaths sometimes being appended with garbage
characters due to not being initialized.

Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
2024-11-01 17:11:29 +00:00
Casey Bowman 461e1f985f vulkan/screenshot-layer: Fix image index selection
Previously, only the first image in the swapchain was chosen at all times
to be copied to a file.

This meant that if a list of consecutive images were selected, multiple
duplicate images would be saved, instead of the proper frames actually
used in the workload.

Now, the index is properly obtained from AcquireNextImageKHR(), leading
to the same image being used for the workload to be copied and saved to
a file.

Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31793>
2024-11-01 17:11:29 +00:00
Mike Blumenkrantz 5fd0b634d4 zink: add VVL for RADV jobs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27705>
2024-11-01 16:49:50 +00:00
Mike Blumenkrantz 01608a4067 zink: stop leaking precompiled generated tcs
this may have been created during precompile when using shader objects

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27705>
2024-11-01 16:49:50 +00:00
Samuel Pitoiset f7636b611a radv: add a struct that describes the trap handler layout
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31934>
2024-11-01 15:40:25 +00:00
Samuel Pitoiset 49682fc0cb radv,aco: save SQ_WAVE_GPR_ALLOC from the trap handler
This would be used to dump SGPRs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31934>
2024-11-01 15:40:25 +00:00
Samuel Pitoiset 31fc3199dd radv: fix dumping the faulty shader detected by the trap handler on GFX9+
The most significant bits need to be cleared.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>
2024-11-01 15:01:35 +00:00
Samuel Pitoiset 7b4da7f736 radv: only emit the TBA/TMA registers on GFX8
On GFX9+, these registers are privilegied and the kernel needs to
configure them.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>
2024-11-01 15:01:35 +00:00
Samuel Pitoiset 930395c5e4 radv: check for has_trap_handler_support instead of asserting
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>
2024-11-01 15:01:35 +00:00
Samuel Pitoiset e27ba67d33 ac: add ac_gpu_info::has_trap_handler_support
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31925>
2024-11-01 15:01:35 +00:00
Samuel Pitoiset b23cc8c1d3 radv: add missing L2 non-coherent image case for mipmaps with DCC/HTILE on GFX11
According to PAL, an image with DCC/HTILE and mipmaps isn't coherent
with L2 when the mip level is in the metadata mip-tail region.

This fix isn't super optimal because the driver should rely on the
subresource range to determine if the mip level is in the mip-tail,
but it's easier to backport. Upcoming commits will optimize that.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11939
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31920>
2024-11-01 14:36:55 +00:00
David Rosca c9ade8c3b5 radeonsi/vcn: Enable VCN4 AV1 encode WA
Cc: mesa-stable
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31889>
2024-11-01 14:05:04 +00:00
Job Noorman 0d94bf1ef9 freedreno,computerator: add support for local memory
Add @localmem header to set the shared size of the shader. This allows
instructions like ldlw and stlw to be used.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31919>
2024-11-01 10:22:37 +00:00
Paulo Zanoni 5ca883505e brw: add a NOP in between WHILE instructions on LNL
This is a workaround that is still in progress, see HSD 22020521218.
If we don't have these NOPs we may see rendering corruption or even
GPU hangs.

While we still don't fully understand the issue from the hardware
point of view, let's have this workaround so we can pass CTS and move
things forward. If we need to change this later, we can. Besides, the
impact is minimal. Shaderdb/fossilize report no changes for this
patch.

On our Blackops trace, the lack of this patch causes corruption in fog
rendering (rectangles where fog was supposed to be shown don't show
the fog).

On dEQP-VK.graphicsfuzz.cov-array-copies-loops-with-limiters, without
this patch we get a GPU hang.

Backport-to: 24.2
Testcase: dEQP-VK.graphicsfuzz.cov-array-copies-loops-with-limiters
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11813
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31331>
2024-10-31 23:57:10 +00:00
Deborah Brouwer a39d6f5003 freedreno/ci: remove redundant skip files
When running deqp-runner with a toml suite, the skip files can be
specified in the toml configuration or on the command line. The names of
most skip files are generated in `deqp-runner.sh` and passed through on
the command line so it’s not necessary to specify them again in the toml
suite. It doesn’t hurt, but it can be confusing.

Simplify the toml files by removing the duplicate skip files.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31912>
2024-10-31 15:05:16 -07:00
Rob Clark eef0b09939 freedreno/a6xx: Random whitespace fix
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31927>
2024-10-31 20:02:00 +00:00
Rob Clark 49dd40247d freedreno/a6xx: Don't check dst coords
Only the src coords of a blit must be in-bounds.

"Fixes" dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_out_of_bounds_{min,mag}*
by virtue of avoiding the 3d u_blitter fallback, where NEAREST filtering
doesn't do what the deqp test expects.

See https://gitlab.freedesktop.org/mesa/mesa/-/issues/12085

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31927>
2024-10-31 20:02:00 +00:00
Jordan Justen 39fab9b240 intel/dev: Set L3 bank count for Xe2+ from Xe KMD
Rather than updating intel_device_info_update_l3_banks(), the Xe KMD
provides this info via the DRM_XE_DEVICE_QUERY_GT_TOPOLOGY query item.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31894>
2024-10-31 18:40:27 +00:00
Samuel Pitoiset 01f329ec82 radv/ci: skip dEQP-VK.api.command_buffers.many_indirect_disps_on_secondary
It can also hang randomly on VanGogh, let's skip it by default for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31922>
2024-10-31 11:44:12 +00:00
Erik Faye-Lund 62622c6523 panvk: enable KHR_16bit_storage
This enables the 16bit storage extensions, with the
uniformAndStorageBuffer16BitAccess feature-bit.

This seems to already be implemented, so let's just expose it!

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31907>
2024-10-31 11:06:28 +00:00
Samuel Pitoiset 77e59eefc1 radv: add an option to configure the trap handler exceptions
This introduces RADV_TRAP_HANDLER_EXCP to configure the various
shader exceptions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31902>
2024-10-31 06:58:15 +00:00
Samuel Pitoiset 6b5a0f57ba radv: fix configuring the memory violation exception for the compute stage
The compute stage has two EXCP_EN fields and the memory violation bit
is in EXCP_EN_MSB. Confirmed by writing a small test on GFX8.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31902>
2024-10-31 06:58:14 +00:00
Chia-I Wu e474d4ebee panvk: add support for VK_KHR_timeline_semaphore
On panthor, VK_SYNC_FEATURE_TIMELINE is always supported.  On panfrost,
we can use vk_sync_timeline_get_type.

Note that there is a kernel issue regarding syncobj query that causes
dEQP-VK.synchronization.timeline_semaphore.wait.poll_signal_from_device
to time out when VK_SYNC_FEATURE_TIMELINE is set.  It is considered a
kernel bug and is not dealt with here.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31720>
2024-10-30 21:04:20 +00:00
Chia-I Wu 287a4f4701 panvk/jm: assert that the submit mode is not threaded
If the submit mode was VK_QUEUE_SUBMIT_MODE_THREADED, we would need to
call vk_common_QueueWaitIdle.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31720>
2024-10-30 21:04:20 +00:00
Chia-I Wu 60ade50d2d Revert "panvk: Set the submit mode to THREADED_ON_DEMAND"
This reverts commit aedb00ca08.
vk_device_init is able to set the submit mode correctly based on
vk_sync_type.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31720>
2024-10-30 21:04:20 +00:00
Chia-I Wu d3eb432155 panvk: remove an incorrect assert in collect_cs_deps
src_stages_to_subqueue_sb_mask calls stages_cover_subqueue, but also has
a special case for VK_PIPELINE_STAGE_2_DRAW_INDIRECT_BIT.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31720>
2024-10-30 21:04:20 +00:00
Lionel Landwerlin 1485b5659a anv: update some of the indirect invalidations
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31915>
2024-10-30 20:39:31 +00:00
Lionel Landwerlin cb224370b6 anv: avoid L3 fabric flush in pipeline barriers
This bit is not needed for barriers and appears to trigger a
performance regression. So leave it for just for AUX-TT
flushing/invalidation.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e3814dee1a ("anv: add plumbing/support for L3 fabric flush")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12090
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31915>
2024-10-30 20:39:31 +00:00
Rob Clark 98ff271c5a util/primconvert: Avoid OoB with improbable draws
Detect when the temporary index buffer cannot be generated due to too
large primitive count, and simply drop the draw on the floor.

Fixes a webgl reachable asan/crash.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12092
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31914>
2024-10-30 19:59:14 +00:00
Alyssa Rosenzweig 506b9a5ff5 nir/divergence_analysis: add AGX atomics
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: M Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31909>
2024-10-30 19:04:32 +00:00
Collabora's Gfx CI Team ff442e49b3 Uprev Piglit to c2b31333926a6171c3c02d182b756efad7770410
https://gitlab.freedesktop.org/mesa/piglit/-/compare/791e420b2628c1e35eea81b3bafdb1c904a141e8...c2b31333926a6171c3c02d182b756efad7770410

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31811>
2024-10-30 18:11:56 +00:00
Timur Kristóf 96b95c8427 radv: Flush L2 cache for non-L2-coherent images in EndCommandBuffer.
This fixes a CTS hang on Hawaii.

We previously only did a CB/DB flush,
but that doesn't include a L2 cache flush.
Also fix the comment that said this is for GFX9+.

Fixes: 7c62f6fa01
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31906>
2024-10-30 17:46:50 +00:00
Samuel Pitoiset 7015e22cb6 ac/nir: cull triangles/lines when all W positions are zero/NaN
It looks like the fixed-func hardware is very slow to cull primitives
with zero pos.w but shader based culling helps a lot.

This fixes a massive performance gap with the FSR2 demo compared to
AMDGPU-PRO, +228% on RDNA2.

Based on my investigation, AMDGPU-PRO seems to always cull these
primitives. Note that disabling NGG culling with AMDGPU-PRO reports the
same performance as RADV without that fix. Also note that the FSR2
sample doesn't specify any cull mode (ie. VK_CULL_MODE_NONE is used),
so this is the only reason PRO was culling more than RADV.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7260
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31891>
2024-10-30 17:09:37 +00:00
Connor Abbott d3533716f9 ir3: Fix detection of nontrivial continues
We may still need to insert a continue block even if there is only one
backedge, in a situation like:

for (...) {
   if (...) continue;
   foo();
   break;
}

We want foo() to be executed before reconverging. This is important for
the BVH encoding kernel, which launches an invocation for each node in
the tree and does a preorder traversal:

while (true) {
   if (!ready[node]) continue;
   encode();
   for (child node)
      ready[child] = true;
   break;
}

For the first few nodes, which will be in the same wave, we need
encode() for the root node to be called first, then its children spin
until ready, then the children call encode(), and so on. This can only
work if the children that aren't ready yet are parked while the parent
executes encode(), which requires the continue block.

This is also required because divergence analysis will assume that
uniform values written before the continue are still uniform after it,
which isn't the case now and causes an RA validation failure with Godot.

Fixes: 0fa93fb662 ("ir3: Fix convergence behavior for loops with continues")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31905>
2024-10-30 15:37:31 +00:00
Alyssa Rosenzweig 0f278bf3c5 hk: enable constant promotion
reduce the perf gap with GL :)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig c8870da833 agx: fold more inots
noticed in the tessellator.

total instructions in shared programs: 2757905 -> 2757078 (-0.03%)
instructions in affected programs: 105372 -> 104545 (-0.78%)
helped: 115
HURT: 0
helped stats (abs) min: 1 max: 29 x̄: 7.19 x̃: 6
helped stats (rel) min: 0.02% max: 6.67% x̄: 2.01% x̃: 2.44%
95% mean confidence interval for instructions value: -8.67 -5.71
95% mean confidence interval for instructions %-change: -2.31% -1.71%
Instructions are helped.

total alu in shared programs: 2172400 -> 2171573 (-0.04%)
alu in affected programs: 82535 -> 81708 (-1.00%)
helped: 115
HURT: 0
helped stats (abs) min: 1 max: 29 x̄: 7.19 x̃: 6
helped stats (rel) min: 0.03% max: 9.58% x̄: 2.90% x̃: 3.30%
95% mean confidence interval for alu value: -8.67 -5.71
95% mean confidence interval for alu %-change: -3.33% -2.47%
Alu are helped.

total fscib in shared programs: 2168107 -> 2167280 (-0.04%)
fscib in affected programs: 82535 -> 81708 (-1.00%)
helped: 115
HURT: 0
helped stats (abs) min: 1 max: 29 x̄: 7.19 x̃: 6
helped stats (rel) min: 0.03% max: 9.58% x̄: 2.90% x̃: 3.30%
95% mean confidence interval for fscib value: -8.67 -5.71
95% mean confidence interval for fscib %-change: -3.33% -2.47%
Fscib are helped.

total bytes in shared programs: 21534940 -> 21528976 (-0.03%)
bytes in affected programs: 774528 -> 768564 (-0.77%)
helped: 115
HURT: 1
helped stats (abs) min: 2 max: 192 x̄: 51.88 x̃: 42
helped stats (rel) min: 0.01% max: 6.06% x̄: 1.85% x̃: 2.11%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.10% max: 0.10% x̄: 0.10% x̃: 0.10%
95% mean confidence interval for bytes value: -62.70 -40.13
95% mean confidence interval for bytes %-change: -2.14% -1.52%
Bytes are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig d51ae1b634 agx: don't upload constant padding at the start
noticed in vkcube.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig d6d66bf72d asahi,agx: rework constant promotion upload
stuff promoted constants into the binary, this simplifies state management.
saves a big pile of alloc&copy in the gl driver. will unblock this for VK.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig a3696f29c1 agx: run algebraic later
to deal with ldst vectorize leftover

ironically worse due to nir_opt_preamble lottery, but confirmed it fixes ldst
vectorize silliness in preambles, making preambles a *lot* shorter.

total instructions in shared programs: 2759806 -> 2759882 (<.01%)
instructions in affected programs: 26821 -> 26897 (0.28%)
helped: 0
HURT: 10
HURT stats (abs)   min: 1 max: 15 x̄: 7.60 x̃: 6
HURT stats (rel)   min: 0.07% max: 1.33% x̄: 0.47% x̃: 0.19%
95% mean confidence interval for instructions value: 3.65 11.55
95% mean confidence interval for instructions %-change: 0.09% 0.85%
Instructions are HURT.

total alu in shared programs: 2174292 -> 2174340 (<.01%)
alu in affected programs: 25727 -> 25775 (0.19%)
helped: 1
HURT: 10
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
HURT stats (abs)   min: 1 max: 11 x̄: 5.00 x̃: 4
HURT stats (rel)   min: 0.09% max: 0.52% x̄: 0.27% x̃: 0.23%
95% mean confidence interval for alu value: 1.92 6.81
95% mean confidence interval for alu %-change: 0.12% 0.37%
Alu are HURT.

total fscib in shared programs: 2170011 -> 2170059 (<.01%)
fscib in affected programs: 25727 -> 25775 (0.19%)
helped: 1
HURT: 10
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
HURT stats (abs)   min: 1 max: 11 x̄: 5.00 x̃: 4
HURT stats (rel)   min: 0.09% max: 0.52% x̄: 0.27% x̃: 0.23%
95% mean confidence interval for fscib value: 1.92 6.81
95% mean confidence interval for fscib %-change: 0.12% 0.37%
Fscib are HURT.

total bytes in shared programs: 18414728 -> 18415244 (<.01%)
bytes in affected programs: 234114 -> 234630 (0.22%)
helped: 1
HURT: 11
helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8
helped stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02%
HURT stats (abs)   min: 4 max: 90 x̄: 47.64 x̃: 34
HURT stats (rel)   min: 0.03% max: 1.18% x̄: 0.39% x̃: 0.18%
95% mean confidence interval for bytes value: 20.47 65.53
95% mean confidence interval for bytes %-change: 0.08% 0.63%
Bytes are HURT.

total regs in shared programs: 864549 -> 864533 (<.01%)
regs in affected programs: 117 -> 101 (-13.68%)
helped: 3
HURT: 0
helped stats (abs) min: 4 max: 6 x̄: 5.33 x̃: 6
helped stats (rel) min: 10.26% max: 15.38% x̄: 13.68% x̃: 15.38%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig 25c302d337 agx: test immediate packing opt
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig 6d4dc9d9bf agx: negate iadd/imsub constants
total instructions in shared programs: 892853 -> 892841 (<.01%)
instructions in affected programs: 44400 -> 44388 (-0.03%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.02% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.03% -0.03%
Instructions are helped.

total alu in shared programs: 676057 -> 676045 (<.01%)
alu in affected programs: 28599 -> 28587 (-0.04%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.04% max: 0.05% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for alu value: -1.00 -1.00
95% mean confidence interval for alu %-change: -0.05% -0.04%
Alu are helped.

total fscib in shared programs: 675565 -> 675553 (<.01%)
fscib in affected programs: 28599 -> 28587 (-0.04%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.04% max: 0.05% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for fscib value: -1.00 -1.00
95% mean confidence interval for fscib %-change: -0.05% -0.04%
Fscib are helped.

total bytes in shared programs: 6047050 -> 6046978 (<.01%)
bytes in affected programs: 303744 -> 303672 (-0.02%)
helped: 12
HURT: 0
helped stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6
helped stats (rel) min: 0.02% max: 0.03% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for bytes value: -6.00 -6.00
95% mean confidence interval for bytes %-change: -0.03% -0.02%
Bytes are helped.

total uniforms in shared programs: 552413 -> 552315 (-0.02%)
uniforms in affected programs: 13800 -> 13702 (-0.71%)
helped: 48
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.04 x̃: 2
helped stats (rel) min: 0.39% max: 5.26% x̄: 0.96% x̃: 1.04%
95% mean confidence interval for uniforms value: -2.13 -1.96
95% mean confidence interval for uniforms %-change: -1.18% -0.75%
Uniforms are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig f6d8bb9a66 agx: optimize wait_pix a bit
this is a start at least.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig 85b3dc90e0 nir,agx: lower fmin/fmax in NIR
we want to elide flushes, doing so requires more sophisticated analysis than I'd
like in the middle of isel. also, it should be done before forming preambles for
efficiency (notice the uniform reduction here). let's do it with a NIR pass.

total instructions in shared programs: 2768481 -> 2757832 (-0.38%)
instructions in affected programs: 644084 -> 633435 (-1.65%)
helped: 2242
HURT: 18
helped stats (abs) min: 1 max: 349 x̄: 4.77 x̃: 3
helped stats (rel) min: 0.01% max: 34.91% x̄: 3.19% x̃: 2.19%
HURT stats (abs)   min: 1 max: 19 x̄: 2.89 x̃: 1
HURT stats (rel)   min: 0.24% max: 7.94% x̄: 1.27% x̃: 0.81%
95% mean confidence interval for instructions value: -5.20 -4.22
95% mean confidence interval for instructions %-change: -3.30% -3.01%
Instructions are helped.

total alu in shared programs: 2182880 -> 2172352 (-0.48%)
alu in affected programs: 513166 -> 502638 (-2.05%)
helped: 2235
HURT: 16
helped stats (abs) min: 1 max: 349 x̄: 4.73 x̃: 3
helped stats (rel) min: 0.02% max: 37.65% x̄: 3.70% x̃: 2.59%
HURT stats (abs)   min: 1 max: 19 x̄: 2.50 x̃: 1
HURT stats (rel)   min: 0.33% max: 3.74% x̄: 1.04% x̃: 0.91%
95% mean confidence interval for alu value: -5.16 -4.20
95% mean confidence interval for alu %-change: -3.83% -3.49%
Alu are helped.

total fscib in shared programs: 2178643 -> 2168059 (-0.49%)
fscib in affected programs: 514666 -> 504082 (-2.06%)
helped: 2243
HURT: 17
helped stats (abs) min: 1 max: 349 x̄: 4.74 x̃: 3
helped stats (rel) min: 0.02% max: 37.65% x̄: 3.74% x̃: 2.59%
HURT stats (abs)   min: 1 max: 19 x̄: 2.65 x̃: 1
HURT stats (rel)   min: 0.33% max: 14.71% x̄: 1.85% x̃: 0.93%
95% mean confidence interval for fscib value: -5.16 -4.20
95% mean confidence interval for fscib %-change: -3.87% -3.53%
Fscib are helped.

total bytes in shared programs: 18467348 -> 18403042 (-0.35%)
bytes in affected programs: 4403648 -> 4339342 (-1.46%)
helped: 2247
HURT: 20
helped stats (abs) min: 2 max: 2132 x̄: 28.73 x̃: 18
helped stats (rel) min: 0.01% max: 33.53% x̄: 2.80% x̃: 1.94%
HURT stats (abs)   min: 4 max: 72 x̄: 12.60 x̃: 6
HURT stats (rel)   min: 0.23% max: 6.58% x̄: 1.06% x̃: 0.75%
95% mean confidence interval for bytes value: -31.29 -25.45
95% mean confidence interval for bytes %-change: -2.90% -2.64%
Bytes are helped.

total regs in shared programs: 864605 -> 864442 (-0.02%)
regs in affected programs: 4692 -> 4529 (-3.47%)
helped: 68
HURT: 48
helped stats (abs) min: 1 max: 54 x̄: 7.25 x̃: 3
helped stats (rel) min: 4.26% max: 43.20% x̄: 13.21% x̃: 10.53%
HURT stats (abs)   min: 1 max: 36 x̄: 6.88 x̃: 6
HURT stats (rel)   min: 3.64% max: 91.67% x̄: 23.12% x̃: 24.00%
95% mean confidence interval for regs value: -3.60 0.79
95% mean confidence interval for regs %-change: -2.10% 5.75%
Inconclusive result (value mean confidence interval includes 0).

total uniforms in shared programs: 2120927 -> 2120911 (<.01%)
uniforms in affected programs: 770 -> 754 (-2.08%)
helped: 6
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.67 x̃: 2
helped stats (rel) min: 1.79% max: 2.70% x̄: 2.13% x̃: 1.96%
95% mean confidence interval for uniforms value: -3.75 -1.58
95% mean confidence interval for uniforms %-change: -2.50% -1.76%
Uniforms are helped.

total threads in shared programs: 27612224 -> 27613056 (<.01%)
threads in affected programs: 7168 -> 8000 (11.61%)
helped: 6
HURT: 3
helped stats (abs) min: 64 max: 192 x̄: 170.67 x̃: 192
helped stats (rel) min: 8.33% max: 23.08% x̄: 20.62% x̃: 23.08%
HURT stats (abs)   min: 64 max: 64 x̄: 64.00 x̃: 64
HURT stats (rel)   min: 8.33% max: 9.09% x̄: 8.59% x̃: 8.33%
95% mean confidence interval for threads value: -3.17 188.06
95% mean confidence interval for threads %-change: -0.92% 22.69%
Inconclusive result (value mean confidence interval includes 0).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig b3ef0f5aa8 asahi: don't leak drm version
valgrind.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig 9ce092c982 asahi: don't leak linked shaders
Oof!

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig ae7c9995ff asahi: don't leak binaries
ouch. valgrind.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig cb7348eac0 asahi: don't leak blit shaders
valgrind

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig 6a27a3838c asahi: assert guard previously-subtle code
would've caught the bug in the previous patch.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00
Alyssa Rosenzweig 09fde905a0 asahi: fix extremely subtle UAF
we can get into weird situations and the clever logic isn't worth it. do
unclever logic instead and fix subtle CTS flakes. GL was a mistake.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31908>
2024-10-30 10:14:07 -04:00