Commit Graph

202071 Commits

Author SHA1 Message Date
Valentine Burley
5b65bbf72c ci: Simplify downloading kernel for crosvm
Directly download the kernel instead of using the
download-prebuilt-kernel.sh script.
Save the kernel to /kernel for clarity, replacing the previous
/lava-files directory.

Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33606>
2025-02-21 14:52:56 +00:00
Mike Blumenkrantz
d979cd8d9d zink: support cl_gl_sharing if dmabuf is supported
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33652>
2025-02-21 14:18:44 +00:00
Mike Blumenkrantz
93cd4ae0c0 zink: verify that adding a dmabuf bind actually chooses a modifier
this at least provides some checking to catch cases where something
stupid happens and it does a fallback to linear

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33652>
2025-02-21 14:18:44 +00:00
Mike Blumenkrantz
5176370694 zink: handle buffer import/export
just noping out of some image codepaths

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33652>
2025-02-21 14:18:44 +00:00
Mike Blumenkrantz
f7002369fa zink: wait on tc fence before checking for fd semaphore
this forces sync with pending flushes

cc: mesa-stable

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33652>
2025-02-21 14:18:44 +00:00
Daniel Schürmann
df2697c9ab aco/scheduler: remove unused include of unordered_set
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:41 +00:00
Daniel Schürmann
93872270f0 aco/scheduler: keep track of RegisterDemand at DownwardsCursor::insert_idx{_clause}
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:41 +00:00
Daniel Schürmann
f58654e98f aco/scheduler: keep track of RegisterDemand at UpwardsCursor::insert_idx
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:41 +00:00
Daniel Schürmann
52253da783 aco: unify get_addr_sgpr_from_waves() and get_addr_vgpr_from_waves() into one function
which returns the limit as RegisterDemand.

Also remove the unused get_extra_sgprs() from aco_ir.h.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:41 +00:00
Daniel Schürmann
6ea9443726 aco/scheduler: stop rounding down the target number of waves on GFX10+
This way, it can make use of uneven wave numbers.

Totals from 4078 (5.14% of 79395) affected shaders: (Navi21)
MaxWaves: 58715 -> 65460 (+11.49%); split: +11.49%, -0.01%
Instrs: 5033684 -> 5048244 (+0.29%); split: -0.09%, +0.38%
CodeSize: 26833884 -> 26898780 (+0.24%); split: -0.07%, +0.32%
VGPRs: 302360 -> 265312 (-12.25%); split: -12.26%, +0.01%
Latency: 34636448 -> 36044242 (+4.06%); split: -0.08%, +4.14%
InvThroughput: 7999403 -> 7662697 (-4.21%); split: -4.55%, +0.34%
VClause: 105403 -> 111996 (+6.26%); split: -0.40%, +6.66%
SClause: 132996 -> 133460 (+0.35%); split: -0.81%, +1.16%
Copies: 297036 -> 308122 (+3.73%); split: -0.64%, +4.37%
Branches: 89376 -> 89390 (+0.02%); split: -0.00%, +0.02%
VALU: 3477621 -> 3488510 (+0.31%); split: -0.05%, +0.36%
SALU: 484211 -> 484191 (-0.00%); split: -0.08%, +0.08%

Totals from 1840 (2.32% of 79395) affected shaders: (Navi31)

MaxWaves: 30714 -> 34182 (+11.29%)
Instrs: 3102955 -> 3131001 (+0.90%); split: -0.05%, +0.95%
CodeSize: 16160564 -> 16273100 (+0.70%); split: -0.04%, +0.74%
VGPRs: 174540 -> 150600 (-13.72%)
Latency: 23521914 -> 24515055 (+4.22%); split: -0.07%, +4.29%
InvThroughput: 4373397 -> 4202912 (-3.90%); split: -4.40%, +0.50%
VClause: 59087 -> 64091 (+8.47%); split: -0.24%, +8.71%
SClause: 74844 -> 75366 (+0.70%); split: -0.53%, +1.22%
Copies: 184396 -> 197747 (+7.24%); split: -0.25%, +7.49%
Branches: 46015 -> 46028 (+0.03%); split: -0.00%, +0.03%
VALU: 1929286 -> 1942709 (+0.70%); split: -0.02%, +0.71%
SALU: 216126 -> 215983 (-0.07%); split: -0.18%, +0.12%
VOPD: 1216 -> 1217 (+0.08%); split: +1.40%, -1.32%
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:41 +00:00
Daniel Schürmann
676b39d31f aco/scheduler: always respect min_waves on GFX10+
It could theoretically happen that for large workgroups,
the scheduler used more registers than allowed.

No fossil changes.

Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33644>
2025-02-21 13:49:40 +00:00
Collabora's Gfx CI Team
9befbf54a6 Uprev Piglit to 04d901e49de6b650f9dceaf73220371273d87f73
fc8179d319...04d901e49d

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33457>
2025-02-21 11:53:36 +00:00
Danylo Piliaiev
763ddd0fd3 nir/nir_lower_multiview: Don't assert if load_deref doesn't have var
If deref chain has nir_deref_type_cast nir_intrinsic_get_var will
return null, which is valid for e.g. shader inputs, since the pass
only care about outputs.

NIR excerpt that caused issues:

```
    32x3    %6 = deref_cast (block *)%5 (ubo block)  (ptr_stride=0, align_mul=0, align_offset=0)
    32x3    %7 = deref_struct &%6->field0 (ubo vec4[4])  // &((block *)%5)->field0
    32      %8 = load_const (0x00000001)
    32x3    %9 = deref_array &(*%7)[1] (ubo vec4)  // &((block *)%5)->field0[1]
    32x4   %10 = @load_deref (%9) (access=none)
```

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33647>
2025-02-21 11:09:22 +00:00
Daniel Stone
4f11b8d950 ci/zink: Expand flake definition on radv
We've seen a few variants of this now, so just mark them all as flaky.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33546>
2025-02-21 09:22:03 +00:00
Erik Faye-Lund
fde6aeb886 mesa/main: wire up glapi bits for EXT_multi_draw_indirect
Turns out we were missing the glapi bits, making it impossible to use get
the function pointers for this extension. Whoops?!

[daniels: Squashed in a618 SkQP fails, presumably caused by these not
          being skipped anymore.]

Fixes: 9f5af68995 ("mesa/main: expose `EXT_multi_draw_indirect`")
Reviewed-by: Antonino Maniscalco <antomani103@gmail.com>
Tested-by: Chris Healy <healych@amazon.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33546>
2025-02-21 09:22:03 +00:00
Emma Anholt
2f57cf0323 egl: Retire NV_post_sub_buffer support.
It's never been ported to DRI3, but nobody seems to care.  Since DRI2 is
untested at this point, just drop the code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
2025-02-21 02:50:56 +00:00
Emma Anholt
f6aa27a294 egl: Retire NOK_swap_region support.
It's never been ported to DRI3, but nobody seems to care.  Since DRI2 is
untested at this point, just drop the code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
2025-02-21 02:50:56 +00:00
Emma Anholt
58e73e792f egl: Apply autopep8.
My editor does this on save, so let's just apply it to EGL's python for
consistency.  The only exception is that the genCommon import needs the
sys.path.insert, so that part of autopep8 was reverted.

Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
2025-02-21 02:50:56 +00:00
Emma Anholt
34fe896715 docs: Drop some weird unhelpful text about DRI2.
Both instructions for building were the same, and there's not much sense
in calling out just xcb-dri2 out of all the deps there are.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33517>
2025-02-21 02:50:56 +00:00
Lorenzo Rossi
a3ddb223e2 nvk, nak: Implement shaderSharedInt64Atomics
Current nvidia devices miss support for 64-bit arithmetic atomics, we
replace them with compare-and-swap loops using nir_lower_atomics.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10330
Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33572>
2025-02-21 00:33:17 +00:00
Lorenzo Rossi
26079c1a93 nir: support shared atomics in nir_lower_atomics
Add support to rewrite shared atomics into compare-and-swap loops,
previously the nir_lower_atomics pass only supported global and ssbo
atomics.

Only freedreno irc3 reuses nir_lower_atomics, this change does not
impact their usage since they do not support shared atomics.

Signed-off-by: Lorenzo Rossi <snowycoder@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33572>
2025-02-21 00:33:16 +00:00
Ian Romanick
15544ed858 nir/algebraic: Undistribute b2i from logic-ops
shader-db:
All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 16973309 -> 16973173 (<.01%)
instructions in affected programs: 13780 -> 13644 (-0.99%)
helped: 31 / HURT: 0

total cycles in shared programs: 915620550 -> 915618604 (<.01%)
cycles in affected programs: 185962 -> 184016 (-1.05%)
helped: 30 / HURT: 1

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209748003 -> 209745278 (-0.00%)
Cycle count: 30514920400 -> 30514716506 (-0.00%); split: -0.00%, +0.00%
Max live registers: 65477183 -> 65477584 (+0.00%)
Non SSA regs after NIR: 237334710 -> 237333632 (-0.00%)

Totals from 1257 (0.18% of 706651) affected shaders:
Instrs: 693039 -> 690314 (-0.39%)
Cycle count: 39792504 -> 39588610 (-0.51%); split: -0.97%, +0.46%
Max live registers: 194170 -> 194571 (+0.21%)
Non SSA regs after NIR: 821978 -> 820900 (-0.13%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33648>
2025-02-21 00:01:11 +00:00
Ian Romanick
a48a044cf6 nir/algebraic: Simplify equality comparisons of b2T with 1 or 0
Adding the b2i(a) == 1 and b2i(a) != 1 patterns also helps prevent
regressions when spurious negations are removed from integer equality
comparisons, as is done in !33498.

v2: Make all variables part of the iteration instead of calculating some
of them. Suggested by Alyssa.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 16973331 -> 16973309 (<.01%)
instructions in affected programs: 266 -> 244 (-8.27%)
helped: 2 / HURT: 0

total cycles in shared programs: 915620774 -> 915620550 (<.01%)
cycles in affected programs: 4360 -> 4136 (-5.14%)
helped: 2 / HURT: 0

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209748011 -> 209748003 (-0.00%)
Cycle count: 30514920286 -> 30514920400 (+0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 237334726 -> 237334710 (-0.00%)

Totals from 8 (0.00% of 706651) affected shaders:
Instrs: 16956 -> 16948 (-0.05%)
Cycle count: 261052 -> 261166 (+0.04%); split: -0.92%, +0.96%
Non SSA regs after NIR: 20000 -> 19984 (-0.08%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33648>
2025-02-21 00:01:11 +00:00
Ian Romanick
3f39d8f4ff nir/algebraic: Optimize zero comparisons of umax or umin
I observered some of the existing patterns stopped being applied after
some of the ult-to-ieq optimizations in !33498. It turns out that these
patterns occur even without those changes.

shader-db:

All Intel platforms had similar results. (Lunar Lake shown)
total instructions in shared programs: 16973339 -> 16973331 (<.01%)
instructions in affected programs: 7977 -> 7969 (-0.10%)
helped: 2 / HURT: 0

total cycles in shared programs: 915620938 -> 915620774 (<.01%)
cycles in affected programs: 136022 -> 135858 (-0.12%)
helped: 2 / HURT: 0

fossil-db:

Lunar Lake
Totals:
Instrs: 209748173 -> 209748011 (-0.00%); split: -0.00%, +0.00%
Cycle count: 30514361348 -> 30514920286 (+0.00%); split: -0.00%, +0.00%
Spill count: 511813 -> 511808 (-0.00%)
Fill count: 622537 -> 622533 (-0.00%)
Max live registers: 65477033 -> 65477183 (+0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 237334728 -> 237334726 (-0.00%); split: -0.00%, +0.00%

Totals from 26 (0.00% of 706651) affected shaders:
Instrs: 332073 -> 331911 (-0.05%); split: -0.05%, +0.00%
Cycle count: 959758560 -> 960317498 (+0.06%); split: -0.03%, +0.09%
Spill count: 10293 -> 10288 (-0.05%)
Fill count: 23784 -> 23780 (-0.02%)
Max live registers: 9682 -> 9832 (+1.55%); split: -0.08%, +1.63%
Non SSA regs after NIR: 232135 -> 232133 (-0.00%); split: -0.03%, +0.03%

Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 233538532 -> 233536113 (-0.00%); split: -0.00%, +0.00%
Cycle count: 24428142259 -> 24426705655 (-0.01%); split: -0.01%, +0.00%
Spill count: 513128 -> 512923 (-0.04%)
Fill count: 557329 -> 557108 (-0.04%)
Max live registers: 42129806 -> 42129881 (+0.00%); split: -0.00%, +0.00%
Non SSA regs after NIR: 256711720 -> 256711718 (-0.00%); split: -0.00%, +0.00%

Totals from 26 (0.00% of 805759) affected shaders:
Instrs: 325629 -> 323210 (-0.74%); split: -0.74%, +0.00%
Cycle count: 893896782 -> 892460178 (-0.16%); split: -0.21%, +0.05%
Spill count: 10467 -> 10262 (-1.96%)
Fill count: 24291 -> 24070 (-0.91%)
Max live registers: 4946 -> 5021 (+1.52%); split: -0.08%, +1.60%
Non SSA regs after NIR: 232980 -> 232978 (-0.00%); split: -0.03%, +0.03%

Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 237289818 -> 237289714 (-0.00%); split: -0.00%, +0.00%
Cycle count: 22959586058 -> 22960049302 (+0.00%); split: -0.00%, +0.00%
Max live registers: 42182257 -> 42182337 (+0.00%)
Non SSA regs after NIR: 255579974 -> 255579970 (-0.00%); split: -0.00%, +0.00%

Totals from 23 (0.00% of 802019) affected shaders:
Instrs: 27051 -> 26947 (-0.38%); split: -0.39%, +0.01%
Cycle count: 10545917 -> 11009161 (+4.39%); split: -0.09%, +4.49%
Max live registers: 2198 -> 2278 (+3.64%)
Non SSA regs after NIR: 31741 -> 31737 (-0.01%); split: -0.20%, +0.19%

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33648>
2025-02-21 00:01:11 +00:00
Ian Romanick
4311121e73 nir/algebraic: More (a == 0 || a == 1 || ...) patterns
At least some Total War: Warhammer3 vertex shaders associate the
comparisons differntly, so the existing patterns were not triggered.

No shader-db changes on any Intel platform.

fossil-db:

All Intel platforms had similar results. (Lunar Lake shown)
Totals:
Instrs: 209748654 -> 209748173 (-0.00%)
Cycle count: 30514333964 -> 30514361348 (+0.00%); split: -0.00%, +0.00%
Fill count: 622688 -> 622537 (-0.02%)
Max live registers: 65477039 -> 65477033 (-0.00%)
Non SSA regs after NIR: 237334768 -> 237334728 (-0.00%)

Totals from 512 (0.07% of 706651) affected shaders:
Instrs: 1000693 -> 1000212 (-0.05%)
Cycle count: 42174312 -> 42201696 (+0.06%); split: -0.15%, +0.21%
Fill count: 11456 -> 11305 (-1.32%)
Max live registers: 121599 -> 121593 (-0.00%)
Non SSA regs after NIR: 1253445 -> 1253405 (-0.00%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33648>
2025-02-21 00:01:11 +00:00
Eric R. Smith
414dba9f5c panfrost: use an accessor function to read from bi_opcode_props
Use an accessor function to read opcode properties or to change the
opcode. This would allow for different instruction descriptions to
be used for different architectures. Not necessary now, but may
be useful groundwork.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29765>
2025-02-20 23:33:00 +00:00
Faith Ekstrand
651864151f zink: Use persistent semaphores for PIPE_FD_TYPE_SYNCOBJ
These are persistant objects that you can use to signal and wait over.
We need to import without VK_SEMAPHORE_IMPORT_TEMPORARY_BIT and we can't
throw away the Vulkan semaphore after each submit.

Fixes: 32597e116d ("zink: implement GL semaphores")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33549>
2025-02-20 23:09:00 +00:00
Faith Ekstrand
1ffa782227 zink: Use the correct array size for signal_values[]
When the size of the signals[] array was changed to 3, the
signal_values[] array was not updated accordingly.  If we have a
signal_semaphore and are presenting at the same time, this can lead to
an array overflow and the driver will read some random stack value as
the signal value.  This is causing chromium to lock up when running
WebGL.

Fixes: 7f56fd9655 ("zink: it's kopperin' time")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33549>
2025-02-20 23:09:00 +00:00
Casey Bowman
111faf2158 vulkan/screenshot-layer: Correct queueFamilyIndex source
From the Vulkan documentation, the queueFamilyIndex value will be
created with VkDeviceQueueCreateInfo. So let's avoid counting the
index value and just refer to the already-created value.

This will resolve crashes on some GPUs for various workloads.

v2: Needed to use GetDeviceQueue() in order to map the queueFamilyIndex
values. These values can be different when obtaining the queue used
for presentation, so we need to ensure we update the mapped
queueFamilyIndex value for the associated queue_data struct.

Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33487>
2025-02-20 22:36:44 +00:00
Georg Lehmann
67d03033e4 radv: remove separate discard peephole select
This allows removing control flow with a mix of alu and discard.

Foz-DB Navi21 (ignore throughput/latency because of single iteration loops):
Totals from 1251 (1.58% of 79377) affected shaders:
Instrs: 1459317 -> 1457751 (-0.11%); split: -0.14%, +0.04%
CodeSize: 8350856 -> 8352408 (+0.02%); split: -0.03%, +0.05%
VGPRs: 53056 -> 53328 (+0.51%)
SpillSGPRs: 66 -> 62 (-6.06%)
Latency: 19784315 -> 15649290 (-20.90%); split: -21.26%, +0.36%
InvThroughput: 4080229 -> 3122717 (-23.47%); split: -23.56%, +0.09%
VClause: 29293 -> 29294 (+0.00%); split: -0.01%, +0.01%
SClause: 56060 -> 55941 (-0.21%); split: -0.23%, +0.02%
Copies: 129794 -> 127880 (-1.47%); split: -1.51%, +0.04%
Branches: 52039 -> 51275 (-1.47%); split: -1.47%, +0.01%
PreSGPRs: 50221 -> 50024 (-0.39%); split: -0.64%, +0.25%
PreVGPRs: 44058 -> 44053 (-0.01%); split: -0.02%, +0.00%
VALU: 984915 -> 984993 (+0.01%); split: -0.01%, +0.02%
SALU: 177126 -> 177184 (+0.03%); split: -0.62%, +0.65%
SMEM: 79565 -> 79525 (-0.05%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:18 +00:00
Georg Lehmann
f26069fdd9 nir: replace nir_opt_conditional_discard with nir_opt_peephole_select
Foz-DB Navi21:
Totals from 118 (0.15% of 79377) affected shaders:
Instrs: 208001 -> 207355 (-0.31%); split: -0.33%, +0.01%
CodeSize: 1080428 -> 1078432 (-0.18%); split: -0.20%, +0.02%
SpillSGPRs: 202 -> 211 (+4.46%)
Latency: 1923508 -> 1919093 (-0.23%); split: -0.62%, +0.39%
InvThroughput: 407475 -> 407081 (-0.10%); split: -0.12%, +0.02%
SClause: 7050 -> 7033 (-0.24%); split: -0.31%, +0.07%
Copies: 12156 -> 11821 (-2.76%); split: -3.04%, +0.28%
PreSGPRs: 8198 -> 8331 (+1.62%); split: -0.02%, +1.65%
PreVGPRs: 7628 -> 7528 (-1.31%)
VALU: 155747 -> 155657 (-0.06%); split: -0.06%, +0.00%
SALU: 18295 -> 17782 (-2.80%); split: -2.98%, +0.18%
SMEM: 10521 -> 10519 (-0.02%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:17 +00:00
Georg Lehmann
8251a5b846 nir/peephole_select: don't completely ignore ifs with dont_flatten
Apps are misusing this for cases where the if-else are empty (except for phis)
or for conditional discard which will become relevant in the next commit.

Foz-DB Navi21:
Totals from 173 (0.22% of 79188) affected shaders:
Instrs: 1465214 -> 1464987 (-0.02%); split: -0.04%, +0.03%
CodeSize: 7960472 -> 7965188 (+0.06%); split: -0.01%, +0.07%
Latency: 10001176 -> 10012782 (+0.12%); split: -0.01%, +0.12%
InvThroughput: 2336017 -> 2338979 (+0.13%); split: -0.00%, +0.13%
Copies: 140105 -> 138225 (-1.34%)
Branches: 49746 -> 49732 (-0.03%)
VALU: 975632 -> 976322 (+0.07%); split: -0.01%, +0.08%
SALU: 201369 -> 200688 (-0.34%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:16 +00:00
Georg Lehmann
cfee9e1d9f nir/peephole_select: add option to allow discard without ~0 limit
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:16 +00:00
Georg Lehmann
ca8147edbe nir/peephole_select: add options struct
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:16 +00:00
Georg Lehmann
edd82bd03a nir/peephole_select: don't include nir_search_helpers.h
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:15 +00:00
Georg Lehmann
c31fadd25e nir/peephole_select: don't special case nir_opt_collapse_if + limit = ~0
Not sure if this was intentionally left when block_check_for_allowed_instrs's
param was changed from bool to int, but it certainly was broken without the
previous commit for discards. Now those should work, so the (unintentional?)
special case can be removed.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:15 +00:00
Georg Lehmann
40f96460ee nir/peephole_select: handle demote and terminate in nir_opt_collapse_if
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:15 +00:00
Georg Lehmann
58d6243f62 nir/peephole_select: support demote for non CF HW
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33590>
2025-02-20 21:59:15 +00:00
Karol Herbst
e0b62d7e2e rusticl/mem: set num_samples and num_mip_levels to 0 when importing from GL
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33653>
2025-02-20 21:37:56 +00:00
Mike Blumenkrantz
d1d2afa3ac zink: apply layer/depth to clear handling
this can avoid flushing/discarding some unnecessary clears

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33355>
2025-02-20 20:01:19 +00:00
Faith Ekstrand
2b1a97b742 nak: Use MemScope::GPU instead of MemScop::System
MemScope::System has to synchronize with everything in the system,
including across PCIe so it's horribly slow.  MemScope::GPU, on the
other hand, only has to synchronize within the GPU.  This is way faster
and still satisfies all of Vulkan's requirements because Vulkan never
allows CPU<->GPU access without full semaphores and barriers.

Reviewed-by: Mel Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33649>
2025-02-20 19:32:24 +00:00
Faith Ekstrand
13f7ea7b3d nak: Only use suld.constant on Ampere+
Turing doesn't support it so we'll use suld.weak instead.  While we're
here, get rid of an accidental copy+paste condition.

Fixes: ffdc0d8e98 ("nak: Use suld.constant when ACCESS_CAN_REORDER is set")
Reviewed-by: Mel Henning <drawoc@darkrefraction.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33649>
2025-02-20 19:32:24 +00:00
Roland Scheidegger
61911b6a4b llvmpipe: Fix alpha-to-coverage without dithering
Implementing alpha-to-coverage dithering broke the non-dithering case.
(Discovered by accident, not really a big deal since it's almost always
enabled and can only be disabled by using a Nvidia GL extension, and
can't be disabled with Vulkan.)

Fixes: ad4635d6ef
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33586>
2025-02-20 18:59:21 +00:00
Adam Jackson
244c9cc45e mapi/glx: Remove FASTCALL/PURE
This isn't worth the complexity.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33623>
2025-02-20 15:47:23 +00:00
Adam Jackson
32a10ccbdd glx: Remove (almost) all usage of _X_HIDDEN / _X_INTERNAL
It's redundant at this point. The one exception is for GLX_PUBLIC when
building for glvnd, because then we really do want the GLX API to be
hidden.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33623>
2025-02-20 15:47:23 +00:00
Adam Jackson
43fb26f8ea mapi/glx: Remove xserver code generation
This hasn't been hooked up to the build since we deleted autotools back
in 2019. It's effectively dead code anyway, as GLX is not a moving
target, and at this point is it easier to modify the generated code
directly than to modify the generator. xserver is encouraged to copy
the generators from 2019 into its own build if it wants, or -
preferably, in this GLX greybeard's opinion - find a prettier codegen
solution in the process of finishing GL 3.0 support.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33623>
2025-02-20 15:47:23 +00:00
Adam Jackson
09bbf71e68 glx: Make #undef GLX_INDIRECT_RENDERING do something
Not that meson lets you reach this state yet, but if you did, you'd
still build all of the indirect code but the linker would gc most of it.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33623>
2025-02-20 15:47:23 +00:00
Daniel Schürmann
259b73a3ae nir/print: print phi sources sorted by predecessor blocks
We already print the predecessors sorted. Just do the same with
phi sources.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33574>
2025-02-20 14:22:14 +00:00
Juan A. Suarez Romero
2d91798561 broadcom/simulator: use string copy instead of memcpy
Using memcpy with the max size generates a global-buffer-overflow, as
the performance counter strings are smaller than the max size.

Instead, use a string copy function to get a copy.

This was detected with address sanitizer enabled and running vulkaninfo.

Fixes: 3e8b2fe053 ("broadcom/simulator: Add DRM_IOCTL_V3D_GET_COUNTER to simulator")
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33627>
2025-02-20 13:15:01 +00:00
Juan A. Suarez Romero
351bf1e524 vc4/ci: update expected results
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33627>
2025-02-20 13:15:01 +00:00