Commit Graph

9764 Commits

Author SHA1 Message Date
Danylo Piliaiev
7b09fc98fb nir/opt_16b_tex_image: Sign extension should matter for texel buffer txf
Texel buffer could be arbitrary large, so the assumption being made in
the following comment is wrong:

 "Zero-extension (u16) and sign-extension (i16) have
  the same behavior here - txf returns 0 if bit 15 is set
  because it's out of bounds and the higher bits don't matter."

Sign extension should matter for GLSL_SAMPLER_DIM_BUF.

This fixes the case of doing texelFetch with u16 offset:

  uniform itextureBuffer s1;
  uint16_t offset = some_ssbo.offset;
  value = texelFetch(s1, offset).x;

If the offset is higher than s16 optimization incorrectly
left it as 16b.

In spirv the above glsl is translated into:

  %22 = OpLoad %ushort %21
  %23 = OpUConvert %uint %22
  %24 = OpBitcast %int %23
  %26 = OpImageFetch %v4int %16 %24

Cc: mesa-stable

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31664>
2024-10-16 10:10:00 +00:00
Timothy Arceri
aa7c59e02c nir/glsl: set deref cast mode for blocks during function inlining
More cast fixes this time for UBO and SSBO. Which were missing testing
previously.

Fixes: d681cf96fb ("nir/glsl: set deref cast mode during function inlining")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11587

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31668>
2024-10-16 06:25:57 +00:00
Marek Olšák
0727634443 nir/opt_load_store_vectorize: vectorize load_smem_amd
radeonsi+ACO with the new vectorization callback:

TOTALS FROM AFFECTED SHADERS (19508/58918)
  VGPRs: 708672 -> 708864 (0.03 %)
  Code Size: 31458688 -> 31217160 (-0.77 %) bytes
  Max Waves: 305960 -> 305952 (-0.00 %)

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
a44e5cfccf nir/opt_load_store_vectorize: allow a 4-byte hole between 2 loads
If there is a 4-byte hole between 2 loads, drivers can now optionally
vectorize the loads by including the hole between them, e.g.:
    4B load + 4B hole + 8B load --> 16B load

All vectorize callbacks already reject all holes, but AMD will want to
allow it.

radeonsi+ACO with the new vectorization callback:

TOTALS FROM AFFECTED SHADERS (25248/58918)
  VGPRs: 871116 -> 871872 (0.09 %)
  Spilled SGPRs: 397 -> 407 (2.52 %)
  Code Size: 43074536 -> 42496352 (-1.34 %) bytes

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
80c156422d nir/opt_load_store_vectorize: allow overfetching, merge overfetched loads
New load merging transformations (first, second), examples:
    (vec4, vec3) ==> vec8(read=0x7f) (because NIR doesn't have vec7)
    (vec1, vec8(read=0x7f)) ==> vec8(read=0xff)
    - the unused component at the end of vec8 is dropped

Not merged:
    vec8(read=0xfe) + vec1
    - unused components at the beginning are kept

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
65ace5649b nir: reject unsupported component counts from all vectorize callbacks
If you allow an unsupported component count in the callback for loads,
nir_opt_load_store_vectorize will align num_components to the next supported
vector size, essentially overfetching.

This changes all callbacks to reject it. AMD will enable it in a later commit.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
02923e237d nir: add hole_size parameter into the vectorize callback
It will be used to allow merging loads with a hole between them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
8ce43b7765 nir/opt_load_store_vectorize: add entry::num_components
We will represent vec6..vec7, vec9..vec15 loads with 8 and 16
components respectively, so we need to track how many components
we really use.

This is a prerequisite for optimal merging up to vec16. Example:
    Step 1: vec4 + vec3 ==> vec7as8 (last component unused)
    Step 2: vec1 + vec7as8 ==> vec8 (last unused component dropped)

Without using the number of components read, the same example would end up
doing:
    Step 1: vec4 + vec3 ==> vec8
    Step 2: vec1 + vec8 ==> vec9 (fail)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Alyssa Rosenzweig
e9303c0952 nir: extract round component helper
another nir pass will use this.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29398>
2024-10-15 05:50:24 +00:00
Marek Olšák
64c4d29e65 nir/opt_vectorize_io: fix stack buffer overflow with 16-bit output stores
uncovered by unrelated work

Fixes: 2514999c9c - nir: add nir_opt_vectorize_io, vectorizing lowered IO

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31644>
2024-10-15 03:59:17 +00:00
Timothy Arceri
46facf9037 nir/glsl: set cast mode for image during function inlining
Fixes: d681cf96fb ("nir/glsl: set deref cast mode during function inlining")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11980

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31554>
2024-10-15 03:13:24 +00:00
Konstantin Seurer
70a1453537 nir/print: Fix the alignment of 8-bit definitions
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31612>
2024-10-14 21:21:04 +00:00
Adam Jackson
605d6aaf13 vtn: Handle SPV_INTEL_optnone
We don't advertise this in rusticl (and probably shouldn't, at least
until we can honor the request) but DPC++ emits this regardless so we
may as well ignore it.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31592>
2024-10-11 15:39:45 +00:00
Rhys Perry
67ad7359ff nir/divergence_analysis: disable phi undef optimization by default
If the backend does not implement this too, or some other future transform
modifiess the phi so that this isn't the case (replace the phi with a
bcsel or replace undef with zero), then it will not actually be uniform.

This keeps it enabled to some degree for RADV/ACO.

fossil-db (navi31):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 195008 -> 195282 (+0.14%)
CodeSize: 1012592 -> 1015884 (+0.33%)
Latency: 3892826 -> 3898843 (+0.15%); split: -0.00%, +0.15%
InvThroughput: 460681 -> 460964 (+0.06%)
Copies: 13508 -> 13516 (+0.06%)
Branches: 5244 -> 5412 (+3.20%)
PreVGPRs: 5092 -> 5096 (+0.08%)
VALU: 116177 -> 116197 (+0.02%)
SALU: 23449 -> 23785 (+1.43%)

fossil-db (navi21):
Totals from 76 (0.10% of 79395) affected shaders:
Instrs: 164471 -> 164981 (+0.31%)
CodeSize: 883988 -> 888420 (+0.50%)
Latency: 4074287 -> 4082043 (+0.19%)
InvThroughput: 783783 -> 784276 (+0.06%); split: -0.00%, +0.06%
Branches: 5262 -> 5430 (+3.19%)
PreVGPRs: 5100 -> 5104 (+0.08%)
VALU: 116375 -> 116381 (+0.01%)
SALU: 23589 -> 23925 (+1.42%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30211>
2024-10-10 14:59:26 +00:00
Caio Oliveira
c06a55fd39 spirv: Update SPIR-V grammar to use aliases
For enumerants and instruction names, instead of duplicating the values
now the grammar will use an aliases field to list the alternative names.
Update the Python scripts for that.

The new SPIR-V files correspond to d92cf88c371424591115a87499009dfad41b669c
("Add "aliases" fields to the grammar and remove duplicated (#447)")
in https://github.com/KhronosGroup/SPIRV-Headers.

Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31369>
2024-10-10 02:48:00 +00:00
Alyssa Rosenzweig
6287c8251d nir: add bounds_agx opcode
used to facilitate bounds checking optimization

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31532>
2024-10-05 18:30:11 +00:00
Timothy Arceri
065b45e4dc glsl: remove linker.cpp
All functionality has now been converted to NIR or moved elsewhere.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:59 +00:00
Timothy Arceri
e4c3e7e0d8 glsl: rename link_shaders() -> link_shaders_init()
And move it to the linker util file.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:59 +00:00
Timothy Arceri
37ac8f5e79 glsl: move shader cache lookup call to st
The nir shader cache read call is just below this call now so the code
is easier to follow.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:59 +00:00
Timothy Arceri
b663eb83fe glsl: move error and warning helpers to util file
These functions are already defined in linker_util.h so moving them here
is logical.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:59 +00:00
Timothy Arceri
19c27c39b4 glsl/mesa: remove ir_uniform.h
We moved its contents elsewhere in a previous patch.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:59 +00:00
Timothy Arceri
13301e2509 glsl: move resource_name_updated() to linker_util.cpp
We want to remove the old linker.cpp file in following patches so move
this util function to the util code file.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:58 +00:00
Timothy Arceri
08e25e091b glsl/mesa: move uniform related shader structs to shader_types.h
This is where all the other uniform related structs are and these were
the only structs used by both the compiler and gl api that were defined
in the compiler code so lets move them to where everything else is
defined.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31500>
2024-10-04 00:10:58 +00:00
Iago Toral Quiroga
aac1c074cc nir: make fclamp_pos_mali and fsat_signed_mali opcodes generic
V3D can use these too.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31480>
2024-10-03 09:02:07 +00:00
Timothy Arceri
1c58f513c4 glsl: fix gl_{Clip,Cull}Distance error messages
The error message in the linker that checked
gl_MaxCombinedClipAndCullDistances would never be issued because the
compiler was already doing the check. I think the compiler might have
been done this way in the original commit d656736b as the linker
only sets the size when the clip/cull outputs are written so the
piglit test for this wouldn't have been triggered as it does not
write to the outputs.

Here we move the error to the compiler and fix things up so the
correct messages are triggered.

Fixes: d656736bbf ("glsl: Add arb_cull_distance support (v3)")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31471>
2024-10-03 06:59:47 +00:00
Faith Ekstrand
62a4fe861a nir: Add an option to lower quad vote
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31470>
2024-10-02 21:10:32 +00:00
Marek Olšák
f546df95a6 nir/opt_vectorize_io: fix skipped output vectorization if inputs were vectorized
Fixes: 2514999c9c - nir: add nir_opt_vectorize_io, vectorizing lowered IO

Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31454>
2024-10-02 20:26:23 +00:00
Job Noorman
4d50504b26 nir/lower_int64: add nir_intrinsic_rotate
Can simply be split into 32b ops.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>
2024-10-02 06:35:49 +00:00
Job Noorman
e6a5c342da nir/lower_int64: add nir_intrinsic_read_invocation_cond_ir3
Can simply be split into 32b ops.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31455>
2024-10-02 06:35:49 +00:00
Job Noorman
584b63ecab nir/load_store_vectorize: fix division by zero
Don't use glsl_get_explicit_stride as it may return 0 for vector types,
use nir_deref_instr_array_stride instead.

Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31460>
2024-10-02 05:53:57 +00:00
Rhys Perry
be64454710 nir/tests: test opt_loop_peel_initial_break with derefs in header block
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>
2024-10-01 12:24:22 +00:00
Rhys Perry
0484044b1a nir/opt_loop: rematerialize header block derefs in their use blocks
Otherwise, we could end up with phis of derefs.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 6b4b044739 ("nir/opt_loop: add loop peeling optimization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>
2024-10-01 12:24:22 +00:00
Christian Gmeiner
1421319dcf compiler/rust: Copy MappedInstrs from NAK
Rename it to SmallVec, make it more generic and switch NAK
to it.

Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31409>
2024-10-01 11:33:35 +00:00
Gert Wollny
f19f1ec17b nir/opt_algebraic: Allow two-step lowering of ftrunc@64 to use ffract@64
If ftrunc@64 is lowered by nir_lower_doubles it is turned into a
comparable long series of 32 bit operations. If the hardware
supports ffract@64 then nir_opt_algebraic can first lower ftrunc@64
to use some combinations with ffloor@64. They can then be turned
into a combination of fsub@64 and ffract@64 resulting in less
all-over instructions.

Fixes: 5218cff34b
   nir/algebraic: avoid double lowering of some fp64 operations

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29281>
2024-09-30 23:51:02 +00:00
Kenneth Graunke
0b34a7aff0 nir: Don't generate single iteration loops to zero-initialize memory
If the stride we're adding to our loop counter is larger than the total
amount of shared local memory we're trying to initialize, we know the
loop will run at most one time.  So we can skip emitting a loop.

Loop unrolling appears to be unable to detect this currently.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31312>
2024-09-30 05:27:17 +00:00
Georg Lehmann
bb7e8d51b6 nir: delete nir_opt_reuse_constants
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>
2024-09-27 05:19:16 +00:00
Georg Lehmann
60776f87c3 nir/opt_remove_phis: rematerialize constants
Foz-DB Navi31:
Totals from 749 (0.94% of 79395) affected shaders:
Instrs: 1224359 -> 1223722 (-0.05%); split: -0.07%, +0.02%
CodeSize: 6468392 -> 6466296 (-0.03%); split: -0.06%, +0.03%
Latency: 9764410 -> 9766457 (+0.02%); split: -0.01%, +0.03%
InvThroughput: 1017401 -> 1017380 (-0.00%); split: -0.03%, +0.03%
VClause: 19902 -> 19873 (-0.15%); split: -0.16%, +0.02%
SClause: 38441 -> 38424 (-0.04%); split: -0.05%, +0.01%
Copies: 86880 -> 86304 (-0.66%); split: -0.73%, +0.06%
Branches: 34206 -> 34159 (-0.14%); split: -0.14%, +0.01%
PreSGPRs: 45557 -> 45527 (-0.07%); split: -0.08%, +0.01%
PreVGPRs: 32406 -> 32408 (+0.01%)
VALU: 671633 -> 671533 (-0.01%); split: -0.02%, +0.01%
SALU: 155284 -> 154675 (-0.39%); split: -0.40%, +0.00%
VMEM: 27303 -> 27271 (-0.12%)
SMEM: 67490 -> 67455 (-0.05%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>
2024-09-27 05:19:16 +00:00
Georg Lehmann
40fc85c15b nir: make nir_instr_clone usable with load_const and undef
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>
2024-09-27 05:19:16 +00:00
Georg Lehmann
a9f8089240 nir: replace nir_opt_remove_phis_block with a single source version
This is what callers actually want, and it simplifies nir_opt_remove_phis
because we can assume dominance meta data is valid.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>
2024-09-27 05:19:16 +00:00
Georg Lehmann
41e82b8b8e nir: sink is_subgroup_invocation_lt_amd
Having it closer to the branches means we can eliminate an exec copy.

Foz-DB Navi31:
Totals from 11615 (14.63% of 79395) affected shaders:
Instrs: 6804372 -> 6804903 (+0.01%); split: -0.04%, +0.05%
CodeSize: 33684672 -> 33680584 (-0.01%); split: -0.07%, +0.05%
VGPRs: 578616 -> 578604 (-0.00%)
SpillSGPRs: 1506 -> 1304 (-13.41%)
Latency: 29817034 -> 29821320 (+0.01%); split: -0.03%, +0.05%
InvThroughput: 3581587 -> 3581217 (-0.01%); split: -0.02%, +0.01%
VClause: 124826 -> 124782 (-0.04%); split: -0.04%, +0.00%
SClause: 187916 -> 187645 (-0.14%); split: -0.27%, +0.13%
Copies: 520969 -> 510027 (-2.10%); split: -2.20%, +0.10%
PreSGPRs: 442584 -> 421344 (-4.80%)
VALU: 3810755 -> 3810267 (-0.01%); split: -0.01%, +0.00%
SALU: 763402 -> 752650 (-1.41%); split: -1.48%, +0.07%

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:14 +00:00
Georg Lehmann
bcfc5c09fa amd: add offset to is_subgroup_invocation_lt_amd
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>
2024-09-26 14:29:13 +00:00
Marek Olšák
09e64e3682 nir/opt_shrink_vectors: shrink memory loads, not just IO
The problem with radeonsi+ACO is that UBO loads from vec4 uniforms using
only 1 component always load all 4 components. This fixes that.

We are only interested in shrinking UBO and SSBO loads, but I added more
intrinsics because why not.

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29384>
2024-09-26 03:01:38 +00:00
Timothy Arceri
f6e7520b13 glsl: remove now unused linker code
This has all be replaced by a nir based linker implementation.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
cbfc225e2b glsl: switch to a full nir based linker
This commit does 3 things at once (3 squashed commits) as required
to make sure the commit doesn't break things.

1. convert to nir at compile time
2. enable full nir linking
3. switch standalone compiler to nir linker

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
5108a9a37d glsl: set blake3 hash in standalone scaffolding
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
1c88ed6194 glsl: add lower_derivatives_without_layout() helper
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Georg Lehmann
ff4596ae61 spirv: explicitly lower derivatives to zero
To allow removal of the existing nir_builder lowering.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
721d23b8ff glsl: add intrastage shader linking helpers for nir linker
Conversions of the existing glsl ir linking code to nir.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
fe9b93fc1c nir: handle wildcard array deref
Here we add handling of wildcard array derefs when attempting to mark
an io as partially used rather than hitting an assert.

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00
Timothy Arceri
6bb6b0e5ad nir: add nir_intrinsic_deref_implicit_array_length intrinsic
This will be used to handle .length() calls on unsized arrays

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>
2024-09-25 09:39:44 +00:00