AlexIndustrial/mesa

Author	SHA1	Message	Date
Ian Romanick	2d6f48f6ef	nir/algebraic: Do not generate 8- or 16-bit find_msb The next commit will add validation to restrict this instruction (and others) to only 32-bit or 64-bit sources. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	2119ab7319	nir/builder: Do not generate 8- or 16-bit find_msb Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	28311f9d02	nir: intel/compiler: Move ufind_msb lowering to NIR Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Cycles in all programs: 9098346105 -> 9098333765 (-0.0%) Cycles helped: 6 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	a4052e70ea	nir/algebraic: Only lower ufind_msb with 32-bit sources The 31-ufind_msb_rev(x) lowering only produces the correct result for 32-bit sources. ufind_msb_rev can also have 64-bit sources, and most platforms are expected to lower this to 32-bit instructions with extra logic operations. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	0cc7bf63b7	nir: intel/compiler: Move ifind_msb lowering to NIR Unlike ufind_msb, ifind_msb is only defined in NIR for 32-bit values, so no @32 annotation is required. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Ian Romanick	66840b98e4	nir: ifind_msb_rev can only have int32 sources Just like ifind_msb. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19042>	2023-03-10 15:27:17 +00:00
Eric Engestrom	f5d3d1e7ed	meson: inline gtest_test_protocol now that it's always 'gtest' Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21485>	2023-03-10 07:20:29 +00:00
antonino	3a59b2a670	nir: handle output beeing written to deref in `nir_lower_point_smooth` Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21731>	2023-03-09 04:38:24 +00:00
Daniel Schürmann	3073810397	nir/gather_info: allow terminate() in non-PS RADV will use terminate() to end ray-tracing shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21736>	2023-03-08 16:59:41 +00:00
Rhys Perry	98cb7e0108	nir: add nir_lower_alu_width_test.fdot_order Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>	2023-03-08 14:38:26 +00:00
Rhys Perry	50f7e21481	nir: make fdph lowering match fdot Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>	2023-03-08 14:38:26 +00:00
Rhys Perry	3668da7c83	nir: use xyzw order for precise fdot Fixes flickering grass in Immortals Fenyx Rising. fossil-db (gfx1100): Totals from 13969 (10.38% of 134574) affected shaders: MaxWaves: 442794 -> 442878 (+0.02%) Instrs: 4861105 -> 4901408 (+0.83%); split: -0.02%, +0.85% CodeSize: 24316100 -> 24396272 (+0.33%); split: -0.03%, +0.35% VGPRs: 446256 -> 445572 (-0.15%); split: -0.20%, +0.05% Latency: 28122456 -> 28162233 (+0.14%); split: -0.10%, +0.24% InvThroughput: 2899673 -> 2904323 (+0.16%); split: -0.07%, +0.23% VClause: 119599 -> 119631 (+0.03%); split: -0.07%, +0.09% SClause: 186636 -> 186265 (-0.20%); split: -0.23%, +0.03% Copies: 301370 -> 300386 (-0.33%); split: -0.75%, +0.42% Branches: 85066 -> 85047 (-0.02%); split: -0.02%, +0.00% PreSGPRs: 436167 -> 436137 (-0.01%) PreVGPRs: 329715 -> 329809 (+0.03%); split: -0.01%, +0.04% fossil-db (gfx1100, RADV_DEBUG=invariantgeom): Totals from 43116 (32.04% of 134574) affected shaders: MaxWaves: 1332938 -> 1333012 (+0.01%); split: +0.01%, -0.00% Instrs: 16424513 -> 16658021 (+1.42%); split: -0.06%, +1.48% CodeSize: 81258868 -> 81827860 (+0.70%); split: -0.07%, +0.77% VGPRs: 1720368 -> 1719648 (-0.04%); split: -0.19%, +0.15% SpillSGPRs: 1670 -> 1600 (-4.19%); split: -5.27%, +1.08% Latency: 82063766 -> 82425418 (+0.44%); split: -0.23%, +0.67% InvThroughput: 9665803 -> 9727810 (+0.64%); split: -0.09%, +0.73% VClause: 449662 -> 451099 (+0.32%); split: -0.32%, +0.64% SClause: 498841 -> 498639 (-0.04%); split: -0.24%, +0.20% Copies: 1001020 -> 1000770 (-0.02%); split: -1.20%, +1.17% Branches: 237580 -> 239637 (+0.87%); split: -0.01%, +0.88% PreSGPRs: 1198167 -> 1198024 (-0.01%); split: -0.01%, +0.00% PreVGPRs: 1225202 -> 1225035 (-0.01%); split: -0.06%, +0.05% fossil-db (navi10): Totals from 13969 (10.38% of 134563) affected shaders: MaxWaves: 474386 -> 474508 (+0.03%); split: +0.05%, -0.03% Instrs: 3740895 -> 3771566 (+0.82%); split: -0.00%, +0.82% CodeSize: 19426592 -> 19459916 (+0.17%); split: -0.00%, +0.18% VGPRs: 389916 -> 389852 (-0.02%); split: -0.09%, +0.07% Latency: 25452927 -> 25502482 (+0.19%); split: -0.14%, +0.34% InvThroughput: 3880807 -> 3923144 (+1.09%); split: -0.07%, +1.16% VClause: 66835 -> 66712 (-0.18%); split: -0.38%, +0.20% SClause: 178805 -> 178802 (-0.00%); split: -0.01%, +0.01% Copies: 167601 -> 167625 (+0.01%); split: -0.54%, +0.56% Branches: 83788 -> 83784 (-0.00%) PreSGPRs: 388229 -> 388216 (-0.00%) PreVGPRs: 342984 -> 343062 (+0.02%); split: -0.01%, +0.03% fossil-db (navi10, RADV_DEBUG=invariantgeom): Totals from 43116 (32.04% of 134563) affected shaders: MaxWaves: 1260184 -> 1256414 (-0.30%); split: +0.10%, -0.40% Instrs: 12804951 -> 12983628 (+1.40%); split: -0.01%, +1.41% CodeSize: 65813224 -> 66137852 (+0.49%); split: -0.03%, +0.52% VGPRs: 1556396 -> 1561340 (+0.32%); split: -0.09%, +0.41% SpillSGPRs: 1377 -> 1395 (+1.31%) Latency: 76095867 -> 76355111 (+0.34%); split: -0.32%, +0.66% InvThroughput: 13546863 -> 13788789 (+1.79%); split: -0.05%, +1.84% VClause: 310910 -> 311283 (+0.12%); split: -0.63%, +0.75% SClause: 474878 -> 474941 (+0.01%); split: -0.09%, +0.10% Copies: 639367 -> 637610 (-0.27%); split: -1.03%, +0.76% Branches: 240178 -> 240185 (+0.00%); split: -0.00%, +0.00% PreSGPRs: 1056594 -> 1056590 (-0.00%); split: -0.00%, +0.00% PreVGPRs: 1247950 -> 1247798 (-0.01%); split: -0.05%, +0.04% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7920 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20812>	2023-03-08 14:38:26 +00:00
Marek Olšák	f7076d129d	amd: add nir_intrinsic_xfb_counter_sub_amd and fix overflowed streamout offsets Fixes: `5ec79f9899` - ac/nir/ngg: nogs support streamout Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21584>	2023-03-07 22:08:47 +00:00
Lionel Landwerlin	a278eeb719	nir: fix nir_ishl_imm Both GLSL & SPIRV have undefined values for shift > bitsize. But SM5 says : "This instruction performs a component-wise shift of each 32-bit value in src0 left by an unsigned integer bit count provided by the LSB 5 bits (0-31 range) in src1, inserting 0." Better to not hard code the wrong behavior in NIR. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `e227bb9fd5` ("nir/builder: add ishl_imm helper") Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21720>	2023-03-07 08:14:34 +00:00
Alyssa Rosenzweig	f47ea3f992	glsl/nir: Use scoped_barrier for control barrier Rather than control_barrier. This avoids the need to handle control_barrier at all for backends that set use_scoped_barrier. This effectively matches what spirv_to_nir emits, so Vulkan-capable compilers should be ok. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>	2023-03-07 00:41:13 +00:00
Alyssa Rosenzweig	952bd63d6d	nir/opt_barrier: Generalize to control barriers For GLSL, we want to optimize code like memoryBarrierBuffer(); controlBarrier(); into a single scoped_barrier intrinsic for the backend to consume. Now that backends can get scoped_barriers everywhere, what's left is enabling backends to combine these barriers together. We already have an Intel-specific pass for combining memory barriers; it just needs a teensy bit of generalization to allow combining all sorts of barriers together. This avoids code quality regression on Asahi when switching to purely scoped barriers. It's probably useful for other backends too. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21661>	2023-03-06 22:09:27 +00:00
Alyssa Rosenzweig	282aeb9b9c	nir/lower_tex: Add lower_index_to_offset Some backends can handle a constant texture index or a dynamic texture index but not a constant texture index plus a dynamic texture offset. Add a nir_lower_tex option to lower to one of these options. v2: Use more straightforward code proposed by Faith. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Emma Anholt <emma@anholt.net> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21546>	2023-03-06 21:38:32 +00:00
Erik Faye-Lund	c305f97257	nir: add a print_internal debug-flag It can sometimes be useful to also print the shaders that are marked as internal, so let's add a flag that lets us do that. Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21681>	2023-03-06 09:13:52 +00:00
Alyssa Rosenzweig	586da7b329	nir: Add nir_lower_helper_writes pass This NIR pass lowers stores in fragment shaders to: if (!gl_HelperInvocaton) { store(); } This implements the API requirement that helper invocations do not have visible side effects, and the lowering is required on any hardware that cannot directly mask helper invocation's side effects. The pass was originally written for Midgard (which has this issue) but is also needed for Asahi. Let's share the code, and fix it while we're at it. Changes from the Midgard pass: 1. Add an option to only lower atomics. AGX hardware can mask helper invocations for "plain" stores but not for atomics. Accordingly, the AGX compiler wants this lowering for atomics but not store_global. By contrast, Midgard cannot mask any stores and needs the lowering for all store intrinsics. Add an option to the common pass to accommodate both cases. This is an optimization for AGX. It is not required for correctness, this lowering is always legal. 2. Fix dominance issues. It's invalid to have NIR like if ... { ssa_1 = ... } foo ssa_1 Instead we need to rewrite as if ... { ssa_1 = ... } else { ssa_2 = undef } ssa_3 = phi ssa_1, ssa_2 foo ssa_3 By default, neither nir_validate nor the backends check this, so this doesn't currently fix a (known) real bug. But it's still invalid and fails validation with NIR_DEBUG=validate_ssa_dominance. Fix this in lower_helper_writes for intrinsics that return data (atomics). 3. Assert that the pass is run only for fragment shaders. This encourages backends to be judicious about which passes they call instead of just throwing everything in a giant lower everything spaghetti. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21413>	2023-03-04 13:31:05 -05:00
Rhys Perry	aa32dc704f	nir/range_analysis: fix vectorized phis and intrinsics Found by inspection. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21288>	2023-03-04 12:58:38 +00:00
Marek Olšák	b80bd58265	nir: skip nir_op_unpack_32_4x8 in nir_lower_alu_width The pass can't handle it just like the other unpack opcodes and generates invalid NIR. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>	2023-03-03 03:27:40 +00:00
Marek Olšák	ec38758e86	nir: return progress from nir_lower_io_to_scalar oversight? Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19399>	2023-03-03 03:27:40 +00:00
Faith Ekstrand	c11ac5e446	nir: Handle wider unaligned loads in lower_mem_access_bit_size Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	7e8a10be67	nir: Make chunk_align_offset const in lower_mem_load() This should make things more clear than changing the value from earlier in the loop. Also, rename chunk_offset to load_offset so they match. Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	eb9a56b6ca	nir: Rename nir_mem_access_size_align::align_mul to align It's a simple alignment so calling it align_mul is a bit misleading. Suggested-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	802bf1d9a6	nir: Rename align to whole_align in lower_mem_load Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	ca4d73ba36	nir: Add a combined alignment helper Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@colllabora.com> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	e433a7c4fa	nir: Add UBO support to nir_lower_mem_access_bit_sizes Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	116a851264	nir: Add mode filtering to lower_mem_access_bit_sizes Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Faith Ekstrand	4b06b1a7c5	nir: Check against combined alignment in nir_lower_mem_access_bit_sizes Checking against align_mul is insufficient if align_offset > 0. We need to check against the combined alignment instead. Fixes: `2e2d7803c7` ("nir: Add a load/store bit size lowering pass") Reviewed-by: M Henning <drawoc@darkrefraction.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21524>	2023-03-03 02:00:39 +00:00
Georg Lehmann	0a3387a190	nir/lower_mediump: don't use fp16 for constants if the result is denormal Image stores are not required to preserve denorms. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21622>	2023-03-02 11:42:10 +00:00
Timothy Arceri	d75a36a9ee	glsl: remove do_copy_propagation_elements() optimisation pass Since `13b859de` do_copy_propagation_elements() has a flaw where the time it takes to complete grows exponentially slowers as the number of nested loops increases. It can also hurt rather than help verses just letting NIR optimise the code. So if the NIR linker is enabled we let it handle it instead. shader-db results Iris (BDW): total instructions in shared programs: 11177181 -> 11199739 (0.20%) instructions in affected programs: 119424 -> 141982 (18.89%) helped: 109 HURT: 65 total cycles in shared programs: 368946819 -> 372277173 (0.90%) cycles in affected programs: 116539428 -> 119869782 (2.86%) total spills in shared programs: 3983 -> 8785 (120.56%) spills in affected programs: 2072 -> 6874 (231.76%) helped: 0 HURT: 6 total fills in shared programs: 2016 -> 6068 (200.99%) fills in affected programs: 230 -> 4282 (1761.74%) helped: 0 HURT: 6 LOST: 85 GAINED: 77 freedreno results: total instructions in shared programs: 11011122 -> 11011620 (<.01%) instructions in affected programs: 939829 -> 940327 (0.05%) total full in shared programs: 762725 -> 762674 (<.01%) full in affected programs: 1096 -> 1045 (-4.65%) total constlen in shared programs: 1772092 -> 1771596 (-0.03%) constlen in affected programs: 2780 -> 2284 (-17.84%) total stp in shared programs: 4040 -> 4058 (0.45%) stp in affected programs: 3656 -> 3674 (0.49%) total ldp in shared programs: 2160 -> 2178 (0.83%) ldp in affected programs: 1748 -> 1766 (1.03%) stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/13.shader_test CL: 1231 -> 1234 (0.24%) stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/13.shader_test CL: 1231 -> 1234 (0.24%) stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/15.shader_test CL: 453 -> 456 (0.66%) stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/15.shader_test CL: 453 -> 456 (0.66%) stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_high_off/17.shader_test CL: 144 -> 147 (2.08%) stp HURT: shaders/robclark-shaders/gfxbench5/gl_5_normal_off/17.shader_test CL: 144 -> 147 (2.08%) however, those stp counts are misleading -- gfxbench gl-5-normal actually gets its scratch (ldp/stp) stored as 16 bits instead of 32 thanks to better NIR copy prop, and the result is 2.64398% +/- 0.0991923% perf improvement! i915 results: total instructions in shared programs: 510528 -> 510489 (<.01%) instructions in affected programs: 3303 -> 3264 (-1.18%) total tex_indirect in shared programs: 16708 -> 16717 (0.05%) tex_indirect in affected programs: 134 -> 143 (6.72%) total temps in shared programs: 30181 -> 30169 (-0.04%) temps in affected programs: 1268 -> 1256 (-0.95%) LOST: 0 GAINED: 1 i915 highlights: instructions HURT: shaders/closed/steam/legend-of-grimrock/47.shader_test FS: 141 -> 144 (2.13%) instructions HURT: shaders/closed/steam/steamworld-dig/22.shader_test FS: 84 -> 108 (28.57%) temps HURT: shaders/closed/steam/left-4-dead-2/medium/3682.shader_test FS: 7 -> 13 (85.71%) r300 results: total instructions in shared programs: 1340439 -> 1340845 (0.03%) instructions in affected programs: 32354 -> 32760 (1.25%) total temps in shared programs: 179394 -> 179329 (-0.04%) temps in affected programs: 1505 -> 1440 (-4.32%) total consts in shared programs: 1177742 -> 1177885 (0.01%) consts in affected programs: 1107 -> 1250 (12.92%) total lits in shared programs: 24992 -> 25019 (0.11%) lits in affected programs: 138 -> 165 (19.57%) instructions HURT: shaders/closed/steam/legend-of-grimrock/26.shader_test FS: 47 -> 52 (10.64%) instructions HURT: shaders/closed/steam/sanctum-2/6072.shader_test FS: 43 -> 48 (11.63%) instructions HURT: shaders/closed/steam/champions-of-regnum/2378.shader_test VS: 35 -> 40 (14.29%) Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13288>	2023-03-01 16:09:25 +00:00
Emma Anholt	106019a5d8	nir/split_64bit_vec3_and_vec4: Handle 64-bit matrix types. The offset handling should already work for flattening to our split vars, just need to make sure we have enough (or any!) array elements. Fixes: #7154 Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13288>	2023-03-01 16:09:25 +00:00
Caio Oliveira	5f79e78911	spirv: Add skip_os_break_in_debug_build option to use in unit tests When running in the CI environment, instead of crashing the test binary, it is preferable to just fail gracefully (in this case return a NULL shader) like is done in release mode, so other tests continue to be executed. For convenience add a variable break_on_failure to the test so the breaking behavior can be re-enable in individual tests when debugging. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512>	2023-03-01 13:47:57 +00:00
Caio Oliveira	8a91a33b7c	spirv/tests: Add some basic control flow tests The DISABLED test currently fails parsing. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512>	2023-03-01 13:47:57 +00:00
Caio Oliveira	4e5b520286	spirv/tests: Parametrize stage in get_nir() helper Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512>	2023-03-01 13:47:57 +00:00
Caio Oliveira	131f328a18	spirv/tests: Add script to generate C array from SPIR-V source This is useful for generating the C code to embed the SPIR-V when adding a new test. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512>	2023-03-01 13:47:57 +00:00
Caio Oliveira	17e0c75441	spirv/tests: Subclass spirv_test helper to namespace the tests Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21512>	2023-03-01 13:47:57 +00:00
Georg Lehmann	aeb68c29b4	nir/opt_algebraic: add patterns for iand/ior of feq/fneu with 0 Foz-DB Navi21: Totals from 1245 (0.92% of 134913) affected shaders: VGPRs: 66232 -> 66248 (+0.02%); split: -0.01%, +0.04% CodeSize: 5874976 -> 5868168 (-0.12%); split: -0.17%, +0.05% MaxWaves: 25278 -> 25274 (-0.02%); split: +0.01%, -0.02% Instrs: 1087502 -> 1085267 (-0.21%); split: -0.21%, +0.00% Latency: 6531489 -> 6531672 (+0.00%); split: -0.04%, +0.05% InvThroughput: 1531774 -> 1532327 (+0.04%); split: -0.02%, +0.05% VClause: 22218 -> 22202 (-0.07%); split: -0.08%, +0.00% SClause: 45906 -> 45873 (-0.07%); split: -0.08%, +0.01% Copies: 64004 -> 64102 (+0.15%); split: -0.24%, +0.39% Branches: 21529 -> 21534 (+0.02%); split: -0.00%, +0.03% PreSGPRs: 51936 -> 51850 (-0.17%) PreVGPRs: 55393 -> 55398 (+0.01%); split: -0.02%, +0.03% Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21576>	2023-03-01 11:24:43 +00:00
Caio Oliveira	863cbb3e02	spirv: Don't specify nir_var_uniform or nir_var_mem_ubo in barriers These are constant read-only data and don't need to be synchronized. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21517>	2023-03-01 09:53:29 +00:00
Eric Engestrom	78c95b2865	glsl: align definition of _mesa_problem with the one in main/error.h The ctx pointer not used by that function anyway, so const'ing it makes no difference. Signed-off-by: Eric Engestrom <eric@igalia.com> Reviewed-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Marek Olšák <maraeo@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21557>	2023-02-28 09:04:47 +00:00
Emma Anholt	87ec94f6aa	glsl: Move lower_vector_insert to GLSL-to-NIR. We already have a nir_builder equivalent for generating this code, just use that instead of doing it in GLSL. No change on r300 shader-db. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21476>	2023-02-28 06:13:06 +00:00
Emma Anholt	2f53188f18	glsl: Remove unused as_rvalue_to_saturate(). This is not where saturate recognition happens. Dead code since `5598458e69` ("i965/vec4: Remove try_emit_saturate") in 2014! Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:09 +00:00
Emma Anholt	d76fb3b2b1	glsl/opt_algebraic: Drop the flrp recognizer. No change to r300. freedreno looks mixed but slightly positive in instructions: total instructions in shared programs: 11012472 -> 11012453 (<.01%) instructions in affected programs: 8250 -> 8231 (-0.23%) helped: 16 HURT: 50 Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:09 +00:00
Emma Anholt	579aca894f	glsl/opt_algebraic: Drop the ftrunc pattern recognizer. Now that it's in NIR, there's no change to r300 or freedreno shader-db when we do. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:09 +00:00
Emma Anholt	6d52e6fd2c	nir: Port a floor->truncate algebraic opt pattern from GLSL. Prevents regression when dropping code from the GLSL optimizer. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:09 +00:00
Emma Anholt	6229d34b91	glsl/opt_algebraic: Drop some fmul simplifications. Looks like mostly noise, trending slightly positively. freedreno: total instructions in shared programs: 11012781 -> 11012472 (<.01%) instructions in affected programs: 114072 -> 113763 (-0.27%) helped: 123 HURT: 153 r300: total instructions in shared programs: 1338236 -> 1337897 (-0.03%) instructions in affected programs: 3460 -> 3121 (-9.80%) helped: 61 HURT: 11 Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:09 +00:00
Emma Anholt	4bf65ce221	glsl/opt_algebraic: Drop the flrp/ffma simplifiers. NIR seems to do a better job. Freedreno: total instructions in shared programs: 11013096 -> 11012781 (<.01%) instructions in affected programs: 258358 -> 258043 (-0.12%) helped: 470 HURT: 269 r300: total instructions in shared programs: 1338237 -> 1338236 (<.01%) instructions in affected programs: 161 -> 160 (-0.62%) helped: 1 HURT: 0 total presub in shared programs: 45127 -> 44881 (-0.55%) presub in affected programs: 1719 -> 1473 (-14.31%) helped: 246 HURT: 0 Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:09 +00:00
Emma Anholt	3f632ce764	glsl/opt_algebraic: Drop no-op pack/unpack optimization. No change on freedreno shader-db. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:08 +00:00
Emma Anholt	d589760f44	glsl/opt_algebraic: Drop the eq/neq add-removal optimization. No change on freedreno or r300 shader-db. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21475>	2023-02-28 03:36:08 +00:00

1 2 3 4 5 ...

7754 Commits