AlexIndustrial/mesa

Author	SHA1	Message	Date
Job Noorman	584b63ecab	nir/load_store_vectorize: fix division by zero Don't use glsl_get_explicit_stride as it may return 0 for vector types, use nir_deref_instr_array_stride instead. Signed-off-by: Job Noorman <jnoorman@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31460>	2024-10-02 05:53:57 +00:00
Rhys Perry	be64454710	nir/tests: test opt_loop_peel_initial_break with derefs in header block Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>	2024-10-01 12:24:22 +00:00
Rhys Perry	0484044b1a	nir/opt_loop: rematerialize header block derefs in their use blocks Otherwise, we could end up with phis of derefs. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `6b4b044739` ("nir/opt_loop: add loop peeling optimization") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31324>	2024-10-01 12:24:22 +00:00
Gert Wollny	f19f1ec17b	nir/opt_algebraic: Allow two-step lowering of ftrunc@64 to use ffract@64 If ftrunc@64 is lowered by nir_lower_doubles it is turned into a comparable long series of 32 bit operations. If the hardware supports ffract@64 then nir_opt_algebraic can first lower ftrunc@64 to use some combinations with ffloor@64. They can then be turned into a combination of fsub@64 and ffract@64 resulting in less all-over instructions. Fixes: `5218cff34b` nir/algebraic: avoid double lowering of some fp64 operations Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29281>	2024-09-30 23:51:02 +00:00
Kenneth Graunke	0b34a7aff0	nir: Don't generate single iteration loops to zero-initialize memory If the stride we're adding to our loop counter is larger than the total amount of shared local memory we're trying to initialize, we know the loop will run at most one time. So we can skip emitting a loop. Loop unrolling appears to be unable to detect this currently. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31312>	2024-09-30 05:27:17 +00:00
Georg Lehmann	bb7e8d51b6	nir: delete nir_opt_reuse_constants Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	60776f87c3	nir/opt_remove_phis: rematerialize constants Foz-DB Navi31: Totals from 749 (0.94% of 79395) affected shaders: Instrs: 1224359 -> 1223722 (-0.05%); split: -0.07%, +0.02% CodeSize: 6468392 -> 6466296 (-0.03%); split: -0.06%, +0.03% Latency: 9764410 -> 9766457 (+0.02%); split: -0.01%, +0.03% InvThroughput: 1017401 -> 1017380 (-0.00%); split: -0.03%, +0.03% VClause: 19902 -> 19873 (-0.15%); split: -0.16%, +0.02% SClause: 38441 -> 38424 (-0.04%); split: -0.05%, +0.01% Copies: 86880 -> 86304 (-0.66%); split: -0.73%, +0.06% Branches: 34206 -> 34159 (-0.14%); split: -0.14%, +0.01% PreSGPRs: 45557 -> 45527 (-0.07%); split: -0.08%, +0.01% PreVGPRs: 32406 -> 32408 (+0.01%) VALU: 671633 -> 671533 (-0.01%); split: -0.02%, +0.01% SALU: 155284 -> 154675 (-0.39%); split: -0.40%, +0.00% VMEM: 27303 -> 27271 (-0.12%) SMEM: 67490 -> 67455 (-0.05%) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	40fc85c15b	nir: make nir_instr_clone usable with load_const and undef Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	a9f8089240	nir: replace nir_opt_remove_phis_block with a single source version This is what callers actually want, and it simplifies nir_opt_remove_phis because we can assume dominance meta data is valid. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31031>	2024-09-27 05:19:16 +00:00
Georg Lehmann	41e82b8b8e	nir: sink is_subgroup_invocation_lt_amd Having it closer to the branches means we can eliminate an exec copy. Foz-DB Navi31: Totals from 11615 (14.63% of 79395) affected shaders: Instrs: 6804372 -> 6804903 (+0.01%); split: -0.04%, +0.05% CodeSize: 33684672 -> 33680584 (-0.01%); split: -0.07%, +0.05% VGPRs: 578616 -> 578604 (-0.00%) SpillSGPRs: 1506 -> 1304 (-13.41%) Latency: 29817034 -> 29821320 (+0.01%); split: -0.03%, +0.05% InvThroughput: 3581587 -> 3581217 (-0.01%); split: -0.02%, +0.01% VClause: 124826 -> 124782 (-0.04%); split: -0.04%, +0.00% SClause: 187916 -> 187645 (-0.14%); split: -0.27%, +0.13% Copies: 520969 -> 510027 (-2.10%); split: -2.20%, +0.10% PreSGPRs: 442584 -> 421344 (-4.80%) VALU: 3810755 -> 3810267 (-0.01%); split: -0.01%, +0.00% SALU: 763402 -> 752650 (-1.41%); split: -1.48%, +0.07% Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>	2024-09-26 14:29:14 +00:00
Georg Lehmann	bcfc5c09fa	amd: add offset to is_subgroup_invocation_lt_amd Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31184>	2024-09-26 14:29:13 +00:00
Marek Olšák	09e64e3682	nir/opt_shrink_vectors: shrink memory loads, not just IO The problem with radeonsi+ACO is that UBO loads from vec4 uniforms using only 1 component always load all 4 components. This fixes that. We are only interested in shrinking UBO and SSBO loads, but I added more intrinsics because why not. Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29384>	2024-09-26 03:01:38 +00:00
Timothy Arceri	f6e7520b13	glsl: remove now unused linker code This has all be replaced by a nir based linker implementation. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	fe9b93fc1c	nir: handle wildcard array deref Here we add handling of wildcard array derefs when attempting to mark an io as partially used rather than hitting an assert. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	6bb6b0e5ad	nir: add nir_intrinsic_deref_implicit_array_length intrinsic This will be used to handle .length() calls on unsized arrays Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	60937b5286	nir: add implicit_conversion_prohibited field to nir_parameter Will be used in link time validation in following patches. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	5645495156	nir: store variable mode in nir_parameter This will be used by the nir glsl linker in following patches. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	89a2411c54	nir: serialize nir_parameter type Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	6ff3e87e5f	nir: add function in/outs to variable modes Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:44 +00:00
Timothy Arceri	1cb115abd2	nir: add nir_function_impl_clone_remap_globals() This will be use by the glsl nir linker when we are combining different shaders from the same shader stage that might have multiple declarations of global variables across the different shaders. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:43 +00:00
Timothy Arceri	7a1061e0dd	nir: add max_ifc_array_access field to vars This will be used in following patches by the nir based glsl linker code. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:43 +00:00
Timothy Arceri	7c5b21c032	glsl: add support for converting global instructions to NIR NIR doesn't really support global instructions such as global val initilisation. So here we add functionality to glsl_to_nir() to put these instructions into a temporary function that will be later inlined into main. We give the function a name starting with gl_mesa_tmp_ as functions starting with gl_ are reserved and will not have any clashes with user functions, we finish the name with the blake3 of the shader source to avoid conflicts with multiple shaders attached to a single stage. Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31137>	2024-09-25 09:39:43 +00:00
Georg Lehmann	e0bcab953d	nir: add amd shared append/consume Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31075>	2024-09-19 16:21:47 +00:00
Boris Brezillon	eeb3512498	nir/lower_ssbo: Extend the load_ssbo_address intrinsic to pass an offset On Mali(Valhall), the bounds checking can be done when in hardware, but for this to work properly, we need to pass the offset to the nir_load_ssbo_address() intrinsic. Add an offset source to the intrinsic, and adjust the lowering pass to conditionally lower the offset addition. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>	2024-09-18 13:45:57 +00:00
Boris Brezillon	adadb097a3	nir/lower_ssbo: Add an option to conditionally lower loads On Mali(Valhall), we have a way to load SSBO data without going through an SSBO index -> global address translation, so let's provide a way to tell nir_lower_ssbo() when it shouldn't lower loads. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Eric R. Smith <eric.smith@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31164>	2024-09-18 13:45:57 +00:00
Georg Lehmann	a3d6a770c0	nir/instr_set: fix fp_fast_math We can't just ignore the flags of the match, we need the union. Fixes: `666647acae` ("nir: track some float controls bits per instruction") Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31195>	2024-09-17 20:00:03 +00:00
Ian Romanick	6a09d33549	nir: Add a pass to generate BFI instructions from logical operations Inspired by a commit message in !30934, I set about optimizing the code generated for nir_copysign. It would be possible to just implement an opt_algebraic pattern for the specific values used by nir_copysign, but this casts a slightly larger net. As noted in a comment in the code, there may be variations of the pattern that this pass misses. The opt_algebraic pattern would miss them too. v2: Use nir_def_replace. Suggested by Alyssa. Allow more "root" instruction types. Suggested by Georg. v3: Treat extract_u16(x, 0) as (x & 0x0000ffff), and treat extract_u8(x, 0) as (x & 0x000000ff). v4: Use nir_scalar. Suggested by Georg. Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>	2024-09-13 00:21:00 +00:00
Ian Romanick	057c7c9f53	nir/algebraic: Recognize open-coded bitfield_reverse in XCOM 2 The XCOM 2 shaders in my shader-db use iadd instead of ior. No fossil-db changes on any Intel platform. shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19787210 -> 19787034 (<.01%) instructions in affected programs: 1187 -> 1011 (-14.83%) helped: 6 / HURT: 0 total cycles in shared programs: 906024436 -> 906012612 (<.01%) cycles in affected programs: 72978 -> 61154 (-16.20%) helped: 6 / HURT: 0 Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31006>	2024-09-13 00:21:00 +00:00
Rhys Perry	97f4250a7c	nir: skip opt_loop_peel_initial_break if continue block only has phis Doing that optimization wouldn't do anything useful in this case. nir_block_has_non_copy() is used by opt_loop_peel_initial_break(). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>	2024-09-12 23:36:58 +00:00
Rhys Perry	8410b4cdd6	nir/tests: add some loop peeling tests Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>	2024-09-12 23:36:58 +00:00
Rhys Perry	64ac601049	nir/opt_loop: skip peeling if the loop ends with any kind of jump Any kind of jump prevents us from moving it to the top of the loop, not just breaks. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `6b4b044739` ("nir/opt_loop: add loop peeling optimization") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>	2024-09-12 23:36:58 +00:00
Rhys Perry	af3b099e0a	nir/opt_loop: skip peeling if the break is non-trivial If this nir_if contains continues or other breaks, we can't move it outside the loop. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `6b4b044739` ("nir/opt_loop: add loop peeling optimization") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>	2024-09-12 23:36:57 +00:00
Rhys Perry	4f44a944bb	nir/opt_if: fix fighting between split_alu_of_phi and peel_initial_break Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `6b4b044739` ("nir/opt_loop: add loop peeling optimization") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11822 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31002>	2024-09-12 23:36:57 +00:00
Georg Lehmann	7fa7812219	nir: merge out of loop decision with nir_can_move_instr logic One place to modify instead of two when adding new intrinsics here. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>	2024-09-12 21:49:34 +00:00
Georg Lehmann	91f8e32a85	nir/opt_sink: do not sink inverse_ballot out of loops Inverse_ballot result is undefined if the input is not dynamically uniform. And sinking out of loops might make the input divergent. Fixes: `18a0ff137f` ("nir: sink/move inverse_ballot like moves") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>	2024-09-12 21:49:34 +00:00
Georg Lehmann	1ec3cc2aed	nir/opt_sink: do not sink load_ubo_vec4 out of loops Same reason as for load_ubo. Fixes: `d199d65c3a` ("nir/nir_opt_move,sink: Include load_ubo_vec4 as a load_ubo instr.") Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30906>	2024-09-12 21:49:34 +00:00
Caio Oliveira	1e7f1c2039	nir: Allow Mesh/Task to use implicit LOD when DERIVATIVE_GROUP is set Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30956>	2024-09-10 18:22:42 +00:00
David Heidelberg	6bf7b5bcd8	nir_lower_mem_access_bit_sizes: Assert when 0 components or bits are requested Prevent the accidental passing of 0 components or bits, as it makes no sense. Cc: mesa-stable Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Suggested-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: David Heidelberg <david@ixit.cz> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31103>	2024-09-10 11:17:48 +00:00
Ian Romanick	a780305818	nir/algebraic: Optimize more comparisons with b2f shader-db: All Intel platforms had similar results. (Meteor Lake shown) total instructions in shared programs: 19781108 -> 19772614 (-0.04%) instructions in affected programs: 372638 -> 364144 (-2.28%) helped: 2915 / HURT: 0 total cycles in shared programs: 905907644 -> 905822682 (<.01%) cycles in affected programs: 5573453 -> 5488491 (-1.52%) helped: 2363 / HURT: 234 LOST: 42 GAINED: 16 fossil-db: All Intel platforms had similar results. (Meteor Lake shown) Totals: Instrs: 152519634 -> 152519610 (-0.00%) Cycle count: 17122707642 -> 17122710974 (+0.00%); split: -0.00%, +0.00% Totals from 5 (0.00% of 633222) affected shaders: Instrs: 2827 -> 2803 (-0.85%) Cycle count: 83089 -> 86421 (+4.01%); split: -0.12%, +4.13% Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31068>	2024-09-10 04:15:58 +00:00
Alyssa Rosenzweig	b7542c4390	nir: CSE comparisons in atan2 Same code generated on AGX but simplified NIR. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	7546ae96a7	nir: drop NaN fixup for atan this existed due to the min/max, per the comment. now that we don't do min/max, the whole routine is NaN correct so the fixup is pointless. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Suggested-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	ab8547a002	nir: push up abs in atan2 calculation everybody has abs on fmul, not everyone has abs on bcsel. should help agx and bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	398e1ad46c	nir: fuse ffma for atan range fixup Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	47e7cd268c	nir: negate an expression in atan we're going to fix up the sign immediately anyway. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	5318b8868b	nir: simplify atan range reduction fixup the original version sure is creative. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	87b99d5797	nir: use copysign for atan this does two things: * ignores sign of negative numbers which let us play fast and loose later in th series * avoids an expensive fsign instruction in favour of a cheap bitwise op Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	95215a094a	nir: extend copysign for no-integer hw Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	0a4a0df283	nir: push down fabs for atan worse in terms of NIR instruction count but lets the fabs fold easier. (on agx, which has fabs on comparisons and fmul but not on bcsel. should be no worse if ISA has fabs on all 3.) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	8579375777	nir: simplify atan range reduction just implement what the comment says, don't be clever. the clever thing is worse on all architectures i'm familiar with, because the fdiv will turn into fmul+frcp. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00
Alyssa Rosenzweig	a32b1a975d	nir: correct comment for atan range reduction the code did not match the comment, blew a sign. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30934>	2024-09-07 00:54:35 +00:00

1 2 3 4 5 ...

5592 Commits