AlexIndustrial/mesa

Author	SHA1	Message	Date
Dave Airlie	01dfd65a2d	nir: port fp16 casting code from dxil This moves the dxil pass to common code and makes dxil use the new code. Acked-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9643>	2021-03-22 12:16:59 +10:00
Jesse Natalie	55d153b9f5	nir: Temporarily disable optimizations for MSVC ARM64 There's currently an MSVC optimizer bug which causes a stack overflow in the compiler if it attempts to optimize fsat. Acked-by: Rob Clark <robdclark@chromium.org> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9700>	2021-03-21 21:41:41 +00:00
Jason Ekstrand	1ba9c262fd	nir: Add image atomic_fmin/fmax intrinsics Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8750>	2021-03-18 00:13:40 +00:00
Caio Marcelo de Oliveira Filho	302183d635	nir: Handle deref_atomic_fadd in a couple of passes Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8750>	2021-03-18 00:13:40 +00:00
Jason Ekstrand	4079279051	anv/apply_pipeline_layout: Add support for A64 descriptor access Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>	2021-03-17 17:49:59 +00:00
Jason Ekstrand	c8748771bb	nir/lower_io: Support global addresses for UBOs in nir_lower_explicit_io For nir_address_format_64bit_global_32bit_offset and nir_address_format_64bit_bounded_global, we use a new intrinsics which take the base address and offset as separate parameters. For bounds- checked access, the bound is also included in the intrinsic. This gives the drive more control over the bounds checking so that UBOs don't suddenly become massively more expensive. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>	2021-03-17 17:49:59 +00:00
Jason Ekstrand	93a3f18719	nir: Add a new 64+32-bit address format This is a global address format where you have a 64-bit base pointer and a 32-bit offset. It's intentionally identical to 64bit_bounded_global except nir_lower_explicit_io does no bounds checking with it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>	2021-03-17 17:49:59 +00:00
Jason Ekstrand	1ce3660a5a	intel/fs,rt: Add a predicate to load_global_const_block This allows us to do bounds checked A64 block load without the it being counted as control-flow by NIR. This means that NIR optimizations like CSE will be able to work on these the same as a regular load. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635>	2021-03-17 17:49:58 +00:00
Timur Kristóf	4c5c610f1d	nir: Add AMD-specific Geometry Shader related intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	38df949f98	nir: Add tessellation related AMD-specific intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	744dc74078	nir: Add nir_opt_offsets to fold const adds into load/store offsets. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	eee3435757	nir: Add AMD-specific buffer load/store intrinsics. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	c2a81ebe19	nir: Add default unsigned upper bound configuration. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	8ebb8d31af	nir: Add unsigned upper bound for TCS load_invocation_id. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	9fbfafb57a	nir: Shrink vectors for load_shared. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	084863bb5d	nir: Fix unsigned upper bound of local_invocation_index for non-CS stages. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	132171dc4e	nir: Add a few more algebraic optimizations to help address calculation. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	9f9b0f583b	nir: Add nir_builder helper for I/O address offset calculations. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Timur Kristóf	f6f68d5cf1	nir: Add new nir_builder helpers for iadd with no_unsigned_wrap. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Rhys Perry	5bc42ce579	nir: Don't update base in vectorize_loads() The offset is already updated with consideration to the base above under "/* update the offset */". Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9201>	2021-03-17 12:42:23 +00:00
Iago Toral Quiroga	f29de817eb	compiler/glsl: call util_cpu_detect from glsl_type_singleton_init_or_ref Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Closes: #4393 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9457>	2021-03-17 08:15:36 +01:00
Hyunjun Ko	d82b58c03e	nir: Set access at lower_ubo_vec4 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9125>	2021-03-17 01:09:30 +00:00
Ian Romanick	da7389eced	nir/range_analysis: Simplify analysis of bcsel union_ranges was previously guarded by 'ifndef NDEBUG'. After removing that, I noticed that the two tables were identical. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	7019cd84c0	nir/search: Use range analysis for is_finite There are only a couple patterns that use is_finite, so the changes aren't huge. Mostly shaders from Batman Arkham City and a few shaders from Shadow of the Tomb Raider were affected. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake Instructions in all programs: 160902591 -> 160902489 (-0.0%) SENDs in all programs: 6812270 -> 6812270 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7429003266 -> 7428992369 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) Ice Lake Instructions in all programs: 145301634 -> 145301460 (-0.0%) SENDs in all programs: 6863890 -> 6863890 (+0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798589772 -> 8798575869 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334250 (+0.0%) Skylake Instructions in all programs: 135892010 -> 135891836 (-0.0%) SENDs in all programs: 6802916 -> 6802916 (+0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442597324 -> 8442583202 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301116 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	f4a7dbc58f	nir/range_analysis: Fix analysis of fmin, fmax, or fsat with NaN source Recall that when either value is NaN, fmax will pick the other value. This means the result range of the fmax will either be the "ideal" result range (calculated above) or the range of the non-NaN value. Previously, something like fmax({gt_zero}, {lt_zero, is_a_number}) would return a range of gt_zero. However, if the "gt_zero" parameter is NaN, the actual result will be the "lt_zero" parameter. This analysis depends on the is_a_number analysis also added in this MR. Assuming this doesn't cause any unforeseen problems, I believe we should wait a bit, then nominate a subset of the series for the stable branches. This fixes the piglit tests tests/spec/glsl-1.30/execution/range_analysis_fmax_of_nan.shader_test tests/spec/glsl-1.30/execution/range_analysis_fmin_of_nan.shader_test from https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/463. Even with the added fsat fixes, range_analysis_fsat_of_nan.shader_test still fails. There are some other issues there that will be addressed in later commits (in another MR). v2: Add fsat fixes. Suggested by Rhys. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Shader-db results: All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21049290 -> 21049314 (<.01%) instructions in affected programs: 3175 -> 3199 (0.76%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 3 x̄: 1.41 x̃: 1 HURT stats (rel) min: 0.20% max: 1.89% x̄: 0.97% x̃: 0.92% 95% mean confidence interval for instructions value: 1.09 1.73 95% mean confidence interval for instructions %-change: 0.75% 1.19% Instructions are HURT. total cycles in shared programs: 855136176 -> 855136406 (<.01%) cycles in affected programs: 37579 -> 37809 (0.61%) helped: 0 HURT: 17 HURT stats (abs) min: 12 max: 20 x̄: 13.53 x̃: 14 HURT stats (rel) min: 0.17% max: 1.13% x̄: 0.79% x̃: 0.91% 95% mean confidence interval for cycles value: 12.53 14.53 95% mean confidence interval for cycles %-change: 0.63% 0.94% Cycles are HURT. Fossil-db results: Tiger Lake Instructions in all programs: 160901033 -> 160902591 (+0.0%) SENDs in all programs: 6812270 -> 6812270 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7430016795 -> 7429003266 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) Ice Lake Instructions in all programs: 145299102 -> 145301634 (+0.0%) SENDs in all programs: 6863890 -> 6863890 (+0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798390846 -> 8798589772 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334250 (+0.0%) Skylake Instructions in all programs: 135889478 -> 135892010 (+0.0%) SENDs in all programs: 6802916 -> 6802916 (+0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442624166 -> 8442597324 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301116 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	aa5d38decd	nir/range_analysis: Add "is a number" range analysis tracking This commit is necessary to support "nir/range_analysis: Fix analysis of fmin and fmax with NaN". No shader-db or fossil-db changes on any Intel platform. v2: Pack and unpack is_a_number. v3: Don't set is_a_number of integer constants. The bit pattern might be NaN. v4: Update handling of b2i32. intBitsToFloat(int(true)) is 1.401298464324817e-45. Return a value consistent with that. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	d4f21b53f2	nir/range_analysis: Add "is finite" range analysis tracking The obvious changes to nir_search_helpers.h are in a separate commit to limit the scope of this change. These additions are really only needed to support the next commit "nir/range_analysis: Add "is a number" range analysis tracking". This reduction in scope is intended to increase the suitability for stable branches. No shader-db or fossil-db changes on any Intel platform. v2: Pack and unpack is_finite. v3: Split nir_search_helpers.h changes into a separate commit. v4: Remove assertion intended for the next commit. Update is_finite comment for fsign. Both noticed by Rhys. Fix is_finite handling for load_const vectors. If any element is not finite, set the flag to false. This is the same way is_integral is already handled. v5: Update handling of b2i32. intBitsToFloat(int(true)) is 1.401298464324817e-45. Return a value consistent with that. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	86fb53b1be	nir/range_analysis: Refactor fsat handling This will greatly simplify a later commit. The assert(r.is_integral) in the eq_zero case is dropped because I don't think it's useful anymore. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Marek Vasut	b19f1dc7d6	compiler/nir: Increment shader input count and mark as used when adding new gl_PointCoord In case a new gl_PointCoord shader input is created, increment shader input count and set valid driver_location to the new input variable, otherwise the input gets aliased to input 0 and shows up in NIR_PRINT output as whatever shader input 0 is instead of gl_PointCoord. Also set the input as used, otherwise it might get removed. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9214>	2021-03-09 21:24:35 +00:00
Jesse Natalie	ef0d2a5b4b	nir: Add a nir_after_instr_and_phis helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9464>	2021-03-09 01:41:32 +00:00
Jason Ekstrand	e20e85f01e	nir: Make nir_ssa_def_rewrite_uses_after take an SSA value This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after() with an SSA def, and rewrites all the users as needed. Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Jason Ekstrand	117668b811	nir: Make nir_ssa_def_rewrite_uses take an SSA value This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses() with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites all the users as needed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Jason Ekstrand	13a0ee8a51	nir: Add and use a new nir_ssa_def_rewrite_uses_src helper This is currently an alias for nir_ssa_def_rewrite_uses but we move all the instances which used it to write a non-SSA source to the newly named helper. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Alyssa Rosenzweig	e30994a471	nir/lower_viewport_transform: Allow geom/tess This pass needs to run on the last shader in a pipeline writing gl_Position. In GLES2, that's always the vertex shader, but in ES3.2, it can be a geometry or tessellation shader. The shared code works the same in this case, just make the assert more generous. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9444>	2021-03-07 17:57:04 +00:00
Ian Romanick	2c4fd24c01	nir/algebraic: Apply addition property of equality to the other ordering too Inequality comparison operations are not commutative, so `foo < bar` and `bar < foo` both have to be explicitly listed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 20027051 -> 20026899 (<.01%) instructions in affected programs: 37181 -> 37029 (-0.41%) helped: 85 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.05% max: 6.78% x̄: 0.92% x̃: 0.68% 95% mean confidence interval for instructions value: -2.42 -1.15 95% mean confidence interval for instructions %-change: -1.23% -0.61% Instructions are helped. total cycles in shared programs: 979762793 -> 979753527 (<.01%) cycles in affected programs: 2653905 -> 2644639 (-0.35%) helped: 104 HURT: 50 helped stats (abs) min: 1 max: 1048 x̄: 119.99 x̃: 11 helped stats (rel) min: <.01% max: 9.88% x̄: 0.77% x̃: 0.20% HURT stats (abs) min: 1 max: 734 x̄: 64.26 x̃: 8 HURT stats (rel) min: <.01% max: 3.06% x̄: 0.36% x̃: 0.10% 95% mean confidence interval for cycles value: -98.65 -21.68 95% mean confidence interval for cycles %-change: -0.66% -0.15% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Ian Romanick	33031bdab6	nir/algebraic: Apply addition property of equality more conservatively This allows a lot more CSE. Depending on where the addition and the comparison are scheduled, it may also reduce register pressure by reducing the live range of the addends. Across all the platforms, the shaders affected for spills or fills were all fragment shaders from Dirt Rally. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 21043103 -> 21038804 (-0.02%) instructions in affected programs: 892878 -> 888579 (-0.48%) helped: 1549 HURT: 724 helped stats (abs) min: 1 max: 225 x̄: 4.14 x̃: 2 helped stats (rel) min: 0.05% max: 11.18% x̄: 1.04% x̃: 0.78% HURT stats (abs) min: 1 max: 71 x̄: 2.93 x̃: 1 HURT stats (rel) min: 0.07% max: 6.90% x̄: 0.80% x̃: 0.56% 95% mean confidence interval for instructions value: -2.33 -1.45 95% mean confidence interval for instructions %-change: -0.50% -0.40% Instructions are helped. total cycles in shared programs: 855054155 -> 855757566 (0.08%) cycles in affected programs: 58275918 -> 58979329 (1.21%) helped: 1213 HURT: 1680 helped stats (abs) min: 1 max: 107405 x̄: 1684.00 x̃: 10 helped stats (rel) min: <.01% max: 38.09% x̄: 1.51% x̃: 0.25% HURT stats (abs) min: 1 max: 126632 x̄: 1634.59 x̃: 12 HURT stats (rel) min: <.01% max: 85.91% x̄: 2.75% x̃: 0.49% 95% mean confidence interval for cycles value: -98.06 584.35 95% mean confidence interval for cycles %-change: 0.71% 1.22% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9843 -> 9771 (-0.73%) spills in affected programs: 72 -> 0 helped: 5 HURT: 0 total fills in shared programs: 9600 -> 9451 (-1.55%) fills in affected programs: 149 -> 0 helped: 5 HURT: 0 LOST: 14 GAINED: 9 Skylake total instructions in shared programs: 18185074 -> 18183866 (<.01%) instructions in affected programs: 575180 -> 573972 (-0.21%) helped: 1286 HURT: 468 helped stats (abs) min: 1 max: 15 x̄: 1.55 x̃: 1 helped stats (rel) min: 0.03% max: 4.08% x̄: 0.67% x̃: 0.65% HURT stats (abs) min: 1 max: 8 x̄: 1.69 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.87% x̃: 0.45% 95% mean confidence interval for instructions value: -0.77 -0.60 95% mean confidence interval for instructions %-change: -0.30% -0.22% Instructions are helped. total cycles in shared programs: 960518105 -> 960608234 (<.01%) cycles in affected programs: 42536073 -> 42626202 (0.21%) helped: 1210 HURT: 1714 helped stats (abs) min: 1 max: 7015 x̄: 123.41 x̃: 10 helped stats (rel) min: <.01% max: 33.76% x̄: 1.32% x̃: 0.26% HURT stats (abs) min: 1 max: 14474 x̄: 139.71 x̃: 14 HURT stats (rel) min: <.01% max: 58.94% x̄: 2.00% x̃: 0.44% 95% mean confidence interval for cycles value: 4.02 57.63 95% mean confidence interval for cycles %-change: 0.43% 0.82% Cycles are HURT. LOST: 16 GAINED: 42 Broadwell total instructions in shared programs: 17856880 -> 17852158 (-0.03%) instructions in affected programs: 564836 -> 560114 (-0.84%) helped: 1243 HURT: 418 helped stats (abs) min: 1 max: 115 x̄: 4.36 x̃: 1 helped stats (rel) min: 0.03% max: 9.67% x̄: 0.90% x̃: 0.67% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.14% max: 7.69% x̄: 0.89% x̃: 0.46% 95% mean confidence interval for instructions value: -3.45 -2.23 95% mean confidence interval for instructions %-change: -0.51% -0.38% Instructions are helped. total cycles in shared programs: 1031140321 -> 1029856892 (-0.12%) cycles in affected programs: 66986946 -> 65703517 (-1.92%) helped: 1084 HURT: 1653 helped stats (abs) min: 1 max: 415168 x̄: 1835.32 x̃: 10 helped stats (rel) min: <.01% max: 57.16% x̄: 1.19% x̃: 0.28% HURT stats (abs) min: 1 max: 43930 x̄: 427.14 x̃: 12 HURT stats (rel) min: <.01% max: 57.53% x̄: 1.32% x̃: 0.39% 95% mean confidence interval for cycles value: -915.76 -22.07 95% mean confidence interval for cycles %-change: 0.17% 0.47% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 20891 -> 20335 (-2.66%) spills in affected programs: 1567 -> 1011 (-35.48%) helped: 70 HURT: 0 total fills in shared programs: 27307 -> 25905 (-5.13%) fills in affected programs: 5381 -> 3979 (-26.05%) helped: 71 HURT: 0 LOST: 17 GAINED: 20 Haswell total instructions in shared programs: 16411850 -> 16409414 (-0.01%) instructions in affected programs: 602666 -> 600230 (-0.40%) helped: 1152 HURT: 781 helped stats (abs) min: 1 max: 103 x̄: 3.59 x̃: 1 helped stats (rel) min: 0.03% max: 8.61% x̄: 0.85% x̃: 0.65% HURT stats (abs) min: 1 max: 41 x̄: 2.18 x̃: 1 HURT stats (rel) min: 0.12% max: 7.69% x̄: 0.88% x̃: 0.69% 95% mean confidence interval for instructions value: -1.74 -0.78 95% mean confidence interval for instructions %-change: -0.21% -0.10% Instructions are helped. total cycles in shared programs: 1035338781 -> 1036977801 (0.16%) cycles in affected programs: 68961096 -> 70600116 (2.38%) helped: 1246 HURT: 2206 helped stats (abs) min: 1 max: 392022 x̄: 1040.28 x̃: 14 helped stats (rel) min: <.01% max: 56.44% x̄: 2.32% x̃: 0.38% HURT stats (abs) min: 1 max: 68630 x̄: 1330.56 x̃: 18 HURT stats (rel) min: <.01% max: 69.97% x̄: 3.31% x̃: 0.61% 95% mean confidence interval for cycles value: 90.43 859.17 95% mean confidence interval for cycles %-change: 1.02% 1.54% Cycles are HURT. total spills in shared programs: 17805 -> 17457 (-1.95%) spills in affected programs: 1202 -> 854 (-28.95%) helped: 34 HURT: 31 total fills in shared programs: 20939 -> 20387 (-2.64%) fills in affected programs: 2702 -> 2150 (-20.43%) helped: 34 HURT: 31 LOST: 24 GAINED: 45 Ivy Bridge and earlier Intel GPUs had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515912 -> 15516757 (<.01%) instructions in affected programs: 396569 -> 397414 (0.21%) helped: 578 HURT: 858 helped stats (abs) min: 1 max: 9 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.04% max: 3.70% x̄: 0.65% x̃: 0.65% HURT stats (abs) min: 1 max: 11 x̄: 1.87 x̃: 1 HURT stats (rel) min: 0.08% max: 12.90% x̄: 0.95% x̃: 0.53% 95% mean confidence interval for instructions value: 0.47 0.70 95% mean confidence interval for instructions %-change: 0.24% 0.37% Instructions are HURT. total cycles in shared programs: 584395455 -> 584466352 (0.01%) cycles in affected programs: 20346570 -> 20417467 (0.35%) helped: 1192 HURT: 1896 helped stats (abs) min: 1 max: 4108 x̄: 123.27 x̃: 14 helped stats (rel) min: <.01% max: 37.20% x̄: 2.27% x̃: 0.46% HURT stats (abs) min: 1 max: 3698 x̄: 114.89 x̃: 19 HURT stats (rel) min: <.01% max: 70.28% x̄: 3.02% x̃: 0.71% 95% mean confidence interval for cycles value: 10.75 35.16 95% mean confidence interval for cycles %-change: 0.73% 1.23% Cycles are HURT. LOST: 20 GAINED: 12 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Gert Wollny	81b41e0c76	nir: Add r600 specific intrinsic for loading the tesselation coords Only the XY pair is provided directly, the Z value has to be deducted from the primitive type. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9373>	2021-03-04 09:14:03 +00:00
Ian Romanick	c393ae9d84	nir/search: Constify instruction parameter to search helpers The search helps must never modify the instruction passed in, so let the compiler enforce this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9378>	2021-03-03 18:32:14 +00:00
Rhys Perry	cbb5ed476c	nir/opt_shrink_vectors: add option to skip shrinking image stores Some games declare the wrong format, so we might want to disable this optimization in that case. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e4d75c22` ("nir/opt_shrink_vectors: shrink image stores using the format") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>	2021-03-03 14:18:37 +00:00
Eric Anholt	8bd0cc1a5a	nir/vec_to_movs: Don't generate MOVs for undef channels. This appeared in softpipe's image operations, since NIR always uses 4-component values for the coords, while the GLSL IR only has 2 components for a 2D image (for example). arb_shader_image_load_store-shader-mem-barrier (which times out in CI and spends its time inside of tgsi_exec) was spending 4/51 of its instructions on moving these undefs around. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Eric Anholt	1e5ef4c60c	nir: Add a nir_src_is_undef() helper, like nir_src_is_const(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Gert Wollny	935d9e6863	nir: disaallow reordering for r600 shared load and remove component field The original shared load op can't be reordered, so it might be better to also not allow this for the lowered variant. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Rhys Perry	812dd9c9f6	nir/copy_prop: use nir_{instr,if}_rewrite_{src,condition}_ssa Compile-time (nir_copy_prop): Difference at 95.0% confidence -2470.88 +/- 19.8762 -35.7461% +/- 0.247259% (Student's t, pooled s = 23.4747) Compile-time (overall): Difference at 95.0% confidence -2175.72 +/- 178.786 -1.73627% +/- 0.140826% (Student's t, pooled s = 211.155) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	c2209d836c	nir/copy_prop: visit copies instead of sources There are less copy instructions than sources, so instead of visiting each source and rewriting it if it's uses a copy instruction, visit each copy instruction and rewrite it's users. Besides improving compile time, this also has a side effect of fixing a rare situation where copy-propagation does not happen: loop { a = phi ..., b c = vec ... b = mov c.y } It might have been the case that a phi source could not be rewritten until the copy was visited later. Compile-time (nir_copy_prop): Difference at 95.0% confidence -2613.13 +/- 15.2094 -27.4333% +/- 0.150247% (Student's t, pooled s = 17.963) Comple-time (overall): Difference at 95.0% confidence -2627.89 +/- 201.557 -2.05404% +/- 0.156221% (Student's t, pooled s = 238.048) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	41125bff4f	nir/copy_prop: remove unused copies These were hurting performance of other passes. Compile-time (overall): Difference at 95.0% confidence -5496.3 +/- 219.752 -4.11912% +/- 0.160285% (Student's t, pooled s = 259.538) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	ed9c3c4f19	nir: add nir_ssa_def_is_unused() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8784>	2021-03-01 17:38:10 +00:00
Rhys Perry	f66a7240f9	nir: fix build at -O1 At -O1 with GCC 10.2.1, _nir_visit_dest_indirect (declared ALWAYS_INLINE) will fail to inline if it's caller (nir_foreach_dest) is not inlined, because _nir_visit_dest_indirect is passed as a function pointer. This results in a compilation error. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com> Fixes: `336bcbacd0` ("nir: inline nir_foreach_{src,dest}") Tested-by: Witold Baryluk <witold.baryluk@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4353 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9301>	2021-02-26 21:54:53 +00:00
Rob Clark	a9618e7c42	util: Add accessor for util_cpu_caps In release builds, there should be no change, but in debug builds the assert will help us catch undefined behavior resulting from using util_cpu_caps before it is initialized. With fix for u_half_test for MSVC from Jesse Natalie squashed in. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>	2021-02-26 18:31:19 +00:00
Gert Wollny	e5db9c3dd4	nir: Add r600 specific CUBE opcode to evaluate cube texture coords and face The opcode evaluates tha unnormalized coordinates, the length of the major axis, and the cube face. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>	2021-02-26 09:51:37 +01:00
Gert Wollny	4f4e1e5ed9	nir: Add flag to tex instruction to indicate lowering cube to array E.g. r600 a cube texture lookup uses a specific cube instruction to evaluate the sample coordinates and the face ID, so that the cube texture lookup can be lowered to a array texture lookup, thereby sharing the code with the 2D array texture lopkup. However, for TXD the given gradients still need to be three-component vectors, so add a flag that the NIR validation knows that we deal with cube texture that was lowered to an array and can validate accordingly. v2: Handle new flag in serialization (Marek) v3: Rebase so that the change does not require the patch to deduct the number of offset and grad components from sampler type Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2) Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>	2021-02-26 09:51:37 +01:00

1 2 3 4 5 ...

3036 Commits