Commit Graph

8659 Commits

Author SHA1 Message Date
Lionel Landwerlin 6c2e7797f5 anv: tweak performance query timeout based on number of passes
This avoids device lost events when we replay a command buffer 1k
times on DG2.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin 56bd81ee21 anv/perf: fixup counter/query mapping
The intel_perf_counter_pass::pass field is actually useless and
invalid.

Once you have mapped all the counters to all the metrics, the order of
the metrics capture is dictated by intel_perf_get_n_passes().

When reading values that is the order we should follow.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 2001a80d4a ("anv: Implement VK_KHR_performance_query")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin 7fbfa694a8 intel/perf: simplify pass computation loop
We don't need to go through all the metric sets as we're already built
a bitset matching per counter to figure out in which metric set a
particular counter is.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin 4d19685a99 intel/perf: don't ralloc on perf context a temporary hash table
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin e754bf6be4 intel/perf: allocate cleared counter infos
This array of structure needs to be initialized to 0 as it contains a
bitset we don't explicitly clear.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3144bc1d33 ("intel/perf: move query_mask and location out of gen_perf_query_counter")
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin bdacd6df5a intel/perf: add a non installable tool to print metrics
Useful to look at the layout of the queries.

v2: Rework based on Marcin's comment

v3: Rebase

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Mark Janes e3a842d627 intel/perf: fix overflow in index types
With DG2, the number of perf groups and metrics climbs into the
thousands.  16bit fields are not sufficient for storing metrics
indices, and the build throws warnings when compiling the generated
intel_perf_metrics.c

Use a 32bit integer for these values.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin 7770346902 intel/perf: support new variable names
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin c1aa1059c6 intel/perf: support new operators for upcoming metrics
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin d4cbb66506 intel/perf: support more than 64 queries
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin 1dd4cc0da5 intel/perf: fix variable type assumption error
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>
2022-11-17 12:57:06 +00:00
Lionel Landwerlin 440da44a84 anv: get rid of ilog2_round_up
__builtin_clz(value - 1) is undefined for with value=1 (because
__builtin_clz(0) is undefined).

Because we set rt_pipeline->stack_size = 1 when a ray tracing pipeline
doesn't need any stack allocation to differentiate from a dynamic size
(rt_pipeline->stack_size = 0) we can run into this undefinied behavior
issue.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: f68d64dac0 ("anv: Add support for vkCmdSetRayTracingPipelineStackSizeKHR")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19781>
2022-11-17 10:06:37 +00:00
Joshua Ashton 55b6813b7b anv: Enable EXT_swapchain_colorspace
This extension is basically a no-op exposing some new enums.

Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19726>
2022-11-16 14:07:45 +00:00
Matt Coster afb8308087 intel: Use common CONCAT/PASTE macros
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16945>
2022-11-15 11:54:42 +00:00
Matt Coster 7a84473344 intel: Unify naming of CONCAT/PASTE macros
In isl/isl_priv.h:
 - __PASTE2 => PASTE2
 - __PASTE => CONCAT2

Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Lionel Landwerlin <lionel.g.landwerlin@intel.com>

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16945>
2022-11-15 11:54:42 +00:00
Caio Oliveira e81c35d19f anv: Don't use REQUIRE_8 for Bindless Shaders
In 23c7142cd6 ("anv: disable SIMD16 for RT shaders") we were forcing the SIMD8
using the mechanism for subgroup size control, which is problematic since it has
other effects on the shader behavior.

The code was changed to select the SIMD in a different way in the previous patches,
so we can revert the behavior to the original semantics.

Fixes dEQP-VK.subgroups.builtin_var.ray_tracing.subgroupsize.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira eedbd1ddbf intel/compiler: Use SIMD selection helpers in compile_single_bs()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira 6c194ddd18 intel/compiler: Prepare SIMD selection helpers to handle different prog_datas
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira 6ffa597bcf intel/compiler: Keep track of compiled/spilled in brw_simd_selection_state
We still update the cs_prog_data, but don't rely on it for this state anymore.
This will allow use the SIMD selector with shaders that don't use cs_prog_data.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira 3c52e2d04c intel/compiler: Add a SIMD_COUNT constant
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira a0580dadfd intel/compiler: Create a struct to hold SIMD selection state
This is a preparation to decouple the storage of what SIMDs
compiled/spilled from the cs_prog_data.  This will allow reuse
of SIMD selection code by Bindless Shaders.

And since we have a struct now, move the error array there so
reduce the boilerplate of the users.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira 8cda6cd774 intel/compiler: Simplify usage of brw_simd_select_for_workgroup_size()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira a943dbf475 intel/compiler: Make brw_private.h and simd selector helpers C++
We don't intend to expose neither to drivers, so it is fine to be C++.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>
2022-11-15 04:55:18 +00:00
Caio Oliveira 494e2edb90 intel/compiler: Fix missing tie-breaker in brw_nir_analyze_ubo_ranges() ordering code
Per Ken suggestion, use ascending order for the start offset.

Fixes: 6d28c6e52c ("i965: Select ranges of UBO data to be uploaded as push constants.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19731>
2022-11-14 19:41:35 +00:00
Caio Oliveira 9fd1d47aa0 intel/compiler: Fix dynarray usage in intel_clc
The code builds up the dynamic array of objects (spirv_objs) and
collect pointers to each of them into another dynamic
array (spirv_ptr_objs).

If the growth of the first array cause a reallocation, it is
possible that the previous pointers end up invalid.

Fixes: 77e929a527 ("intel/clc: allow multiple CL files to be compiled together")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19730>
2022-11-14 19:15:05 +00:00
Lionel Landwerlin ae76bba34a anv: bump pool bucket max allocation size
Age of Empire IV generates a shader of ~2.3Mb on DG2 which is above
the limit we currently have.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19669>
2022-11-12 21:40:34 +02:00
Lionel Landwerlin bdf680cd3f intel/fs: use nir_opt_ray_query_ranges
Results on DG2 q2rtx shaders:

Totals from 6 (12.24% of 49) affected shaders:
Instrs: 88927 -> 54088 (-39.18%)
Cycles: 4115088 -> 2536902 (-38.35%)
Send messages: 2639 -> 1609 (-39.03%)
Spill count: 1321 -> 613 (-53.60%)
Fill count: 3130 -> 1104 (-64.73%)
Scratch Memory Size: 22528 -> 18432 (-18.18%)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16593>
2022-11-11 15:17:08 +00:00
Mark Collins 086b50078d common/utrace: Rename u_trace_context_actively_tracing to u_trace_should_process
Signed-off-by: Mark Collins <mark@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Yonggang Luo <luoyonggang@gmail.com>
Ack-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18271>
2022-11-11 13:50:56 +00:00
Tapani Pälli 0d85a0d7cd anv: remove dg2 condition for Wa_22011440098
We need same workaround for MTL.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19636>
2022-11-11 10:38:24 +00:00
Tapani Pälli ecd4517560 anv: setup stage bitmask for Wa_22011440098
Fixes: 40b66a4499 ("anv, iris: Add Wa_22011440098 for DG2")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19636>
2022-11-11 10:38:24 +00:00
Lionel Landwerlin 4ceaed7839 anv: split internal surface states from descriptors
On Intel HW we use the same mechanism for internal operations surfaces
as well as application surfaces (VkDescriptor).

This change splits the surface pool in 2, one part dedicated to
internal allocations, the other to application VkDescriptors.

To do so, the STATE_BASE_ADDRESS::SurfaceStateBaseAddress points to a
4Gb area, with the following layout :
   - 1Gb of binding table pool
   - 2Gb of internal surface states
   - 1Gb of bindless surface states

That way any entry from the binding table can refer to both internal &
bindless surface states but none of the driver allocations interfere
with the allocation of the application.

Based off a change from Sviatoslav Peleshko.

v2: Allocate image view null surface state from bindless heap (Sviatoslav)
    Removed debug stuff (Sviatoslav)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7110
Cc: mesa-stable
Tested-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19275>
2022-11-11 10:13:27 +00:00
Dylan Baker 41a929d94c util/glsl2spirv: pass path to glslangValidator into the script
This allows users to override the location of glslang using normal meson
mechanisms.

Reviewed-by: Luis Felipe Strano Moraes <luis.strano@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19449>
2022-11-10 21:14:17 +00:00
Caio Oliveira ecc2dfc503 intel/compiler: Use std::unique_ptr for tracking the fs_visitors
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19605>
2022-11-10 18:01:52 +00:00
Lionel Landwerlin 68fd9d2829 anv: fixup invalid enum for nir environment
Also switching away from PIPE_

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8c4c4c3ee1 ("anv: Add softtp64 workaround")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19638>
2022-11-10 14:51:32 +00:00
Lionel Landwerlin b499a27d74 nir: make ray query load values visible in NIR prints
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19641>
2022-11-10 14:40:08 +02:00
Jason Ekstrand 4d63beaae6 hasvk: Switch to common code for command buffer lifecycles
This gets us command buffer object recycling.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18383>
2022-11-10 11:15:23 +00:00
Jason Ekstrand 415bf88637 anv: Switch to common code for command buffer lifecycles
This gets us command buffer object recycling.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18383>
2022-11-10 11:15:23 +00:00
Emma Anholt 74bbeb5116 ci/iris: Add some flakes from the new testing on JSL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19628>
2022-11-09 22:07:10 +00:00
Ian Romanick 351b8c6aec intel/fs: Enable nir_op_imul_32x16 and nir_op_umul_32x16 on pre-Gfx7
Even though Intel's CI doesn't test these old platforms anymore, the
validation added in "intel/eu/validate: Validate integer multiplication
source size restrictions" combined with full shader-db runs gives me
confidence in the changes.

Sandy Bridge
total instructions in shared programs: 13902341 -> 13902167 (<.01%)
instructions in affected programs: 30771 -> 30597 (-0.57%)
helped: 66 / HURT: 0

total cycles in shared programs: 741795500 -> 741791931 (<.01%)
cycles in affected programs: 987602 -> 984033 (-0.36%)
helped: 28 / HURT: 5

Iron Lake
total instructions in shared programs: 8365806 -> 8365754 (<.01%)
instructions in affected programs: 1766 -> 1714 (-2.94%)
helped: 10 / HURT: 0

total cycles in shared programs: 248542694 -> 248542378 (<.01%)
cycles in affected programs: 29836 -> 29520 (-1.06%)
helped: 9 / HURT: 0

GM45
total instructions in shared programs: 5187127 -> 5187101 (<.01%)
instructions in affected programs: 891 -> 865 (-2.92%)
helped: 5 / HURT: 0

total cycles in shared programs: 163643914 -> 163643750 (<.01%)
cycles in affected programs: 22206 -> 22042 (-0.74%)
helped: 5 / HURT: 0

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>
2022-11-09 21:34:26 +00:00
Ian Romanick 293ad13e3f intel/fs: Slightly restructure emitting nir_op_imul_32x16 and nir_op_umul_32x16
There are no immediate values at this point, so all of this code was
bunk. :face_palm:

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>
2022-11-09 21:34:26 +00:00
Ian Romanick ee2a299661 intel/eu/validate: Validate integer multiplication source size restrictions
v2: Expect correct result on BDW in test_eu.

v3: Fix SNB type-size check. Noticed by Marcin.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>
2022-11-09 21:34:26 +00:00
Ian Romanick d668512f88 intel/compiler: Fix signed integer range analysis of imax and imin
Some review feedback of an earlier commit caused me to rearrange some
code quite a bit. I wasn't paying enough attention while applying the
later commits, and these breaks should have been returns. As it is, the
result of the imin or imax analysis is overwritten by the default case
handling... effectively the original commit does nothing. :(

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 19914090 -> 19904772 (-0.05%)
instructions in affected programs: 121258 -> 111940 (-7.68%)
helped: 445 / HURT: 0

total cycles in shared programs: 855291535 -> 855266659 (<.01%)
cycles in affected programs: 2737005 -> 2712129 (-0.91%)
helped: 426 / HURT: 17

LOST:   0
GAINED: 3

Skylake and Broadwell had similar results. (Skylake shown)
total cycles in shared programs: 842395356 -> 842338259 (<.01%)
cycles in affected programs: 5460985 -> 5403888 (-1.05%)
helped: 458 / HURT: 0

Haswell and Ivy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16710449 -> 16708449 (-0.01%)
instructions in affected programs: 44101 -> 42101 (-4.54%)
helped: 75 / HURT: 0

total cycles in shared programs: 882760230 -> 882727923 (<.01%)
cycles in affected programs: 2867797 -> 2835490 (-1.13%)
helped: 62 / HURT: 10

No shader-db change on any other Intel platform.

No fossil-db changes on any Intel platform.

Fixes: 5ec75ca10d ("intel/compiler: Teach signed integer range analysis about imax and imin")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>
2022-11-09 21:34:26 +00:00
Jason Ekstrand 25c180b509 intel: Don't cross DWORD boundaries with byte scratch load/store
The back-end swizzles dwords so that our indirect scratch messages match
the memory layout of spill/fill messages for better cache coherency.
The swizzle happens at a DWORD granularity.  If a read or write crosses
a DWORD boundary, the first bit will get correctly swizzled but whatever
piece lands in the next dword will not because the scatter instructions
assume sequential addresses for all bytes.  For DWORD writes, this is
handled naturally as part of scalarizing.  For smaller writes, we need
to be sure that a single write never escapes a dword.

Fixes: fd04f858b0 ("intel/nir: Don't try to emit vector load_scratch instructions")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7364
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19580>
2022-11-09 19:45:10 +00:00
Jason Ekstrand 85685cf932 intel/lower_mem_access_bit_sizes: Compute alignments automatically
Because dup_mem_intrinsic() retains the SSA offset from the original
intrinsic and only modifies it by adding a constant, we can compute the
alignment based on the original alignment and the constant offset.  This
is both easier and more accurate.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19580>
2022-11-09 19:45:10 +00:00
Lionel Landwerlin 97b3dd34c1 anv: fix missing VkPhysicalDeviceExtendedDynamicState3PropertiesEXT handling
Fixes: 13c422e1b2 ("anv: toggle on EXT_extended_dynamic_state3")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19573>
2022-11-08 15:28:57 +00:00
Caio Oliveira 22d8ed84b8 intel/compiler: Remove unused fs_visitor::emit_percomp()
Since 7ef7738a61 ("i965: Write gl_FragCoord directly to the destination.") this
is not used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>
2022-11-08 07:33:09 +00:00
Caio Oliveira 90861e6fea intel/compiler: Remove various unused function declarations
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>
2022-11-08 07:33:08 +00:00
Caio Oliveira 48506a9029 intel/compiler: Remove unused data members
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>
2022-11-08 07:33:08 +00:00
Ian Romanick 9abeb3d739 intel/fs: Optimize integer multiplication of large constants by factoring
Many Intel platforms can only perform 32x16 bit multiplication.  The
straightforward way to implement 32x32 bit multiplications is by
splitting one of the operands into high and low parts called H and L,
repsectively.  The full multiplication can be implemented as:

         ((A * H) << 16) + (A * L)

On Intel platforms, special register accesses can be used to eliminate
the shift operation.  This results in three instructions and a temporary
register for most values.

If H or L is 1, then one (or both) of the multiplications will later be
eliminated.  On some platforms it may be possible to eliminate the
multiplication when H is 256.

If L is zero (note that H cannot be zero), one of the multiplications
will also be eliminated.

Instead of splitting the operand into high and low parts, it may
possible to factor the operand into two 16-bit factors X and Y.  The
original multiplication can be replaced with (A * (X * Y)) = ((A * X) *
Y).  This requires two instructions without a temporary register.

I may have gone a bit overboard with optimizing the factorization
routine.  It was a fun brainteaser, and I couldn't put it down. :) On my
1.3GHz Ice Lake, a standalone test could chug through 1,000,000 randomly
selected values in about 5.7 seconds.  This is about 9x the performance
of the obvious, straightforward implementation that I started with.

v2: Drop an unnecessary return.  Rearrange logic slightly and rename
variables in factor_uint32 to better match the names used in the large
comment.  Both suggested by Caio. Rearrange logic to avoid possibly
using `a` uninitialized. Noticed by Marcin.

v3: Use DIV_ROUND_UP instead of open coding it. Noticed by Caio.

Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown)
total instructions in shared programs: 19912558 -> 19912526 (<.01%)
instructions in affected programs: 3432 -> 3400 (-0.93%)
helped: 10 / HURT: 0

total cycles in shared programs: 856413218 -> 856412810 (<.01%)
cycles in affected programs: 122032 -> 121624 (-0.33%)
helped: 9 / HURT: 0

No shader-db changes on any other Intel platforms.

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
Instructions in all programs: 141997227 -> 141996923 (-0.0%)
Instructions helped: 71

Cycles in all programs: 9162524757 -> 9162523886 (-0.0%)
Cycles helped: 63
Cycles hurt: 5

No fossil-db changes on any other Intel platforms.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00
Ian Romanick 5ec75ca10d intel/compiler: Teach signed integer range analysis about imax and imin
This is especially helpful for a*isign(a) generated by idiv_by_const
optimization.  On many GPUs, isign(a) is lowered to imax(imin(a, 1),
-1).

There are no changes on fossil-db because ANV uses a different
optimization path for idiv with a constant denominator.  A future MR
will change this.

NOTE: This commit used to help a few hundred shader-db shaders, but
now none are affected.  I suspect this is due to some change in the
idiv_by_const optimization.  This could possibly be dropped.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>
2022-11-08 00:02:16 +00:00