AlexIndustrial/mesa

Author	SHA1	Message	Date
Lionel Landwerlin	6c2e7797f5	anv: tweak performance query timeout based on number of passes This avoids device lost events when we replay a command buffer 1k times on DG2. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	56bd81ee21	anv/perf: fixup counter/query mapping The intel_perf_counter_pass::pass field is actually useless and invalid. Once you have mapped all the counters to all the metrics, the order of the metrics capture is dictated by intel_perf_get_n_passes(). When reading values that is the order we should follow. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `2001a80d4a` ("anv: Implement VK_KHR_performance_query") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	7fbfa694a8	intel/perf: simplify pass computation loop We don't need to go through all the metric sets as we're already built a bitset matching per counter to figure out in which metric set a particular counter is. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	4d19685a99	intel/perf: don't ralloc on perf context a temporary hash table Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	e754bf6be4	intel/perf: allocate cleared counter infos This array of structure needs to be initialized to 0 as it contains a bitset we don't explicitly clear. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3144bc1d33` ("intel/perf: move query_mask and location out of gen_perf_query_counter") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	bdacd6df5a	intel/perf: add a non installable tool to print metrics Useful to look at the layout of the queries. v2: Rework based on Marcin's comment v3: Rebase Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Mark Janes	e3a842d627	intel/perf: fix overflow in index types With DG2, the number of perf groups and metrics climbs into the thousands. 16bit fields are not sufficient for storing metrics indices, and the build throws warnings when compiling the generated intel_perf_metrics.c Use a 32bit integer for these values. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	7770346902	intel/perf: support new variable names Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	c1aa1059c6	intel/perf: support new operators for upcoming metrics Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	d4cbb66506	intel/perf: support more than 64 queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	1dd4cc0da5	intel/perf: fix variable type assumption error Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18893>	2022-11-17 12:57:06 +00:00
Lionel Landwerlin	440da44a84	anv: get rid of ilog2_round_up __builtin_clz(value - 1) is undefined for with value=1 (because __builtin_clz(0) is undefined). Because we set rt_pipeline->stack_size = 1 when a ray tracing pipeline doesn't need any stack allocation to differentiate from a dynamic size (rt_pipeline->stack_size = 0) we can run into this undefinied behavior issue. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `f68d64dac0` ("anv: Add support for vkCmdSetRayTracingPipelineStackSizeKHR") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19781>	2022-11-17 10:06:37 +00:00
Joshua Ashton	55b6813b7b	anv: Enable EXT_swapchain_colorspace This extension is basically a no-op exposing some new enums. Signed-off-by: Joshua Ashton <joshua@froggi.es> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19726>	2022-11-16 14:07:45 +00:00
Matt Coster	afb8308087	intel: Use common CONCAT/PASTE macros Signed-off-by: Matt Coster <matt.coster@imgtec.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16945>	2022-11-15 11:54:42 +00:00
Matt Coster	7a84473344	intel: Unify naming of CONCAT/PASTE macros In isl/isl_priv.h: - __PASTE2 => PASTE2 - __PASTE => CONCAT2 Signed-off-by: Matt Coster <matt.coster@imgtec.com> Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16945>	2022-11-15 11:54:42 +00:00
Caio Oliveira	e81c35d19f	anv: Don't use REQUIRE_8 for Bindless Shaders In `23c7142cd6` ("anv: disable SIMD16 for RT shaders") we were forcing the SIMD8 using the mechanism for subgroup size control, which is problematic since it has other effects on the shader behavior. The code was changed to select the SIMD in a different way in the previous patches, so we can revert the behavior to the original semantics. Fixes dEQP-VK.subgroups.builtin_var.ray_tracing.subgroupsize. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	eedbd1ddbf	intel/compiler: Use SIMD selection helpers in compile_single_bs() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	6c194ddd18	intel/compiler: Prepare SIMD selection helpers to handle different prog_datas Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	6ffa597bcf	intel/compiler: Keep track of compiled/spilled in brw_simd_selection_state We still update the cs_prog_data, but don't rely on it for this state anymore. This will allow use the SIMD selector with shaders that don't use cs_prog_data. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	3c52e2d04c	intel/compiler: Add a SIMD_COUNT constant Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	a0580dadfd	intel/compiler: Create a struct to hold SIMD selection state This is a preparation to decouple the storage of what SIMDs compiled/spilled from the cs_prog_data. This will allow reuse of SIMD selection code by Bindless Shaders. And since we have a struct now, move the error array there so reduce the boilerplate of the users. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	8cda6cd774	intel/compiler: Simplify usage of brw_simd_select_for_workgroup_size() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	a943dbf475	intel/compiler: Make brw_private.h and simd selector helpers C++ We don't intend to expose neither to drivers, so it is fine to be C++. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19601>	2022-11-15 04:55:18 +00:00
Caio Oliveira	494e2edb90	intel/compiler: Fix missing tie-breaker in brw_nir_analyze_ubo_ranges() ordering code Per Ken suggestion, use ascending order for the start offset. Fixes: `6d28c6e52c` ("i965: Select ranges of UBO data to be uploaded as push constants.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19731>	2022-11-14 19:41:35 +00:00
Caio Oliveira	9fd1d47aa0	intel/compiler: Fix dynarray usage in intel_clc The code builds up the dynamic array of objects (spirv_objs) and collect pointers to each of them into another dynamic array (spirv_ptr_objs). If the growth of the first array cause a reallocation, it is possible that the previous pointers end up invalid. Fixes: `77e929a527` ("intel/clc: allow multiple CL files to be compiled together") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19730>	2022-11-14 19:15:05 +00:00
Lionel Landwerlin	ae76bba34a	anv: bump pool bucket max allocation size Age of Empire IV generates a shader of ~2.3Mb on DG2 which is above the limit we currently have. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19669>	2022-11-12 21:40:34 +02:00
Lionel Landwerlin	bdf680cd3f	intel/fs: use nir_opt_ray_query_ranges Results on DG2 q2rtx shaders: Totals from 6 (12.24% of 49) affected shaders: Instrs: 88927 -> 54088 (-39.18%) Cycles: 4115088 -> 2536902 (-38.35%) Send messages: 2639 -> 1609 (-39.03%) Spill count: 1321 -> 613 (-53.60%) Fill count: 3130 -> 1104 (-64.73%) Scratch Memory Size: 22528 -> 18432 (-18.18%) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16593>	2022-11-11 15:17:08 +00:00
Mark Collins	086b50078d	common/utrace: Rename `u_trace_context_actively_tracing` to `u_trace_should_process` Signed-off-by: Mark Collins <mark@igalia.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Ack-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18271>	2022-11-11 13:50:56 +00:00
Tapani Pälli	0d85a0d7cd	anv: remove dg2 condition for Wa_22011440098 We need same workaround for MTL. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19636>	2022-11-11 10:38:24 +00:00
Tapani Pälli	ecd4517560	anv: setup stage bitmask for Wa_22011440098 Fixes: `40b66a4499` ("anv, iris: Add Wa_22011440098 for DG2") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19636>	2022-11-11 10:38:24 +00:00
Lionel Landwerlin	4ceaed7839	anv: split internal surface states from descriptors On Intel HW we use the same mechanism for internal operations surfaces as well as application surfaces (VkDescriptor). This change splits the surface pool in 2, one part dedicated to internal allocations, the other to application VkDescriptors. To do so, the STATE_BASE_ADDRESS::SurfaceStateBaseAddress points to a 4Gb area, with the following layout : - 1Gb of binding table pool - 2Gb of internal surface states - 1Gb of bindless surface states That way any entry from the binding table can refer to both internal & bindless surface states but none of the driver allocations interfere with the allocation of the application. Based off a change from Sviatoslav Peleshko. v2: Allocate image view null surface state from bindless heap (Sviatoslav) Removed debug stuff (Sviatoslav) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7110 Cc: mesa-stable Tested-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19275>	2022-11-11 10:13:27 +00:00
Dylan Baker	41a929d94c	util/glsl2spirv: pass path to glslangValidator into the script This allows users to override the location of glslang using normal meson mechanisms. Reviewed-by: Luis Felipe Strano Moraes <luis.strano@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19449>	2022-11-10 21:14:17 +00:00
Caio Oliveira	ecc2dfc503	intel/compiler: Use std::unique_ptr for tracking the fs_visitors Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19605>	2022-11-10 18:01:52 +00:00
Lionel Landwerlin	68fd9d2829	anv: fixup invalid enum for nir environment Also switching away from PIPE_ Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8c4c4c3ee1` ("anv: Add softtp64 workaround") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19638>	2022-11-10 14:51:32 +00:00
Lionel Landwerlin	b499a27d74	nir: make ray query load values visible in NIR prints Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19641>	2022-11-10 14:40:08 +02:00
Jason Ekstrand	4d63beaae6	hasvk: Switch to common code for command buffer lifecycles This gets us command buffer object recycling. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18383>	2022-11-10 11:15:23 +00:00
Jason Ekstrand	415bf88637	anv: Switch to common code for command buffer lifecycles This gets us command buffer object recycling. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18383>	2022-11-10 11:15:23 +00:00
Emma Anholt	74bbeb5116	ci/iris: Add some flakes from the new testing on JSL. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19628>	2022-11-09 22:07:10 +00:00
Ian Romanick	351b8c6aec	intel/fs: Enable nir_op_imul_32x16 and nir_op_umul_32x16 on pre-Gfx7 Even though Intel's CI doesn't test these old platforms anymore, the validation added in "intel/eu/validate: Validate integer multiplication source size restrictions" combined with full shader-db runs gives me confidence in the changes. Sandy Bridge total instructions in shared programs: 13902341 -> 13902167 (<.01%) instructions in affected programs: 30771 -> 30597 (-0.57%) helped: 66 / HURT: 0 total cycles in shared programs: 741795500 -> 741791931 (<.01%) cycles in affected programs: 987602 -> 984033 (-0.36%) helped: 28 / HURT: 5 Iron Lake total instructions in shared programs: 8365806 -> 8365754 (<.01%) instructions in affected programs: 1766 -> 1714 (-2.94%) helped: 10 / HURT: 0 total cycles in shared programs: 248542694 -> 248542378 (<.01%) cycles in affected programs: 29836 -> 29520 (-1.06%) helped: 9 / HURT: 0 GM45 total instructions in shared programs: 5187127 -> 5187101 (<.01%) instructions in affected programs: 891 -> 865 (-2.92%) helped: 5 / HURT: 0 total cycles in shared programs: 163643914 -> 163643750 (<.01%) cycles in affected programs: 22206 -> 22042 (-0.74%) helped: 5 / HURT: 0 Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Ian Romanick	293ad13e3f	intel/fs: Slightly restructure emitting nir_op_imul_32x16 and nir_op_umul_32x16 There are no immediate values at this point, so all of this code was bunk. :face_palm: Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Ian Romanick	ee2a299661	intel/eu/validate: Validate integer multiplication source size restrictions v2: Expect correct result on BDW in test_eu. v3: Fix SNB type-size check. Noticed by Marcin. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Ian Romanick	d668512f88	intel/compiler: Fix signed integer range analysis of imax and imin Some review feedback of an earlier commit caused me to rearrange some code quite a bit. I wasn't paying enough attention while applying the later commits, and these breaks should have been returns. As it is, the result of the imin or imax analysis is overwritten by the default case handling... effectively the original commit does nothing. :( Tiger Lake and Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 19914090 -> 19904772 (-0.05%) instructions in affected programs: 121258 -> 111940 (-7.68%) helped: 445 / HURT: 0 total cycles in shared programs: 855291535 -> 855266659 (<.01%) cycles in affected programs: 2737005 -> 2712129 (-0.91%) helped: 426 / HURT: 17 LOST: 0 GAINED: 3 Skylake and Broadwell had similar results. (Skylake shown) total cycles in shared programs: 842395356 -> 842338259 (<.01%) cycles in affected programs: 5460985 -> 5403888 (-1.05%) helped: 458 / HURT: 0 Haswell and Ivy Bridge had similar results. (Haswell shown) total instructions in shared programs: 16710449 -> 16708449 (-0.01%) instructions in affected programs: 44101 -> 42101 (-4.54%) helped: 75 / HURT: 0 total cycles in shared programs: 882760230 -> 882727923 (<.01%) cycles in affected programs: 2867797 -> 2835490 (-1.13%) helped: 62 / HURT: 10 No shader-db change on any other Intel platform. No fossil-db changes on any Intel platform. Fixes: `5ec75ca10d` ("intel/compiler: Teach signed integer range analysis about imax and imin") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19602>	2022-11-09 21:34:26 +00:00
Jason Ekstrand	25c180b509	intel: Don't cross DWORD boundaries with byte scratch load/store The back-end swizzles dwords so that our indirect scratch messages match the memory layout of spill/fill messages for better cache coherency. The swizzle happens at a DWORD granularity. If a read or write crosses a DWORD boundary, the first bit will get correctly swizzled but whatever piece lands in the next dword will not because the scatter instructions assume sequential addresses for all bytes. For DWORD writes, this is handled naturally as part of scalarizing. For smaller writes, we need to be sure that a single write never escapes a dword. Fixes: `fd04f858b0` ("intel/nir: Don't try to emit vector load_scratch instructions") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7364 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19580>	2022-11-09 19:45:10 +00:00
Jason Ekstrand	85685cf932	intel/lower_mem_access_bit_sizes: Compute alignments automatically Because dup_mem_intrinsic() retains the SSA offset from the original intrinsic and only modifies it by adding a constant, we can compute the alignment based on the original alignment and the constant offset. This is both easier and more accurate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19580>	2022-11-09 19:45:10 +00:00
Lionel Landwerlin	97b3dd34c1	anv: fix missing VkPhysicalDeviceExtendedDynamicState3PropertiesEXT handling Fixes: `13c422e1b2` ("anv: toggle on EXT_extended_dynamic_state3") Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19573>	2022-11-08 15:28:57 +00:00
Caio Oliveira	22d8ed84b8	intel/compiler: Remove unused fs_visitor::emit_percomp() Since `7ef7738a61` ("i965: Write gl_FragCoord directly to the destination.") this is not used. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:09 +00:00
Caio Oliveira	90861e6fea	intel/compiler: Remove various unused function declarations Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:08 +00:00
Caio Oliveira	48506a9029	intel/compiler: Remove unused data members Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19586>	2022-11-08 07:33:08 +00:00
Ian Romanick	9abeb3d739	intel/fs: Optimize integer multiplication of large constants by factoring Many Intel platforms can only perform 32x16 bit multiplication. The straightforward way to implement 32x32 bit multiplications is by splitting one of the operands into high and low parts called H and L, repsectively. The full multiplication can be implemented as: ((A * H) << 16) + (A * L) On Intel platforms, special register accesses can be used to eliminate the shift operation. This results in three instructions and a temporary register for most values. If H or L is 1, then one (or both) of the multiplications will later be eliminated. On some platforms it may be possible to eliminate the multiplication when H is 256. If L is zero (note that H cannot be zero), one of the multiplications will also be eliminated. Instead of splitting the operand into high and low parts, it may possible to factor the operand into two 16-bit factors X and Y. The original multiplication can be replaced with (A * (X * Y)) = ((A * X) * Y). This requires two instructions without a temporary register. I may have gone a bit overboard with optimizing the factorization routine. It was a fun brainteaser, and I couldn't put it down. :) On my 1.3GHz Ice Lake, a standalone test could chug through 1,000,000 randomly selected values in about 5.7 seconds. This is about 9x the performance of the obvious, straightforward implementation that I started with. v2: Drop an unnecessary return. Rearrange logic slightly and rename variables in factor_uint32 to better match the names used in the large comment. Both suggested by Caio. Rearrange logic to avoid possibly using `a` uninitialized. Noticed by Marcin. v3: Use DIV_ROUND_UP instead of open coding it. Noticed by Caio. Tiger Lake, Ice Lake, Haswell, and Ivy Bridge had similar results. (Ice Lake shown) total instructions in shared programs: 19912558 -> 19912526 (<.01%) instructions in affected programs: 3432 -> 3400 (-0.93%) helped: 10 / HURT: 0 total cycles in shared programs: 856413218 -> 856412810 (<.01%) cycles in affected programs: 122032 -> 121624 (-0.33%) helped: 9 / HURT: 0 No shader-db changes on any other Intel platforms. Tiger Lake and Ice Lake had similar results. (Ice Lake shown) Instructions in all programs: 141997227 -> 141996923 (-0.0%) Instructions helped: 71 Cycles in all programs: 9162524757 -> 9162523886 (-0.0%) Cycles helped: 63 Cycles hurt: 5 No fossil-db changes on any other Intel platforms. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00
Ian Romanick	5ec75ca10d	intel/compiler: Teach signed integer range analysis about imax and imin This is especially helpful for a*isign(a) generated by idiv_by_const optimization. On many GPUs, isign(a) is lowered to imax(imin(a, 1), -1). There are no changes on fossil-db because ANV uses a different optimization path for idiv with a constant denominator. A future MR will change this. NOTE: This commit used to help a few hundred shader-db shaders, but now none are affected. I suspect this is due to some change in the idiv_by_const optimization. This could possibly be dropped. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>	2022-11-08 00:02:16 +00:00

1 2 3 4 5 ...

8659 Commits