AlexIndustrial/mesa

Author	SHA1	Message	Date
Caio Oliveira	1cdc4be14b	intel/compiler: Don't allocate memory for SIMD select error handling The position in the error array already indicate the SIMD in question, so take off all the formatted printing from the errors -- which in some cases were just not needed. We lose a little bit of extra context but it is all easily derivable from the message and the SIMD. This also will remove the overhead when SIMD selection is being used to just to find the selected dispatch width -- at a point where the shaders were already compiled -- and the errors are not used at all. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9849 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25336>	2023-09-22 16:23:02 +00:00
Alyssa Rosenzweig	d1eb17e92e	treewide: Drop nir_ssa_for_src users Via Coccinelle patch: @@ expression b, s, n; @@ -nir_ssa_for_src(b, *s, n) +s->ssa @@ expression b, s, n; @@ -nir_ssa_for_src(b, s, n) +s.ssa Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25247>	2023-09-18 10:25:17 -04:00
Iván Briano	f1bc58cb7b	intel/fs: use ffsll so we don't explode on 32 bits Fixes: `b200e5765c` ("anv: use a simpler MUE layout for fast linked libraries") Tested-by: Mark Janes <markjanes@swizzler.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25192>	2023-09-12 22:42:38 +00:00
Iván Briano	b200e5765c	anv: use a simpler MUE layout for fast linked libraries The compaction introduced in `a252123363` ("intel/compiler/mesh: compactify MUE layout") is not suitable for the case where graphics pipeline libraries are fast linked, as the fragment shader won't receive the mue_map to know where to locate its inputs. For that case, keep doing what we did before and lay things down in the order varyings are defined, which is also how it works for the non-mesh case. Fixes dEQP-VK.fragment_shading_rate.fast_linked_library.ms Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25047>	2023-09-12 02:51:31 +00:00
Alyssa Rosenzweig	f80c57c38f	treewide: Use nir_before/after_impl for more elaborate cases Via Coccinelle patch: @@ expression func_impl; @@ -nir_before_block(nir_start_block(func_impl)) +nir_before_impl(func_impl) @@ expression func_impl; @@ -nir_after_block(nir_impl_last_block(func_impl)) +nir_after_impl(func_impl) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24910>	2023-08-30 19:30:58 +00:00
Alyssa Rosenzweig	cda1961835	treewide: Also handle struct nir_builder form Via Coccinelle patch: @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(struct nir_builder* builder, -nir_instr instr, +nir_intrinsic_instr intr, ...) { ( - if (instr->type != nir_instr_type_intrinsic) - return false; - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); \| - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); - if (instr->type != nir_instr_type_intrinsic) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_intrinsics_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_intrinsics_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_intrinsics_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24852>	2023-08-24 15:48:02 +00:00
Alyssa Rosenzweig	465b138f01	treewide: Use nir_shader_intrinsic_pass sometimes This converts a lot of trivial passes. Nice boilerplate deletion. Via Coccinelle patch (with a small manual fix-up for panfrost where coccinelle got confused by genxml + ninja clang-format squashed in, and for Zink because my semantic patch was slightly buggy). @def@ typedef bool; typedef nir_builder; typedef nir_instr; typedef nir_def; identifier fn, instr, intr, x, builder, data; @@ static fn(nir_builder* builder, -nir_instr instr, +nir_intrinsic_instr intr, ...) { ( - if (instr->type != nir_instr_type_intrinsic) - return false; - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); \| - nir_intrinsic_instr intr = nir_instr_as_intrinsic(instr); - if (instr->type != nir_instr_type_intrinsic) - return false; ) <... ( -instr->x +intr->instr.x \| -instr +&intr->instr ) ...> } @pass depends on def@ identifier def.fn; expression shader, progress; @@ ( -nir_shader_instructions_pass(shader, fn, +nir_shader_intrinsics_pass(shader, fn, ...) \| -NIR_PASS_V(shader, nir_shader_instructions_pass, fn, +NIR_PASS_V(shader, nir_shader_intrinsics_pass, fn, ...) \| -NIR_PASS(progress, shader, nir_shader_instructions_pass, fn, +NIR_PASS(progress, shader, nir_shader_intrinsics_pass, fn, ...) ) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24852>	2023-08-24 15:48:02 +00:00
Faith Ekstrand	b5d6b7c402	nir: Drop most uses if nir_instr_rewrite_src() Generated by the following semantic patch: @@ expression I, S, D; @@ -nir_instr_rewrite_src(I, S, nir_src_for_ssa(D)); +nir_src_rewrite(S, D); Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24729>	2023-08-18 01:00:15 +00:00
Faith Ekstrand	b781dd6200	nir s/nir_get_ssa_scalar/nir_get_scalar/ Generated with sed: sed -i -e 's/nir_get_ssa_scalar/nir_get_scalar/g' src/*/.h src/*/.c src/*/.cpp Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24703>	2023-08-15 17:44:27 +00:00
Faith Ekstrand	4695bebc79	nir: Drop nir_dest Instead, we replace every use of it with nir_def. Most of this commit was generated by sed: sed -i -e 's/dest.ssa/def/g' src/*/.h src/*/.c src/*/.cpp A few manual fixups were required in lima and the nir_legacy code. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Faith Ekstrand	9d81f13a75	nir: Get rid of nir_dest_num_components() We could add a nir_def_num_components() helper but we use ssa.num_components about 3x as often as nir_dest_num_components() today so that's a major Coccinelle refactor anyway and this doesn't make it much worse. Most of this commit was generated byt the following semantic patch: @@ expression D; @@ <... -nir_dest_num_components(D) +D.ssa.num_components ... Some manual fixup was needed, especially in cpp files where Coccinelle tends to give up the moment it sees any interesting C++. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Faith Ekstrand	80a1836d8b	nir: Get rid of nir_dest_bit_size() We could add a nir_def_bit_size() helper but we use ->bit_size about 3x as often as nir_dest_bit_size() today so that's a major Coccinelle refactor anyway and this doesn't make it much worse. Most of this commit was generated byt the following semantic patch: @@ expression D; @@ <... -nir_dest_bit_size(D) +D.ssa.bit_size ... Some manual fixup was needed, especially in cpp files where Coccinelle tends to give up the moment it sees any interesting C++. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Faith Ekstrand	ce8b157b94	intel/fs: Stop passing around nir_dest and nir_alu_dest We want to get rid of nir_dest so back-ends need to stop storing it in structs and passing it through helpers. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24674>	2023-08-14 21:22:53 +00:00
Alyssa Rosenzweig	09d31922de	nir: Drop "SSA" from NIR language Everything is SSA now. sed -e 's/nir_ssa_def/nir_def/g' \ -e 's/nir_ssa_undef/nir_undef/g' \ -e 's/nir_ssa_scalar/nir_scalar/g' \ -e 's/nir_src_rewrite_ssa/nir_src_rewrite/g' \ -e 's/nir_gather_ssa_types/nir_gather_types/g' \ -i $(git grep -l nir \| grep -v relnotes) git mv src/compiler/nir/nir_gather_ssa_types.c \ src/compiler/nir/nir_gather_types.c ninja -C build/ clang-format cd src/compiler/nir && find .c .h -type f -exec clang-format -i \{} \; Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24585>	2023-08-12 16:44:41 -04:00
Lionel Landwerlin	9934613c74	anv/hasvk: track robustness per pipeline stage And split them into UBO and SSBO v2 (Lionel): - Get rid of robustness fields in anv_shader_bin v3 (Lionel): - Do not pass unused parameters around Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17545>	2023-08-09 09:00:12 +03:00
Alyssa Rosenzweig	11fc4f969c	intel: Collapse is_ssa checks Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:29 +00:00
Alyssa Rosenzweig	95e3df39c0	treewide: sed out more is_ssa Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24432>	2023-08-03 22:40:28 +00:00
Alyssa Rosenzweig	51db19f7a2	nir: Rename scoped_barrier -> barrier sed + ninja clang-format + fix up spacing for common code. If you are unhappy that I did not manually change the whitespace of your driver, you need to enable clang-format for it so the formatting would happen automatically. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>	2023-08-01 23:18:29 +00:00
Iván Briano	377c2a045f	intel/compiler: call brw_nir_adjust_payload from brw_postprocess_nir Calling anything after nir_trivialize_registers() risks undoing some of its work. In this case, brw_nir_adjust_payload() will do a constant folding pass if any payload adjusting happened, and that can turn a bunch of @store_regs into basically noops. Fixes dEQP-VK.subgroups.*task Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24325>	2023-07-25 22:48:09 +00:00
Marcin Ślusarz	48885c7fe3	intel/compiler: load debug mesh compaction options once Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Marcin Ślusarz	c1685f08dd	intel/compiler,anv: put some vertex and primitive data in headers Both per-primitive and per-vertex space is allocated in MUE in 8 dword chunks and those 8-dword chunks (granularity of 3DSTATE_SBE_MESH.Per[Primitive\|Vertex]URBEntryOutputReadLength) are passed to fragment shaders as inputs (either non-interpolated for per-primitive and flat vertex attributes or interpolated for non-flat vertex attributes). Some attributes have a special meaning and must be placed in separate 8/16-dword slot called Primitive Header or Vertex Header. Primitive Header contains 4 such attributes (Cull Primitive, ViewportIndex, RTAIndex, CPS), leaving 4 dwords (the rest of 8-dword slot) potentially unused. Vertex Header is similar - it starts with 3 unused dwords, 1 dword for Point Size (but if we declare that shader doesn't produce Point Size then we can reuse it), followed by 4 dwords for Position and optionally 8 dwords for clip distances. This means we have an interesting optimization problem - we can put some user attributes into holes in Primitive and Vertex Headers, which may lead to smaller MUE size and potentially more mesh threads running in parallel, but we have to be careful to use those holes only when we need it, otherwise we could force HW to pass too much data to fragment shader. Example 1: Let's assume that Primitive Header is enabled and user defined 12 dwords of per-primitive attributes. Without packing we would consume 8 + ALIGN(12, 8) = 24 dwords of MUE space and pass ALIGN(12, 8) = 16 dwords to fragment shader. With packing, we'll consume 4 + 4 + ALIGN(12 - 4, 8) = 16 dwords of MUE space and pass ALIGN(4, 8) + ALIGN(12 - 4, 8) = 16 dwords to fragment shader. 16/16 is better than 24/16, so packing makes sense. Example 2: Now let's assume that Primitive Header is enabled and user defined 16 dwords of per-primitive attributes. Without packing we would consume 8 + ALIGN(16, 8) = 24 dwords of MUE space and pass ALIGN(16, 16) = 16 dwords to fragment shader. With packing, we'll consume 4 + 4 + ALIGN(16 - 4, 8) = 24 dwords of MUE space and pass ALIGN(4, 8) + ALIGN(16 - 4, 8) = 24 dwords to fragment shader. 24/24 is worse than 24/16, so packing doesn't make sense. This change doesn't affect vk_meshlet_cadscene in default configuration, but it speeds it up by up to 25% with "-extraattributes N", where N is some small value divisible by 2 (by default N == 1) and we are bound by URB size. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Marcin Ślusarz	a252123363	intel/compiler/mesh: compactify MUE layout Instead of using 4 dwords for each output slot, use only the amount of memory actually needed by each variable. There are some complications from this "obvious" idea: - flat and non-flat variables can't be merged into the same vec4 slot, because flat inputs mask has vec4 stride - multi-slot variables can have different layout: float[N] requires N 1-dword slots, but i64vec3 requires 1 fully occupied 4-dword slot followed by 2-dword slot - some output variables occur both in single-channel/component split and combined variants - crossing vec4 boundary requires generating more writes, so avoiding them if possible is beneficial This patch fixes some issues with arrays in per-vertex and per-primitive data (func.mesh.ext.outputs.*.indirect_array.q0 in crucible) and by reduction in single MUE size it allows spawning more threads at the same time. Note: this patch doesn't improve vk_meshlet_cadscene performance because default layout is already optimal enough. Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20407>	2023-07-24 07:55:29 +00:00
Lionel Landwerlin	3384f029be	intel/compiler: rework input parameters Use a struct for various common parameters rather than per stage structure or arguments to stage specific entrypoints. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23942>	2023-07-20 09:08:08 +00:00
Marcin Ślusarz	36ff6c0004	intel/compiler: remove NV_mesh_shader support Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24071>	2023-07-14 08:27:14 +00:00
Marcin Ślusarz	7ed9ec70c0	intel/compiler: simplify reading of gl_NumWorkGroups in task/mesh Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22334>	2023-07-04 09:15:08 +00:00
Konstantin Seurer	05269047d3	intel: Use nir_builder_at Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23883>	2023-07-03 15:21:38 +00:00
Yonggang Luo	68b8aa788d	intel/compiler: Switch to use nir_foreach_function_impl Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23920>	2023-06-29 11:29:54 +00:00
Alyssa Rosenzweig	173b9ee69a	treewide: Use nir_builder_create more perl -p0e 's/nir_builder_init\(&([^,]*), /\1 = nir_builder_create(/g' -i $(git grep -l nir_builder_init) Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23860>	2023-06-27 18:13:02 +00:00
Caio Oliveira	fde8bf7b7f	intel/compiler: Respect NIR_DEBUG_PRINT_INTERNAL flag If flag is not set, don't print debugging information for internal shaders. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23756>	2023-06-21 00:01:10 +00:00
Caio Oliveira	59cc77f0fa	compiler: Move from nir_scope to mesa_scope Just moving the enum and performing renames, no behavior change. Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23328>	2023-06-19 23:29:26 +00:00
Erik Faye-Lund	6d142078bc	nir: use generated immediate comparison helpers This makes the code a bit less verbose, so let's use the helpers. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23393>	2023-06-05 13:40:08 +00:00
Erik Faye-Lund	28b1c5bca1	nir: use nir_i{ne,eq}_imm helpers We already have these, so let's use them more. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23393>	2023-06-05 13:40:07 +00:00
Rohan Garg	a15cc833f9	intel: drop unused is_scalar function parameter in brw_nir_apply_key Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Rohan Garg	212810ac8a	intel: infer scalar'ness locally for brw_postprocess_nir Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23098>	2023-05-18 15:46:06 +02:00
Lionel Landwerlin	09cdb77a92	intel/fs: report max register pressure in shader stats Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21756>	2023-03-08 13:37:07 +00:00
Marcin Ślusarz	e29a964d02	intel/compiler/mesh: follow the type of offset variable This allows copy propagation to kick in, decreasing the overall number of generated instructions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21098>	2023-02-21 11:10:24 +00:00
Marcin Ślusarz	15afb8dcc6	intel/compiler/mesh: apply URB payload mask once per program Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21098>	2023-02-21 11:10:23 +00:00
Marcin Ślusarz	dd9bf86725	intel/compiler/mesh: use slice id of task urb handles in mesh shaders When mesh shader is spawned on a different slice than the originating task shader, then input task urb handle can come from a different slice, so masking this information off will load data from the current slice, instead of the one where real data are. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21007>	2023-02-14 09:36:53 +00:00
Marcin Ślusarz	465c241266	intel/compiler/mesh: use U888X packed index format Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20910>	2023-02-10 21:03:33 +00:00
Lionel Landwerlin	ebc4893947	intel/fs: fix mesh indirect movs The size in src[2] is in byte and needs to cover any possible data accessed in src[0] by the indirection. That way the register allocation is aware of what cannot be spilled for the instruction to execute on valid data. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `70ace2bbcd` ("intel/compiler: Implement Task Output and Mesh Input") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21188>	2023-02-09 15:35:55 +00:00
Marcin Ślusarz	af9e2b8bf1	intel/compiler/mesh: remove dead code path supporting >4 dword writes Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20858>	2023-01-31 18:28:21 +00:00
Marcin Ślusarz	be82ed28f0	intel/compiler/mesh: support longer write messages Allowing longer writes reduces the number of send messages needed to support unaligned 4-component writes. Note: nothing currently generates 8-component writes, so this change makes "second_mask" code path in emit_urb_direct_writes and emit_urb_indirect_writes_mod dead. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20858>	2023-01-31 18:28:21 +00:00
Marcin Ślusarz	3131c2fc7a	intel/compiler/mesh: optimize indirect writes Our hardware requires that we write to URB using full vec4s at aligned addresses. It gives us an ability to mask-off dwords within vec4 we don't want to write, but we have to know their positions at compile time. Let's assume that: - V represents one dword we want to write - ? is an unitinitialized value - "\|" is a vec4 boundary. When we want to write 2-dword value at offset 0 we generate 1 write message: \| V1 V2 ? ? \| with mask: \| 1 1 0 0 \| When we want to write 4-dword value at offset 2 we generate 2 write messages: \| ? ? V1 V2 \| V3 V4 ? ? \| with mask: \| 0 0 1 1 \| 1 1 0 0 \| However if we don't know the offset within vec4 at compile time we currently generate 4 write messages: \| V1 V1 V1 V1 \| \| 0 0 1 0 \| \| V2 V2 V2 V2 \| \| 0 0 0 1 \| \| V3 V3 V3 V3 \| \| 1 0 0 0 \| \| V4 V4 V4 V4 \| \| 0 1 0 0 \| where masks are determined at run time. This is quite wasteful and slow. However, if we could determine the offset modulo 4 statically at compile time, we could generate only 1 or 2 write messages (1 if modulo is 0) instead of 4. This is what this patch does: it analyzes the addressing expression for modulo 4 value and if it can determine it at compile time, we generate 1 or 2 writes, and if it can't we fallback to the old 4 writes method. In mesh shader, the value of offset modulo 4 should be known for all outputs, with an exception of primitive indices. The modulo value should be known because of MUE layout restrictions, which require that user per-primitive and per-vertex data start at address aligned to 8 dwords and we should statically always know the offset from this base. There can be some cases where the offset from the base is more dynamic (e.g. indirect array access inside a per-vertex value), so we always do the analysis. Primitive indices are an exception, because they form vec3s (for triangles), which means that the offset will not be easy to analyse. When U888X index format lands, primitive indices will use only one dword per triangle, which means that we'll always write them using one message. Task shaders don't have any predetermined structure of output memory, so always do the analysis. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20050>	2023-01-31 13:50:08 +00:00
Marcin Ślusarz	536a2acfc2	intel/compiler/mesh: handle const data in task & mesh programs Started showing up when nir_opt_large_constants call was moved in `88756cee8d`. Fixes dEQP-VK.mesh_shader.ext.smoke.monolithic.fullscreen_gradient* Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Fixes: `88756cee8d` ("intel/compiler: Run nir_opt_large_constants before scalarizing consts") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20876>	2023-01-24 14:47:21 +00:00
Marcin Ślusarz	75375233f6	intel/compiler/mesh: extract emit_urb_direct_vec4_write No functional changes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20292>	2022-12-13 13:00:49 +00:00
Marcin Ślusarz	bb93f1bda1	intel/compiler/mesh: extract shared code for offset adjustment No functional changes. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20292>	2022-12-13 13:00:48 +00:00
Marcin Ślusarz	7fbd1dfb18	anv,intel/compiler/mesh: drop lowering of gl_Primitive*IndicesEXT Until U888X index format lands this change shouldn't have any impact on performance. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20292>	2022-12-13 13:00:48 +00:00
Marcin Ślusarz	7809f76fe8	intel/compiler/mesh: align payload size to the size of vec4 This reduces the number of instructions in task shaders when payload size is not aligned to vec4 and payload_in_shared WA is enabled, because nir_lower_task_shader will not need to handle the unaligned size case. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20080>	2022-12-06 16:31:11 +00:00
Marcin Ślusarz	db0e6f9a07	intel/compiler: user payload starts after TUE header & its padding All data written by the user are offset by TUE header size. Without this patch we copy the correct amount of user data, but both "from" and "to" offsets are wrong. Fixes: `37e78803d7` ("intel/compiler: use nir_lower_task_shader pass") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19409>	2022-12-01 11:19:47 +00:00
Marcin Ślusarz	7aaafaa8ae	intel/compiler: adjust [store\|load]_task_payload.base too Base also needs to be converted from bytes to words. Fixes: `c36ae42e4c` ("intel/compiler: Use nir_var_mem_task_payload") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19409>	2022-12-01 11:19:47 +00:00

1 2

95 Commits