Commit Graph

9934 Commits

Author SHA1 Message Date
Alyssa Rosenzweig 17d66055ae nir: Remove reg_intrinsics parameter to convert_from_ssa
All users must set it.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24450>
2023-08-02 10:26:45 -04:00
Alyssa Rosenzweig 51db19f7a2 nir: Rename scoped_barrier -> barrier
sed + ninja clang-format + fix up spacing for common code.

If you are unhappy that I did not manually change the whitespace of your driver,
you need to enable clang-format for it so the formatting would happen
automatically.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24428>
2023-08-01 23:18:29 +00:00
José Roberto de Souza f49989148a anv: Return earlier in anv_reloc_list functions
Xe KMD don't need relocs, so calling a nop function and avoiding the
CPU cycles and memory waste with reloc.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24411>
2023-08-01 22:30:04 +00:00
José Roberto de Souza d9d284d050 anv: Remove VkAllocationCallbacks parameter from reloc functions
Mismatch allocator could cause bad things, so better set the allocator
on anv_reloc_list_init() and use it in every reloc function.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24411>
2023-08-01 22:30:04 +00:00
José Roberto de Souza 0584bb450e anv: Nuke unused READ_ONCE() from anv_batch_chain.c
Only genX_cmd_buffer.c makes use of READ_ONCE() but that file also
defines it so it can be removed from anv_batch_chain.c.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24411>
2023-08-01 22:30:04 +00:00
José Roberto de Souza 373019f9ef intel/genxml/gen125: Set MI_MATH MOCS field as non-zero
All MOCS tables have 0 as a invalid value, so this will asssert
in case some place misset to set MI_MATH MOCS field.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
2023-08-01 19:49:45 +00:00
José Roberto de Souza b0b763d0be intel/tests/mi_builder: Set MI_MATH MOCS field
MOCS = 0 is a invalid MOCS index on MTL, so it is necessary get a
valid value and set to MI_MATH instructions.

So here the mocs index is set with mi_builder_set_mocs(), it can be
always set but it is required when mi_build will emit MI_MATH
instructions.
The mocs index will only be stored and used in gfx12.5+ platforms
so no changes were are required in crocus or hasvk.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
2023-08-01 19:49:45 +00:00
José Roberto de Souza a7ab31b96a anv: Set MI_MATH MOCS field
MOCS = 0 is a invalid MOCS index, so it is necessary get a valid value
and set to MI_MATH instructions.

So here the mocs index is set with mi_builder_set_mocs(), it can be
always set but it is required when mi_build will emit MI_MATH
instructions.
The mocs index will only be stored and used in gfx12.5+ platforms
so no changes were are required in crocus or hasvk.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
2023-08-01 19:49:45 +00:00
José Roberto de Souza d0890a0b8b iris: Set MI_MATH MOCS field
MOCS = 0 is a invalid MOCS index on MTL, so it is necessary get a
valid value and set to MI_MATH instructions.

So here the mocs index is set with mi_builder_set_mocs(), it can be
always set but it is required when mi_build will emit MI_MATH
instructions.
The mocs index will only be stored and used in gfx12.5+ platforms
so no changes were are required in crocus or hasvk.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
2023-08-01 19:49:44 +00:00
José Roberto de Souza 0233e3639f intel/genxml/gen125: Add missing fields in MI_MATH
BSpec: 53415
BSpec: 45737
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22508>
2023-08-01 19:49:44 +00:00
Faith Ekstrand ae105ad5cd anv: Use the common versions of vkBegin/EndQuery()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24409>
2023-08-01 19:17:05 +00:00
Faith Ekstrand e4485bc062 anv: Use vk_query_pool
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24409>
2023-08-01 19:17:05 +00:00
Faith Ekstrand 1d6d775ffe anv: Use vk_buffer_view
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24409>
2023-08-01 19:17:05 +00:00
Faith Ekstrand 92f996d0fa anv: Use vk_sampler
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24409>
2023-08-01 19:17:05 +00:00
Rohan Garg 7f6e6eb8ec anv: partially revert 2e8b1f6d
set_image_compressed_bit checks for the image aux usage whereas
cmd_buffer_mark_image_written checks for the subresource's aux usage.

Signed-off-by: Rohan Garg <rohan.garg@intel.com>
Fixes: 2e8b1f6d ('anv: drop duplicate checks when setting the compressed bit')
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24363>
2023-07-31 15:06:39 +00:00
Lionel Landwerlin c1c0311d42 anv: enable EDS3 ConservativeRasterizationMode
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24395>
2023-07-31 12:30:37 +00:00
Lionel Landwerlin a0179c32b6 anv: fix 3DSTATE_RASTER::APIMode field setting
The APIMode field is set in the dynamic part in gfx8_cmd_buffer.c

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 55951ac28e ("anv: fix emitting dynamic primitive topology")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24395>
2023-07-31 12:30:37 +00:00
Jordan Justen 5df97c27dc intel/compiler: Use nir SUBGROUP_INVOCATION for RT TOPOLOGY_ID
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21774>
2023-07-28 22:54:59 +00:00
Jordan Justen dbf19b76e8 intel/isl: Use intel_needs_workaround() for MTL CCS WA
Also use parent WA number of 14017240301 instead of 14017353530.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22401>
2023-07-28 15:23:47 -07:00
José Roberto de Souza 2b7599dc49 intel: Rename intel_gem_add_ext() to intel_i915_gem_add_ext()
gem_add_ext() is i915 specific so adding it to the name.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>
2023-07-28 15:36:52 +00:00
José Roberto de Souza c9950786f6 intel/common: Move functions inside of C++ ifdef
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>
2023-07-28 15:36:52 +00:00
José Roberto de Souza 4198a301b3 intel: Move i915_drm.h specific code from common/intel_gem.h to common/i915/intel_gem.h
This allow us to remove one more i915_drm.h include from code shared
by both backends.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>
2023-07-28 15:36:52 +00:00
José Roberto de Souza 1174e7412e intel/dev: Port intel_dev_info tool to Xe KMD
Only hwconfig was calling i915 specifc function, so it was only
necessary split the function that fetches it from backends and call it
from intel_get_and_print_hwconfig_table() depending on the KMD loaded.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23905>
2023-07-28 15:36:52 +00:00
Illia Polishchuk 56e0aff530 anv, drirc: Add workaround to speed up Cyberpunk 2077 reg allocation
Calling the ra_allocate function after each register spill can take
several minutes. This option speeds up shader compilation by spilling
more registers after the ra_allocate failure.Required for
Cyberpunk 2077, which uses a watchdog thread to terminate the process
in case the render thread hasn't responded within 2 minutes.

Execution time of my Cyberpunk2077 shader compilation test:
https://gitlab.freedesktop.org/illia.a.polishchuk/cyberpunk-vulkan-compute-hang-test-anv

Before the patch:

real 1m28,738s
user 1m28,329s
sys 0m0,400s

After the patch

real 0m33,245s
user 32m,835s
sys 0m0,404s

I think it's acceptable patch because Cyberpunk benchmarks has
the same FPS with and without patch. (I started
it without patch with a patched binary with disabled watchdog thread)

Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Requires: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24228
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9241
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>
2023-07-28 14:51:42 +00:00
Jason Ekstrand 739e21fa9a intel/fs: Add a parameter to speed up register spilling
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24299>
2023-07-28 14:51:42 +00:00
Mike Blumenkrantz e68e612826 nir: add a helper for calculating variable slots
this will maybe avoid future bugs, but probably not

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>
2023-07-28 13:14:35 +00:00
Mike Blumenkrantz 59396eefe6 nir: fix slot calculations for compact variables with location_frac
a variable with a component offset may span multiple slots, and this cannot
be inferred from its type alone (e.g., compacted clip+cull distances)

cc: mesa-stable

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24163>
2023-07-28 13:14:35 +00:00
Lionel Landwerlin 87149cc545 blorp: update and move fast clear PIPE_CONTROLs to drivers
Before this patch, when updating the indirect clear color, BLORP only
invalidated the texture cache on gfx11. The hardware docs state that
the texture cache invalidation is also needed on gfx12 however. Add
this invalidation for gfx12 and move the fast-clear related cache
invalidations to the drivers for clarity and performance.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5850
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23588>
2023-07-28 00:07:15 +00:00
Lionel Landwerlin c94bd56114 blorp: switch blorp_update_clear_color to early return
Avoid even going to the function if we don't need to.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23588>
2023-07-28 00:07:15 +00:00
Iván Briano 71ebd9b9d7 anv,hasvk: respect provoking vertex setting on geometry shaders
We need to set the right value on ReorderMode based on the provoking
vertex mode, or the order in which the vertices for tristrip[_adj] are
delivered to the geometry shader doesn't match what Vulkan expects.

Fixes
dEQP-VK.transform_feedback.primitives_generated_query.concurrent.*triangle_strip_with_adjacency*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23243>
2023-07-27 18:52:49 +00:00
Lionel Landwerlin 365b14489d anv: wire image sparse loads
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>
2023-07-27 02:03:02 +03:00
Lionel Landwerlin fe81d40bff intel/nir: add lower for sparse images & textures
We have to lower images into image load + sampler residency.

There is also a restriction on sampler access with a compare, lower
those as 2 sampler instructions to meet the restriction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>
2023-07-27 02:02:59 +03:00
Lionel Landwerlin 300cc829de intel/nir: handle image_sparse_load in storage format lowering
The last component of sparse load is the residency data. We don't want
to touch/convert that value with the format lowering.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>
2023-07-27 02:02:34 +03:00
Lionel Landwerlin d33aff783d intel/fs: add support for sparse accesses
Purely from the backend point of view it's just an additional
parameter to sampler messages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23882>
2023-07-27 02:02:30 +03:00
Lionel Landwerlin 50c29e1ffa anv: simplify buffer address+size loads from descriptor buffer
Only found a couple titles that have been helped by this :

 PERCENTAGE DELTAS Shaders   Instrs    Cycles
 cyberpunk_2077    10388     -0.00%    -0.00%
 -----------------------------------------------
 All affected      1         -2.24%    -0.39%
 -----------------------------------------------
 Total             10388     -0.00%    -0.00%

 PERCENTAGE DELTAS    Shaders   Instrs    Cycles
 red_dead_redemption2 5949      -0.10%    -0.00%
 --------------------------------------------------
 All affected         111       -0.74%    -0.14%
 --------------------------------------------------
 Total                5949      -0.10%    -0.00%

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23318>
2023-07-26 09:41:23 +00:00
Lionel Landwerlin f1f58c3bea isl: add ability to store buffer size in unused RENDER_SURFACE_STATE fields
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23318>
2023-07-26 09:41:23 +00:00
Lionel Landwerlin d099e47de0 intel/fs: add more UNDEFs around SEND messages
lower_find_live_channel() in particular is used a lot in control flow
to find the live channel for the surface/sampler handle. Adding UNDEFs
on the temporary registers used for finding the live channels helps
reduce the liveness of those temporary registers, especially in loops.

Some titles affected :

Rise Of The Tomb Raider:
Totals from 2780 (22.58% of 12311) affected shaders:
Instrs: 1294455 -> 1294592 (+0.01%); split: -0.15%, +0.16%
Cycles: 1473136441 -> 1471302617 (-0.12%); split: -1.52%, +1.40%
Max live registers: 144282 -> 143595 (-0.48%)
Max dispatch width: 22200 -> 22232 (+0.14%)

Red Dead Redemption 2:
Totals from 435 (7.28% of 5972) affected shaders:
Instrs: 488472 -> 487594 (-0.18%); split: -0.31%, +0.14%
Cycles: 11354732 -> 11384928 (+0.27%); split: -0.44%, +0.71%
Spill count: 1217 -> 1172 (-3.70%)
Fill count: 3521 -> 3447 (-2.10%)
Scratch Memory Size: 64512 -> 62464 (-3.17%)
Max live registers: 35997 -> 35798 (-0.55%)

Fallout 4:
Totals from 8 (0.49% of 1638) affected shaders:
Instrs: 41908 -> 40509 (-3.34%)
Cycles: 3638464 -> 3555680 (-2.28%); split: -2.67%, +0.39%
Spill count: 717 -> 665 (-7.25%)
Fill count: 2542 -> 2438 (-4.09%)
Scratch Memory Size: 32768 -> 16384 (-50.00%)
Max live registers: 567 -> 534 (-5.82%)

Cyberpunk 2077:
Totals from 2984 (28.97% of 10301) affected shaders:
Instrs: 3888874 -> 3891600 (+0.07%); split: -0.20%, +0.27%
Cycles: 67906489 -> 67767721 (-0.20%); split: -0.68%, +0.47%
Spill count: 200 -> 98 (-51.00%)
Fill count: 237 -> 90 (-62.03%)
Scratch Memory Size: 10240 -> 8192 (-20.00%)
Max live registers: 215715 -> 212727 (-1.39%)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>
2023-07-26 08:48:33 +00:00
Lionel Landwerlin 5c72724819 intel/fs: consider UNDEF as non-partial write
A few titles show max live register reductions, but nothing
significant in instruction count or other stats.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24282>
2023-07-26 08:48:32 +00:00
Lionel Landwerlin d62e494b37 intel/vec4: fix log_data pointer
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3384f029be ("intel/compiler: rework input parameters")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/9421
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24307>
2023-07-26 06:36:18 +00:00
Iván Briano 377c2a045f intel/compiler: call brw_nir_adjust_payload from brw_postprocess_nir
Calling anything after nir_trivialize_registers() risks undoing some of
its work.
In this case, brw_nir_adjust_payload() will do a constant folding pass
if any payload adjusting happened, and that can turn a bunch of
@store_regs into basically noops.

Fixes dEQP-VK.subgroups.*task

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24325>
2023-07-25 22:48:09 +00:00
Ian Romanick cb0de0a1d3 intel/fs: Constant fold OR and AND
The path taken in fs_visitor::swizzle_nir_scratch_addr for DG2 generates
some AND and OR instructions before the SHL. This commit folds those so
the whold calculation becomes a constant (like on older platforms).

v2: Fix return type of src_as_uint. Noticed by Marcin.

shader-db results:

DG2
total instructions in shared programs: 23190475 -> 23179540 (-0.05%)
instructions in affected programs: 36026 -> 25091 (-30.35%)
helped: 7 / HURT: 0

total cycles in shared programs: 841196807 -> 841142563 (<.01%)
cycles in affected programs: 1660670 -> 1606426 (-3.27%)
helped: 7 / HURT: 0

No shader-db changes on any older Intel platforms.

fossil-db results:

DG2
Totals:
Instrs: 197780372 -> 197773966 (-0.00%)
Cycles: 14066410782 -> 14066399378 (-0.00%); split: -0.00%, +0.00%
Subgroup size: 8438104 -> 8438112 (+0.00%)
Send messages: 8049445 -> 8049446 (+0.00%)
Scratch Memory Size: 14263296 -> 14264320 (+0.01%)

Totals from 9 (0.00% of 668055) affected shaders:
Instrs: 24547 -> 18141 (-26.10%)
Cycles: 1984791 -> 1973387 (-0.57%); split: -0.98%, +0.40%
Subgroup size: 88 -> 96 (+9.09%)
Send messages: 867 -> 868 (+0.12%)
Scratch Memory Size: 69632 -> 70656 (+1.47%)

No fossil-db changes on any older Intel platforms.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>
2023-07-25 22:11:21 +00:00
Ian Romanick 61c786bad5 intel/fs: Constant fold SHL
This is a modified version of a commit originally in !7698. This version
add the changes to brw_fs_copy_propagation. If the address passed to
fs_visitor::swizzle_nir_scratch_addr is a constant, that function will
generate SHL with two constant sources.

DG2 uses a different path to generate those addresses, so the constant
folding can't occur there yet. That will be addressed in the next
commit.

What follows is the commit change history from that older MR.

v2: Previously this commit was after `intel/fs: Combine constants for
integer instructions too`.  However, this commit can create invalid
instructions that are only cleaned up by `intel/fs: Combine constants
for integer instructions too`.  That would potentially affect the
shader-db results of each commit, but I did not collect new data for
the reordering.

v3: Fix masking for W/UW and for Q/UQ types. Add an assertion for
!saturate. Both suggested by Ken. Also add an assertion that B/UB types
don't matically come back.

v4: Fix sources count. See also ed3c2f73db ("intel/fs: fixup sources
number from opt_algebraic").

v5: Fix typo in comment added in v3. Noticed by Marcin. Fix a typo in a
comment added when pulling this commit out of !7698. Noticed by Ken.

shader-db results:

DG2
No changes.

Tiger Lake, Ice Lake, and Skylake had similar results (Ice Lake shown)
total instructions in shared programs: 20655696 -> 20651648 (-0.02%)
instructions in affected programs: 23125 -> 19077 (-17.50%)
helped: 7 / HURT: 0

total cycles in shared programs: 858436639 -> 858407749 (<.01%)
cycles in affected programs: 8990532 -> 8961642 (-0.32%)
helped: 7 / HURT: 0

Broadwell and Haswell had similar results. (Broadwell shown)
total instructions in shared programs: 18500780 -> 18496630 (-0.02%)
instructions in affected programs: 24715 -> 20565 (-16.79%)
helped: 7 / HURT: 0

total cycles in shared programs: 946100660 -> 946087688 (<.01%)
cycles in affected programs: 5838252 -> 5825280 (-0.22%)
helped: 7 / HURT: 0

total spills in shared programs: 17588 -> 17572 (-0.09%)
spills in affected programs: 1206 -> 1190 (-1.33%)
helped: 2 / HURT: 0

total fills in shared programs: 25192 -> 25156 (-0.14%)
fills in affected programs: 156 -> 120 (-23.08%)
helped: 2 / HURT: 0

No shader-db changes on any older Intel platforms.

fossil-db results:

DG2
Totals:
Instrs: 197780415 -> 197780372 (-0.00%); split: -0.00%, +0.00%
Cycles: 14066412266 -> 14066410782 (-0.00%); split: -0.00%, +0.00%

Totals from 16 (0.00% of 668055) affected shaders:
Instrs: 16420 -> 16377 (-0.26%); split: -0.43%, +0.17%
Cycles: 220133 -> 218649 (-0.67%); split: -0.69%, +0.01%

Tiger Lake, Ice Lake and Skylake had similar results. (Ice Lake shown)
Totals:
Instrs: 153425977 -> 153423678 (-0.00%)
Cycles: 14747928947 -> 14747929547 (+0.00%); split: -0.00%, +0.00%
Subgroup size: 8535968 -> 8535976 (+0.00%)
Send messages: 7697606 -> 7697607 (+0.00%)
Scratch Memory Size: 4380672 -> 4381696 (+0.02%)

Totals from 6 (0.00% of 662749) affected shaders:
Instrs: 13893 -> 11594 (-16.55%)
Cycles: 5386074 -> 5386674 (+0.01%); split: -0.42%, +0.43%
Subgroup size: 80 -> 88 (+10.00%)
Send messages: 675 -> 676 (+0.15%)
Scratch Memory Size: 91136 -> 92160 (+1.12%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>
2023-07-25 22:11:21 +00:00
Ian Romanick 56e6186dcf intel/fs: Always do opt_algebraic after opt_copy_propagation makes progress
opt_copy_propagation can create invalid instructions like

    shl(8) vgrf96:UD, 2d, 8u

These instructions will be cleaned up by opt_algebraic.  The irony is
opt_algebraic converts these to simple mov instructions that
opt_copy_propagation should clean up.  I don't think we want a loop like

   do {
      progress = false;
      if (OPT(opt_copy_propagation)) {
         OPT(opt_algebraic);
         OPT(dead_code_eliminate);
      }
   } while (progress);

But maybe we do?

Maybe this would be sufficient:

   while (OPT(opt_copy_propagation))
      OPT(opt_algebraic);
   OPT(dead_code_eliminate);

No shader-db or fossil-db changes (yet) on any Intel platform.  This is
expected.

v2: Do opt_algebraic immediately after every call to
opt_copy_propagation instead of being clever. Suggested by Lionel.

Tested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23884>
2023-07-25 22:11:21 +00:00
José Roberto de Souza f59d272e93 anv: Request Xe KMD to place BOs to CPU visible VRAM when required
This is required to support discrete GPUs placed in systems with large
PCI bar or resizeble PCI bar not available or disabled.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>
2023-07-25 19:33:16 +00:00
José Roberto de Souza f9fcd7168a intel/dev/xe: Add support for small-bar setups
This adds support for discrete GPUs placed in systems with large PCI
bar or resizeble PCI bar not available or disabled.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23781>
2023-07-25 19:33:15 +00:00
Faith Ekstrand 94f36cfaa3 intel/fs: Assume NIR is in SSA form
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>
2023-07-25 16:25:11 +00:00
Faith Ekstrand 965bbe5286 intel/fs: Rework the overlapping mov/vec case
Now that we're using load/store_reg intrinsics, the previous checks for
registers aren't what we want.  Instead, we need to be looking for a mov
or vec where both the destination and a source are load/store_reg with
matching decl_reg.

Fixes: b8209d69ff ("intel/fs: Add support for new-style registers")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>
2023-07-25 16:25:11 +00:00
Faith Ekstrand 45ee952efb intel/fs: Use write masks from store_reg intrinsics
Fixes: b8209d69ff ("intel/fs: Add support for new-style registers")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24310>
2023-07-25 16:25:10 +00:00
Marcin Ślusarz 4f1125e4ae intel/compiler/test: fix crashes when TEST_DEBUG is set
Dumping instructions requires that ISA info is not empty.

Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/24274>
2023-07-25 15:13:29 +00:00
Illia Polishchuk c2724b4d37 s/Intel: fix/anv: fix: potentially overflowing expression in genX
CID 1528164 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression
pool->n_passes * pool->khr_perf_preamble_stride with type
unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic,
and then used in a context that expects an expression of type uint64_t (64 bits, unsigned).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Illia Polishchuk <illia.a.polishchuk@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20893>
2023-07-25 08:55:56 +00:00