Ian Romanick
97e3c6a12a
intel/brw: Use range analysis to optimize fsign
...
shader-db:
Meteor Lake, DG2, and Tiger Lake had similar results. (Meteor Lake shown)
total instructions in shared programs: 19674784 -> 19665960 (-0.04%)
instructions in affected programs: 933425 -> 924601 (-0.95%)
helped: 3656 / HURT: 0
total cycles in shared programs: 810343919 -> 810241030 (-0.01%)
cycles in affected programs: 56752034 -> 56649145 (-0.18%)
helped: 3032 / HURT: 434
LOST: 11
GAINED: 0
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20315795 -> 20305856 (-0.05%)
instructions in affected programs: 979698 -> 969759 (-1.01%)
helped: 3845 / HURT: 0
total cycles in shared programs: 830600281 -> 830534694 (<.01%)
cycles in affected programs: 45675615 -> 45610028 (-0.14%)
helped: 3250 / HURT: 325
total spills in shared programs: 4583 -> 4565 (-0.39%)
spills in affected programs: 180 -> 162 (-10.00%)
helped: 3 / HURT: 0
total fills in shared programs: 5245 -> 5219 (-0.50%)
fills in affected programs: 379 -> 353 (-6.86%)
helped: 3 / HURT: 0
LOST: 14
GAINED: 8
fossil-db:
All Intel platforms except Tiger Lake had similar results. (Meteor Lake shown)
Totals:
Instrs: 154024263 -> 154023814 (-0.00%)
Cycle count: 17463341602 -> 17461726239 (-0.01%); split: -0.01%, +0.00%
Totals from 322 (0.05% of 631440) affected shaders:
Instrs: 199933 -> 199484 (-0.22%)
Cycle count: 168492537 -> 166877174 (-0.96%); split: -0.96%, +0.00%
Tiger Lake
Instrs: 149984723 -> 149984287 (-0.00%)
Cycle count: 15238596937 -> 15239260415 (+0.00%); split: -0.00%, +0.01%
Max dispatch width: 5553408 -> 5553424 (+0.00%)
Totals from 318 (0.05% of 631414) affected shaders:
Instrs: 179624 -> 179188 (-0.24%)
Cycle count: 160724533 -> 161388011 (+0.41%); split: -0.06%, +0.48%
Max dispatch width: 3296 -> 3312 (+0.49%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:21 +00:00
Ian Romanick
e578657313
intel/brw: Implement more strictly correct fsign lowering
...
The huge amount of helped shaders is due to the "~" versions of the
patterns.
shader-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
total instructions in shared programs: 19672345 -> 19662605 (-0.05%)
instructions in affected programs: 1147766 -> 1138026 (-0.85%)
helped: 2691 / HURT: 1650
total cycles in shared programs: 810323688 -> 810145191 (-0.02%)
cycles in affected programs: 68918312 -> 68739815 (-0.26%)
helped: 3651 / HURT: 1832
LOST: 29
GAINED: 38
Tiger Lake
total instructions in shared programs: 19489619 -> 19479909 (-0.05%)
instructions in affected programs: 1124564 -> 1114854 (-0.86%)
helped: 2682 / HURT: 1643
total cycles in shared programs: 811468406 -> 811706747 (0.03%)
cycles in affected programs: 66397690 -> 66636031 (0.36%)
helped: 3692 / HURT: 1775
total spills in shared programs: 3906 -> 3907 (0.03%)
spills in affected programs: 16 -> 17 (6.25%)
helped: 0 / HURT: 1
total fills in shared programs: 3220 -> 3222 (0.06%)
fills in affected programs: 50 -> 52 (4.00%)
helped: 0 / HURT: 1
LOST: 33
GAINED: 36
Ice Lake and Skylake had similar results. (Ice Lake shown)
total instructions in shared programs: 20317882 -> 20307495 (-0.05%)
instructions in affected programs: 1199651 -> 1189264 (-0.87%)
helped: 2863 / HURT: 1680
total cycles in shared programs: 830880024 -> 830457927 (-0.05%)
cycles in affected programs: 63347102 -> 62925005 (-0.67%)
helped: 4118 / HURT: 1622
total spills in shared programs: 4593 -> 4583 (-0.22%)
spills in affected programs: 205 -> 195 (-4.88%)
helped: 4 / HURT: 0
total fills in shared programs: 5284 -> 5245 (-0.74%)
fills in affected programs: 464 -> 425 (-8.41%)
helped: 4 / HURT: 0
LOST: 70
GAINED: 33
fossil-db:
Meteor Lake and DG2 had similar results. (Meteor Lake shown)
Totals:
Instrs: 154025275 -> 154022035 (-0.00%); split: -0.00%, +0.00%
Cycle count: 17472869499 -> 17463289530 (-0.05%); split: -0.06%, +0.00%
Spill count: 141269 -> 141246 (-0.02%); split: -0.02%, +0.00%
Fill count: 265342 -> 265159 (-0.07%); split: -0.11%, +0.04%
Max live registers: 32597829 -> 32597986 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5536776 -> 5537048 (+0.00%)
Totals from 1590 (0.25% of 631423) affected shaders:
Instrs: 1146532 -> 1143292 (-0.28%); split: -0.44%, +0.16%
Cycle count: 1230843330 -> 1221263361 (-0.78%); split: -0.83%, +0.05%
Spill count: 15832 -> 15809 (-0.15%); split: -0.19%, +0.04%
Fill count: 36071 -> 35888 (-0.51%); split: -0.79%, +0.29%
Max live registers: 93529 -> 93686 (+0.17%); split: -0.00%, +0.17%
Max dispatch width: 15168 -> 15440 (+1.79%)
Tiger Lake, Ice Lake, and Skylake had similar results. (Tiger Lake shown)
Totals:
Instrs: 149564084 -> 149562467 (-0.00%); split: -0.00%, +0.00%
Cycle count: 15151701515 -> 15158290114 (+0.04%); split: -0.00%, +0.04%
Max live registers: 32249443 -> 32249620 (+0.00%); split: -0.00%, +0.00%
Max dispatch width: 5540536 -> 5540488 (-0.00%)
Totals from 1605 (0.25% of 630303) affected shaders:
Instrs: 584950 -> 583333 (-0.28%); split: -0.49%, +0.21%
Cycle count: 160926321 -> 167514920 (+4.09%); split: -0.05%, +4.14%
Max live registers: 90851 -> 91028 (+0.19%); split: -0.00%, +0.20%
Max dispatch width: 15440 -> 15392 (-0.31%)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
864268ff0d
intel/brw: Algebraic optimizations for CSEL
...
No shader-db or fossil-db changes on any Intel platform. In this MR, the
only benefit of these changes is to convert some "-a > 0" CSEL
comparisons to "a < 0" for improved readability.
v2: Add integer CSEL support
v3: Use fs_inst::resize_sources and brw_type_is_sint. Both suggested by
Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
033405cd4b
intel/brw: Combine constants and constant propagation for CSEL
...
No shader-db or fossil-db changes on any Intel platform. This ends up
begin helpful in "intel/brw: Use range analysis to optimize fsign."
v2: Add integer CSEL support
v3: Massive simplification (-20 lines!) of constant propagation
logic. Suggested by Ken. Add missing CSEL case in supports_src_as_imm.
Noticed by Ken.
v4: While MAD can mix F and HF sources on some platforms, CSEL
cannot. Found by skqp on TGL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org > [v3]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
504b742b83
intel/brw: Update CSEL source type validation
...
Gfx9 can only have F, but newer GPUs can have F, HF, *D, or *W. The
source and destination types must still match in size.
v2: Simplify the float vs integer logic. Suggested by Ken.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
3f151c03af
intel/brw: Handle fsign optimization in a NIR algebraic pass
...
This is a lot less code, and it makes it easier to experiment with other
pattern-based optimizations in the future.
The results here are nearly identical to the results I got from Ken's
"intel/brw: Make fsign (for 16/32-bit) in SSA form"... which are not
particularly good.
In this commit and in Ken's, all of the shader-db shaders hurt for
spills and fills are from Deus Ex Mankind Divided. Each shader has a
bunch of texture instructions with a single fsign between the
blocks. With the dependency on the flag removed, the scheduler puts all
of the texture instructions at the start... and there are a LOT of them.
shader-db:
All Intel platforms had similar results. (Meteor Lake shown)
total instructions in shared programs: 19647060 -> 19650207 (0.02%)
instructions in affected programs: 734718 -> 737865 (0.43%)
helped: 382 / HURT: 1984
total cycles in shared programs: 823238442 -> 822785913 (-0.05%)
cycles in affected programs: 426901157 -> 426448628 (-0.11%)
helped: 3408 / HURT: 3671
total spills in shared programs: 3887 -> 3891 (0.10%)
spills in affected programs: 256 -> 260 (1.56%)
helped: 0 / HURT: 4
total fills in shared programs: 3236 -> 3306 (2.16%)
fills in affected programs: 882 -> 952 (7.94%)
helped: 0 / HURT: 12
LOST: 37
GAINED: 34
fossil-db:
DG2 and Meteor Lake had similar results. (Meteor Lake shown)
Totals:
Instrs: 154005469 -> 154008294 (+0.00%); split: -0.00%, +0.00%
Cycle count: 17551859277 -> 17554293955 (+0.01%); split: -0.02%, +0.04%
Spill count: 142078 -> 142090 (+0.01%)
Fill count: 266761 -> 266729 (-0.01%); split: -0.02%, +0.01%
Max live registers: 32593578 -> 32593858 (+0.00%)
Max dispatch width: 5535944 -> 5536816 (+0.02%); split: +0.02%, -0.01%
Totals from 5867 (0.93% of 631350) affected shaders:
Instrs: 5475544 -> 5478369 (+0.05%); split: -0.04%, +0.09%
Cycle count: 1649032029 -> 1651466707 (+0.15%); split: -0.24%, +0.39%
Spill count: 26411 -> 26423 (+0.05%)
Fill count: 57364 -> 57332 (-0.06%); split: -0.10%, +0.04%
Max live registers: 431561 -> 431841 (+0.06%)
Max dispatch width: 49784 -> 50656 (+1.75%); split: +2.38%, -0.63%
Tiger Lake
Totals:
Instrs: 149530671 -> 149533588 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15261418953 -> 15264764921 (+0.02%); split: -0.00%, +0.03%
Spill count: 60317 -> 60316 (-0.00%); split: -0.02%, +0.01%
Max live registers: 32249201 -> 32249464 (+0.00%)
Max dispatch width: 5540608 -> 5540584 (-0.00%)
Totals from 5862 (0.93% of 630309) affected shaders:
Instrs: 4740800 -> 4743717 (+0.06%); split: -0.04%, +0.10%
Cycle count: 566531248 -> 569877216 (+0.59%); split: -0.13%, +0.72%
Spill count: 11709 -> 11708 (-0.01%); split: -0.09%, +0.08%
Max live registers: 424560 -> 424823 (+0.06%)
Max dispatch width: 50304 -> 50280 (-0.05%)
Ice Lake
Totals:
Instrs: 150499705 -> 150502608 (+0.00%); split: -0.00%, +0.00%
Cycle count: 15105629116 -> 15105425880 (-0.00%); split: -0.00%, +0.00%
Spill count: 60087 -> 60090 (+0.00%)
Fill count: 100542 -> 100541 (-0.00%); split: -0.00%, +0.00%
Max live registers: 32605215 -> 32605495 (+0.00%)
Max dispatch width: 5617752 -> 5617792 (+0.00%); split: +0.00%, -0.00%
Totals from 5882 (0.93% of 634934) affected shaders:
Instrs: 4737206 -> 4740109 (+0.06%); split: -0.04%, +0.10%
Cycle count: 598882104 -> 598678868 (-0.03%); split: -0.08%, +0.05%
Spill count: 10278 -> 10281 (+0.03%)
Fill count: 22504 -> 22503 (-0.00%); split: -0.01%, +0.01%
Max live registers: 424184 -> 424464 (+0.07%)
Max dispatch width: 50216 -> 50256 (+0.08%); split: +0.25%, -0.18%
Skylake
Totals:
Instrs: 139092612 -> 139095257 (+0.00%); split: -0.00%, +0.00%
Cycle count: 14533550285 -> 14533544716 (-0.00%); split: -0.00%, +0.00%
Spill count: 58176 -> 58172 (-0.01%)
Fill count: 95877 -> 95796 (-0.08%)
Max live registers: 31924594 -> 31924874 (+0.00%)
Max dispatch width: 5484568 -> 5484552 (-0.00%); split: +0.00%, -0.00%
Totals from 5789 (0.93% of 625512) affected shaders:
Instrs: 4481987 -> 4484632 (+0.06%); split: -0.04%, +0.10%
Cycle count: 578310124 -> 578304555 (-0.00%); split: -0.05%, +0.05%
Spill count: 9248 -> 9244 (-0.04%)
Fill count: 19677 -> 19596 (-0.41%)
Max live registers: 415340 -> 415620 (+0.07%)
Max dispatch width: 49720 -> 49704 (-0.03%); split: +0.10%, -0.13%
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
cd343fb9ac
intel/brw: Add support for fcsel opcodes
...
Don't enable nir_opt_algebraic to generate these opcodes yet.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
d51ad9f4e0
intel/brw: Use fs_inst::resize_sources in brw_fs_opt_algebraic
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
11c6b6c102
intel/elk: Remove dsign optimization
...
This bit from the comment should have been a big red flag:
There are currently zero instances of fsign(double(x))*IMM in
shader-db or any test suite, so it is hard to care at this time.
The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.
Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.
Ironically, the vec4 implementation is correct. The odds of encountering
an application that is performace limited by dsign performance in vertex
processing stages on Ivy Bridge or Haswell is infinitesimal.
No shader-db changes on any Intel platform.
v2: Delete 's' in emit_fsign as it is now unused.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org > [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Ian Romanick
ded8690336
intel/brw: Remove dsign optimization
...
This bit from the comment should have been a big red flag:
There are currently zero instances of fsign(double(x))*IMM in
shader-db or any test suite, so it is hard to care at this time.
The implementation of that path was incorrect. The XOR instructions
should be predicated like the OR instruction in the non-multiplication
path. As a result, dsign(zero_value) * x will not produce the correct
result.
Instead of fixing this code that is never exercised by anything, replace
it with the simple lowering in NIR.
No shader-db or fossil-db changes on any Intel platform.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29095 >
2024-05-14 01:28:20 +00:00
Caio Oliveira
b8dbd64267
intel/brw: Fix commas when dumping instructions
...
Some commas were being skipped, according to history as an attempt
to elide BAD_FILEs, but we still print them, so be consistent. Also
for instructions without any sources, the trailing comma was always
being printed. Fix that too.
Example of instruction output before the change
halt_target(8) (null):UD,
send(8) (mlen: 1) (EOT) (null):UD, 0u, 0u, g126:UD(null):UD NoMask
and after it
halt_target(8) (null):UD
send(8) (mlen: 1) (EOT) (null):UD, 0u, 0u, g126:UD, (null):UD NoMask
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114 >
2024-05-11 02:17:57 +00:00
Caio Oliveira
c9fe20fdf1
intel/brw: Use vNN instead of vgrfNN when printing instructions
...
Reduce the noise in the shader dump output.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114 >
2024-05-11 02:17:56 +00:00
Caio Oliveira
3a081106b0
intel/brw: Hide register pressure information in dumps
...
It was the default to show register pressure for each instruction,
but it gets in the way of cleaner diffs before/after an optimization pass.
Add INTEL_DEBUG=reg-pressure option to show it again.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114 >
2024-05-11 02:17:56 +00:00
Caio Oliveira
866b1245e9
intel/brw: Don't print IP as part of the dump
...
The sequential IP cause noise when diffing before/after a pass that
either add or remove instructions.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29114 >
2024-05-11 02:17:56 +00:00
Lionel Landwerlin
fd47f90d37
brw: drop dependency on libintel_common
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11136
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29128 >
2024-05-11 01:52:01 +00:00
Lionel Landwerlin
36c043e2eb
intel: move debug identifier out of libintel_dev
...
The debug identifier is put into the captured buffers for error
capture. This helps us figure out what version of the driver people
are running when encountering a GPU hang. This identifier has the
git-sha1 + driver name.
libintel_dev is also a dependency of the compiler so any change to the
git-sha1 also triggers recompile which we want to avoid.
This changes moves the debug identifier to src/intel/common which
drivers already depend on, so the compiler is not affected anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11136
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29128 >
2024-05-11 01:52:01 +00:00
Lionel Landwerlin
d1c01e256d
brw: add more condition for reducing sampler simdness
...
Running
KHR-GL46.sparse_texture_clamp_tests.SparseTextureClampLookupColor test
with Zink on Anv we run into an assert :
assert(inst->mlen <= MAX_SAMPLER_MESSAGE_SIZE * reg_unit(devinfo));
Turns out we've not covered all the cases in the SIMD lowering.
It's a bit of a shame to have both files reproduce the same logic.
Will try to think of a better way to extract the layout of the a send
message but that'll be a much bigger rework.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Cc: mesa-stable
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29118 >
2024-05-10 19:40:00 +00:00
Alyssa Rosenzweig
90866bc58c
anv,hasvk: use common stype debug
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29009 >
2024-05-10 18:49:38 +00:00
Sagar Ghuge
69fc7ee622
intel/disasm: Fix cache load/store disassembly for URB messages
...
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28868 >
2024-05-09 19:45:18 +00:00
Faith Ekstrand
91b62e9868
anv: Use spirv_capabilities for the float64 shader
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Iván Briano <ivan.briano@intel.com >
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28905 >
2024-05-09 01:14:23 +00:00
Faith Ekstrand
9d5b4a4ffd
intel/kernel: Use the new capabilities struct
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Iván Briano <ivan.briano@intel.com >
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28905 >
2024-05-09 01:14:23 +00:00
Faith Ekstrand
ce2946ae0f
vulkan: Set SPIR-V caps from supported features
...
Any drivers which use vk_spirv_to_nir() now no longer need to build a
caps table manually.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Iván Briano <ivan.briano@intel.com >
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28905 >
2024-05-09 01:14:23 +00:00
Faith Ekstrand
c1eaa03904
spirv: Drop the SubgroupUniformControlFlow check
...
It's just a vtn_fail_if() and there's no actual cap for it. It's not
really gaining us much to have the check.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Iván Briano <ivan.briano@intel.com >
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28905 >
2024-05-09 01:14:22 +00:00
Lionel Landwerlin
28a0f98123
intel/tools: add README file
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594 >
2024-05-08 22:50:47 +00:00
Tvrtko Ursulin
bab52763f4
intel/hang_replay: fix batch address
...
Also capture all buffers so that we can compare replay run with the
original error state.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594 >
2024-05-08 22:50:47 +00:00
Lionel Landwerlin
a9f1151de2
intel/hang_replay: use hw image param
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594 >
2024-05-08 22:50:47 +00:00
Lionel Landwerlin
4d69870071
intel/hang_replay: use newer API of i915 execbuffer
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27594 >
2024-05-08 22:50:47 +00:00
Lionel Landwerlin
665cad6408
anv: fix ycbcr plane indexing with indirect descriptors
...
We need to add the plane index to compute the address from which to
load the descriptor (anv_sampled_image_descriptor in this case).
This was likely broken before we added direct descriptor support so
that gets a stable backport.
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11125
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29111 >
2024-05-08 21:51:49 +00:00
Tapani Pälli
c225f89d34
anv: skip gfx push constants alloc optimization on gfx9/11
...
Always reallocate in cmd_buffer_flush_gfx_push_constants like was done
before the the optimization got introduced.
Fixes: 62d96a6546 ("anv: add dirty tracking for push constant data")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11064
Signed-off-by: Tapani Pälli <tapani.palli@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Tested-by: Mark Janes <markjanes@swizzler.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28999 >
2024-05-08 17:21:26 +00:00
José Roberto de Souza
73188a4590
intel/perf: Add function to open perf stream
...
This will make easy to add Xe KMD support and reduce code duplication.
No changes in behavior are expected here.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
d27dcb815e
intel/perf: Add and use a function to return platform OA format
...
The platform version check to return the OA format was duplicated
in a few places, so adding a function and dropping this duplication.
While at it, already making it future proof for Xe KMD support and
split i915 specific code to its own file.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
137021fbe0
hasvk: Free intel_perf_config when destroying physical device
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
a941ce746a
anv: Free intel_perf_config when destroying physical device
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
4b179e7bea
intel/ds: Nuke ralloc_ctx and ralloc_cfg
...
Now that perf config and context are freed we don't need this rallocs
contexts.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
6c3ebff569
intel/ds: Free perf config and context
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
2cecf3e8a8
intel/perf: Add intel_perf_free_context()
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
ebe8d2f9ea
intel/perf: Add intel_perf_free()
...
There was no function to free resources allocated in intel_perf_config
or it self.
Other callers will be added in separated patches.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
José Roberto de Souza
a9a53c914d
intel/perf: Store pointer intel_device_info to in intel_perf_config
...
This will reduce host memory usage a bit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29077 >
2024-05-07 21:44:34 +00:00
Oskar Viljasaar
2d575034f2
hasvk: switch to use runtime physical device properties infrastructure
...
Remove the GPDP and GPDP2 entrypoints, and fill the properties
at device initialization time instead.
Move DRM master major/minor gathering before get_properties() and WSI
init, as the latter uses the results gathered by the former.
Reviewed-by: Julia Tatz <tatz.j@northeastern.edu >
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27717 >
2024-05-07 08:45:18 +00:00
Oskar Viljasaar
55967a411d
anv: Move completely over to common runtime GetPhysicalDeviceProperties2
...
The runtime grew support for VkPhysicalDevicePresentationPropertiesANDROID,
so we can use that now and get rid of anv_GetPhysicalDeviceProperties2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27717 >
2024-05-07 08:45:17 +00:00
Lionel Landwerlin
8c1cc405d3
anv: VK_EXT_legacy_vertex_attributes
...
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29070 >
2024-05-07 08:16:20 +03:00
Sagar Ghuge
e32828f5fc
intel/compiler: Fix destination type for CMP/CMPN
...
For CMP/CMPN, use src0 type if destination is null otherwise get the
src0 type register with destination register size.
This fixes dEQP-VK.glsl.builtin_var.frontfacing.* tests cases on Xe2+.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com >
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28679 >
2024-05-06 21:46:18 +00:00
Lionel Landwerlin
610a7c84c3
anv: move empty_vs_input to physical device
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29057 >
2024-05-06 09:20:01 +00:00
Lionel Landwerlin
725397759a
anv: move device initialization as the last step of vkCreateDevice
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29057 >
2024-05-06 09:20:01 +00:00
Lionel Landwerlin
63c4d24f7d
anv: avoid requirement to put flush_data as first field
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29057 >
2024-05-06 09:20:01 +00:00
Lionel Landwerlin
ae6d20815a
anv: fix leak of custom border colors
...
Inside a HAVE_VALGRIND section.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: 4dad2a4a6f ("anv: enable shader border color capture/replay")
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29057 >
2024-05-06 09:20:01 +00:00
Lionel Landwerlin
e260b16b11
anv: fixup alloc failure handling in reserved_array_pool
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: 806281f61f ("anv: add a new reserved pool for capture/release")
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Reviewed-by: José Roberto de Souza <jose.souza@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29057 >
2024-05-06 09:20:01 +00:00
Ian Romanick
0fa17962d6
intel/elk: Fix optimize_extract_to_float for i2f of unsigned extract
...
Fixes fs-uint-to-float-of-extract-int8.shader_test and
fs-uint-to-float-of-extract-int16.shader_test added by piglit!883.
v2: Expand the comment explaining the potential problem. Suggested by
Caio.
Fixes: e6022281f2 ("intel/elk: Rename files to use elk prefix")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891 >
2024-05-03 15:01:43 -07:00
Ian Romanick
fc2360167c
intel/brw: Avoid optimize_extract_to_float when it will just be undone later
...
v2: Add bspec quotation. Suggested by Caio. With better understand of
the restriction, only apply on DG2 and newer platforms.
shader-db:
DG2 and Meteor Lake had similar results. (DG2 shown)
total instructions in shared programs: 19659363 -> 19659360 (<.01%)
instructions in affected programs: 2484 -> 2481 (-0.12%)
helped: 6 / HURT: 1
total cycles in shared programs: 823445738 -> 823432524 (<.01%)
cycles in affected programs: 2619836 -> 2606622 (-0.50%)
helped: 48 / HURT: 63
fossil-db:
DG2 and Meteor Lake had similar results. (DG2 shown)
Totals:
Instrs: 154015863 -> 153987806 (-0.02%); split: -0.02%, +0.00%
Cycle count: 17552172994 -> 17562047866 (+0.06%); split: -0.13%, +0.19%
Spill count: 142124 -> 141544 (-0.41%); split: -0.54%, +0.13%
Fill count: 266803 -> 266046 (-0.28%); split: -0.38%, +0.09%
Scratch Memory Size: 10266624 -> 10271744 (+0.05%); split: -0.02%, +0.07%
Max live registers: 32592428 -> 32592393 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5535944 -> 5535912 (-0.00%); split: +0.00%, -0.00%
Totals from 41887 (6.63% of 631367) affected shaders:
Instrs: 32971032 -> 32942975 (-0.09%); split: -0.10%, +0.01%
Cycle count: 3892086217 -> 3901961089 (+0.25%); split: -0.60%, +0.85%
Spill count: 105669 -> 105089 (-0.55%); split: -0.72%, +0.18%
Fill count: 206459 -> 205702 (-0.37%); split: -0.49%, +0.12%
Scratch Memory Size: 7766016 -> 7771136 (+0.07%); split: -0.03%, +0.09%
Max live registers: 3230515 -> 3230480 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 337232 -> 337200 (-0.01%); split: +0.00%, -0.01%
No shader-db or fossil-db changes on any earlier Intel platforms.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891 >
2024-05-03 15:01:43 -07:00
Ian Romanick
bf5d82654a
intel/brw: Fix optimize_extract_to_float for i2f of unsigned extract
...
Fixes fs-uint-to-float-of-extract-int8.shader_test and
fs-uint-to-float-of-extract-int16.shader_test added by piglit!883.
No shader-db or fossil-db changes on any Intel platform.
v2: Expand the comment explaining the potential problem. Suggested by
Caio.
Fixes: 29ce110be6 ("i965/fs: Remove extract virtual opcodes.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891 >
2024-05-03 15:01:43 -07:00