Commit Graph

190776 Commits

Author SHA1 Message Date
Ian Romanick
fc2360167c intel/brw: Avoid optimize_extract_to_float when it will just be undone later
v2: Add bspec quotation. Suggested by Caio. With better understand of
the restriction, only apply on DG2 and newer platforms.

shader-db:

DG2 and Meteor Lake had similar results. (DG2 shown)
total instructions in shared programs: 19659363 -> 19659360 (<.01%)
instructions in affected programs: 2484 -> 2481 (-0.12%)
helped: 6 / HURT: 1

total cycles in shared programs: 823445738 -> 823432524 (<.01%)
cycles in affected programs: 2619836 -> 2606622 (-0.50%)
helped: 48 / HURT: 63

fossil-db:

DG2 and Meteor Lake had similar results. (DG2 shown)

Totals:
Instrs: 154015863 -> 153987806 (-0.02%); split: -0.02%, +0.00%
Cycle count: 17552172994 -> 17562047866 (+0.06%); split: -0.13%, +0.19%
Spill count: 142124 -> 141544 (-0.41%); split: -0.54%, +0.13%
Fill count: 266803 -> 266046 (-0.28%); split: -0.38%, +0.09%
Scratch Memory Size: 10266624 -> 10271744 (+0.05%); split: -0.02%, +0.07%
Max live registers: 32592428 -> 32592393 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 5535944 -> 5535912 (-0.00%); split: +0.00%, -0.00%

Totals from 41887 (6.63% of 631367) affected shaders:
Instrs: 32971032 -> 32942975 (-0.09%); split: -0.10%, +0.01%
Cycle count: 3892086217 -> 3901961089 (+0.25%); split: -0.60%, +0.85%
Spill count: 105669 -> 105089 (-0.55%); split: -0.72%, +0.18%
Fill count: 206459 -> 205702 (-0.37%); split: -0.49%, +0.12%
Scratch Memory Size: 7766016 -> 7771136 (+0.07%); split: -0.03%, +0.09%
Max live registers: 3230515 -> 3230480 (-0.00%); split: -0.00%, +0.00%
Max dispatch width: 337232 -> 337200 (-0.01%); split: +0.00%, -0.01%

No shader-db or fossil-db changes on any earlier Intel platforms.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891>
2024-05-03 15:01:43 -07:00
Ian Romanick
bf5d82654a intel/brw: Fix optimize_extract_to_float for i2f of unsigned extract
Fixes fs-uint-to-float-of-extract-int8.shader_test and
fs-uint-to-float-of-extract-int16.shader_test added by piglit!883.

No shader-db or fossil-db changes on any Intel platform.

v2: Expand the comment explaining the potential problem. Suggested by
Caio.

Fixes: 29ce110be6 ("i965/fs: Remove extract virtual opcodes.")
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/27891>
2024-05-03 15:01:43 -07:00
Eric Engestrom
82dab8691e ci: uprev mold to 2.31.0
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29043>
2024-05-03 19:24:03 +00:00
Gert Wollny
7de8a01087 mesa/st: don't use base shader serialization when uniforms are not packed
When loading the base shader serialization there is a discrepancy
between the state parameters that may already have been optimized,
because after storing the serialization the shader went through
st_finalize_nir, and _mesa_optimize_state_parameters was run, so
that original state parameters may have been optimized and replaced
by new parameters.

After get_nir_shader is called, the original state parameters are
re-added - in addition to the optimized parameters. This lead to
an bug with the uniform offsets when lowering uniforms to UBOs.

Therefore, as a hotfix for drivers that don't support packed
uniforms, ignore the base serialization and use the
serialization obtained after st_finalize_nir was run. With that
the problem can be avoided.

Fixes: 5eb0136a3c
    mesa/st: when creating draw shader variants,
    use the base nir and skip driver opts

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10881

v2: reorder conditional evaluation for better readability (zmike)
v3: revert c72bb8de7 ("r300: mark new fails") (Pavel Ondračka)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28994>
2024-05-03 18:41:36 +00:00
Dmitry Osipenko
087e9a96d1 venus: make cross-device optional
Cross-device is a virtio-gpu feature that enables sharing host blob
dma-bufs with other virtio devices, like virtio-wl or virtio-video.
This feature is mainly used by ChromeOS and not required if there is
no dma-buf sharing. Venus has a hard requirement for the cross-device
feature.

Qemu doesn't support cross-device. Relax cross-device feature requirement
by making it optional, allowing Venus to work on Qemu.

Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29040>
2024-05-03 17:54:33 +00:00
Gert Wollny
811ed62865 zink/kopper: Wait for last QueuePresentKHR to finish before acquiring for readback
When a job is submitted to the flush_queue the resource dt_idx is reset,
and if a readback is requested then we have to make sure that the
corresponding kopper_preset has finished before we can acquire the image
for readback, so wait for the according fence in this case.

This fixes the validation error UNASSIGNED-Threading-MultipleThreads-Write
    triggered by piglit "read-front" lavapipe.

    Fixes: 8ade5588e3
        zink: add kopper api

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28127>
2024-05-03 15:19:11 +00:00
Collabora's Gfx CI Team
fd392745c2 Uprev Piglit to 7aa7bc1b01d57b4b091c4fc82a94a6ff47f38ebf
f7ece74a10...7aa7bc1b01

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28835>
2024-05-03 13:56:10 +00:00
Daniel Schürmann
6b4b044739 nir/opt_loop: add loop peeling optimization
This optimization turns:

     loop {
        do_work_1();
        if (cond) {
           break;
        } else {
        }
        do_work_2();
     }

 into:

     do_work_1();
     if (cond) {
     } else {
        loop {
           do_work_2();
           do_work_1();
           if (cond) {
              break;
           } else {
           }
        }
     }

RADV GFX11:
Totals from 925 (1.17% of 79395) affected shaders:
MaxWaves: 20583 -> 20455 (-0.62%)
Instrs: 5260489 -> 5361418 (+1.92%); split: -0.63%, +2.55%
CodeSize: 26965388 -> 27501104 (+1.99%); split: -0.48%, +2.47%
VGPRs: 70304 -> 70712 (+0.58%)
SpillSGPRs: 2163 -> 2159 (-0.18%)
Scratch: 51200 -> 69632 (+36.00%)
Latency: 36404844 -> 34542213 (-5.12%); split: -5.51%, +0.39%
InvThroughput: 6628474 -> 6384249 (-3.68%); split: -4.19%, +0.50%
VClause: 124997 -> 127008 (+1.61%); split: -0.43%, +2.04%
SClause: 121774 -> 120799 (-0.80%); split: -3.21%, +2.40%
Copies: 357048 -> 360850 (+1.06%); split: -0.62%, +1.68%
Branches: 171985 -> 168082 (-2.27%); split: -3.61%, +1.34%
PreSGPRs: 59812 -> 60088 (+0.46%); split: -0.20%, +0.66%
PreVGPRs: 60325 -> 60586 (+0.43%); split: -0.29%, +0.72%
VALU: 2882263 -> 2951373 (+2.40%); split: -0.37%, +2.77%
SALU: 636373 -> 640091 (+0.58%); split: -0.87%, +1.46%
VMEM: 200059 -> 204612 (+2.28%); split: -0.09%, +2.36%
SMEM: 173328 -> 174343 (+0.59%); split: -2.34%, +2.92%
VOPD: 1064 -> 898 (-15.60%); split: +0.09%, -15.70%

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150>
2024-05-03 13:01:29 +00:00
Daniel Schürmann
3a2226be47 nir/opt_if: don't split ALU of phi into otherwise empty blocks
RADV GFX11:
Totals from 1566 (1.97% of 79395) affected shaders:
Instrs: 5663011 -> 5638219 (-0.44%); split: -0.45%, +0.01%
CodeSize: 29760844 -> 29639756 (-0.41%); split: -0.42%, +0.01%
SpillSGPRs: 1750 -> 1603 (-8.40%)
Latency: 62963520 -> 62831280 (-0.21%); split: -0.22%, +0.01%
InvThroughput: 10501171 -> 10490116 (-0.11%); split: -0.11%, +0.00%
VClause: 127928 -> 128054 (+0.10%); split: -0.01%, +0.11%
SClause: 152635 -> 152956 (+0.21%); split: -0.08%, +0.29%
Copies: 476865 -> 461288 (-3.27%); split: -3.28%, +0.02%
Branches: 169038 -> 168104 (-0.55%); split: -0.56%, +0.00%
PreSGPRs: 88851 -> 88356 (-0.56%); split: -0.58%, +0.02%
PreVGPRs: 114565 -> 114559 (-0.01%); split: -0.01%, +0.01%
VALU: 3158023 -> 3157387 (-0.02%); split: -0.03%, +0.01%
SALU: 615028 -> 595360 (-3.20%); split: -3.21%, +0.01%
VMEM: 219891 -> 218287 (-0.73%); split: -0.74%, +0.01%
SMEM: 206956 -> 206484 (-0.23%)

Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150>
2024-05-03 13:01:29 +00:00
Daniel Schürmann
e74f5b16e3 nir/loop_analyze: adjust negative (or huge) iteration count check for bit size
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150>
2024-05-03 13:01:29 +00:00
Daniel Schürmann
52efb6cc83 panfrost: skip gles-3.0-transform-feedback-uniform-buffer-object on Mali G52 and G57
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150>
2024-05-03 13:01:29 +00:00
Daniel Schürmann
ce51e48cb6 radv: move nir_opt_dead_cf() before nir_opt_loop()
This can avoid unnecessary CF transformations.

Totals from 557 (0.70% of 79395) affected shaders: (GFX11)
MaxWaves: 12020 -> 12028 (+0.07%)
Instrs: 4237096 -> 4234110 (-0.07%); split: -0.08%, +0.01%
CodeSize: 21731952 -> 21719556 (-0.06%); split: -0.06%, +0.00%
VGPRs: 40492 -> 40480 (-0.03%)
SpillSGPRs: 467 -> 416 (-10.92%)
Latency: 25704891 -> 25684156 (-0.08%); split: -0.10%, +0.02%
InvThroughput: 5545224 -> 5542998 (-0.04%); split: -0.06%, +0.02%
VClause: 107850 -> 107838 (-0.01%); split: -0.02%, +0.01%
SClause: 90450 -> 90440 (-0.01%); split: -0.05%, +0.04%
Copies: 292714 -> 291354 (-0.46%); split: -0.50%, +0.03%
Branches: 133630 -> 133617 (-0.01%); split: -0.01%, +0.00%
PreSGPRs: 42299 -> 42104 (-0.46%); split: -0.48%, +0.02%
PreVGPRs: 36396 -> 36393 (-0.01%); split: -0.02%, +0.01%
VALU: 2321811 -> 2321192 (-0.03%); split: -0.03%, +0.01%
SALU: 505001 -> 503289 (-0.34%); split: -0.35%, +0.01%
SMEM: 132622 -> 132640 (+0.01%)

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150>
2024-05-03 13:01:29 +00:00
Daniel Schürmann
4453971fbb radv: mark nir_opt_loop() as not idempotent
This pass misses opportunities because foreach_list_typed_safe()
might point to disconnected cf_nodes after some optimization got
applied. No fossil-db changes.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28150>
2024-05-03 13:01:29 +00:00
Samuel Pitoiset
2e38cc06f8 radv/ci: document a recent regression on GFX6-8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29037>
2024-05-03 10:11:24 +00:00
Eric Engestrom
dd171d21dd vc4/ci: add fails seen overnight
Fixes: 03474500b5 ("vc4/ci: update results")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29033>
2024-05-03 08:47:21 +00:00
Pavel Ondračka
0c96b03fcf r300: better packing for immediates
How this works? First we check which immediates are used as vectors,
i.e., have any reads that are using 2 or more channels. Such immdeiates
will be places in a free slots (but only the specific channels that are
used in the vector). This way we don't have to worry about swizzling
restrictions. The remaining scalar immediates will be checked for
duplicates and placed in free slots, including any empty slots in
previously places vector immediates (any swizzle is valid for scalars).

RV410:
total instructions in shared programs: 98883 -> 98905 (0.02%)
instructions in affected programs: 15414 -> 15436 (0.14%)
helped: 100
HURT: 102
total presub in shared programs: 2235 -> 2235 (0.00%)
presub in affected programs: 608 -> 608 (0.00%)
helped: 51
HURT: 72
total omod in shared programs: 419 -> 418 (-0.24%)
omod in affected programs: 15 -> 14 (-6.67%)
helped: 3
HURT: 3
total temps in shared programs: 15698 -> 15692 (-0.04%)
temps in affected programs: 952 -> 946 (-0.63%)
helped: 46
HURT: 37
total consts in shared programs: 84458 -> 83856 (-0.71%)
consts in affected programs: 14648 -> 14046 (-4.11%)
helped: 499
HURT: 0
total cycles in shared programs: 156476 -> 156493 (0.01%)
cycles in affected programs: 22532 -> 22549 (0.08%)
helped: 100
HURT: 102
LOST:   shaders/ck2/157.shader_test FS
GAINED: shaders/ck2/160.shader_test FS
GAINED: shaders/tesseract/395.shader_test FS

RV530:
total instructions in shared programs: 119543 -> 119612 (0.06%)
instructions in affected programs: 27435 -> 27504 (0.25%)
helped: 118
HURT: 183
total presub in shared programs: 7257 -> 7111 (-2.01%)
presub in affected programs: 1856 -> 1710 (-7.87%)
helped: 121
HURT: 48
total omod in shared programs: 426 -> 427 (0.23%)
omod in affected programs: 5 -> 6 (20.00%)
helped: 1
HURT: 2
total temps in shared programs: 16784 -> 16779 (-0.03%)
temps in affected programs: 392 -> 387 (-1.28%)
helped: 29
HURT: 17
total consts in shared programs: 93198 -> 92667 (-0.57%)
consts in affected programs: 14577 -> 14046 (-3.64%)
helped: 451
HURT: 0
total cycles in shared programs: 186649 -> 186590 (-0.03%)
cycles in affected programs: 26306 -> 26247 (-0.22%)
helped: 125
HURT: 111

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
2024-05-03 09:28:56 +02:00
Pavel Ondračka
11ad056ee9 r300: compact scalar uniforms into empty slots
We are not doing this on R5xx unless we have more than 200 constants,
because emitting constants one by one will add extra overhead at emit
time which we want to avoid if possible.

RV410:
total instructions in shared programs: 98778 -> 98703 (-0.08%)
instructions in affected programs: 7106 -> 7031 (-1.06%)
helped: 80
HURT: 25
total presub in shared programs: 2266 -> 2227 (-1.72%)
presub in affected programs: 134 -> 95 (-29.10%)
helped: 22
HURT: 10
total temps in shared programs: 15662 -> 15660 (-0.01%)
temps in affected programs: 330 -> 328 (-0.61%)
helped: 16
HURT: 13
total consts in shared programs: 85632 -> 84400 (-1.44%)
consts in affected programs: 6646 -> 5414 (-18.54%)
helped: 617
HURT: 0
total cycles in shared programs: 156305 -> 156234 (-0.05%)
cycles in affected programs: 14167 -> 14096 (-0.50%)
helped: 79
HURT: 28
LOST:   shaders/ck2/160.shader_test FS
GAINED: shaders/ck2/157.shader_test FS
GAINED: shaders/tropics/249.shader_test FS
GAINED: shaders/tropics/252.shader_test FS

RV530:
total consts in shared programs: 93209 -> 93198 (-0.01%)
consts in affected programs: 72 -> 61 (-15.28%)
helped: 6
HURT: 0

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.comm>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
2024-05-03 09:28:56 +02:00
Pavel Ondračka
5d3483bfe4 r300: switch to a new constant remap table format
Instead of just moving around constants as full vec4, we will now have
the flexibility to shuffle scalars around. However, this commit just
prepares the infrestructure and converts to it, while the constant
elimination logiic reamins the same, i.e., we only remove constant if it
is fully unused and there is no constant compaction whatsoever.

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
2024-05-03 09:28:56 +02:00
Pavel Ondračka
71761e2117 r300: move dead constants pass earlier for vertex shaders
We need to put it before source conflict resolve because we will be
shuffling immediates around, so we can introduce new conflicts (albeit
in general there should be less conflicts instead).

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
2024-05-03 09:28:55 +02:00
Pavel Ondračka
a0ee1ac2b7 r300: replace constant size field with usemask
To have more flexibility in case there are some empty slots (e.g., if
the specific slot was converted to inline constant or constant swizzle).

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
2024-05-03 09:28:55 +02:00
Samuel Pitoiset
d71d189790 radv: add a new dirty state for emitting the color output state
SPI_SHADER_COL_FORMAT/CB_SHADER_MASK are used slightly differently
for PS epilogs, shader objects and monolithic graphics pipelines.

This introduces a new state that will allow us to emit these two
registers in only place. The main motivation is for depth-only RB+
support and for tracking context registers in the cmdbuf.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976>
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
66d4188ec5 radv: store cb_shader_mask for fragment shaders and epilogs
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976>
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
0ce1bfc040 radv: rename col_format_non_compacted to spi_shader_col_format
This is always the non-compacted format because it's compacted right
before it's emitted. This looks much cleaner to me.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976>
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
199f521804 radv: compact SPI_SHADER_COL_FORMAT as late as possible
This will allow us to do more cleanups because this thing is a complete
mess.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976>
2024-05-03 06:29:05 +00:00
Samuel Pitoiset
e1483d022b radv: clear unwritten color attachments for monolithic PS earlier
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976>
2024-05-03 06:29:04 +00:00
Samuel Pitoiset
3b41fbd4b8 radv: precompute compute/task shader register values
To make emission faster.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29014>
2024-05-03 06:07:46 +00:00
Alyssa Rosenzweig
0549649bcf vulkan: optimize vk_dynamic_graphics_state_any_dirty
For drivers using the new state tracking, __bitset_test_range can be
surprisingly hot because we have a lot of dirty bits and __bitset_test_range has
to handle lots of special cases. __bitset_is_empty does not have to worry about
those special cases so can be much faster.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29008>
2024-05-03 02:22:28 +00:00
Colin Marc
602c62a273 vulkan/video: correctly set sub-layer ordering in H.265 VPS/SPS
The relevant sections here are F.7.3.2.1 and F.7.3.2.2.1. The code was
incorrectly assuming sub_layer_ordering_info_present_flag is always 1.

Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29001>
2024-05-02 23:51:56 +00:00
Colin Marc
b613566faf vulkan/video: generate profile_tier_level structure correctly
Per section 7.7.3, the structure includes additional optional layer-specific
information, which is padded if left unset, based on the value of
max_sub_layers_minus1. The vulkan input structs have no way to specify this
per-layer information, so we just need the padding.

Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29001>
2024-05-02 23:51:56 +00:00
Kenneth Graunke
8d983b3425 intel/nir: Set src_type on TCS quads workaround store_output
We weren't setting this and now it's validated, causing assert failures.

Fixes: 1632948a76 ("nir: validate src_type of store_output intrinsics, require bit_size >= 16")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11107
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29027>
2024-05-02 13:58:21 -07:00
Thomas H.P. Andersen
42ed28a726 nvk: advertise EXT_depth_range_unrestricted
This enables EXT_depth_range_unrestricted from VOLTA_A

Test of dEQP-VK.*depth_range_unrestricted* on TU104 shows:

Test run totals:
  Passed:        14212/14212 (100.0%)
  Failed:        0/14212 (0.0%)
  Not supported: 0/14212 (0.0%)
  Warnings:      0/14212 (0.0%)
  Waived:        0/14212 (0.0%)

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28958>
2024-05-02 20:21:00 +00:00
Faith Ekstrand
5d37a5c7b6 nvk: Only clip Z with the guardband
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28958>
2024-05-02 20:21:00 +00:00
Faith Ekstrand
14d749f13d nak: Don't saturate depth writes
This is unnecessary in Vulkan and prevents unrestricted depth.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28958>
2024-05-02 20:21:00 +00:00
Derek Foreman
c6dc61775f wsi/wayland: Add tracepoint in wsi_wl_swapchain_wait_for_present
We can spend a lot of time in wait_for_present, making it an interesting
trace point.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:27 +00:00
Derek Foreman
c4b432f83e wsi/wayland: Add a perfetto track for image presentation
Now that we have flows, custom tracks, and timestamps, we can have a track
for wayland buffer presentation times, tagged with appropriate flow ids
so we can follow when a buffer was acquired through to the time it was
displayed.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
e9596149cf perfetto: Add some functions for timestamped events
This can be useful if we know when an event happened, but our code isn't
running at that time (such as reporting when an image was presented in
the wayland wsi).

We can't really mix these with events that we log at the current time,
because there could be overlap, so also add a function for creating
custom tracks.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
57c03fe49c wsi/wayland: Add latency information to perfetto profiling
When using presentation feedback, we know when an image is presented. Use
this and the time we submit the image to calculate the delay in ms
between submission and display.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
60eb27591f perfetto: Add simple support for counters
Perfetto can report time varying numberic values (counters) in tracks.

Add some simple functions to use this.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
34273bc4ed wsi/wayland: Add timing debugging
If perfetto is tracing, always send presentation feedback requests
for image presentations.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
23b4fb2b4c wsi/wayland: Add flow id to presentation feedback
When we use waitforpresent we use presentation feedback. We can plumb
the flow ids into this to have slightly more expressive flows.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
5ba7b3f40c wsi/wayland: Add perfetto flows to image acquisition and presentation
Generate flow ids for slightly more informative swapchain profiling.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
16b8dbedfa perfetto: Add flows
Perfetto can assign flow ids to events, which can be used to connect
related events in tracks when they share the same id.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
8b460cf9b5 egl/wayland: Use loader_wayland_dispatch
This is just to get event tracing in perfetto, as the wrapper calls
MESA_TRACE_FUNC().

It can be useful to see how long and when we stall in wayland dispatch.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Derek Foreman
90effcceab wsi/wayland: refactor wayland dispatch
Add a thin wrapper around the wayland dispatch code for no reason other
than to add MESA_TRACE_FUNC so we can see where wayland dispatch delays
are.

Move this to loader so we can use it in the wayland egl code later.

Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Sebastian Wick
1062b3e813 vulkan/wsi/wayland: refactor wsi_wl_swapchain_wait_for_present
Split it into a part that dispatches and a part that waits for the
requested id.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
2024-05-02 19:37:26 +00:00
Philipp Zabel
0554d11f1e etnaviv/nn: Pipe through input/accumulation buffer depth from hwdb
Stop hard coding accumulation buffer depth and input buffer depth to the
values for VIPNano-QI. This is allows to calculate correct tile sizes
for other cores.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28956>
2024-05-02 19:17:58 +00:00
Connor Abbott
e82d70d472 freedreno/a7xx: Add A7XX_HLSQ_DP_STR location from kgsl
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29025>
2024-05-02 18:48:24 +00:00
Connor Abbott
37f9a7a9c2 freedreno/a7xx: Add AQE-related registers from kgsl
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29025>
2024-05-02 18:48:23 +00:00
Amber
bed4ad26ad tu: Disable depth and stencil tests when attachment state requires it
The depth and stencil tests should be disabled in case the respective
attachments are null in VkRenderingInfo or their format is undefined in
VkPipelineRenderingCreateInfo, additionally the stencil test should be
disabled in case the depth/stencil attachment has no stencil component.

Fixes:
dEQP-VK.pipeline.*.stencil.no_stencil_att.*.d24_unorm_s8_uint
dEQP-VK.pipeline.*.stencil.no_stencil_att.*.x8_d24_unorm_pack32

Signed-off-by: Amber Harmonia <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28556>
2024-05-02 18:18:52 +00:00
Juan A. Suarez Romero
03474500b5 vc4/ci: update results
Add new crashes caused by 1632948a76 ("nir: validate src_type of
store_output intrinsics, require bit_size >= 16").

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29024>
2024-05-02 16:45:07 +00:00