How this works? First we check which immediates are used as vectors,
i.e., have any reads that are using 2 or more channels. Such immdeiates
will be places in a free slots (but only the specific channels that are
used in the vector). This way we don't have to worry about swizzling
restrictions. The remaining scalar immediates will be checked for
duplicates and placed in free slots, including any empty slots in
previously places vector immediates (any swizzle is valid for scalars).
RV410:
total instructions in shared programs: 98883 -> 98905 (0.02%)
instructions in affected programs: 15414 -> 15436 (0.14%)
helped: 100
HURT: 102
total presub in shared programs: 2235 -> 2235 (0.00%)
presub in affected programs: 608 -> 608 (0.00%)
helped: 51
HURT: 72
total omod in shared programs: 419 -> 418 (-0.24%)
omod in affected programs: 15 -> 14 (-6.67%)
helped: 3
HURT: 3
total temps in shared programs: 15698 -> 15692 (-0.04%)
temps in affected programs: 952 -> 946 (-0.63%)
helped: 46
HURT: 37
total consts in shared programs: 84458 -> 83856 (-0.71%)
consts in affected programs: 14648 -> 14046 (-4.11%)
helped: 499
HURT: 0
total cycles in shared programs: 156476 -> 156493 (0.01%)
cycles in affected programs: 22532 -> 22549 (0.08%)
helped: 100
HURT: 102
LOST: shaders/ck2/157.shader_test FS
GAINED: shaders/ck2/160.shader_test FS
GAINED: shaders/tesseract/395.shader_test FS
RV530:
total instructions in shared programs: 119543 -> 119612 (0.06%)
instructions in affected programs: 27435 -> 27504 (0.25%)
helped: 118
HURT: 183
total presub in shared programs: 7257 -> 7111 (-2.01%)
presub in affected programs: 1856 -> 1710 (-7.87%)
helped: 121
HURT: 48
total omod in shared programs: 426 -> 427 (0.23%)
omod in affected programs: 5 -> 6 (20.00%)
helped: 1
HURT: 2
total temps in shared programs: 16784 -> 16779 (-0.03%)
temps in affected programs: 392 -> 387 (-1.28%)
helped: 29
HURT: 17
total consts in shared programs: 93198 -> 92667 (-0.57%)
consts in affected programs: 14577 -> 14046 (-3.64%)
helped: 451
HURT: 0
total cycles in shared programs: 186649 -> 186590 (-0.03%)
cycles in affected programs: 26306 -> 26247 (-0.22%)
helped: 125
HURT: 111
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
We are not doing this on R5xx unless we have more than 200 constants,
because emitting constants one by one will add extra overhead at emit
time which we want to avoid if possible.
RV410:
total instructions in shared programs: 98778 -> 98703 (-0.08%)
instructions in affected programs: 7106 -> 7031 (-1.06%)
helped: 80
HURT: 25
total presub in shared programs: 2266 -> 2227 (-1.72%)
presub in affected programs: 134 -> 95 (-29.10%)
helped: 22
HURT: 10
total temps in shared programs: 15662 -> 15660 (-0.01%)
temps in affected programs: 330 -> 328 (-0.61%)
helped: 16
HURT: 13
total consts in shared programs: 85632 -> 84400 (-1.44%)
consts in affected programs: 6646 -> 5414 (-18.54%)
helped: 617
HURT: 0
total cycles in shared programs: 156305 -> 156234 (-0.05%)
cycles in affected programs: 14167 -> 14096 (-0.50%)
helped: 79
HURT: 28
LOST: shaders/ck2/160.shader_test FS
GAINED: shaders/ck2/157.shader_test FS
GAINED: shaders/tropics/249.shader_test FS
GAINED: shaders/tropics/252.shader_test FS
RV530:
total consts in shared programs: 93209 -> 93198 (-0.01%)
consts in affected programs: 72 -> 61 (-15.28%)
helped: 6
HURT: 0
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.comm>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
Instead of just moving around constants as full vec4, we will now have
the flexibility to shuffle scalars around. However, this commit just
prepares the infrestructure and converts to it, while the constant
elimination logiic reamins the same, i.e., we only remove constant if it
is fully unused and there is no constant compaction whatsoever.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip.gawin@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28630>
SPI_SHADER_COL_FORMAT/CB_SHADER_MASK are used slightly differently
for PS epilogs, shader objects and monolithic graphics pipelines.
This introduces a new state that will allow us to emit these two
registers in only place. The main motivation is for depth-only RB+
support and for tracking context registers in the cmdbuf.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28976>
Per section 7.7.3, the structure includes additional optional layer-specific
information, which is padded if left unset, based on the value of
max_sub_layers_minus1. The vulkan input structs have no way to specify this
per-layer information, so we just need the padding.
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29001>
This enables EXT_depth_range_unrestricted from VOLTA_A
Test of dEQP-VK.*depth_range_unrestricted* on TU104 shows:
Test run totals:
Passed: 14212/14212 (100.0%)
Failed: 0/14212 (0.0%)
Not supported: 0/14212 (0.0%)
Warnings: 0/14212 (0.0%)
Waived: 0/14212 (0.0%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28958>
Now that we have flows, custom tracks, and timestamps, we can have a track
for wayland buffer presentation times, tagged with appropriate flow ids
so we can follow when a buffer was acquired through to the time it was
displayed.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
This can be useful if we know when an event happened, but our code isn't
running at that time (such as reporting when an image was presented in
the wayland wsi).
We can't really mix these with events that we log at the current time,
because there could be overlap, so also add a function for creating
custom tracks.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28634>
The depth and stencil tests should be disabled in case the respective
attachments are null in VkRenderingInfo or their format is undefined in
VkPipelineRenderingCreateInfo, additionally the stencil test should be
disabled in case the depth/stencil attachment has no stencil component.
Fixes:
dEQP-VK.pipeline.*.stencil.no_stencil_att.*.d24_unorm_s8_uint
dEQP-VK.pipeline.*.stencil.no_stencil_att.*.x8_d24_unorm_pack32
Signed-off-by: Amber Harmonia <amber@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28556>
Xe KMD will not provide a enum with formats, instead UMD needs set
a uint64_t with type, counter_sel, counter_size and bc_report for the
format.
So here changing from int to uint64_t, it do not causes any issues for
i915 and makes it ready for Xe KMD.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28997>