This consolidates several duplicated pieces of code into devinfo.
max_scratch_ids is an array that provides the max number of threads
for the rendering and compute stages.
This fixes some exceptions missed by crocus for scratch ids on haswell
and cherryview.
It also fills out devinfo->max_scratch_ids properly for stages VS
through CS on Gfx12.5. But, functionally this should not make a
difference as Gfx12.5 already uses COMPUTE for all stages.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12799>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
For now, only mark the 4x8BitPacked variants as accelerated.
Applications are unlikely to use the "add with saturate" opcodes from
VK_INTEL_shader_integer_functions2, so, technically, all of the
AccumulatingSaturating variants "[provide] a performance advantage over
user-provided code composed from elementary instructions..." on all
Intel platforms. If we encounter an application that cares, we can do
things differently then. Ditto for the non-packed 8Bit, 4-element
vector variants.
v2: Don't memset props as this also zeros sType and pNext. Noticed by
Georg Lehmann in !12617.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>
Previously, we misunderstood how conditional modifiers and saturate
interacted. We thought the condition was evaulated before the saturate
was applied. For the floating point cases, we went to some heroics to
modify the condition to maintain the same results. For integer cases,
it was not clear that this could even work. We had no use-cases and no
tests-cases, so we just disallowed everything.
Now we understand that the condition is evaluated after the saturate.
Earlier commits in this series removed the various floating point
heroics. It is easier to just delete the code that prevents some cases
that just work.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Originally this was part of "intel/fs: Remove condition-based
restriction for cmod propagation to saturated operations". With some
additional changes to that commit, it caused a lot of extra churn in the
unit tests. I felt that made it harder to see the actual changes in the
unit tests, so I split it out.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
I don't know why the float_saturate_l_mov test was #if'ed out, but it
passes... so this commit enables it.
No shader-db or fossil-db changes.
In a previous iteration of this MR, this commit helped ~200 shaders in
shader-db. Now all of those same shaders are helped by "intel/fs: cmod
propagate from MOV with any condition". All of these shaders come from
Mad Max. After initial translation from NIR to assembly, these shader
contain patterns like:
mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.sat(8) g90<1>F g90<8,8,1>F
...
cmp.nz.f0(8) null<1>F g90<8,8,1>F 0 /* 0F */
An initial pass of cmod propagation converts this to
mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.sat.XX.f0(8) g90<1>F g90<8,8,1>F
Without this commit, XX is G. With this commit, XX is NZ. Saturate
propagation moves the saturate:
mul.sat(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
...
mov.XX.f0(8) g90<1>F g90<8,8,1>F
Without this commit (but with "intel/fs: cmod propagate from MOV with
any condition"), the G gets propagated:
mul.sat.g.f0(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
With this commit (with or without "intel/fs: cmod propagate from MOV
with any condition"), the NZ gets propagated:
mul.sat.nz.f0(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F */
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
There were tests related to propagating conditional modifiers from a MOV
to an instruction with a .SAT modifier for a very long time, but they
were #if'ed out.
There are restrictions later in the function that limit the kinds of MOV
instructions that can propagate. This avoids the dangers of
type-converting MOVs that may generate flags in different ways.
v2: Update the added comment to look more like the existing comment.
That makes the small differences between the two cases more obvious.
Noticed by Marcin.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19827127 -> 19826924 (<.01%)
instructions in affected programs: 62024 -> 61821 (-0.33%)
helped: 201
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1
helped stats (rel) min: 0.13% max: 0.60% x̄: 0.35% x̃: 0.36%
95% mean confidence interval for instructions value: -1.02 -1.00
95% mean confidence interval for instructions %-change: -0.36% -0.34%
Instructions are helped.
total cycles in shared programs: 954655879 -> 954655356 (<.01%)
cycles in affected programs: 1212877 -> 1212354 (-0.04%)
helped: 155
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 3.65 x̃: 4
helped stats (rel) min: <.01% max: 0.17% x̄: 0.07% x̃: 0.07%
HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 8
HURT stats (rel) min: 0.04% max: 0.23% x̄: 0.14% x̃: 0.15%
95% mean confidence interval for cycles value: -3.60 -2.90
95% mean confidence interval for cycles %-change: -0.07% -0.05%
Cycles are helped.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
We were previously operating with the mindset "a MOV is just a compare
with zero." As a result, we were trying to share as much code between
the MOV path and the CMP path as possible. However, MOV instructions
can perform type conversions that affect the result of the comparison.
There was some code added to better handle this for cases like
and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD
mov.nz.f0(16) null<1>F g31<8,8,1>D
The flaw in these changed special cases is that it allowed things like
or(8) dest:D src0:D src1:D
mov.nz(8) null:D dest:F
Because both destinations were integer types, the propagation was
allowed. The source type of the MOV and the destination type of the OR
do not match, so type conversion rules have to be accounted for.
My solution was to just split the MOV and non-MOV paths with completely
separate checks. The "else" path in this commit is basically the old
code with the BRW_OPCODE_MOV special case removed.
The new MOV code further splits into "destination of scan_inst is float"
and "destination of scan_inst is integer" paths. For each case I
enumerate the rules that I belive apply. For the integer path, only the
"Z or NZ" rules are listed as only NZ is currently allowed (hence the
conditional_mod assertion in that path). A later commit relaxes this
and adds the rule.
The new rules slightly relax one of the previous rules. Previously the
sizes of the MOV destination and the MOV source had to be the same. In
some cases now the sizes can be different by the following conditions:
- Floating point to integer conversion are not allowed.
- If the conversion is integer to floating point, the size of the
floating point value does not matter as it will not affect the
comparison result.
- If the conversion is float to float, the size of the destination
must be greater than or equal to the size of the source.
- If the conversion is integer to integer, the size of the destination
must be greater than or equal to the size of the source.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
Of particular interest are the tests where the MOV performs a type
conversion. If the restriction on conditional modifier for a MOV is
ever relaxed, some of these cases must still be disallowed.
v2: s/NZ/Z/ in one of the comments. Notice by Marcin.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
We're going to start asserting for valid halign/valign values during
surface state emission. Pre-SKL, isl_surf_get_uncompressed_surf creates
uncompressed surfaces with invalid halign/valign values - 1x1. Fix this
by replacing the call to isl_surf_get_image_surf with isl_surf_init,
passing in the uncompressed format up-front.
As we're no longer using isl_surf_get_image_surf, we also need to get
the x and y offset of the image ourselves. Instead of getting a
sample-based offset, then converting to elements later on, we use
isl_surf_get_image_offset_B_tile_el to get the offset in elements
up-front.
With the above two changes, the generic code after the else-block is no
longer needed for the single-layer-view code path. We move it and
specialize it to the if-block (which is executed SKL+ and handles
multi-layer views).
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
On XeHP, NPOT and POT formatted surfaces will use different image
alignment units when emitting surface states. When BLORP fakes an RGB
image as RED, update the image alignment to prevent assert failures when
emitting surface states.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
Implement the new XeHP alignment rules for surface layout.
RENDER_SURFACE_STATE objects still need updating, but that's left for a
separate commit.
Rework:
* Nanley: Include Sagar's VALIGN fix for D16_UNORM.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
XeHP defines new tiling formats, Tile4 and Tile64. They are needed in
order to support depth/stencil surfaces and multisampling. Create new
ISL enums and define some initial tiling information in order to enable
them later on.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>