Commit Graph

7023 Commits

Author SHA1 Message Date
Ian Romanick
f2c69f8306 anv: Enable KHR_shader_integer_dot_product
For now, only mark the 4x8BitPacked variants as accelerated.

Applications are unlikely to use the "add with saturate" opcodes from
VK_INTEL_shader_integer_functions2, so, technically, all of the
AccumulatingSaturating variants "[provide] a performance advantage over
user-provided code composed from elementary instructions..." on all
Intel platforms.  If we encounter an application that cares, we can do
things differently then.  Ditto for the non-packed 8Bit, 4-element
vector variants.

v2: Don't memset props as this also zeros sType and pNext.  Noticed by
Georg Lehmann in !12617.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>
2021-08-31 19:57:21 +00:00
Lionel Landwerlin
838c0e5eef intel/fs: fix framebuffer reads
We're missing some restrictions on those messages.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5292
Cc: mesa-stable
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12615>
2021-08-31 11:52:39 +00:00
Ian Romanick
a8d0c0af86 intel/fs: Remove type-based restriction for cmod propagation to saturated operations
Previously, we misunderstood how conditional modifiers and saturate
interacted.  We thought the condition was evaulated before the saturate
was applied.  For the floating point cases, we went to some heroics to
modify the condition to maintain the same results.  For integer cases,
it was not clear that this could even work.  We had no use-cases and no
tests-cases, so we just disallowed everything.

Now we understand that the condition is evaluated after the saturate.
Earlier commits in this series removed the various floating point
heroics.  It is easier to just delete the code that prevents some cases
that just work.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
5ad88fd499 intel/fs: Remove after parameter from test_saturate_prop
Originally this was part of "intel/fs: Remove condition-based
restriction for cmod propagation to saturated operations".  With some
additional changes to that commit, it caused a lot of extra churn in the
unit tests.  I felt that made it harder to see the actual changes in the
unit tests, so I split it out.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
e6373923a7 intel/fs: Remove condition-based restriction for cmod propagation to saturated operations
I don't know why the float_saturate_l_mov test was #if'ed out, but it
passes... so this commit enables it.

No shader-db or fossil-db changes.

In a previous iteration of this MR, this commit helped ~200 shaders in
shader-db.  Now all of those same shaders are helped by "intel/fs: cmod
propagate from MOV with any condition".  All of these shaders come from
Mad Max.  After initial translation from NIR to assembly, these shader
contain patterns like:

    mul(8)            g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */
    ...
    mov.sat(8)        g90<1>F       g90<8,8,1>F
    ...
    cmp.nz.f0(8)      null<1>F      g90<8,8,1>F     0 /* 0F */

An initial pass of cmod propagation converts this to

    mul(8)            g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */
    ...
    mov.sat.XX.f0(8)  g90<1>F       g90<8,8,1>F

Without this commit, XX is G.  With this commit, XX is NZ.  Saturate
propagation moves the saturate:

    mul.sat(8)        g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */
    ...
    mov.XX.f0(8)      g90<1>F       g90<8,8,1>F

Without this commit (but with "intel/fs: cmod propagate from MOV with
any condition"), the G gets propagated:

    mul.sat.g.f0(8)   g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */

With this commit (with or without "intel/fs: cmod propagate from MOV
with any condition"), the NZ gets propagated:

    mul.sat.nz.f0(8)  g90<1>F       g88<8,8,1>F     0x40400000F  /* 3F */

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
47f0cdc449 intel/fs: cmod propagate from MOV with any condition
There were tests related to propagating conditional modifiers from a MOV
to an instruction with a .SAT modifier for a very long time, but they
were #if'ed out.

There are restrictions later in the function that limit the kinds of MOV
instructions that can propagate.  This avoids the dangers of
type-converting MOVs that may generate flags in different ways.

v2: Update the added comment to look more like the existing comment.
That makes the small differences between the two cases more obvious.
Noticed by Marcin.

All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 19827127 -> 19826924 (<.01%)
instructions in affected programs: 62024 -> 61821 (-0.33%)
helped: 201
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1
helped stats (rel) min: 0.13% max: 0.60% x̄: 0.35% x̃: 0.36%
95% mean confidence interval for instructions value: -1.02 -1.00
95% mean confidence interval for instructions %-change: -0.36% -0.34%
Instructions are helped.

total cycles in shared programs: 954655879 -> 954655356 (<.01%)
cycles in affected programs: 1212877 -> 1212354 (-0.04%)
helped: 155
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 3.65 x̃: 4
helped stats (rel) min: <.01% max: 0.17% x̄: 0.07% x̃: 0.07%
HURT stats (abs)   min: 2 max: 12 x̄: 7.00 x̃: 8
HURT stats (rel)   min: 0.04% max: 0.23% x̄: 0.14% x̃: 0.15%
95% mean confidence interval for cycles value: -3.60 -2.90
95% mean confidence interval for cycles %-change: -0.07% -0.05%
Cycles are helped.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
a9120eccff intel/compiler: Move type_is_unsigned_int to brw_reg_type.h
...and rename it to brw_reg_type_is_unsigned_integer.  It is now next to
brw_reg_type_is_floating_point and brw_reg_type_is_integer.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
b23432c540 intel/fs: Fix a cmod prop bug when the source type of a mov doesn't match the dest type of scan_inst
We were previously operating with the mindset "a MOV is just a compare
with zero."  As a result, we were trying to share as much code between
the MOV path and the CMP path as possible.  However, MOV instructions
can perform type conversions that affect the result of the comparison.
There was some code added to better handle this for cases like

    and(16)         g31<1>UD       g20<8,8,1>UD   g22<8,8,1>UD
    mov.nz.f0(16)   null<1>F       g31<8,8,1>D

The flaw in these changed special cases is that it allowed things like

    or(8)           dest:D  src0:D  src1:D
    mov.nz(8)       null:D  dest:F

Because both destinations were integer types, the propagation was
allowed.  The source type of the MOV and the destination type of the OR
do not match, so type conversion rules have to be accounted for.

My solution was to just split the MOV and non-MOV paths with completely
separate checks.  The "else" path in this commit is basically the old
code with the BRW_OPCODE_MOV special case removed.

The new MOV code further splits into "destination of scan_inst is float"
and "destination of scan_inst is integer" paths.  For each case I
enumerate the rules that I belive apply.  For the integer path, only the
"Z or NZ" rules are listed as only NZ is currently allowed (hence the
conditional_mod assertion in that path).  A later commit relaxes this
and adds the rule.

The new rules slightly relax one of the previous rules.  Previously the
sizes of the MOV destination and the MOV source had to be the same.  In
some cases now the sizes can be different by the following conditions:

  - Floating point to integer conversion are not allowed.

  - If the conversion is integer to floating point, the size of the
    floating point value does not matter as it will not affect the
    comparison result.

  - If the conversion is float to float, the size of the destination
    must be greater than or equal to the size of the source.

  - If the conversion is integer to integer, the size of the destination
    must be greater than or equal to the size of the source.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
0797388dc2 intel/fs: Add many cmod propagation tests involving MOV instructions
Of particular interest are the tests where the MOV performs a type
conversion.  If the restriction on conditional modifier for a MOV is
ever relaxed, some of these cases must still be disallowed.

v2: s/NZ/Z/ in one of the comments.  Notice by Marcin.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
4f6c5da025 intel/fs: Remove redundant inst->opcode checks in cmod prop
This foreach_inst_in_block_reverse_starting_from loop only applies
CMP, MOV, and AND.  AND instructions break out of the loop before this
point.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Ian Romanick
3afefb0818 intel/fs: Refactor some cmod propagation tests
This will simplify some later changes to these tests.

v2: Combine test_positive_saturate_prop and test_negative_saturate_prop
into a single function.  Suggested by Marcin.

Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>
2021-08-30 14:00:14 -07:00
Marcin Ślusarz
e0533ebf16 intel/compiler: INT DIV function does not support source modifiers
BSpec says that for all generations.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5281
CC: mesa-stable

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12518>
2021-08-26 07:51:44 +00:00
Nanley Chery
565f9105b7 anv/image: Don't assert that HiZ can be added
HiZ isn't yet enabled for Tile4/64.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
bd516c6581 intel/isl: Disable I915_FORMAT_MOD_Y_TILED on XeHP+
XeHP lacks support for Y-tiling.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
af40104e7d intel: Add underscores to HALIGN and VALIGN enums
The HALIGN enums for XeHP already have underscores. Make the other
HALIGN and VALIGN enums conform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
3d1f6342c0 intel: Update surface states for XeHP alignments
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
fbde743b07 intel/isl: Use a switch for HALIGN/VALIGN encoding
Avoid using a sparse and relatively large array for HALIGN encoding.
Additionally, this provides validation of the input alignment values.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
2e464e69b9 intel/isl: Fix halign/valign of uncompressed views
We're going to start asserting for valid halign/valign values during
surface state emission. Pre-SKL, isl_surf_get_uncompressed_surf creates
uncompressed surfaces with invalid halign/valign values - 1x1. Fix this
by replacing the call to isl_surf_get_image_surf with isl_surf_init,
passing in the uncompressed format up-front.

As we're no longer using isl_surf_get_image_surf, we also need to get
the x and y offset of the image ourselves. Instead of getting a
sample-based offset, then converting to elements later on, we use
isl_surf_get_image_offset_B_tile_el to get the offset in elements
up-front.

With the above two changes, the generic code after the else-block is no
longer needed for the single-layer-view code path. We move it and
specialize it to the if-block (which is executed SKL+ and handles
multi-layer views).

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
c7bcbc950c intel/blorp: Fix Gfx7 stencil surface state valign
Stencil on Gfx7 has a vertical alignment element of 8, but the largest
its surface state can express is 4. Apply the Gfx6 solution of changing
the alignment in blorp_surf_retile_w_to_y.

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
1f62cddaf5 intel/blorp: Fix faked RGB image alignment on XeHP
On XeHP, NPOT and POT formatted surfaces will use different image
alignment units when emitting surface states. When BLORP fakes an RGB
image as RED, update the image alignment to prevent assert failures when
emitting surface states.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
79ad9cda48 intel: Support Tile4/64 in surface states
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
dd9ae2dc7b intel: Support Tile4/64 in depth/stencil state
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
f54de77c3a intel/isl: Update tiling filter functions for XeHP
Enable the XeHP-specific tilings and restrict them to that platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
0ab2fa18e4 intel/isl: Use an allow-list in gfx6_filter_tiling
Try to avoid having to update isl_gfx6_filter_tiling when new tilings
are added for new platforms. Note that the allow-list uses
ISL_TILING_ANY_Y_MASK and thus assumes that no new Y-tilings will be
added in the future.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
602f597bc1 intel/isl: Drop ISL_SURF_USAGE_DISPLAY_*_BIT
We haven't used these since their introduction.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
ac37d7801c intel/isl: Drop extra assert on array_pitch_el_rows
ISL already asserts that the variable is a multiple of the tile height
via isl_assert_div.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
4309012774 intel/isl: Size Tile64 surfaces with 4 dimensions
In order to size Tile64 surfaces correctly, make sure that the total
physical extent is arrayed. The code should handle 3D surfaces as well,
but is untested for now.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
8fd7678241 intel/isl: Update image alignments on XeHP
Implement the new XeHP alignment rules for surface layout.
RENDER_SURFACE_STATE objects still need updating, but that's left for a
separate commit.

Rework:
* Nanley: Include Sagar's VALIGN fix for D16_UNORM.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
0bcfa2d8fb intel/isl: Define ISL_TILING_4/64 for XeHP
XeHP defines new tiling formats, Tile4 and Tile64. They are needed in
order to support depth/stencil surfaces and multisampling. Create new
ISL enums and define some initial tiling information in order to enable
them later on.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Nanley Chery
44ef425ce8 intel/isl: Add msaa_layout param to isl_tiling_get_info
The additional parameter will be used by Tile64.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Jason Ekstrand
e307d46eab intel/isl: Add more parameters to isl_tiling_get_info
They are not used yet but the layout of Yf and Ys tiles are dependent on
these parameters.  While we're here, better document the function.

Rework:
* Nanley: Update crocus.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>
2021-08-25 22:39:30 +00:00
Ian Romanick
0f809dbf40 intel/compiler: Basic support for DP4A instruction
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
2021-08-24 19:58:57 +00:00
Jason Ekstrand
31fdd26d01 intel/compiler: Add unified barrier support for CS
Program CS barrier message fields for producers/consumers.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
2021-08-24 01:31:48 +00:00
Jordan Justen
6a950bab0c intel/compiler: Add unified barrier support for TCS
Program the producers/consumer fields for TCS Barrier messages.
Producer and consumer fields are set to number of TCS threads.

Ref: Bspec 54006 for Barrier Data Payload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
2021-08-24 01:31:48 +00:00
Jordan Justen
b4055a020f intel/compiler: Regroup TCS barrier code paths
Rearrange if/else fragments to unify case for Gen11 or later
platforms. This will help the code look cleaner for adding
unified barrier support to TCS.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>
2021-08-24 01:31:48 +00:00
Nanley Chery
2944f49610 intel: Parse INTEL_NO_HW for devinfo construction
This commit does several things:

* Unify code common to several drivers by evaluating INTEL_NO_HW within
  intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
  separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
  code simplification, but we may find reason to undo this later on.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
2021-08-24 00:12:47 +00:00
Nanley Chery
7d59a66e3a intel: Use env_var_as_boolean for INTEL_NO_HW
The prior method of checking the result of getenv() for NULL would cause
the feature to be enabled for INTEL_NO_HW=0.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
2021-08-24 00:12:47 +00:00
Nanley Chery
4003f2d48d anv: Optimize genX(cmd_buffer_emit_gfx12_depth_wa)
Only emit the workaround as needed.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>
2021-08-20 17:50:35 +00:00
Nanley Chery
2ae70329f5 intel: Move the D16 workarounds out of ISL
Implement the workarounds in anv and iris instead.

Before this commit, ISL unconditionally modified workaround registers
while filling out depth stencil state. To account for this, drivers
unconditionally stalled prior to emitting depth stencil packets. This
hurt performance.

By having the drivers perform the workarounds, they can choose when to
modify the relevant registers. The drivers now avoid emitting the
workaround for NULL depth buffers. This reduces stalls and leads to
better performance.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (the ISL/Anv bits)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (the Iris bits)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>
2021-08-20 17:50:35 +00:00
Nanley Chery
14b3732b84 anv: Add genX(cmd_buffer_emit_gfx12_depth_wa)
This will replace the workaround built into ISL.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>
2021-08-20 17:50:35 +00:00
Jason Ekstrand
a6a449837b anv: Set CONTEXT_PARAM_RECOVERABLE to false
We want the kernel to ban our context immediately instead of foolhardily
attempting to recover.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12476>
2021-08-19 19:37:03 +00:00
Ian Romanick
5ce3bfcdf3 intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76.

With the previous optimizations in place, this change seems to improve
the quality of the generated code.  Comparing a couple Vulkan CTS tests
on Skylake had the following results.

dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends

dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
f0a8a9816a nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO.  This results
in a lot of shifts and MOVs.  When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.

v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
7c83aa0518 intel/fs: Emit better code for u2u of extract
Emitting the instructions one by one results in two MOV instructions
that won't be propagated.  By handling both instructions at once, a
single MOV is emitted.  For example, on Ice Lake this helps
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends

Without "intel/fs: Allow copy propagation between MOVs of mixed sizes,"
the improvement is still 8 instructions, but there are more instructions
to begin with:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
e3f502e007 intel/fs: Allow copy propagation between MOVs of mixed sizes
This eliminates some spurious, size-converting moves.  For example, on
Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:

SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends

Unfortunately, this doesn't clean everything up.  Here's a subset of the
"before" assembly:

send(8)         g11<1>UW        g2<0,1,0>UD     0x02106e02
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g7<4>UB         g11<8,8,1>UD                    { align1 1Q };
mov(8)          g12<1>UB        g7<32,8,4>UB                    { align1 1Q };
send(8)         g13<1>UW        g2<0,1,0>UD     0x02106e03
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g15<1>UW        g12<8,8,1>UB                    { align1 1Q };
mov(8)          g8<4>UB         g13<8,8,1>UD                    { align1 1Q };
mov(8)          g14<1>UB        g8<32,8,4>UB                    { align1 1Q };
mov(8)          g16<1>UW        g14<8,8,1>UB                    { align1 1Q };
xor(8)          g17<1>UW        g15<8,8,1>UW    g16<8,8,1>UW    { align1 1Q };

And here's the same subset of the "after" assembly:

send(8)         g11<1>UW        g2<0,1,0>UD     0x02106e02
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g7<4>UB         g11<8,8,1>UD                    { align1 1Q };
send(8)         g13<1>UW        g2<0,1,0>UD     0x02106e03
                            dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q };
mov(8)          g15<1>UW        g7<32,8,4>UB                    { align1 1Q };
mov(8)          g8<4>UB         g13<8,8,1>UD                    { align1 1Q };
mov(8)          g16<1>UW        g8<32,8,4>UB                    { align1 1Q };
xor(8)          g17<1>UW        g15<8,8,1>UW    g16<8,8,1>UW    { align1 1Q };

There are a lot of regioning and type restrictions in
fs_visitor::try_copy_propagate, and I'm a little nervious about messing
with them too much.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Suggested-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick
f9665040f1 intel/compiler: Document and assert some aspects of 8-bit integer lowering
In the vec4 compiler, 8-bit types should never exist.

In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.

Some instructions are handled in non-obvious ways.

Hopefully this will save the next person some time.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Jordan Justen
7faad66ab0 intel/pci-ids: Re-enable DG1 and add SG1
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584>
2021-08-18 17:35:41 +00:00
Sagar Ghuge
57bfd7122f anv: Fix VK_EXT_memory_budget to consider VRAM if available
Instead of calling the OS query, re-run anv_update_meminfo to get the
latest from either the kernel memory info API or the OS as appropriate.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5173
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>
2021-08-18 17:13:00 +00:00
Jason Ekstrand
758662759d anv: compute available memory in anv_init_meminfo
We can now detect EXT_memory_budget support based on whether or not we
have non-zero available system memory.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>
2021-08-18 17:13:00 +00:00
Jason Ekstrand
5c79c545e3 anv: Rework init_meminfo
Instead of making LMEM the special case, unify the two paths by setting
up a fake drm_i915_query_memory_regions struct and filling it out based
on OS queries.  The important functional change here is that we now pass
system memory through the same GTT size and 3/4 filter that we were
using with the OS queries.  This should make behavior consistent on
integrated GPUs regardless of whether or not we have the memory region
query API.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>
2021-08-18 17:13:00 +00:00