Ian Romanick
0f809dbf40
intel/compiler: Basic support for DP4A instruction
...
v2: Very significant rebase on changes to previous commits.
Specifically, brw_fs_nir.cpp changes were pretty much rewritten from
scratch after changing the NIR opcode names and types.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142 >
2021-08-24 19:58:57 +00:00
Jordan Justen
6a950bab0c
intel/compiler: Add unified barrier support for TCS
...
Program the producers/consumer fields for TCS Barrier messages.
Producer and consumer fields are set to number of TCS threads.
Ref: Bspec 54006 for Barrier Data Payload
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963 >
2021-08-24 01:31:48 +00:00
Jordan Justen
b4055a020f
intel/compiler: Regroup TCS barrier code paths
...
Rearrange if/else fragments to unify case for Gen11 or later
platforms. This will help the code look cleaner for adding
unified barrier support to TCS.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963 >
2021-08-24 01:31:48 +00:00
Ian Romanick
5ce3bfcdf3
intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
...
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76 .
With the previous optimizations in place, this change seems to improve
the quality of the generated code. Comparing a couple Vulkan CTS tests
on Skylake had the following results.
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends
dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025 >
2021-08-18 22:03:37 +00:00
Ian Romanick
f0a8a9816a
nir: intel/compiler: Add and use nir_op_pack_32_4x8_split
...
A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results
in a lot of shifts and MOVs. When that pattern can be recognized, the
individual 8-bit components can be packed much more efficiently.
v2: Rebase on b4369de27f ("nir/lower_packing: use
shader_instructions_pass")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025 >
2021-08-18 22:03:37 +00:00
Ian Romanick
7c83aa0518
intel/fs: Emit better code for u2u of extract
...
Emitting the instructions one by one results in two MOV instructions
that won't be propagated. By handling both instructions at once, a
single MOV is emitted. For example, on Ice Lake this helps
dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends
Without "intel/fs: Allow copy propagation between MOVs of mixed sizes,"
the improvement is still 8 instructions, but there are more instructions
to begin with:
SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025 >
2021-08-18 22:03:37 +00:00
Ian Romanick
f9665040f1
intel/compiler: Document and assert some aspects of 8-bit integer lowering
...
In the vec4 compiler, 8-bit types should never exist.
In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.
Some instructions are handled in non-obvious ways.
Hopefully this will save the next person some time.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025 >
2021-08-18 22:03:37 +00:00
Timothy Arceri
a654e39f15
intel/compiler: make sure swizzle is applied to if condition
...
This fixes a hang in the following piglit test when GCM moves a
UBO load outside of the loop.
tests/shaders/ssa/fs-if-def-else-break.shader_test
The end NIR ends up looking like this:
vec2 32 ssa_3 = intrinsic load_ubo (ssa_2, ssa_0) (0, 1073741824, 0, 0, 8)
vec1 32 ssa_4 = mov ssa_3.x
vec1 32 ssa_5 = inot ssa_3.y
/* succs: block_1 */
loop {
...
if ssa_5 { }
}
Fixes: 1edf67fc3f ("intel/fs: Generate if instructions with inverted conditions")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064 >
2021-08-03 10:54:50 +00:00
Jason Ekstrand
4465ca296d
nir: Suffix all the MCS texture stuff _intel
...
It's intel-specific, used to get at MSAA compression information.
Reviewed-by: Emma Anholt <emma@anholt.net >
Reviewed-by: Adam Jackson <ajax@redhat.com >
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com >
Reviewed-by: Connor Abbott <cwabbott0@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775 >
2021-07-23 15:53:57 +00:00
Jordan Justen
8c29891fa4
intel/compiler: Remove cube array size lowering in compiler backend
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9466 >
2021-07-21 11:34:49 -07:00
Sagar Ghuge
705285b9f4
intel/compiler: Add support for ternary add instruction on XeHP
...
v2:
- Re-arragne opcode in correct order (Matt Turner)
- Move ADD3 case closer to LRP (Jason)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596 >
2021-07-16 15:59:56 +00:00
Sagar Ghuge
b67f1ff465
intel/compiler: Add support for LSC fence operations
...
v2 (Jason Ekstrand):
- Squash SLM and global fence ops together
v3 (Jason Ekstrand):
- Rework to use message descriptors instead of instruction fields
v4 (Jason Ekstrand):
- Don't pass BTI into back-end emit function. Always use FLAT.
Co-authored-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11600 >
2021-06-30 16:17:18 +00:00
Jason Ekstrand
1e242785c3
intel/fs: Implement load/store_scratch on XeHP
...
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582 >
2021-06-25 00:18:29 +00:00
Francisco Jerez
231337a13a
intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.
...
As required by HSDES:14013363432.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11433 >
2021-06-23 07:34:22 +00:00
Caio Marcelo de Oliveira Filho
8af6766062
nir: Move workgroup_size and workgroup_variable_size into common shader_info
...
Move it out the "cs" sub-struct, since these will be used for other
shader stages in the future.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225 >
2021-06-08 09:23:55 -07:00
Caio Marcelo de Oliveira Filho
c8a7bd0dc8
nir: Rename WORK_GROUP (and similar) to WORKGROUP
...
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho
a71a780598
nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size
...
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho
430d2206da
compiler: Rename local_size to workgroup_size
...
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Marcin Ślusarz
3340d5ee02
intel: simplify is_haswell checks, part 1
...
Generated with:
files=`git grep is_haswell | cut -d: -f1 | sort | uniq`
for file in $files; do
cat $file | \
sed "s/devinfo->ver <= 7 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" | \
sed "s/devinfo->ver >= 8 || devinfo->is_haswell/devinfo->verx10 >= 75/g" | \
sed "s/devinfo->is_haswell || devinfo->ver >= 8/devinfo->verx10 >= 75/g" | \
sed "s/devinfo.is_haswell || devinfo.ver >= 8/devinfo.verx10 >= 75/g" | \
sed "s/devinfo->ver > 7 || devinfo->is_haswell/devinfo->verx10 >= 75/g" | \
sed "s/devinfo->ver == 7 && !devinfo->is_haswell/devinfo->verx10 == 70/g" | \
sed "s/devinfo.ver == 7 && !devinfo.is_haswell/devinfo.verx10 == 70/g" | \
sed "s/devinfo->ver < 8 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" | \
sed "s/device->info.ver == 7 && !device->info.is_haswell/device->info.verx10 == 70/g" \
> tmpXXX
mv tmpXXX $file
done
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10810 >
2021-05-17 09:46:45 +00:00
Lionel Landwerlin
a297061524
intel/compiler: add support for fragment shading rate variable
...
v2: Drop old register type initializers (Jason)
Simplify instruction snippet (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Anuj Phogat
61e8636557
intel: Rename gen_device prefix to intel_device
...
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241 >
2021-04-20 20:06:33 +00:00
Michel Dänzer
2928c21eb7
Convert most remaining free-form fall-through comments to FALLTHROUGH
...
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.
Reviewed-by: Eric Anholt <eric@anholt.net >
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com >
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com >
Reviewed-by: Gert Wollny <gert.wollny@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220 >
2021-04-15 16:01:22 +00:00
Jason Ekstrand
e6c79329dd
intel: fix querying mip levels on null surfaces on SKL and prior
...
When a surface of type SURFTYPE_NULL is accessed by resinfo, the MIPCount
returned is undefined instead of 0.
Closes #4309
Fixes dEQP-VK.robustness.robustness2.*.sampled_image.*.null_descriptor.*
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10147 >
2021-04-13 13:30:09 -07:00
Anuj Phogat
1d296484b4
intel: Rename Genx keyword to Gfxx
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/Gen\([[:digit:]]\+\)/Gfx\1/g"
Exclude changes in src/intel/perf/oa-*.xml:
find src/intel/perf -type f \( -name "*.xml" \) | xargs sed -ie "s/Gfx/Gen/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
b75f095bc7
intel: Rename genx keyword to gfxx in source files
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/gen\([[:digit:]]\+\)/gfx\1/g"
Exclude pack.h and xml changes in this patch:
grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+_pack\.h\)/gen\1/g"
grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+\.xml\)/gen\1/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
c1f3a778de
intel: Rename GENx prefix in macros to GFXx in source files
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN" -rIl src/intel/genxml | grep -E ".*py" | xargs sed -ie "s/GEN\([%{]\)/GFX\1/g"
grep -E "[^_]GEN[[:digit:]]+" -rIl $SEARCH_PATH | grep -E ".*(\.c|\.h|\.y|\.l)" | xargs sed -ie "s/\([^_]\)GEN\([[:digit:]]\+\)/\1GFX\2/g"
Leave out renaming GFX12_CCS_E macros. They fall under renaming pattern like "_GEN[[:digit:]]+":
grep -E "GFX12_CCS_E" -rIl $SEARCH_PATH | xargs sed -ie "s/GFX12_CCS_E/GEN12_CCS_E/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
abe9a71a09
intel: Rename gen field in gen_device_info struct to ver
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
99331f6deb
intel: Rename genx10 field in gen_device_info struct to verx10
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)genx10" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)genx10/info\1\2verx10/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Jason Ekstrand
91192696e6
intel/fs: Add support for 16-bit A64 float and integer atomics
...
The messages for those 16-bit operations still use 32-bit sources and
destinations, so expand them accordingly when building the payload.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8750 >
2021-03-18 00:13:40 +00:00
Jason Ekstrand
1ce3660a5a
intel/fs,rt: Add a predicate to load_global_const_block
...
This allows us to do bounds checked A64 block load without the it being
counted as control-flow by NIR. This means that NIR optimizations like
CSE will be able to work on these the same as a regular load.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8635 >
2021-03-17 17:49:58 +00:00
Timur Kristóf
17bc587f88
intel/compiler: Make room for maximum dest size in nir_emit_texture.
...
The maximum dest_size is 5, not 4.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9634 >
2021-03-17 03:47:23 +00:00
Eric Anholt
1e5ef4c60c
nir: Add a nir_src_is_undef() helper, like nir_src_is_const().
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345 >
2021-03-03 00:51:44 +00:00
Jordan Justen
18bc7d9d3f
intel: Use devinfo genx10 field
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9329 >
2021-03-01 22:00:08 -08:00
Jason Ekstrand
e797daba53
intel/compiler: Move brw_reg_type_for_bit_size to brw_reg_type.h
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Jordan Justen
9294193098
intel/compiler: Disable push constants on gen12-hp
...
We currently don't use push constants with the COMPUTE_WALKER command.
Make all uniforms to be pull constants.
The local group id previously was a push constant, but is now
available in R0.2[7:0].
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342 >
2021-01-13 13:10:27 -08:00
Jason Ekstrand
a1976e1cb2
intel/fs: Implement nir_jump_halt
...
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:19:18 -06:00
Jason Ekstrand
6992d2f625
intel/fs: Emit HALT_TARGET in emit_nir_code()
...
Instead of making it a fragment-specific thing based on uses_kill, track
whether or not we need one in fs_visitor and emit HALT_TARGET at the end
of emit_nir_code() if needed.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:19:14 -06:00
Jason Ekstrand
f9d549b2bf
intel/fs: Use BRW_OPCODE_HALT for discards
...
We're about to start using it to implement nir_jump_halt which has
nothing inherently to do with fragment shaders or discards. May as well
name it for the HW instruction it generates.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:19:08 -06:00
Jason Ekstrand
75209d5bd1
intel/fs: Add and implement intel-specific ray-tracing intrinsics
...
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356 >
2020-11-25 05:37:10 +00:00
Jason Ekstrand
1f6e70c85a
intel/fs: Add and implement a load_global_const_block intrinsic
...
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356 >
2020-11-25 05:37:09 +00:00
Jason Ekstrand
7280b0911d
intel/compiler: Add support for bindless shaders
...
The Intel bindless thread dispatch model is very simple. When a compute
shader is to be used for bindless dispatch, it can request a set of
stack IDs. These are allocated per-dual-subslice by the hardware and
recycled automatically when the stack ID is returned. Passed to the
bindless dispatch are a global argument address, a stack ID, and an
address of the BINDLESS_SHADER_RECORD to invoke. When the bindless
shader is dispatched, it is passed its stack ID as well as the global
and local argument pointers. The local argument pointer is the address
of the BINDLESS_SHADER_RECORD plus some offset which is specified as
part of the BINDLESS_SHADER_RECORD.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7356 >
2020-11-25 05:37:09 +00:00
Kenneth Graunke
97ebb896af
intel/compiler: Do interpolateAtOffset coordinate scaling in NIR
...
In our source languages, interpolateAtOffset() takes a floating point
offset in the range [-0.5, +0.5]. However, the hardware takes integer
valued offsets in the range [-8, 7], in units of 1/16th of a pixel.
So, we need to multiply and clamp the coordinates. We were doing this
in the FS backend, but with the advent of IBC, I'd like to avoid doing
it twice. This patch instead moves the lowering to NIR so we can reuse
it across both backends.
v2: Use nir_shader_instructions_pass (suggested by Eric Anholt).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6193 >
2020-11-18 23:26:53 +00:00
Jason Ekstrand
68092df8d8
intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+
...
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:
- Byte operations don't actually execute with a byte type. The
execution type for byte operations is actually word. (I don't know
if this has implications for the HW implementation. Probably?)
- Destinations are required to be strided out to at least the
execution type size. This means that B-type operations always have
a stride of at least 2. This means wreaks havoc on the back-end in
multiple ways.
- Thanks to the strided destination, we don't actually save register
space by storing things in bytes. We could, in theory, interleave
two byte values into a single 2B-strided register but that's both a
pain for RA and would lead to piles of false dependencies pre-Gen12
and on Gen12+, we'd need some significant improvements to the SWSB
pass.
- Also thanks to the strided destination, all byte writes are treated
as partial writes by the back-end and we don't know how to copy-prop
them.
- On Gen11, they added a new hardware restriction that byte types
aren't allowed in the 2nd and 3rd sources of instructions. This
means that we have to emit B->W conversions all over to resolve
things. If we emit said conversions in NIR, instead, there's a
chance NIR can get rid of some of them for us.
We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us. It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus. There is still a bit we have to handle in the back-end. In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482 >
2020-11-09 18:58:51 +00:00
Jason Ekstrand
b98f0d3d7c
intel/nir: Lower 8-bit scan/reduce ops to 16-bit
...
We can't really support these directly on any platform. May as well let
NIR lower them. The NIR lowering is potentially one more instruction
for scan/reduce ops thanks to not being able to do the B->W conversion
as part of SEL_EXEC. For imax/imin exclusive scan, it's yet another
instruction thanks to the extra imax/imin NIR has to insert to deal with
the fact that the first live channel will contain the identity value
which, when signed, will cast wrong. However, it does let us drop some
complexity from our back-end so it's probably worth it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482 >
2020-11-09 18:58:51 +00:00
Caio Marcelo de Oliveira Filho
5d5f3e3a47
intel/fs: Implement nir_intrinsic_{load,store}_shared_block_intel
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448 >
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
9fe158e1d1
intel/fs: Implement nir_intrinsic_{load,store}_ssbo_block_intel
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448 >
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
296137df53
intel/fs: Implement nir_intrinsic_{load,store}_global_block_intel
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7448 >
2020-11-04 20:24:48 +00:00
Caio Marcelo de Oliveira Filho
ce0b72a13a
intel/fs: Don't emit_uniformize when getting a constant SSBO index
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7340 >
2020-10-29 21:54:01 +00:00
Caio Marcelo de Oliveira Filho
e7e24d5039
intel/fs: Handle nir_intrinsic_terminate
...
For terminate operation, jump the invocation without predicating on
the rest of the quad being disabled -- which is what is done for
demote and discard.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7150 >
2020-10-15 21:40:09 +00:00
Timur Kristóf
2be99012e9
nir: Add ability to count emitted GS primitives.
...
Add an option to nir_lower_gs_intrinsics which tells it to track
the number of emitted primitives, not just vertices. Additionally,
also make it per-stream.
Also rename the set_vertex_count intrinsic to
set_vertex_and_primitive_count.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00