Caio Marcelo de Oliveira Filho
8af6766062
nir: Move workgroup_size and workgroup_variable_size into common shader_info
...
Move it out the "cs" sub-struct, since these will be used for other
shader stages in the future.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225 >
2021-06-08 09:23:55 -07:00
Caio Marcelo de Oliveira Filho
430d2206da
compiler: Rename local_size to workgroup_size
...
Acked-by: Emma Anholt <emma@anholt.net >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Acked-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190 >
2021-06-07 22:34:42 +00:00
Marcin Ślusarz
3340d5ee02
intel: simplify is_haswell checks, part 1
...
Generated with:
files=`git grep is_haswell | cut -d: -f1 | sort | uniq`
for file in $files; do
cat $file | \
sed "s/devinfo->ver <= 7 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" | \
sed "s/devinfo->ver >= 8 || devinfo->is_haswell/devinfo->verx10 >= 75/g" | \
sed "s/devinfo->is_haswell || devinfo->ver >= 8/devinfo->verx10 >= 75/g" | \
sed "s/devinfo.is_haswell || devinfo.ver >= 8/devinfo.verx10 >= 75/g" | \
sed "s/devinfo->ver > 7 || devinfo->is_haswell/devinfo->verx10 >= 75/g" | \
sed "s/devinfo->ver == 7 && !devinfo->is_haswell/devinfo->verx10 == 70/g" | \
sed "s/devinfo.ver == 7 && !devinfo.is_haswell/devinfo.verx10 == 70/g" | \
sed "s/devinfo->ver < 8 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" | \
sed "s/device->info.ver == 7 && !device->info.is_haswell/device->info.verx10 == 70/g" \
> tmpXXX
mv tmpXXX $file
done
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10810 >
2021-05-17 09:46:45 +00:00
Caio Marcelo de Oliveira Filho
c0dc6affdc
intel/compiler: Clarify why VUE is recomputed by FS
...
FS will get the last geometry VUE, but it still needs to recompute in
case the number of position slots assigned by geometry is larger than
one -- this happens when Primitive Replication is used.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10653 >
2021-05-13 12:10:26 -07:00
Caio Marcelo de Oliveira Filho
caf9fb1a10
intel/compiler: Remove unused exported functions
...
Now that all drivers are using brw_cs_get_dispatch_info() we can
remove one function (which is now unused) and reduce the scope of the
other.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504 >
2021-05-04 08:15:19 -07:00
Caio Marcelo de Oliveira Filho
5cc758558d
intel/compiler: Add common function for CS dispatch info
...
We have this small calculations repeated in each Intel driver, so move
them to a single place to be reused. Also includes "right_mask" since
is always used in the same context and depends on the dispatch info
values.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504 >
2021-05-04 08:15:19 -07:00
Lionel Landwerlin
0421690f83
intel/compiler: add restrictions related to coarse pixel shading
...
v2: Update to BITSET_TEST()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
6d4070f3dd
intel/compiler: add support for fragment coordinate with coarse pixels
...
v2: Drop new internal opcodes (Jason)
Simplify code (Jason)
v3: Add Z computation for coarse pixels
v4: Document things a little
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
a297061524
intel/compiler: add support for fragment shading rate variable
...
v2: Drop old register type initializers (Jason)
Simplify instruction snippet (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
b6332fc4a8
intel/compiler: handle coarse pixel in render target writes descriptors
...
v2: Use the new inst->ex_desc field (Jason)
v3: Drop CPS LoD compensation from sampler messages (Lionel)
v4: Drop useless uses_rate_shading (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
64551610d1
intel/compiler: rework message descriptors for render targets
...
Render target message descriptors are slightly different from the
dataport ones. In particular the msg_type field is on bits 14:17 for
RT while bits 14:18 for DP.
v2: Drop unused send_commit_msg field in brw_fb_write_desc() (Ken)
v3: Rebase on top renaming (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Suggested-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Lionel Landwerlin
dabaaaf6c7
intel/compiler: make sure we keep the lowest dispatch limit
...
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455 >
2021-05-02 20:20:06 +00:00
Anuj Phogat
61e8636557
intel: Rename gen_device prefix to intel_device
...
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241 >
2021-04-20 20:06:33 +00:00
Anuj Phogat
926d343acf
intel: Rename files with gen_debug prefix
...
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
find $SEARCH_PATH -type f -name "*gen_debug.*[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \;
grep -E "gen_debug" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_debug\./intel_debug\./g"
grep -E "GEN_DEBUG" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241 >
2021-04-20 20:06:33 +00:00
Francisco Jerez
a0e0dfe174
intel/fs: Introduce lowering pass to implement derivatives in terms of quad swizzles.
...
Unfortunately the funky Align1 regions used by the code generator in
order to implement derivatives efficiently aren't available to the
floating-point pipeline on XeHP. We need to lower them into a number
of pipelined integer shuffle instructions followed by the
floating-point difference computation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000 >
2021-04-16 08:27:35 +00:00
Rafael Antognolli
49b2d9f428
intel/fs: Lower dword integer multiplies on XeHP.
...
From the BSpec:
"When multiplying DW X DW, resulting dst can only be QW precision. If
DW precision is required at output than MUL/MACH macro must be used."
So for now simply lower it. We might want to revisit it later.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000 >
2021-04-16 08:27:35 +00:00
Francisco Jerez
0dc16965a9
intel/fs: Fix repclear assembly for XeHP+ regioning restrictions.
...
The regioning mode used here is no longer supported by the
floating-point pipeline. We could run the regioning lowering pass in
order to fix it with some extra copies, but it's more efficient to
change the instruction to use integer types.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000 >
2021-04-16 08:27:35 +00:00
Iván Briano
8328989130
intel, anv: propagate robustness setting to nir_opt_load_store_vectorize
...
Closes #4309
Fixes dEQP-VK-robustness.robustness2.*.readonly.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10147 >
2021-04-13 13:30:09 -07:00
Bas Nieuwenhuizen
580f1ac473
nir: Extract shader_info->cs.shared_size out of union.
...
It is valid for all stages, just 0 for most of them. In particular
mesh/task shaders might be using it.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10094 >
2021-04-08 14:39:28 +00:00
Lionel Landwerlin
49be175a4b
intel/fs: limit OW reads to 8 owords on XeHP+
...
We can only use 16 OW reads/writes on SLM.
v2: Update comment (Curro)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
BSpec: 47652
Fixes: 369eab9420 ("intel/fs: Emit code for Gen12-HP indirect compute data")
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10082 >
2021-04-08 09:25:38 +00:00
Anuj Phogat
f96c3b8b63
intel: Rename GEN:BUG:### to Wa_###
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN:BUG:" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN\(:BUG:\)/Wa_/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
e7e55af4d6
intel: Rename GENx keyword to GFXx
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN\([[:digit:]]\+\)/GFX\1/g"
Exclude the changes to modifiers:
grep -E "I915_.*GFX" -rIl $SEARCH_PATH | xargs sed -ie "s/\(I915_.*\)GFX/\1GEN/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
1d296484b4
intel: Rename Genx keyword to Gfxx
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/Gen\([[:digit:]]\+\)/Gfx\1/g"
Exclude changes in src/intel/perf/oa-*.xml:
find src/intel/perf -type f \( -name "*.xml" \) | xargs sed -ie "s/Gfx/Gen/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
b75f095bc7
intel: Rename genx keyword to gfxx in source files
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/gen\([[:digit:]]\+\)/gfx\1/g"
Exclude pack.h and xml changes in this patch:
grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+_pack\.h\)/gen\1/g"
grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+\.xml\)/gen\1/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
c1f3a778de
intel: Rename GENx prefix in macros to GFXx in source files
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "GEN" -rIl src/intel/genxml | grep -E ".*py" | xargs sed -ie "s/GEN\([%{]\)/GFX\1/g"
grep -E "[^_]GEN[[:digit:]]+" -rIl $SEARCH_PATH | grep -E ".*(\.c|\.h|\.y|\.l)" | xargs sed -ie "s/\([^_]\)GEN\([[:digit:]]\+\)/\1GFX\2/g"
Leave out renaming GFX12_CCS_E macros. They fall under renaming pattern like "_GEN[[:digit:]]+":
grep -E "GFX12_CCS_E" -rIl $SEARCH_PATH | xargs sed -ie "s/GFX12_CCS_E/GEN12_CCS_E/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
abe9a71a09
intel: Rename gen field in gen_device_info struct to ver
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Anuj Phogat
99331f6deb
intel: Rename genx10 field in gen_device_info struct to verx10
...
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)genx10" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)genx10/info\1\2verx10/g"
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936 >
2021-04-02 18:33:07 +00:00
Caio Marcelo de Oliveira Filho
e93c8ab023
intel/compiler: Use a struct for brw_compile_cs parameters
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho
05933fb0f7
intel/compiler: Use INTEL_DEBUG=blorp to dump blorp shaders
...
Make INTEL_DEBUG=blorp dump the blorp shaders instead using the
general INTEL_DEBUG=fs,vs, which is now reserved to the actual FS and
VS shaders used by the pipeline.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho
7fb1e58651
intel/compiler: Make visitors take debug_enabled as a parameter
...
The callers already have this value, and we would like to make it
follow different rules other than stage that might not be visible to
the helper function, so just pass explicitly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho
244d2daa00
intel/compiler: Make brw_postprocess_nir take debug_enabled as a parameter
...
The callers already have this value, and we would like to make it
follow different rules other than stage that might not be visible to
the helper function, so just pass explicitly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho
82d77f0ea8
intel/compiler: Refactor the shader INTEL_DEBUG checks
...
Make the check once in a variable, that can be reused for other parts.
Also add `unlikely` to the various conditionals depending on it
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho
f5e1765f98
intel/compiler: Use a struct for brw_compile_fs parameters
...
Makes calling code more explicit about what is being set, and allows
take advantage of zero initialization for the ones the callsite don't
care.
Besides moving to the struct, two extra "ergonomic" changes were done:
- Add a new shader_time boolean, so shader_time_index is ignored when
unused -- this allow taking advantage of the zero initialization of
unset fields.
- Since we have a struct, provide space for the error_str pointer.
Both iris and i965 were using it, and the extra rstrdup in case of
failure shouldn't be a burden for the others.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Caio Marcelo de Oliveira Filho
84c3d68344
intel/compiler: Make vue_map parameter const for brw_compile_fs
...
Just a documentation hint that the VUE map is not modified.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779 >
2021-03-24 23:18:46 +00:00
Jason Ekstrand
91192696e6
intel/fs: Add support for 16-bit A64 float and integer atomics
...
The messages for those 16-bit operations still use 32-bit sources and
destinations, so expand them accordingly when building the payload.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8750 >
2021-03-18 00:13:40 +00:00
Jason Ekstrand
8b7c2f1800
intel/fs: Use INTEL_MASK for pushish constant address masking
...
It's easier to compare with the HW docs than a pile of hex.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9501 >
2021-03-10 22:17:41 +00:00
Jason Ekstrand
117668b811
nir: Make nir_ssa_def_rewrite_uses take an SSA value
...
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com >
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383 >
2021-03-08 16:59:55 +00:00
Jordan Justen
18bc7d9d3f
intel: Use devinfo genx10 field
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9329 >
2021-03-01 22:00:08 -08:00
Ian Romanick
3c31364f5e
intel/compiler: Use CMPN for min / max on Gen4 and Gen5
...
On Intel platforms before Gen6, there is no min or max instruction.
Instead, a comparison instruction (*more on this below) and a SEL
instruction are used. Per other IEEE rules, the regular comparison
instruction, CMP, will always return false if either source is NaN. A
sequence like
cmp.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F
(+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F
will generate the wrong result for min if g22 is NaN. The CMP will
return false, and the SEL will pick g22.
To account for this, the hardware has a special comparison instruction
CMPN. This instruction behaves just like CMP, except if the second
source is NaN, it will return true. The intention is to use it for min
and max. This sequence will always generate the correct result:
cmpn.l.f0.0(16) null<1>F g30<8,8,1>F g22<8,8,1>F
(+f0.0) sel(16) g8<1>F g30<8,8,1>F g22<8,8,1>F
The problem is... for whatever reason, we don't emit CMPN. There was
even a comment in lower_minmax that calls out this very issue! The bug
is actually older than the "Fixes" below even implies. That's just when
the comment was added. That we know of, we never observed a failure
until #4254 .
If src1 is known to be a number, either because it's not float or it's
an immediate number, use CMP. This allows cmod propagation to still do
its thing. Without this slight optimization, about 8,300 shaders from
shader-db are hurt on Iron Lake.
Fixes the following piglit tests (from piglit!475):
tests/spec/glsl-1.20/execution/fs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/fs-nan-builtin-min.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-max.shader_test
tests/spec/glsl-1.20/execution/vs-nan-builtin-min.shader_test
Closes : #4254
Fixes: 2f2c00c727 ("i965: Lower min/max after optimization on Gen4/5.")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8115134 -> 8115135 (<.01%)
instructions in affected programs: 229 -> 230 (0.44%)
helped: 0
HURT: 1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9027 >
2021-02-17 19:52:24 +00:00
Jason Ekstrand
f3a43e36e0
intel/fs: Add an ex_desc field to fs_inst for SHADER_OPCODE_SEND
...
I meant to do this years ago when I first added SHADER_OPCODE_SEND. At
the time, the only use for the extended descriptor was bindless handles
which were always one thing and never non-constant. However, it doesn't
actually require any extra instructions because we have to OR in ex_mlen
anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8748 >
2021-01-28 17:57:48 +00:00
Caio Marcelo de Oliveira Filho
9f3d5e99ea
compiler: Use util/bitset.h for system_values_read
...
It is currently a bitset on top of a uint64_t but there are already
more than 64 values. Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Karol Herbst <kherbst@redhat.com >
Acked-by: Jesse Natalie <jenatali@microsoft.com >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585 >
2021-01-26 20:20:47 +00:00
Lionel Landwerlin
65f7b93435
intel: silence unused var warnings in release builds
...
v2: Use ASSERTED
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4162
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8681 >
2021-01-25 09:04:32 +00:00
Jason Ekstrand
44571c6a68
intel/fs: Properly lower 64-bit MUL on 64-bit-incapable platforms
...
There are two problems this commit solves: First, is that the 64x64 MUL
lowering generates a Q MOV which, because of how late it runs in the
compile pipeline, it never gets removed. Second, it generates 32x32
MULs and we have to run it a second time to lower those.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:38 +00:00
Jason Ekstrand
69a3559efd
intel/reg,fs: Handle immediates properly in subscript()
...
Just returning the original type isn't what we want in basically any
case. Mask and shift the immediate as needed.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7329 >
2021-01-22 18:38:37 +00:00
Jason Ekstrand
369eab9420
intel/fs: Emit code for Gen12-HP indirect compute data
...
Reworks:
* Jordan: Apply to gen > 12
* Jordan: Adjust comment about loading constants
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342 >
2021-01-13 13:10:28 -08:00
Jason Ekstrand
b4ffbf1521
intel/fs: Allow compute dispatch without a pushed subgroup ID on Gen12-HP
...
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8342 >
2021-01-13 13:10:27 -08:00
Jason Ekstrand
6992d2f625
intel/fs: Emit HALT_TARGET in emit_nir_code()
...
Instead of making it a fragment-specific thing based on uses_kill, track
whether or not we need one in fs_visitor and emit HALT_TARGET at the end
of emit_nir_code() if needed.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:19:14 -06:00
Jason Ekstrand
4a7f0aa2e0
intel/fs: Remove unnecessary HALT_TARGET in opt_redundant_halt()
...
This means the pass has to walk all the instructions but it was doing
that in a bunch of cases anyway when it didn't have a HALT_TARGET.
However, removing HALT_TARGET frees up the scheduler a bit because
HALT_TARGET is considered a scheduling barrier. The shader-db results
are kind-of a wash but we're about to add HALT_TARGET unconditionally so
we want to be able to get rid of it.
Shader-db results on Ice Lake:
total instructions in shared programs: 19935623 -> 19935623 (0.00%)
instructions in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 976758472 -> 976766135 (<.01%)
cycles in affected programs: 11097707 -> 11105370 (0.07%)
helped: 1750
HURT: 875
helped stats (abs) min: 1 max: 866 x̄: 26.39 x̃: 4
helped stats (rel) min: <.01% max: 39.24% x̄: 1.25% x̃: 0.46%
HURT stats (abs) min: 1 max: 1678 x̄: 61.54 x̃: 10
HURT stats (rel) min: <.01% max: 65.69% x̄: 1.86% x̃: 0.42%
95% mean confidence interval for cycles value: -2.48 8.32
95% mean confidence interval for cycles %-change: -0.40% -0.03%
Inconclusive result (value mean confidence interval includes 0).
LOST: 62
GAINED: 46
All of the lost/gained programs are SIMD32 fragment shaders.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:19:10 -06:00
Jason Ekstrand
f9d549b2bf
intel/fs: Use BRW_OPCODE_HALT for discards
...
We're about to start using it to implement nir_jump_halt which has
nothing inherently to do with fragment shaders or discards. May as well
name it for the HW instruction it generates.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:19:08 -06:00
Jason Ekstrand
e76e359007
intel/fs: Rename PLACEHOLDER_HALT to HALT_TARGET
...
It's a bit more explicit and will play more nicely with what we're about
to do.
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5071 >
2020-12-01 16:18:50 -06:00