Rhys Perry
8850a63161
radv/aco,nir/lower_subgroups: don't lower elect
...
ACO can implement this better.
fossil-db (Navi):
Totals from 33 (0.02% of 135946) affected shaders:
SGPRs: 1736 -> 1744 (+0.46%)
VGPRs: 1680 -> 1656 (-1.43%)
CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04%
MaxWaves: 449 -> 461 (+2.67%)
Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05%
Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558 >
2020-10-13 12:47:20 +00:00
Timur Kristóf
f11f4a2a4d
nir: Add ability to count primitives per stream.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
aac5adc3c2
nir: Count vertices per stream.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Timur Kristóf
2be99012e9
nir: Add ability to count emitted GS primitives.
...
Add an option to nir_lower_gs_intrinsics which tells it to track
the number of emitted primitives, not just vertices. Additionally,
also make it per-stream.
Also rename the set_vertex_count intrinsic to
set_vertex_and_primitive_count.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964 >
2020-10-09 15:26:14 +02:00
Jason Ekstrand
3d22de05ca
intel/fs: Add an option to use dataport messages for UBOs
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932 >
2020-10-08 01:17:06 -05:00
Jason Ekstrand
0d462dbee5
intel/fs: Add an alignment to VARYING_PULL_CONSTANT_LOAD_LOGICAL
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3932 >
2020-10-08 01:14:46 -05:00
Jason Ekstrand
dd9c34a907
intel/nir: Lower load_global_constant in lower_mem_access_bit_sizes
...
It's identical to nir_intrinsic_load_global except that it works on data
that's guaranteed to be constant throughout the shader invocation.
Fixes: ff2f44d865 "intel/fs: Implement nir_intrinsic_load_global_constant"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872 >
2020-10-08 03:56:01 +00:00
Jason Ekstrand
fd04f858b0
intel/nir: Don't try to emit vector load_scratch instructions
...
In 53bfcdeecf , we added load/store_scratch instructions which deviate
a little bit from most memory load/store instructions in that we can't
use the normal untyped read/write instructions which can read and write
up to a vec4 at a time. Instead, we have to use the DWORD scattered
read/write instructions which are scalar. To handle this, we added code
to brw_nir_lower_mem_access_bit_sizes to cause them to be scalarized.
However, one case was missing: the load-as-larger-vector case. In this
case, we take small bit-sized constant-offset loads replace it with a
32-bit load and shuffle the result around as needed.
For scratch, this case is much trickier to get right because it often
emits vec2 or wider which we would then have to lower again. We did
this for other load and store ops because, for lower bit-sizes we have
to scalarize thanks to the byte scattered read/write instructions being
scalar. However, for scratch we're not losing as much because we can't
vectorize 32-bit loads and stores either. It's easier to just disallow
it whenever we have to scalarize.
Fixes: 53bfcdeecf "intel/fs: Implement the new load/store_scratch..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6872 >
2020-10-08 03:56:01 +00:00
Jason Ekstrand
9df9f940f0
iris: Add support for load_work_dim as a system value
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047 >
2020-10-07 16:01:31 -05:00
Marcin Ślusarz
9c25689287
intel: drop likely/unlikely around INTEL_DEBUG
...
It's included in declaration of INTEL_DEBUG.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732 >
2020-10-06 18:43:07 +00:00
Vinson Lee
81cd4c8f59
intel/vec4: Remove leftover code from Gen8+ removal.
...
Remove code missed in commit 2a49007411 ("intel/vec4: Remove all
support for Gen8+ [v2]").
Fix defect reported by Coverity Scan.
Logically dead code (DEADCODE)
dead_error_begin: Execution cannot reach this statement:
mcs.swizzle = 80U;
Signed-off-by: Vinson Lee <vlee@freedesktop.org >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6927 >
2020-10-03 03:53:46 +00:00
Jason Ekstrand
8427e56067
intel/fs: Don't use NoDDClk/NoDDClr for split SHUFFLEs
...
When I copied and pasted the code from MOV_INDIRECT for handling the
dependency controls, I missed a subtle difference between MOV_INDIRECT
and SHUFFLE. Specifically, MOV_INDIRECT gets lowered to a narrow
instruction on Gen7 by the SIMD width lowering whereas SHUFFLE has to
split it in the generator. Therefore, the check safety check for
whether or not we can use dependency control has to be based on the
lowered width rather than the width of the original instruction.
Fixes: a8ac61b0ee "intel/fs: NoMask initialize the address..."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3593
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6989 >
2020-10-02 19:53:56 +00:00
Jason Ekstrand
a8ac61b0ee
intel/fs: NoMask initialize the address register for shuffles
...
Cc: mesa-stable@lists.freedesktop.org
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2979
Tested-by: Iván Briano <ivan.briano@intel.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6825 >
2020-10-02 00:42:56 +00:00
Eric Anholt
618556a8cb
nir: Drop the high_offset argument to the load_store_vectorizer filter.
...
Nothing uses it, and it's not clear to me what it provides over
alignment/num_components/bit_size.
Reviewed-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612 >
2020-09-30 19:53:43 +00:00
Eric Anholt
5f757bb95c
nir: Make the load_store_vectorizer provide align_mul + align_offset.
...
It was passing an encoding of the two that wasn't good for ensuring "Don't
combine loads that would make us straddle a vec4 boundary" for
nir_lower_ubo_vec4.
Reviewed-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612 >
2020-09-30 19:53:43 +00:00
Connor Abbott
b2ede6280c
intel/nir: Use nir control flow helpers
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6866 >
2020-09-30 15:47:51 +00:00
Ian Romanick
1d71b1a311
intel/vec4: Remove everything related to VS_OPCODE_SET_SIMD4X2_HEADER_GEN9
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:10 -07:00
Ian Romanick
2a49007411
intel/vec4: Remove all support for Gen8+ [v2]
...
v2: Restore the gen == 10 hunk in brw_compile_vs (around line 2940).
This function is also used for scalar VS compiles. Squash in:
intel/vec4: Reindent after removing Gen8+ support
intel/vec4: Silence unused parameter warning in try_immediate_source
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net > [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com > [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org > [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:10 -07:00
Ian Romanick
60e1d0f028
intel/compiler: Remove INTEL_SCALAR_... env variables
...
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:10 -07:00
Ian Romanick
d0ce24c8ca
intel/vec4: Remove inline lowering of LRP
...
Since dd7135d55d ("intel/compiler: Use the flrp lowering pass for all
stages on Gen4 and Gen5"), it's not possible to get to this function on
GPUs that don't have a LRP instruction.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:10 -07:00
Ian Romanick
86bab92aa4
intel/compiler: Don't fallback to vec4 when scalar GS compile fails [v2]
...
v2: Add missing error string handling. Noticed by Jason.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com > [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:04 -07:00
Ian Romanick
92f08860c9
intel/compiler: Silence unused parameter warning in brw_surface_payload_size
...
src/intel/compiler/brw_eu_emit.c: In function ‘brw_surface_payload_size’:
src/intel/compiler/brw_eu_emit.c:3070:46: warning: unused parameter ‘p’ [-Wunused-parameter]
3070 | brw_surface_payload_size(struct brw_codegen *p,
| ~~~~~~~~~~~~~~~~~~~~^
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:04 -07:00
Ian Romanick
9bcdca2455
intel/vec4: Silence unused paramter warnings in brw_vec4_generator.cpp
...
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_gs_svb_write(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:488:49: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
488 | struct brw_vue_prog_data *prog_data,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_pull_constant_load(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1269:55: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
1269 | struct brw_vue_prog_data *prog_data,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_get_buffer_size(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1331:52: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
1331 | struct brw_vue_prog_data *prog_data,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
src/intel/compiler/brw_vec4_generator.cpp: In function ‘void generate_pull_constant_load_gen7(brw_codegen*, brw_vue_prog_data*, brw::vec4_instruction*, brw_reg, brw_reg, brw_reg)’:
src/intel/compiler/brw_vec4_generator.cpp:1357:60: warning: unused parameter ‘prog_data’ [-Wunused-parameter]
1357 | struct brw_vue_prog_data *prog_data,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6826 >
2020-09-28 11:43:04 -07:00
Danylo Piliaiev
77486db867
intel/fs: Disable sample mask predication for scratch stores
...
Scratch stores are being lowered to the instructions with side-effects,
however they should be enabled in fs helper invocations, since they
are produced from operations which don't imply side-effects.
To fix this - we move the decision of whether the sample mask predication
is enable to the point where logical brw instructions are created.
GLSL example of the issue:
int tmp[1024];
...
do {
// changes to tmp
} while (some_condition(tmp))
If `tmp` is lowered to scrach memory, `some_condition` would be
undefined if scratch write is predicated on sample mask, making
possible for the while loop to become infinite and hang the GPU.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3256
Fixes: 53bfcdeecf
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Acked-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6056 >
2020-09-25 09:48:06 +00:00
Kenneth Graunke
140f53e646
Revert "nir: replace lower_ffma and fuse_ffma with has_ffma"
...
This reverts commit 939ddf3f67 .
Intel has a separate pass for fusing FFMAs selectively. We split
these flags in commit 1b72c31e1f and
the reasoning still stands. The patch being reverted was just a
cleanup, so there should be no issue with reverting it.
Acked-by: Matt Turner <mattst88@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6849 >
2020-09-24 13:11:50 -07:00
Marek Olšák
939ddf3f67
nir: replace lower_ffma and fuse_ffma with has_ffma
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756 >
2020-09-24 12:29:11 +00:00
Marek Olšák
771aad3027
nir: split lower_ffma into lower_ffma16/32/64
...
AMD wants different behavior for each bit size
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6756 >
2020-09-24 12:29:11 +00:00
Jason Ekstrand
9750164c09
nir: Rename get_buffer_size to get_ssbo_size
...
This makes it explicit that this intrinsic is only for SSBOs. For the
v3dv driver, we'll be adding a get_ubo_size intrinsic and we want to be
able to distinguish between the two.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6812 >
2020-09-22 13:34:12 +00:00
Lionel Landwerlin
cc3bf00cc2
intel/compiler: fixup Gen12 workaround for array sizes
...
We didn't handle the case of NULL images/textures for which we should
return 0.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: 397ff2976b ("intel: Implement Gen12 workaround for array textures of size 1")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3522
Reviewed-by: Ivan Briano <ivan.briano@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6729 >
2020-09-21 21:20:09 +00:00
Jason Ekstrand
f63ffc18e7
intel/fs/swsb: SCHEDULING_FENCE only emits SYNC_NOP
...
It's not really unordered in the sense that it can still stall on
ordered things and we don't need a SYNC_NOP for that because it is a
SYNC_NOP. However, it also doesn't count when computing instruction
distances.
Fixes: 18e72ee210 "intel/fs: Add FS_OPCODE_SCHEDULING_FENCE"
Reviewed-by: Francisco Jerez <currojerez@riseup.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6781 >
2020-09-20 14:43:40 +00:00
Gert Wollny
80cde3ad55
intel/compiler: Set lower_uniform_to_ubo compiler flag
...
Signed-off-by: Gert Wollny <gert.wollny@collabora.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6316 >
2020-09-16 10:07:42 +00:00
Marcin Ślusarz
18eb853ac8
intel/compiler: quiet Coverity warnings
...
Coverity complains about possible out-of-bounds write & read, because
it thinks that "loc + i" can be bigger than sizes of the 2 used arrays.
It's not obvious from the code it cannot happen, so add asserts here.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667 >
2020-09-10 12:16:58 +00:00
Marcin Ślusarz
5ea0b6a9c6
intel/compiler: initialize remaining fields of various classes
...
These variables seem to be initialized before being used, so this
patch is not fixing any bug, but leaving them unitialized may become
a bug after some refactoring.
These classes were affected: fs_reg_alloc, fs_visitor, fs_generator,
instruction_scheduler.
Found by Coverity.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667 >
2020-09-10 12:16:58 +00:00
Marcin Ślusarz
40b964dc8f
intel/compiler: remove unused fs_validator::param_size
...
Found by Coverity as unitialized variable.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667 >
2020-09-10 12:16:58 +00:00
Jason Ekstrand
3bd7c3c9db
intel/nir: Call validate_ssa_dominance at both ends of the NIR compile
...
This invokes it before we go into the optimization/lowering pass and
then right before we go out of SSA.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288 >
2020-09-08 19:44:01 +00:00
Marcin Ślusarz
64b0b7c274
intel/compiler: fix typo in a comment
...
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602 >
2020-09-04 17:38:25 +00:00
Marcin Ślusarz
95ce619680
intel/compiler: print dispatch width when shader fails to compile
...
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602 >
2020-09-04 17:38:25 +00:00
Marcin Ślusarz
e5f735a986
intel/compiler: move extern C functions out of namespace brw
...
brw_compile_gs and brw_compile_tcs are extern C functions, but are
defined inside of brw namespace, which somehow works but confuses
Eclipse CDT's code analysis.
Move these functions out of brw namespace and fix references to
objects from brw namespace.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602 >
2020-09-04 17:38:25 +00:00
Marcin Ślusarz
d4c6e3f196
intel/compiler: use the same name for nir shaders in brw_compile_* functions
...
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602 >
2020-09-04 17:38:25 +00:00
Marcin Ślusarz
0dda209406
intel/compiler: match brw_compile_* declarations with their definitions
...
Current state confuses Eclipse CDT's code analysis.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6602 >
2020-09-04 17:38:25 +00:00
Marek Olšák
ac55b1a9a6
nir: get ffma support from NIR options for nir_lower_flrp
...
This also fixes the inverted last parameter of nir_lower_flrp in most drivers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599 >
2020-09-04 17:06:22 +00:00
Marcin Ślusarz
663c4d5377
intel/fs: add hint how to get more info when shader validation fails
...
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6559 >
2020-09-04 12:09:22 +00:00
Jason Ekstrand
a7a0315d7f
intel/nir: Stop using nir_lower_vars_to_scratch
...
Instead, we do a limited indirect deref lowering and then use
nir_lower_vars_to_explicit_types and nir_lower_explicit_io to lower it
as if it were SSBO or global memory access. Among other things, this
should enable pointer arithmetic on local variables. Fun!
The only shader-db change from this change on ICL was a few tiny cycle
count changes in 7 Aztec Ruins compute shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909 >
2020-09-03 14:26:49 +00:00
Jason Ekstrand
38a83a3048
nir/lower_indirect_derefs: Add a threshold
...
Instead of always lowering everything, we add a threshold such that if
the total indirected array size (AoA size) is above that threshold, it
won't lower. It's assumed that the driver will sort things out somehow
by, for instance, lowering to scratch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909 >
2020-09-03 14:26:49 +00:00
Jason Ekstrand
c897cd0278
intel/compiler: Handle all indirect lowering choices in brw_nir.c
...
Since everything flows through NIR and we're doing all of our indirect
deref lowering there now, there's no reason to keep making those
decisions in brw_compiler and stuffing them in the GLSL compiler
structs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5909 >
2020-09-03 14:26:49 +00:00
Jason Ekstrand
fe18a0fd45
intel/nir: Lower load_num_work_groups to 32-bit if needed
...
For OpenCL-style kernels, this builtin is 64-bit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570 >
2020-09-02 20:38:22 +00:00
Jason Ekstrand
5799da47c7
intel/fs: Use a single untyped surface read for load_num_work_groups
...
There's no good reason to split this into three. Sure, CS indirects are
only guaranteed by the spec to be DWORD aligned, but that's all untyped
surface reads require anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6570 >
2020-09-02 20:38:22 +00:00
Jason Ekstrand
8e8701b43a
intel/fs: Don't copy-propagate stride=0 sources into ddx/ddy
...
This can come up if, for instance, the shader does a derivative of a
uniform or flat input. Ideally, NIR would use divergence analysis to
get rid of the derivative in this case but it doesn't right now. This
fixes a crash in F1 2017.
Cc: mesa-stable@lists.freedesktop.org
Reported-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Tested-by: Marcin Ślusarz <marcin.slusarz@intel.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6564 >
2020-09-02 20:31:32 +00:00
Jason Ekstrand
91becd84ae
intel/fs: Add support for a new load_reloc_const intrinsic
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244 >
2020-09-02 19:48:44 +00:00
Jason Ekstrand
8d8a3815ef
intel/eu: Add a mechanism for emitting relocatable constant MOVs
...
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6244 >
2020-09-02 19:48:44 +00:00