Kenneth Graunke
873fcdff38
intel/brw: Stop using long BRW_REGISTER_TYPE enum names
...
s/BRW_REGISTER_TYPE/BRW_TYPE/g
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Kenneth Graunke
9d8f2c4421
intel/brw: Rework BRW_REGISTER_TYPE's representation semantics
...
In ancient days, we directly used the hardware register type encodings
throughout the compiler. As more GPU generations came out, encodings
shifted, and we moved to an abstract enum that we could encode/decode
to a particular GPU's hardware encoding. But there was no particular
meaning behind any particular value.
One downside to this approach is that we end up with switch statements
galore. Want to know a type's size? Switch. Convert a unsigned type
to a signed one? Switch. Get a type with the same base type, but
different bit size? Switch. This is both inefficient and inconvenient.
In contrast, nir_alu_type takes a nicer approach - the type encoding has
certain bits representing the base type, and others encoding the size of
the type. Switching base types or sizes is a simple matter of masking
out the relevant field and substituting a different one.
Tigerlake's encoding adopts a similar approach: two bits represent the
size as a 2-bit unsigned number n, where the bit size is (8 * 2^n).
Two more bits represent the base type. Past encodings were a bit ad hoc
as new data types were added over time, but Gfx12 is organized (mostly).
This patch converts our brw_reg_type enum over to a new system that's
patterned after the Tigerlake style (for easy conversion) while
deviating in a few ways that make our vector immediate type size
handling simpler. Should we add additional base types, we're likely
to continue deviating. Still, converting is much simpler.
Type size calculations (which are performed all the time) are now a
simple mask and shift, instead of a switch.
We also adopt the name BRW_TYPE_* instead of BRW_REGISTER_TYPE_* because
it's much shorter and easier to type. Similarly, we create new helper
functions named brw_type_* for working with these types, with a cleaner
naming convention. Legacy names still exist but will we dropped over
the next few patches as pieces get cleaned up.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Kenneth Graunke
c45e235df5
intel/brw: Drop NF type support
...
Icelake removed the PLN instruction for interpolating fragment shader
inputs, instead adding a special "Native Float" (NF) data type which
was a 66-bit floating point data type that could only be used with the
accumulator. On Tigerlake, they dropped NF support in favor of just
doing the interpolation with MAD instructions.
We stopped using NF years ago (commit 9ea90aae1e ),
instead just using the fs_visitor::lower_linterp() pass to emit MADs.
Since this existed only for a short time, and had very limited utility,
we drop it from the compiler. One downside is that we can no longer
disassemble Icelake shaders containing NF types properly, but I doubt
anyone really minds.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Kenneth Graunke
1c6f863fc7
intel/brw: Delete gfx10 table for align1 3src type encoding
...
align1 three-source instructions do not exist on gfx9, and this
compiler does not support gfx10. So the oldest case is gfx11.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28847 >
2024-04-25 11:41:48 +00:00
Mary Guillemard
40422927dc
nak: Pass has_mod to all form of src2 requiring it
...
This was missing from the original changes and was causing HFMA2 to
misbehave with an immediate value.
Also fix inverted value passed around for cbuf and ureg forms.
Fixes: bad23ddb48 ("nak: Add F16 and F16v2 sources")
Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28828 >
2024-04-25 11:19:00 +00:00
Konstantin
46598758e7
radv: Trace indirect dispatch sizes
...
For figuring out hanging indirect dispatches.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28838 >
2024-04-25 10:20:03 +00:00
Konstantin
2b2f67aa2b
radv: Use a struct for the trace_bo layout
...
Now we can use the members on the CPU side and offsetof on the GPU side
instead of magic offsets.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28838 >
2024-04-25 10:20:03 +00:00
Konstantin
575565af58
ac/debug,radv: Read UMR wave dumps into memory before parsing
...
Allows RADV to reuse the wave dump, which leads to more consistency
between pipeline.log and umr_waves.log.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28838 >
2024-04-25 10:20:03 +00:00
Georg Lehmann
f6143d3f48
aco/tests: validate before and after post-ra tests
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881 >
2024-04-25 09:47:19 +00:00
Georg Lehmann
47d824a644
aco/lower_to_hw: fix 16bit p_insert on gfx8
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881 >
2024-04-25 09:47:19 +00:00
Georg Lehmann
bb80ac7a70
aco/lower_to_hw: fix v_cvt_pk_u16_u32 instruction format
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881 >
2024-04-25 09:47:18 +00:00
Georg Lehmann
619470732f
aco/tests/post_ra: fix various validation errors
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881 >
2024-04-25 09:47:18 +00:00
Georg Lehmann
f85e6c82a6
aco/tests: don't use undef for descriptors
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28881 >
2024-04-25 09:47:18 +00:00
Lionel Landwerlin
68dfe17abc
anv: disable dual source blending state if not used in shader
...
Fixing some simulation issues on Gfx9/11 with zink on anv running dual
source blending piglit tests like :
./bin/arb_blend_func_extended-dual-src-blending-discard-without-src1 -auto -fbo
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Cc: mesa-stable
Reviewed-by: Tapani Pälli <tapani.palli@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28901 >
2024-04-25 09:03:30 +00:00
Kenneth Graunke
e6fb3ba037
isl: Set MOCS to uncached for Gfx12.0 blitter sources/destinations
...
We were accidentally leaving XY_BLOCK_COPY_BLT's Source and Destination
MOCS fields set to 0 (Error: Reserved for Non-Use) on Gfx12.0 systems.
This was causing assert fails in debug builds, since we try to ensure
that we don't do that. In theory, MOCS 0 is supposed to be equivalent
to MOCS 2 (all the caching), but...we probably ought to use MOCS 3
(uncached). Every Gfx12.5+ platform requires it, so although there
isn't a note about Gfx12.0 needing that, it's possible that it does.
We're currently only using the blitter for DRI PRIME blits on Gfx12.0,
anyway, and I think we're flushing all the caches regardless.
This bug was somewhat obscure to hit:
- You need a hybrid graphics system with Gfx12.0 and some other GPU
- You have to be using "reverse PRIME", i.e. rendering on the integrated
GPU and displaying on the discrete one. This is not the common case.
- You have to be using a debug build.
No observable performance delta in GfxBench5 Car Chase (an arbitrary
program) when rendering on Alderlake GT1 and displaying on an Arc A770.
Fixes: 194afe8416 ("anv/iris/blorp: use the right MOCS values for each engine")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Rohan Garg <rohan.garg@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28894 >
2024-04-25 08:05:48 +00:00
Samuel Pitoiset
e8d94536d2
radv: fix image format properties with fragment shading rate usage
...
This was missing and this caused test failures for formats different
than VK_FORMAT_R8_UINT which is the only one supported for FSR.
Fixes recent
dEQP-VK.api.info.unsupported_image_usage.*.fragment_shading_rate_attachment.*.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28893 >
2024-04-25 06:33:39 +00:00
Juston Li
ce1bbd241e
venus: extend image cache to vkGetDeviceImageMemoryRequirements
...
Signed-off-by: Juston Li <justonli@google.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28887 >
2024-04-25 02:48:50 +00:00
Juston Li
f4f8f2ecbb
venus: refactor out image requirements helpers
...
Signed-off-by: Juston Li <justonli@google.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28887 >
2024-04-25 02:48:50 +00:00
Karol Herbst
5e1a988003
nir: document base_global_invocation_id and base_workgroup_id
...
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:49 +00:00
Karol Herbst
d22f936019
nir: remove workgroup_id_zero_base
...
This removes the need for drivers to handle both versions. The base will
get added once in nir_lower_system_values when converting from deref to
intrinsic and will be replaced by a zero for users not supporting it.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:49 +00:00
Karol Herbst
3217838fef
nir: remove global_invocation_id_zero_base
...
This removes the need for drivers to handle both versions. The base will
get added once in nir_lower_system_values when converting from deref to
intrinsic and will be replaced by a zero for users not supporting it.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:49 +00:00
Karol Herbst
a2c96b8e7f
mesa/st: lower base invoc and workgroup id
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:49 +00:00
Karol Herbst
e040a08e5e
lavapipe: lower base_workgroup_id to zero
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:49 +00:00
Karol Herbst
a62fb368d6
v3d: call nir_lower_compute_system_values to get rid of base intrinsics
...
OpenGL doesn't have them and rusticl handles them for CL already.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:49 +00:00
Karol Herbst
51f54cdec4
intel/compiler: lower workgoup id to index only for mesh shaders
...
The compiler supports those intrinsics only for task/mesh shaders and it
never caused any issues, because the way `nir_lower_compute_system_values`
is doing its lowering.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:48 +00:00
Karol Herbst
3625a44dcc
nir/divergence_analysis: handle load_base_global_invocation_id
...
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:48 +00:00
Karol Herbst
25d697ef25
nir: add SYSTEM_VALUE_BASE_WORKGROUP_ID
...
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26800 >
2024-04-24 20:18:48 +00:00
Marek Olšák
c3fc214a98
radeonsi: implement user_data_amd for 5, 6, and 7 components correctly
...
NIR can't handle those component counts, so we have to split it into 2
SGPR vectors where each has max 4 components.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
882ee264a6
radeonsi: use ip_type in debug code instead of hardcoding GFX
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
e7000c02e4
radeonsi: always run nir_opt_16bit_tex_image
...
It optimizes constants in srcs to 16 bits.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
18bcdbb634
radeonsi: only expose 8 EQAA samples due to shader limitations
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
256cc77f84
radeonsi: don't add whether NIR is used into the shader key
...
This is from when we had TGSI and NIR was a debug option.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
e5c8f0781c
radeonsi: make clear_render_target clear DCC directly instead of via pipe->clear()
...
This extracts the relevant parts from si_fast_clear.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
eccaba9dfa
radeonsi: enable fast FB clears for conditional rendering
...
They use compute shaders, which always support the render condition.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
9a47fbecd7
radeonsi: don't flush CB and DB if there have been no draw calls
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
f0160443a2
radeonsi: don't flush CB in si_launch_grid_internal_images if not needed
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
708f57e681
radeonsi: don't use si_get_flush_flags() for flushing images
...
si_make_{CB/DB}_shader_coherent are more correct.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
38f74d6277
radeonsi: disable VRS flat shading for selected 8xMSAA and thick tiling cases
...
for better slow clear performance
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
86131c25a1
radeonsi/gfx11: implement DCC clear to "single" for fast non-0/1 clears
...
If the clear color isn't 0 or 1, we used a slow clear. This adds a new
DCC clear where the DCC buffer is cleared to a special value and the clear
color is stored at the beginning of each 256B block in the image.
It can be very fast, but it's not always faster than a slow clear.
There is a heuristic that determines whether this new fast clear is
better.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
10ec468983
radeonsi: don't call resource_copy_region in pipe->blit
...
It's slower because it forces preservation of NaNs.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
26a5955821
radeonsi: change allow_flat_shading to make it a single condition
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
494cad56c4
radeonsi: remove si_use_compute_copy_for_float_formats
...
Gfx blits preserve NaNs now, so this is no longer needed.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
18b7b2c806
radeonsi: use simpler UINT fallback formats for draw-based resource_copy_region
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
8235d3aa19
radeonsi: preserve NaNs in draw-based resource_copy_region
...
Gfx copies are faster sometimes, so they should be able to copy anything.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
a03df53d3b
radeonsi: move blitter clear_render_target impl into si_gfx_clear_render_target
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
82e63db91f
radeonsi: move blitter resource_copy_region implementation to si_gfx_copy_image
...
for a new performance test.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
e94813204a
radeonsi: allow input NIR to use descriptors in image opcodes
...
Skip lowering because there is nothing to lower.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
30fab15f39
radeonsi: don't expose samples_identical and don't lower FMASK if it's disabled
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
dab4295cd5
radeonsi: fix initialization of occlusion query buffers for disabled RBs
...
GFX9+ should assume the enabled RB results are packed (no holes).
Same as PAL.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00
Marek Olšák
aad2302cf5
radeonsi: move TCS epilog key bits to the key->ge.opt section
...
Since the TCS epilog is no more, this is required to apply those bits
to monolithic shaders.
tessfactors_are_def_in_all_invocs was unused.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/28725 >
2024-04-24 19:17:10 +00:00