The resouce_intel intrinsic doesn't not result in an actual
instruction, it's just a wrapper around another value, usually a
load_const.
Allowing this intrinsic to be moved anywhere means it's going to be
closer to the value it wraps, enabling opt_gcm to move a load_ubo
using this resource_intel.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
Intel HW has multiple ways to access resources like UBO/SSBO/images :
- binding tables : a small ~240 heap of surfaces
- bindless surfaces : a 64Mb heap of surfaces up to Gfx12+, 4Gb on Gfx12.5+
- surfaces : a 4Gb heap on Gfx12.5+ (mostly unused at the moment,
only available through the LSC)
For samplers, we have 2 options since Gfx11+ :
- samplers indexed from the Dynamic State Heap (4Gb)
- samplers indexed from the Bindless Sampler Heap (4Gb)
Additionally our whole push constant promotion mechanism is based
around binding table indices. This is problematic if you want to also
promote to push constants things that would be accessed through the
bindless heap.
To solve this issue, we introduce a new intrinsic that will cary a
block index that is not based off the binding table index nor the
bindless table offset.
We will also use this intrinsic to identify whether the buffer/surface
index in load_ubo/load_ssbo/store_ssbo/etc... is relative to the
binding table or the bindless heap.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21645>
This adds a little extra work, since now dominance is computed and
blocks that don't just have then-return or else-return are looked at.
However it means that nir_lower_returns can now keep phis up to date
by inserting undefs without causing some phis to become non-trivial.
This ends up obviating a couple of tests for lower_returns.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22913>
sample_mask_agx maps to the AGX instruction used to write out a sample mask.
api_sample_mask_agx is a system value that returns the value of glSampleMask
(or its Vulkan equivalent), used to lower glSampleMask (etc).
This is distinct from sample_mask_in, which we map to the hardware thing and
AND with this as a lowering.
sample_positions_agx is a system value returning the sample positions in a
packed fixed-point format matching the hardware register, used to lower
gl_SamplePositions.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23040>
Hardware that lacks dedicated image atomics can still implement image atomics
with regular atomics on global memory, as long as there is a way to get the
address of a texel in memory. I've open-coded this lowering in my first 2
compilers, so before I add another crappy vendored version in my 3rd, let's add
a common NIR pass to do the lowering.
Thanks to unified atomics, the pass itself is fairly concise.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
Some hardware has an instruction to load the address of a texel in a writeable
image, given the coordinates ("LEA_IMAGE"). This operation is defined only for
uncompressed images, but it is well-defined regardless of the underlying
twiddling. As such, it is not expected to be produced by APIs but is useful for
internal lowering when it is known that images will be uncompressed (e.g.
because image_store does not support compression on the hardware).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23120>
The idea of this pass is to promote small bit-sizes to larger, supported
bit-sizes for certain operations. It doesn't handle emulating large
bit-size operations on smaller bit-sizes; passes like nir_lower_int64
and nir_lower_doubles handle that.
So, assert that we aren't shrinking the bit-size, as this will almost
certainly produce incorrect results.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23123>
We'd like to postpone most int64 lowering until pretty late in the
process, because e.g. turning iadd@64 into (unpack + add-low + add-high
+ compare + b2i32 + repack) sequences makes it difficult for many
optimization passes to detect basic arithmetic patterns. In particular,
nir_opt_load_store_vectorizer becomes unable to handle basic offset math
on 64-bit addresses.
We'd like to do double precision lowering earlier in the process,
however. One snag is that nir_lower_int64's lower_2f and lower_f2 can
produce operations that may need lowering by nir_lower_doubles(), so
it's crucial to run those sets of lowering together.
To handle this, we make a new entrypoint that does nir_lower_int64
but skips everything except float conversions. Note that the newly
produced instructions will still be lowered according to the full set
of int64 lowering options; this shouldn't be a huge deal.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23064>
Since 624e799cc3 ("nir: Drop nir_ssa_def::name and nir_register::name"), SSA
defs don't have names, making the name argument unused. Drop it from the
signature and fix the call sites. This was done with the help of the following
Coccinelle semantic patch:
@@
expression A, B, C, D, E;
@@
-nir_ssa_dest_init(A, B, C, D, E);
+nir_ssa_dest_init(A, B, C, D);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23078>
This should make writing some lowering/meta code easier. It also keeps
the num_inputs/outputs updated, when sometimes passes forgot to do so (for
example, nir_lower_input_attachments updated for one of the two vars it
creates). The names of the variables change in many cases, but it's
probably nicer to see "VERT_ATTRIB_POS" than "in_0" or whatever.
I've only converted mesa core (compiler and GL), not all the driver meta
code.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22809>