We currently support offset scaling on a per-intrinsic type basis. Since
the introduction of the offset_shift index, different instantiations of
the same type can now have a different scale. Add support for this by
calculating the offset scale on the fly for instructions that have
offset_shift.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
Note: this was implemented and tested for ir3. The code paths that are
never used there [1] seem non-trivial to implement. Since they cannot be
easily tested, asserts and TODOs are added to ensure we don't
accidentally hit them for intrinsics with offset_shift.
[1]: these paths are never used on ir3 since lower_mem_access_bit_sizes
is only used for SSBO accesses to lower 64b accesses (which are 64b
aligned) to 32b ones. So we'll never request an increase of alignment.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
In ir3, SSBO offsets are in units of the accessed type size so we want
to start using the new offset_shift index.
Even though the shift is implicit for the ir3 intrinsics, we use
nir_intrinsic_copy_const_indices when creating them so we need to make
sure our indices match the ones used by the generic intrinsics.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
For intrinsics supporting offset_shift, dealing with their offset is a
bit tricky as we cannot simply add a byte offset to it anymore (which is
what most passes want to do). This commit adds some helpers to add byte
offsets (and adjusting offset_shift accordingly) so that individual
passes don't have to worry about this.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
For load/store intrinsics that take an offset, this specifies the amount
the offset is shifted left to calculate the final offset:
offset = (offset_src + base) << offset_shift
This is useful for backends that have memory operations that use offset
units other than bytes (i.e., where the shift is implicit).
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35092>
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.
This commit follows the blob and disassembles scalar predicates as
up0.c. The "u" presumably stands for "uniform".
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
Predicate registers can be written from the scalar ALU by using a
special cat2 encoding: if the dst is encoded as a0.c, the instruction
will execute on the scalar ALU and write to p0.c.
This commit makes the ir3 backend aware of scalar predicates. A new
register flag (IR3_REG_UNIFORM) is added that can be used to mark
predicate dsts as being written by the scalar ALU. For such dsts, the
same synchronization rules apply as for shared registers written by the
scalar ALU (e.g., (ss) is needed to read them from the vector ALU).
Scalar predicates can be used in the early preamble, which makes control
flow available there.
In many ways, the backend treats IR3_REG_UNIFORM the same as
IR3_REG_SHARED. A new flag was added because IR3_REG_SHARED is mainly
used to denote a separate register file, not as a flag to indicate usage
by the scalar ALU. Scalar predicates still use the normal predicate
register file but allow it to be written from the scalar ALU.
Signed-off-by: Job Noorman <jnoorman@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36614>
Various small optimizations that have been accumulating, deal with them
in one commit:
- Add erase functionality for vector util, remove memsets for time opt.
- Update should_gen_cmd_info to take in any stream variables.
- Program funcs should directly program - update mpcc mux hook func to
take in blend_mode.
- Add reserved bits for debug flags.
Signed-off-by: Brendan Steven, Leder <BrendanSteven.Leder@amd.com>
Acked-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36809>
Usually JIP will be valid, but as part of other changes, it will be
possible to have a shader that have multiple EOT messages and end with
and ENDIF instruction. Its JIP will point after the program ends.
This is fine but was tripping up the compaction code.
Change compaction to not read its internal structures beyond the last
instruction.
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36822>
The enum pan_earlyzs is just enum mali_pixel_kill under a different
name, which was needed because the enum was missing from common.xml.
However, because pan_earlyzs_lut is used in files that are both included
with PAN_ARCH unset and set to values including values lower than 6, we
get issues with the way genxml/common_pack.h gets included, resulting in
the enum not being defined.
We don't really depend on the values for this, only on the size. So
let's just use unsigned values in the struct instead, to side-step the
issue.
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>
While this was also using translate_zs_format() before the commit in
question, that's didn't lead to any real issues, because only a single
value was legal here before. While it's not entirely in-spec to use
other values, it seems the HW doesn't mind.
But when this logic was reworked, the typed field was used instead. This
lead to a compiler warning on Clang.
Let's correct this properly here, rather than papering over the compiler
warning.
Fixes: 7a763bb0a3 ("pan/genxml: Rework the RT/ZS emission logic")
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Eric R. Smith <eric.smith@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36606>