anv: add dynamic buffer offsets support with independent sets

With independent sets, we're not able to compute immediate values for
the index at which to read anv_push_constants::dynamic_offsets to get
the offset of a dynamic buffer. This is because the pipeline layout
may not have all the descriptor set layouts when we compile the
shader.

To solve that issue, we insert a layer of indirection.

This reworks the dynamic buffer offset storage with a 2D array in
anv_cmd_pipeline_state :

   dynamic_offsets[MAX_SETS][MAX_DYN_BUFFERS]

When the pipeline or the dynamic buffer offsets are updated, we
flatten that array into the
anv_push_constants::dynamic_offsets[MAX_DYN_BUFFERS] array.

For shaders compiled with independent sets, the bottom 6 bits of
element X in anv_push_constants::desc_sets[] is used to specify the
base offsets into the anv_push_constants::dynamic_offsets[] for the
set X.

The computation in the shader is now something like :

  base_dyn_buffer_set_idx = anv_push_constants::desc_sets[set_idx] & 0x3f
  dyn_buffer_offset = anv_push_constants::dynamic_offsets[base_dyn_buffer_set_idx + dynamic_buffer_idx]

It was suggested by Faith to use a different push constant buffer with
dynamic_offsets prepared for each stage when using independent sets
instead, but it feels easier to understand this way. And there is some
room for optimization if you are set X and that you know all the sets in
the range [0, X], then you can still avoid the indirection. Separate
push constant allocations per stage do have a CPU cost.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15637>
This commit is contained in:
Lionel Landwerlin
2022-04-06 18:12:02 +03:00
committed by Marge Bot
parent 16c7c37718
commit 0b8a2de2a1
11 changed files with 384 additions and 100 deletions
+2 -1
View File
@@ -434,7 +434,8 @@ visit_intrinsic(nir_shader *shader, nir_intrinsic_instr *instr)
case nir_intrinsic_image_load_raw_intel:
case nir_intrinsic_get_ubo_size:
case nir_intrinsic_load_ssbo_address:
case nir_intrinsic_load_desc_set_address_intel: {
case nir_intrinsic_load_desc_set_address_intel:
case nir_intrinsic_load_desc_set_dynamic_index_intel: {
unsigned num_srcs = nir_intrinsic_infos[instr->intrinsic].num_srcs;
for (unsigned i = 0; i < num_srcs; i++) {
if (instr->src[i].ssa->divergent) {
+5
View File
@@ -1741,6 +1741,11 @@ intrinsic("load_reloc_const_intel", dest_comp=1, bit_sizes=[32],
intrinsic("load_desc_set_address_intel", dest_comp=1, bit_sizes=[64],
src_comp=[1], flags=[CAN_ELIMINATE, CAN_REORDER])
# Base offset for a given set in the flatten array of dynamic offsets
# src[0] = { set }
intrinsic("load_desc_set_dynamic_index_intel", dest_comp=1, bit_sizes=[32],
src_comp=[1], flags=[CAN_ELIMINATE, CAN_REORDER])
# OpSubgroupBlockReadINTEL and OpSubgroupBlockWriteINTEL from SPV_INTEL_subgroups.
intrinsic("load_deref_block_intel", dest_comp=0, src_comp=[-1],
indices=[ACCESS], flags=[CAN_ELIMINATE])