From 1e0e521a7dee1481d7b32ea4c95470beb9125809 Mon Sep 17 00:00:00 2001 From: "Juan A. Suarez Romero" Date: Thu, 30 Jan 2025 16:41:57 +0100 Subject: [PATCH] broadcom/compiler: move stores to the end of shader MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit It is possible that shader comes with output stores executed before loading inputs. As the memory to read the inputs and store the outputs is the same, this mean it could be overwriting the inputs before reading them. This move avoids this situation. This partially improves https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33053. Reviewed-by: Alyssa Rosenzweig Reviewed-by: Marek Olšák Reviewed-by: Iago Toral Quiroga Signed-off-by: Juan A. Suarez Romero Part-of: --- src/broadcom/ci/traces-broadcom.yml | 4 ++-- src/broadcom/compiler/vir.c | 8 ++++++++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/src/broadcom/ci/traces-broadcom.yml b/src/broadcom/ci/traces-broadcom.yml index e522ab84fde..c1470ad2d7f 100644 --- a/src/broadcom/ci/traces-broadcom.yml +++ b/src/broadcom/ci/traces-broadcom.yml @@ -198,9 +198,9 @@ traces: neverball/neverball-v2.trace: broadcom-rpi4: - checksum: 4f4b4b6f37c124fdda6a9efcad577257 + checksum: c8e8ee352bdb303e4ed144b69272575e broadcom-rpi5: - checksum: 174394638c6f774948e7aac91c12f84d + checksum: 56a0adb0efdf799f269da2d734a6817c nheko/nheko-colors.trace: broadcom-rpi4: diff --git a/src/broadcom/compiler/vir.c b/src/broadcom/compiler/vir.c index 0b526768d8b..fd0fae5a9c2 100644 --- a/src/broadcom/compiler/vir.c +++ b/src/broadcom/compiler/vir.c @@ -1051,6 +1051,14 @@ v3d_nir_lower_vs_early(struct v3d_compile *c) NIR_PASS(_, c->s, nir_lower_io, nir_var_shader_in | nir_var_shader_out, type_size_vec4, (nir_lower_io_options)0); + + /* For geometry stages using the same segment for inputs and outputs + * we need to read all inputs before writing any output. If we switch + * to separate segments in the future this may not longer be strictly + * required. + */ + NIR_PASS(_, c->s, nir_move_output_stores_to_end); + /* clean up nir_lower_io's deref_var remains and do a constant folding pass * on the code it generated. */