aco/insert_NOPs: remove redundant VALUMaskWriteHazard waits

This removes a lot of VALU->SALU waits.

Foz-DB Navi31:
Totals from 8908 (10.84% of 82179) affected shaders:
Instrs: 17118986 -> 17084870 (-0.20%)
CodeSize: 91057212 -> 90919300 (-0.15%); split: -0.15%, +0.00%
Latency: 154044128 -> 154036848 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 26608698 -> 26607933 (-0.00%); split: -0.00%, +0.00%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38445>
This commit is contained in:
Georg Lehmann
2025-11-14 10:43:11 +01:00
committed by Marge Bot
parent c3170d11ac
commit b1d730982e

View File

@@ -1496,7 +1496,7 @@ handle_instruction_gfx11(State& state, NOP_ctx_gfx11& ctx, aco_ptr<Instruction>&
unsigned reg = op.physReg() + i;
/* s_waitcnt_depctr on sa_sdst */
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_salu[reg]) {
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_salu[reg] && wait.sa_sdst > 0) {
imm &= 0xfffe;
wait.sa_sdst = 0;
}
@@ -1504,11 +1504,13 @@ handle_instruction_gfx11(State& state, NOP_ctx_gfx11& ctx, aco_ptr<Instruction>&
/* s_waitcnt_depctr on va_sdst (if non-VCC SGPR) or va_vcc (if VCC SGPR) */
if (ctx.sgpr_read_by_valu_as_lanemask_then_wr_by_valu[reg]) {
bool is_vcc = reg == vcc || reg == vcc_hi;
imm &= is_vcc ? 0xfffd : 0xf1ff;
if (is_vcc)
if (is_vcc && wait.va_vcc > 0) {
imm &= 0xfffd;
wait.va_vcc = 0;
else
} else if (!is_vcc && wait.va_sdst > 0) {
imm &= 0xf1ff;
wait.va_sdst = 0;
}
}
}
}