aco: don't insert s_sendmsg dealloc_vgprs with little vgprs allocated

Reduces message bus traffic when the benefit is small.

Foz-DB Navi31:
Totals from 3752 (4.67% of 80273) affected shaders:
Instrs: 1999755 -> 1992249 (-0.38%)
CodeSize: 10531824 -> 10501800 (-0.29%)
Latency: 14935247 -> 14935147 (-0.00%)
InvThroughput: 5976053 -> 5975262 (-0.01%)

Foz-DB Navi33:
Totals from 2614 (3.26% of 80273) affected shaders:
Instrs: 969475 -> 964247 (-0.54%)
CodeSize: 5171240 -> 5150328 (-0.40%)
Latency: 7891519 -> 7891434 (-0.00%)
InvThroughput: 4815008 -> 4814287 (-0.01%); split: -0.01%, +0.00%

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37508>
This commit is contained in:
Georg Lehmann
2025-09-22 16:20:26 +02:00
committed by Marge Bot
parent 27cc6317f9
commit 8e03505782

View File

@@ -891,6 +891,11 @@ deallocate_vgprs(wait_ctx& ctx, std::vector<aco_ptr<Instruction>>& instructions)
if (ctx.gfx_level < GFX11)
return;
/* New waves are likely not vgpr limited. */
unsigned max_waves_limit = ctx.program->dev.physical_vgprs / ctx.program->dev.max_waves_per_simd;
if (ctx.program->config->num_vgprs <= max_waves_limit)
return;
/* s_sendmsg dealloc_vgprs waits for all counters except stores. */
if (!(ctx.nonzero & counter_vs))
return;