aco: move VMEM instructions below descriptor loads
This is to prevent sequences like: a = descriptor_load() vmem(a) b = descriptor_load() vmem(b) and instead create: a = descriptor_load() b = descriptor_load() vmem(a) vmem(b) fossil-db (GFX10.3): Totals from 114521 (78.30% of 146267) affected shaders: VGPRs: 4540352 -> 4540216 (-0.00%); split: -0.03%, +0.02% CodeSize: 289864228 -> 289114652 (-0.26%); split: -0.29%, +0.03% MaxWaves: 2940234 -> 2940338 (+0.00%); split: +0.00%, -0.00% Instrs: 55112418 -> 54919910 (-0.35%); split: -0.38%, +0.03% Latency: 956528393 -> 954682011 (-0.19%); split: -0.24%, +0.05% InvThroughput: 229280830 -> 229238107 (-0.02%); split: -0.04%, +0.02% VClause: 1141832 -> 1139002 (-0.25%); split: -0.63%, +0.38% SClause: 2357840 -> 2225008 (-5.63%); split: -6.01%, +0.38% Copies: 3316040 -> 3331519 (+0.47%); split: -0.31%, +0.77% Branches: 1187212 -> 1186919 (-0.02%); split: -0.03%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6489>
This commit is contained in:
@@ -645,7 +645,8 @@ void schedule_SMEM(sched_ctx& ctx, Block* block,
|
||||
/* break when encountering another MEM instruction, logical_start or barriers */
|
||||
if (candidate->opcode == aco_opcode::p_logical_start)
|
||||
break;
|
||||
if (candidate->isVMEM())
|
||||
/* only move VMEM instructions below descriptor loads. be more aggressive at higher num_waves to help create more vmem clauses */
|
||||
if (candidate->isVMEM() && (cursor.insert_idx - cursor.source_idx > (ctx.num_waves * 4) || current->operands[0].size() == 4))
|
||||
break;
|
||||
/* don't move descriptor loads below buffer loads */
|
||||
if (candidate->format == Format::SMEM && current->operands[0].size() == 4 && candidate->operands[0].size() == 2)
|
||||
|
||||
Reference in New Issue
Block a user