4e7a777093
This patch allows to form clauses even if the register pressure is at the limit with the effect that VMEM instructions are less scattered after the first clause in a Block. It respects the previous clause size to avoid excessive moving of VMEM instructions. VMEM_CLAUSE_MAX_GRAB_DIST is further reduced to compensate some of the effects. Totals from 28922 (19.26% of 150170) affected shaders: (GFX10.3) VGPRs: 1546568 -> 1523072 (-1.52%); split: -1.52%, +0.00% CodeSize: 117524892 -> 117510288 (-0.01%); split: -0.08%, +0.07% MaxWaves: 605554 -> 611120 (+0.92%) Instrs: 22292568 -> 22291927 (-0.00%); split: -0.10%, +0.09% Latency: 488975399 -> 490230904 (+0.26%); split: -0.06%, +0.32% InvThroughput: 117842300 -> 116521653 (-1.12%); split: -1.15%, +0.03% VClause: 541550 -> 522464 (-3.52%); split: -9.73%, +6.20% SClause: 718185 -> 718298 (+0.02%); split: -0.00%, +0.02% Copies: 1420603 -> 1386949 (-2.37%); split: -2.64%, +0.27% Branches: 559559 -> 559278 (-0.05%); split: -0.06%, +0.01% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10896>