ac/nir: fix check for increasing size of non-descriptor loads
In the previous version, "end" could have been zero, which would have allowed an increase of "mul" bytes, when it should not not be increased at all. For example: - align_offset=4 - mul=4 - unaligned_new_size=96 - aligned_new_size=128 This would have loaded a dword which was not loaded previously. fossil-db (gfx1201): Totals from 115 (0.14% of 79839) affected shaders: Instrs: 286697 -> 287097 (+0.14%); split: -0.16%, +0.30% CodeSize: 1477728 -> 1481256 (+0.24%); split: -0.13%, +0.37% SpillSGPRs: 1662 -> 1658 (-0.24%); split: -0.42%, +0.18% Latency: 2288612 -> 2290248 (+0.07%); split: -0.04%, +0.11% InvThroughput: 467307 -> 467602 (+0.06%); split: -0.03%, +0.10% VClause: 3689 -> 3691 (+0.05%) SClause: 5052 -> 5064 (+0.24%); split: -0.20%, +0.44% Copies: 34837 -> 35103 (+0.76%); split: -0.80%, +1.56% Branches: 7402 -> 7401 (-0.01%) PreSGPRs: 9147 -> 9143 (-0.04%); split: -0.44%, +0.39% VALU: 159333 -> 159372 (+0.02%); split: -0.01%, +0.04% SALU: 52047 -> 52276 (+0.44%); split: -0.55%, +0.99% SMEM: 9556 -> 9697 (+1.48%) fossil-db (navi31): Totals from 238 (0.30% of 79825) affected shaders: Instrs: 484480 -> 485105 (+0.13%); split: -0.05%, +0.17% CodeSize: 2514012 -> 2517928 (+0.16%); split: -0.06%, +0.22% SpillSGPRs: 1064 -> 1059 (-0.47%) Latency: 3941121 -> 3944670 (+0.09%); split: -0.04%, +0.13% InvThroughput: 897483 -> 898090 (+0.07%); split: -0.04%, +0.11% VClause: 7101 -> 7098 (-0.04%) SClause: 9036 -> 9052 (+0.18%); split: -0.44%, +0.62% Copies: 42790 -> 43096 (+0.72%); split: -0.30%, +1.01% PreSGPRs: 14357 -> 14342 (-0.10%); split: -0.37%, +0.26% VALU: 298325 -> 298347 (+0.01%); split: -0.01%, +0.02% SALU: 57288 -> 57577 (+0.50%); split: -0.20%, +0.70% SMEM: 18768 -> 18967 (+1.06%); split: -0.01%, +1.07% fossil-db (navi21): Totals from 239 (0.30% of 79825) affected shaders: Instrs: 444783 -> 445177 (+0.09%); split: -0.07%, +0.15% CodeSize: 2371776 -> 2373136 (+0.06%); split: -0.13%, +0.19% Latency: 4226478 -> 4219221 (-0.17%); split: -0.24%, +0.07% InvThroughput: 1430962 -> 1428445 (-0.18%); split: -0.23%, +0.06% SClause: 9357 -> 9398 (+0.44%); split: -0.20%, +0.64% Copies: 42742 -> 42927 (+0.43%); split: -0.53%, +0.96% Branches: 12975 -> 12970 (-0.04%); split: -0.05%, +0.02% PreSGPRs: 14368 -> 14312 (-0.39%); split: -0.47%, +0.08% VALU: 306642 -> 306720 (+0.03%); split: -0.02%, +0.05% SALU: 63702 -> 63790 (+0.14%); split: -0.31%, +0.45% SMEM: 20030 -> 20231 (+1.00%); split: -0.00%, +1.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14458 Backport-to: 25.3 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38903>
This commit is contained in:
@@ -582,8 +582,8 @@ ac_nir_mem_vectorize_callback(unsigned align_mul, unsigned align_offset, unsigne
|
||||
low->intrinsic == nir_intrinsic_load_global ? NIR_ALIGN_MUL_MAX : 4;
|
||||
uint32_t page_size = 4096;
|
||||
uint32_t mul = MIN3(align_mul, page_size, resource_align);
|
||||
unsigned end = (align_offset + unaligned_new_size / 8u) & (mul - 1);
|
||||
if ((aligned_new_size - unaligned_new_size) / 8u > (mul - end))
|
||||
unsigned end = (align_offset + unaligned_new_size / 8u);
|
||||
if ((aligned_new_size - unaligned_new_size) / 8u > (align(end, mul) - end))
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user