a1278095d3
There seems to be a hardware bug that sometimes causes a GPU hang when an alias...sam sequence crosses an instruction cache line boundary. This commit adds a workaround pass that inserts padding nops to ensure no such sequence cross a cache line. Until an alternative solution is found, this is the best we can do. While the number of nops we have to insert is fixed at this point, we can try to minimize the number of nops executed at runtime by replacing nops encoded in instructions by standalone nops. That is, if this pass has to insert one nop, it will try to make one of the following replacements: - (rptN)nop -> (rptN-1)nop; nop - (nopN)foo -> (nopN-1)foo; nop It does so by keeping track of "insert points". Each insert point keeps track of the instruction and the maximum number of nops that can be inserted there without pushing any subsequent alias sequences over the next cache line. Whenever we need to insert nops, we first try it at the encountered insert points and only if that doesn't work, we insert them right before the first alias. The pass makes sure the insert points are only visited a bounded number of times in total to keep the whole pass O(n). Totals: Instrs: 48207402 -> 48278230 (+0.15%) CodeSize: 101907026 -> 102294524 (+0.38%) NOPs: 8386320 -> 8457148 (+0.84%) (ss)-stall: 4013046 -> 4012931 (-0.00%) (sy)-stall: 16741190 -> 16741033 (-0.00%) Preamble Instrs: 11506988 -> 11520671 (+0.12%) Last helper: 11686328 -> 11701615 (+0.13%) Cat0: 9241457 -> 9312285 (+0.77%) Totals from 25237 (15.32% of 164705) affected shaders: Instrs: 22172360 -> 22243188 (+0.32%) CodeSize: 44372164 -> 44759662 (+0.87%) NOPs: 4201698 -> 4272526 (+1.69%) (ss)-stall: 1982473 -> 1982358 (-0.01%) (sy)-stall: 7379552 -> 7379395 (-0.00%) Preamble Instrs: 4552074 -> 4565757 (+0.30%) Last helper: 6260280 -> 6275567 (+0.24%) Cat0: 4616677 -> 4687505 (+1.53%) Signed-off-by: Job Noorman <jnoorman@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36639>