intel/brw/xe3+: Override P value of GRF register classes to increase thread parallelism.
This causes the graph coloring allocator to use the optimistic coloring codepath for all nodes whose total Q value exceeds the threshold of 96 GRFs, in order to do a better job at minimizing the register requirement of programs even when they are trivially colorable. At the threshold of 96 GRFs the number of threads available per EU starts decreasing as the number of register blocks requested by the program increases, so decreasing the number of registers can increase performance. That showed up in some test cases as a performance inversion from the enabling of VRT, since the extension of the register set to 256 GRFs has the side effect of making some non-trivially colorable programs trivially colorable, which would cause the register allocator to do a worse job at ordering the (trivial) allocations due to the optimistic coloring path being skipped, leading to increased register use and reduced performance. The following Traci test cases improve significantly as a result of this change (4 iterations, 5% significance): MetroExodus-trace-dx11-2160p-ultra: 1.90% ±0.85% BaldursGate3-trace-dx11-1440p-ultra: 1.47% ±0.38% Palworld-trace-dx11-1080p-med: 1.01% ±0.09% TerminatorResistance-trace-dx11-2160p-ultra: 0.95% ±0.29% Control-trace-dx11-1440p-high: 0.87% ±0.50% Even though lowering the P value threshold is expected to have a cost in compile time theoretically due to the increased use of the slower optimistic path of the graph coloring allocator, this doesn't actually show up in my numbers, my shader-db and fossil-db compile-time numbers don't show any statistically significant change (13 iterations, 5% significance). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36618>
This commit is contained in:
committed by
Marge Bot
parent
74168a601e
commit
760437c4c4
@@ -119,6 +119,9 @@ brw_alloc_reg_sets(struct brw_compiler *compiler)
|
||||
|
||||
for (int reg = 0; reg <= base_reg_count - class_sizes[i]; reg++)
|
||||
ra_class_add_reg(classes[i], reg);
|
||||
|
||||
if (devinfo->ver >= 30 && !INTEL_DEBUG(DEBUG_NO_VRT))
|
||||
ra_class_override_p(classes[i], 96 - class_sizes[i] + 1);
|
||||
}
|
||||
|
||||
ra_set_finalize(regs, NULL);
|
||||
|
||||
Reference in New Issue
Block a user