i965: Use SIMD16 instead of SIMD8 on Gen4 when possible.

Gen5+ systems allow you to specify multiple shader programs - both SIMD8
and SIMD16 - and the hardware will automatically dispatch to the most
appropriate one, given the number of subspans to be processed.

However, that is not the case on Gen4.  Instead, you program a single
shader.  If you enable multiple dispatch modes (SIMD8 and SIMD16), the
shader is supposed to contain a series of jump instructions at the
beginning.  The hardware will launch the shader at a small offset,
hitting one of the jumps.

We've always thought that sounds like a pain, and weren't clear how it
affected performance - is it worth having multiple shader types?  So,
we never bothered with SIMD16 until now.

This patch takes a simpler approach: try and compile a SIMD16 shader.
If possible, set the no_8 flag, telling the hardware to just use the
SIMD16 variant all the time.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This commit is contained in:
Kenneth Graunke
2015-02-20 14:09:17 -08:00
parent 108b92b1e9
commit 8aee87fe4c
2 changed files with 4 additions and 3 deletions
+2 -3
View File
@@ -4030,8 +4030,7 @@ brw_wm_fs_emit(struct brw_context *brw,
cfg_t *simd16_cfg = NULL;
fs_visitor v2(brw, mem_ctx, key, prog_data, prog, fp, 16);
if (brw->gen >= 5 && likely(!(INTEL_DEBUG & DEBUG_NO16) ||
brw->use_rep_send)) {
if (likely(!(INTEL_DEBUG & DEBUG_NO16) || brw->use_rep_send)) {
if (!v.simd16_unsupported) {
/* Try a SIMD16 compile */
v2.import_uniforms(&v);
@@ -4049,7 +4048,7 @@ brw_wm_fs_emit(struct brw_context *brw,
cfg_t *simd8_cfg;
int no_simd8 = (INTEL_DEBUG & DEBUG_NO8) || brw->no_simd8;
if (no_simd8 && simd16_cfg) {
if ((no_simd8 || brw->gen < 5) && simd16_cfg) {
simd8_cfg = NULL;
prog_data->no_8 = true;
} else {
@@ -1433,6 +1433,8 @@ fs_visitor::emit_texture_gen4(ir_texture_opcode op, fs_reg dst,
bool simd16 = false;
fs_reg orig_dst;
no16("SIMD16 texturing on Gen4 not supported yet.");
/* g0 header. */
mlen = 1;