intel/fs: Add DP4A to get_lowered_simd_width
While working on cooperative matrix support, I noticed some invalid
DP4A instructions being generated.
dp4a(32) g33<1>UD g21<8,8,1>UD g1.0<0,1,0>UD g9<1,1,1>UD
This violates the constraint that the destination or a source can only
access two consecutive GRFs.
I'm a little surprised that validation didn't catch this. Perhaps
because it's a 3 source instruction? Either way, it seems like a bigger
project to fix that.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 0f809dbf40 ("intel/compiler: Basic support for DP4A instruction")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25554>
This commit is contained in:
@@ -5154,6 +5154,7 @@ get_lowered_simd_width(const struct brw_compiler *compiler,
|
||||
const struct intel_device_info *devinfo = compiler->devinfo;
|
||||
|
||||
switch (inst->opcode) {
|
||||
case BRW_OPCODE_DP4A:
|
||||
case BRW_OPCODE_MOV:
|
||||
case BRW_OPCODE_SEL:
|
||||
case BRW_OPCODE_NOT:
|
||||
|
||||
Reference in New Issue
Block a user