The unordered mode stored in dependencies might be a bitmask and not
only a single mode. In practice, only the "stronger" mode will stick.
Make sure that the code testing for the mode uses "&" instead of "==",
to avoid prevent some valid combinations to happen, e.g.
```
// ...
add(16) g104<1>F g94<1,1,0>F g34<1,1,0>F { align1 1H @7 $7.dst compacted };
```
which without the fix ends up as
```
// ...
sync nop(1) null<0,1,0>UB { align1 WE_all 1N F@7 };
add(16) g104<1>F g94<1,1,0>F g34<1,1,0>F { align1 1H $7.dst compacted };
```
Enables two tests for the scoreboard pass that illustrate this case.
For measuring the effect, re-enabled the sync.nop accounting on total of
instructions and got the following results.
```
Totals:
Instrs: 322041261 -> 321748285 (-0.09%)
Cycle count: 22864587567 -> 22863073741 (-0.01%)
Max dispatch width: 7989040 -> 7989024 (-0.00%); split: +0.00%, -0.00%
Totals from 88212 (9.78% of 902056) affected shaders:
Instrs: 102282050 -> 101989074 (-0.29%)
Cycle count: 12787629859 -> 12786116033 (-0.01%)
Max dispatch width: 525336 -> 525320 (-0.00%); split: +0.01%, -0.01%
```
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36096>
It was used by the pass tests to verify output with TEST_DEBUG=1,
replace it with brw_print_instructions().
The output is slightly different (not printing IP, not reordering the
blocks), we can add those features as we need, but given the usage was
already very reduced, don't bother with that until need arises.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/34012>
And use brw_builder(brw_shader *) and brw_builder() constructors
where possible.
The way tests are written, it is necessary to initialize an "empty"
builder -- which is later replaced by a proper one. Default parameter
NULL make that initialization implicit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33815>
Make the "block after DO" more stable so that adding instructions after
a DO doesn't require repairing the CFG. Use a new SHADER_OPCODE_FLOW
instruction that is a placeholder representing "go to the next block"
and disappears at code generation.
For some context, there are a few facts about how CFG currently works
- Blocks are assumed to not be empty;
- DO is always by itself in a block, i.e. starts and ends a block;
- There are no empty blocks;
- Predicated WHILE and CONTINUE will link to the "block after DO";
- When nesting loops, it is possible that the "block after DO" is
another "DO".
Reasons and further explanations for those are in the brw_cfg.c comments.
What makes this new change useful is that a pass might want to add
instructions between two DO instructions. When that happens, a new
block must be created and any predicated WHILE and CONTINUE must be
repaired.
So, instead of requiring a repair (which has proven to be tricky in
the past), this change adds a block that can be "virtually" empty but
allow instructions to be added without further changes.
One alternative design would be allowing empty blocks, that would be
a deeper change since the blocks are currently assumed to be not empty
in various places. We'll save that for when other changes are made to
the CFG.
The problem described happens in brw_opt_combine_constants, and a
different patch will clean that up.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33536>
When a SEND instruction is a EOT, the scoreboard lowering will not
allocate a new SBID for it, since nothing needs to wait for it. In
Gfx12 this allowed the SEND to get out-of-order $.dst or $.src
dependencies.
Starting on Xe2+ this is not supported anymore, in favor of supporting
more combined modes.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32712>