Francisco Jerez 5bf7bb5cf9 intel/brw/xe3+: Re-enable static analysis-based SIMD32 FS heuristic for the moment.
This disables for now the "optimistic" SIMD heuristic that was
implemented for xe3+ and makes it dependent on a debugging option,
instead use the static analysis-based codepath that was used in
previous generations and was extended by previous commits in this MR
to model the xe3 trade-off between register use and thread
parallelism.

The reason is that the main assumption of the optimistic SIMD
heuristic didn't hold up with reality: Real-world testing on PTL shows
that there are many cases where SIMD32 shows performance degradation
relative to SIMD16 despite the ability of xe3 hardware to scale the
GRF file of a thread on demand, unfortunately that scenario seems to
be more pervasive than hoped when the optimistic SIMD heuristic was
implemented pre-silicon.

In many cases what seems to be going on is that even when the register
file is able to scale with the increased register use of SIMD32, the
thread parallelism of the EU is scaled down by a similar factor, so at
the bottom line SIMD32 (depending on the actual ratio of register use
between both variants) may not buy us anything, and it frequently
encounters constraints (like SIMD lowering and less effective
scheduling) that lead to worse codegen than SIMD16, easily tipping the
balance in favor of SIMD16.  The extension of the performance analysis
pass that was done in a previous commit allows the original SIMD32
heuristic to take into account quantitatively this effect, and that
seems pretty effective at disabling SIMD32 shaders that underperform
judging from the statistically significant improvement of most Traci
test-cases that run on my PTL system (4 iterations, 5% significance),
no statistically significant regressions were observed:

Nba2K23-trace-dx11-2160p-ultra:                    10.16% ±0.34%
Superposition-trace-dx11-2160p-extreme:             4.06% ±0.50%
TotalWarWarhammer3-trace-dx11-1080p-high:           3.52% ±0.76%
Payday3-trace-dx11-1440p-ultra:                     2.41% ±0.81%
MetroExodus-trace-dx11-2160p-ultra:                 2.28% ±0.78%
Borderlands3-trace-dx11-2160p-ultra:                1.89% ±0.65%
MountAndBlade2-trace-dx11-1440p-veryhigh:           1.81% ±0.40%
Blackops3-trace-dx11-1080p-high:                    1.66% ±0.29%
HogwartsLegacy-trace-dx12-1080p-ultra:              1.53% ±0.22%
TotalWarPharaoh-trace-dx11-1440p-ultra:             1.44% ±0.31%
Fortnite-trace-dx11-2160p-epix:                     1.44% ±0.27%
Naraka-trace-dx11-1440p-highest:                    1.39% ±0.27%
PubG-trace-dx11-1440p-ultra:                        1.30% ±0.49%
Destiny2-trace-dx11-1440p-highest:                  1.10% ±0.23%
Factorio-trace-1080p-high:                          1.10% ±1.77%
TerminatorResistance-trace-dx11-2160p-ultra:        1.08% ±0.31%
Ghostrunner2-trace-dx11-1440p-ultra:                1.05% ±0.15%
ShadowTombRaider-trace-dx11-2160p-ultra:            0.98% ±0.19%
CitiesSkylines2-trace-dx11-1440p-high:              0.67% ±0.19%
Palworld-trace-dx11-1080p-med:                      0.44% ±0.22%

The downside is that this will reverse the large reduction in
compile-time we gained from the optimistic SIMD heuristic -- The
run-time of both shader-db and fossil-db jump back up by nearly 20%
with this change.  I'm working on a better compromise based on
run-time feedback that will hopefully allow us to preserve the
compile-time benefit of the optimistic heuristic without the reduction
in run-time performance, but in the meantime it seems like the
run-time performance gap from SIMD32 is the more urgent issue to
address since it has an impact on titles across the board.  Despite
the reversal of that compile-time improvement xe3 still achieves
slightly lower compile time on the average than previous generations
as a result of VRT, so this doesn't seem terribly tragic.

v2: Add bit to brw_get_compiler_config_value() (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/36618>
2025-09-10 02:15:58 +00:00
2025-02-23 00:31:59 +01:00
2025-09-02 07:48:53 +00:00
2025-01-08 20:37:51 +00:00
2023-11-02 11:37:46 +00:00
2025-07-24 16:12:10 +00:00
2025-09-09 16:44:38 +00:00
2025-08-31 15:38:27 +00:00
2024-11-18 22:29:14 +00:00
2025-07-16 13:50:24 +00:00

`Mesa <https://mesa3d.org>`_ - The 3D Graphics Library
======================================================


Source
------

This repository lives at https://gitlab.freedesktop.org/mesa/mesa.
Other repositories are likely forks, and code found there is not supported.


Build & install
---------------

You can find more information in our documentation (`docs/install.rst
<https://docs.mesa3d.org/install.html>`_), but the recommended way is to use
Meson (`docs/meson.rst <https://docs.mesa3d.org/meson.html>`_):

.. code-block:: sh

  $ meson setup build
  $ ninja -C build/
  $ sudo ninja -C build/ install

Support
-------

Many Mesa devs hang on IRC; if you're not sure which channel is
appropriate, you should ask your question on `OFTC's #dri-devel
<irc://irc.oftc.net/dri-devel>`_, someone will redirect you if
necessary.
Remember that not everyone is in the same timezone as you, so it might
take a while before someone qualified sees your question.
To figure out who you're talking to, or which nick to ping for your
question, check out `Who's Who on IRC
<https://dri.freedesktop.org/wiki/WhosWho/>`_.

The next best option is to ask your question in an email to the
mailing lists: `mesa-dev\@lists.freedesktop.org
<https://lists.freedesktop.org/mailman/listinfo/mesa-dev>`_


Bug reports
-----------

If you think something isn't working properly, please file a bug report
(`docs/bugs.rst <https://docs.mesa3d.org/bugs.html>`_).


Contributing
------------

Contributions are welcome, and step-by-step instructions can be found in our
documentation (`docs/submittingpatches.rst
<https://docs.mesa3d.org/submittingpatches.html>`_).

Note that Mesa uses gitlab for patches submission, review and discussions.
Description
No description provided
Readme 538 MiB
Languages
C 75.5%
C++ 17.2%
Python 2.7%
Rust 1.8%
Assembly 1.5%
Other 1%