Shader-db shows this is beneficial, even if it comes with a small
increase in register pressure.
total instructions in shared programs: 10889197 -> 10869857 (-0.18%)
instructions in affected programs: 3625014 -> 3605674 (-0.53%)
helped: 14911
HURT: 8324
Instructions are helped.
total threads in shared programs: 431034 -> 431014 (<.01%)
threads in affected programs: 40 -> 20 (-50.00%)
helped: 0
HURT: 10
Threads are HURT.
total uniforms in shared programs: 5308006 -> 5432767 (2.35%)
uniforms in affected programs: 2204951 -> 2329712 (5.66%)
helped: 9
HURT: 30766
Uniforms are HURT.
total max-temps in shared programs: 2226471 -> 2235269 (0.40%)
max-temps in affected programs: 272670 -> 281468 (3.23%)
helped: 2372
HURT: 8479
Max-temps are HURT.
total spills in shared programs: 4318 -> 4331 (0.30%)
spills in affected programs: 39 -> 52 (33.33%)
helped: 2
HURT: 7
total fills in shared programs: 6514 -> 6527 (0.20%)
fills in affected programs: 42 -> 55 (30.95%)
helped: 2
HURT: 7
total sfu-stalls in shared programs: 15166 -> 15808 (4.23%)
sfu-stalls in affected programs: 2389 -> 3031 (26.87%)
helped: 513
HURT: 944
Inconclusive result (%-change mean confidence interval includes 0).
total inst-and-stalls in shared programs: 10904363 -> 10885665 (-0.17%)
inst-and-stalls in affected programs: 3660930 -> 3642232 (-0.51%)
helped: 14878
HURT: 8450
Inst-and-stalls are helped.
total nops in shared programs: 183672 -> 184256 (0.32%)
nops in affected programs: 12532 -> 13116 (4.66%)
helped: 1841
HURT: 2251
Nops are HURT.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31355>
ldvary instructions have implicit writes to rf0 (r5 in Pi4) that are
read in follow-up instructions to complete the interpolation calculations
so we rather not allocate ldunif(a)'s dst to rf0/r5 during these sequence
too to facilitate pairing.
This gives us -0.25% of instructions for fragment shaders in shader-db for
Pi5 and -0.64% on Pi4.
Shader-db Pi5:
total instructions in shared programs: 10890641 -> 10889197 (-0.01%)
instructions in affected programs: 575506 -> 574062 (-0.25%)
helped: 2506
HURT: 1378
Instructions are helped.
total max-temps in shared programs: 2226555 -> 2226471 (<.01%)
max-temps in affected programs: 5061 -> 4977 (-1.66%)
helped: 139
HURT: 78
Max-temps are helped.
total sfu-stalls in shared programs: 15143 -> 15166 (0.15%)
sfu-stalls in affected programs: 310 -> 333 (7.42%)
helped: 134
HURT: 195
Inconclusive result (value mean confidence interval includes 0).
total inst-and-stalls in shared programs: 10905784 -> 10904363 (-0.01%)
inst-and-stalls in affected programs: 577053 -> 575632 (-0.25%)
helped: 2497
HURT: 1415
Inst-and-stalls are helped.
total nops in shared programs: 183945 -> 183672 (-0.15%)
nops in affected programs: 3862 -> 3589 (-7.07%)
helped: 478
HURT: 234
Nops are helped.
Shader-db Pi4:
total instructions in shared programs: 12842116 -> 12835720 (-0.05%)
instructions in affected programs: 996970 -> 990574 (-0.64%)
helped: 6027
HURT: 367
Instructions are helped.
total max-temps in shared programs: 2251877 -> 2251707 (<.01%)
max-temps in affected programs: 2670 -> 2500 (-6.37%)
helped: 167
HURT: 9
Max-temps are helped.
total sfu-stalls in shared programs: 21132 -> 21093 (-0.18%)
sfu-stalls in affected programs: 114 -> 75 (-34.21%)
helped: 92
HURT: 55
Sfu-stalls are helped.
total inst-and-stalls in shared programs: 12863248 -> 12856813 (-0.05%)
inst-and-stalls in affected programs: 1008237 -> 1001802 (-0.64%)
helped: 6070
HURT: 359
Inst-and-stalls are helped.
total nops in shared programs: 281645 -> 281200 (-0.16%)
nops in affected programs: 2241 -> 1796 (-19.86%)
helped: 501
HURT: 88
Nops are helped.
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31355>
This reverts commit 0b85476d86.
When mapping a BO in v3d, the map keeps forever until freeing the BO. If
later the map is required again, we reuse the map instead of doing the
map from scratch.
This saves calling map/unmap continuously, as well as a mechanism to
keep control of the map usage, like a reference count.
Thus, when reallocating a BO, if it is mapped it just means the map was
used in the past, but not necessarily it is in use right now.
The reverted commit was causing performance regressions in multiple
applications, reducing from 60fps to 5fps.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11783
Backport-to: 24.2
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31049>
Zink currently requires VK_EXT_queue_family_foreign to set
PIPE_CAP_DMABUF, which is one of the requirements to create a
gbm context.
v3dv already conditionally supported this extension for Android.
As it is now required for Zink in Mesa, move it to the driver's
common set.
This allows v3dv to create gbm contexts with Zink again since
this was made a stricter requirement as a side effect of
ab08b79ef7 ("gbm: use driver check for dmabuf export").
Tested with Zink on a gbm EGL application as well as sway with
the wlroots Vulkan backend, which also requires this extension.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30938>
v3d_resource_from_handle when importing a DRM_FORMAT_MOD_INVALID
considered that if we had a render-only device the resource layout was
linear and if we didn't have render-only the resource layout was tiled.
This change honors the resource creation with the SCANOUT flag
independently of the availability of the render-only for the
DRM_FORMAT_MOD_INVALID modifier.
It also fixes most of the failing piglit text for:
spec@ext_image_dma_buf_import@ext_image_dma_buf_import.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11594
Cc: mesa-stable
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30946>
It was set to 170min, which made sense when the job timeout was 3h, but
then 4bb564f40d ("broadcom/ci: add more jobs to test with rpi5")
lowered the job timeout to 2h without lowering the test timeout to match.
Fixes: 4bb564f40d ("broadcom/ci: add more jobs to test with rpi5")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30800>
The CI-wide default is 20min, so if we set a 20min job timeout here, we
can't get the results of our jobs when they timeout.
Instead of setting the test timeout to 15min, which would be too short
for some jobs, leave it at 20min (but be explicit about it and protect
against a future change of that default), and bump the job timeout by
5min to allow for results to be uploaded.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30694>
ir3's lowering of variables to scratch memory has to treat 8-bit values as
16-bit ones when comparing such value's size against the given threshold
since those values are handled through 16-bit half-registers. But those
values can still use natural 8-bit size and alignment for storing inside
scratch memory.
nir_lower_vars_to_scratch now accepts two size-and-alignment functions,
one used for calculating the variable size and the other for calculating
the size and alignment needed for storing inside scratch memory. Non-ir3
uses of this pass can just duplicate the currently-used function. ir3
provides a separate variable-size function that special-cases 8-bit types.
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29875>
We usually emit flags right before consuming them but this is
suboptimal from the point of view of register pressure: if an
instruction is only used to generate flags then waiting to emit
it right before reading the flags extends the liveness of the
sources used to generate the flags for no gain. This pass will
check for such instructions and try to move them as early as
possible.
Shader-db results below show this is effective to reduce register
pressure, allowing a few shaders to increase thread counts and/or
reduce spilling:
total instructions in shared programs: 11057173 -> 11057076 (<.01%)
instructions in affected programs: 1955543 -> 1955446 (<.01%)
helped: 4214
HURT: 3905
Inconclusive result (value mean confidence interval includes 0).
total threads in shared programs: 425096 -> 425170 (0.02%)
threads in affected programs: 74 -> 148 (100.00%)
helped: 37
HURT: 0
Threads are helped.
total uniforms in shared programs: 3846275 -> 3845674 (-0.02%)
uniforms in affected programs: 23574 -> 22973 (-2.55%)
helped: 217
HURT: 30
Uniforms are helped.
total max-temps in shared programs: 2222910 -> 2220488 (-0.11%)
max-temps in affected programs: 61904 -> 59482 (-3.91%)
helped: 2145
HURT: 113
Max-temps are helped.
total spills in shared programs: 4294 -> 4280 (-0.33%)
spills in affected programs: 148 -> 134 (-9.46%)
helped: 8
HURT: 0
total fills in shared programs: 6497 -> 6468 (-0.45%)
fills in affected programs: 291 -> 262 (-9.97%)
helped: 8
HURT: 0
total sfu-stalls in shared programs: 14344 -> 14611 (1.86%)
sfu-stalls in affected programs: 1308 -> 1575 (20.41%)
helped: 217
HURT: 335
Inconclusive result (%-change mean confidence interval includes 0).
total inst-and-stalls in shared programs: 11071517 -> 11071687 (<.01%)
inst-and-stalls in affected programs: 1946767 -> 1946937 (<.01%)
helped: 4191
HURT: 3909
Inconclusive result (value mean confidence interval includes 0).
total nops in shared programs: 270628 -> 269829 (-0.30%)
nops in affected programs: 22032 -> 21233 (-3.63%)
helped: 1213
HURT: 571
Inconclusive result (%-change mean confidence interval includes 0).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30511>