Dave Airlie
0f989a840e
crocus: fix leak on gen4/5 stencil fallback blit path.
...
Noticed by Ilia.
Fixes: f3630548f1 ("crocus: initial gallium driver for Intel gfx 4-7")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15100 >
2022-02-21 10:21:56 +10:00
Ilia Mirkin
357dae424f
freedreno/a4xx: make luminance formats renderable, add missing L8A8_SNORM
...
If the luminance formats aren't renderable, they back out to R*
formats, but those will end up with a 1 in alpha rather than 0 when
textured. So instead make them explicitly renderable, which will cause
the correct texture format swizzle to be applied.
Fixes query-rgba-signed-components and probably others.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15097 >
2022-02-20 16:58:03 +00:00
Ilia Mirkin
56b1bd086f
freedreno/a4xx: use correct macro for color
...
Doesn't actually matter since all the colors are encoded the same. But
for consistency...
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15097 >
2022-02-20 16:58:03 +00:00
Danylo Piliaiev
a814a4f9db
turnip: Add a refcount mechanism to BOs
...
Until now we have lived without a refcount mechanism in the driver
because in Vulkan the user is responsible for handling the life
span of memory allocations for all Vulkan objects, however,
imported BOs are tricky because the kernel doesn't refcount
so user-space needs to make sure that:
1. When importing a BO into the same device used to create it
(self-importing) it does not double free the same BO.
2. Frees imported BOs that were not allocated through the same
device.
Our initial implementation always freed BOs when requested,
so we handled 2) correctly but not 1) on drm and we would
double-free self-imported BOs because kernel doesn't return
a unique gem_handle on each import.
Beside this the submit ioctl checks for duplicates in the
BO list and returns an error if there is one.
This fixes the problem for good by adding refcounts to BOs
so that self-imported BOs have a refcnt > 1 and are only freed
when all references are freed.
KGSL on the other hand does not have the same problems,
at least not with ION buffers which are used for exportable
BOs on pre 5.10 android kernels.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5936
Fixes CTS tests: dEQP-VK.drm_format_modifiers.export_import.*
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15031 >
2022-02-19 15:16:55 +00:00
Lionel Landwerlin
2763a8af5a
anv/genxml/intel/fs: fix binding shader record entry
...
Bit is flipped compared to all the other packets.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Fixes: 705395344d ("intel/fs: Add support for compiling bindless shaders with resume shaders")
Fixes: c3ac9afca3 ("anv: Create and return ray-tracing pipeline SBT handles")
Acked-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15078 >
2022-02-19 13:50:56 +00:00
Chia-I Wu
5f3e50b27c
venus: trace vn_ring_wait_space
...
It is good to know that we run out of ring space and have to wait. This
happens easily with fossilize-replay because encoding a
vkCreateGraphicsPipeline takes microseconds while executing it can take
milliseconds, >100ms sometimes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14966 >
2022-02-19 03:57:30 +00:00
Chia-I Wu
7cb2e9a8f0
venus: cache VkFormatProperties
...
This is for fossilize-replay which keeps querying for the same formats.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14966 >
2022-02-19 03:57:30 +00:00
Alyssa Rosenzweig
e392dd8237
pan/bi: Promote MUX to CSEL in the scheduler
...
Helps scheduling, and makes scheduling more predictable when deciding between
MUX and CSEL.
total tuples in shared programs: 1523328 -> 1516256 (-0.46%)
tuples in affected programs: 509800 -> 502728 (-1.39%)
helped: 1977
HURT: 181
helped stats (abs) min: 1.0 max: 48.0 x̄: 3.71 x̃: 2
helped stats (rel) min: 0.04% max: 14.29% x̄: 1.98% x̃: 1.28%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.43 x̃: 1
HURT stats (rel) min: 0.14% max: 7.69% x̄: 1.40% x̃: 0.70%
95% mean confidence interval for tuples value: -3.47 -3.08
95% mean confidence interval for tuples %-change: -1.79% -1.60%
Tuples are helped.
total clauses in shared programs: 350552 -> 349906 (-0.18%)
clauses in affected programs: 34839 -> 34193 (-1.85%)
helped: 570
HURT: 49
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.22 x̃: 1
helped stats (rel) min: 0.67% max: 20.00% x̄: 3.26% x̃: 2.22%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.92% max: 16.67% x̄: 4.31% x̃: 4.17%
95% mean confidence interval for clauses value: -1.13 -0.96
95% mean confidence interval for clauses %-change: -2.95% -2.38%
Clauses are helped.
total cycles in shared programs: 202589.37 -> 202512.25 (-0.04%)
cycles in affected programs: 7644.46 -> 7567.33 (-1.01%)
helped: 771
HURT: 147
helped stats (abs) min: 0.041665999999999315 max: 1.8333360000000027 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.16% max: 14.29% x̄: 2.10% x̃: 1.35%
HURT stats (abs) min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.07 x̃: 0
HURT stats (rel) min: 0.24% max: 7.41% x̄: 1.49% x̃: 1.11%
95% mean confidence interval for cycles value: -0.09 -0.07
95% mean confidence interval for cycles %-change: -1.69% -1.36%
Cycles are helped.
total arith in shared programs: 56755.96 -> 56585.50 (-0.30%)
arith in affected programs: 18746.29 -> 18575.83 (-0.91%)
helped: 1605
HURT: 352
helped stats (abs) min: 0.04166399999999726 max: 1.8333360000000027 x̄: 0.12 x̃: 0
helped stats (rel) min: 0.07% max: 20.00% x̄: 1.92% x̃: 1.12%
HURT stats (abs) min: 0.041665999999999315 max: 0.3333340000000007 x̄: 0.06 x̃: 0
HURT stats (rel) min: 0.17% max: 33.33% x̄: 2.09% x̃: 1.08%
95% mean confidence interval for arith value: -0.09 -0.08
95% mean confidence interval for arith %-change: -1.34% -1.07%
Arith are helped.
total quadwords in shared programs: 1429737 -> 1424670 (-0.35%)
quadwords in affected programs: 418175 -> 413108 (-1.21%)
helped: 1682
HURT: 198
helped stats (abs) min: 1.0 max: 35.0 x̄: 3.17 x̃: 2
helped stats (rel) min: 0.04% max: 13.33% x̄: 1.72% x̃: 1.29%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.38 x̃: 1
HURT stats (rel) min: 0.15% max: 7.41% x̄: 1.30% x̃: 0.92%
95% mean confidence interval for quadwords value: -2.86 -2.53
95% mean confidence interval for quadwords %-change: -1.48% -1.32%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
a8418abd74
pan/bi: Revert "Fix load_const of 1-bit booleans"
...
This reverts commit 29d319c767 .
Now that we use nir_lower_bool_to_bitsize, we don't see 1-bit booleans
anymore, so the issue this fixed doesn't apply. Actually, that issue was
(in part) why I started looking into boolean handling in the first
place.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
21bdee7bcc
pan/bi: Switch to lower_bool_to_bitsize
...
Instead of ingesting 1-bit booleans and trying to force everything to be
16-bit, except when it isn't, and creating a mess in the backend... just
use the NIR pass designed to select bitsize for booleans. Yes, this
means we need to handle more NIR instructions, but the handling is
easier and the conversion is more obvious (except for some edge cases
like 16-bit vectorized b32csel). This generates noticeably better code,
and the generated code will be easier to optimize.
total instructions in shared programs: 90257 -> 88941 (-1.46%)
instructions in affected programs: 49145 -> 47829 (-2.68%)
helped: 201
HURT: 2
helped stats (abs) min: 1.0 max: 40.0 x̄: 6.57 x̃: 3
helped stats (rel) min: 0.29% max: 13.89% x̄: 2.57% x̃: 1.90%
HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
HURT stats (rel) min: 2.15% max: 2.74% x̄: 2.45% x̃: 2.45%
95% mean confidence interval for instructions value: -7.71 -5.26
95% mean confidence interval for instructions %-change: -2.84% -2.20%
Instructions are helped.
total tuples in shared programs: 73740 -> 72922 (-1.11%)
tuples in affected programs: 36564 -> 35746 (-2.24%)
helped: 184
HURT: 7
helped stats (abs) min: 1.0 max: 74.0 x̄: 4.49 x̃: 2
helped stats (rel) min: 0.30% max: 16.67% x̄: 2.86% x̃: 1.89%
HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.29 x̃: 1
HURT stats (rel) min: 0.12% max: 12.50% x̄: 4.26% x̃: 3.33%
95% mean confidence interval for tuples value: -5.29 -3.28
95% mean confidence interval for tuples %-change: -3.06% -2.13%
Tuples are helped.
total clauses in shared programs: 15993 -> 15928 (-0.41%)
clauses in affected programs: 2464 -> 2399 (-2.64%)
helped: 35
HURT: 16
helped stats (abs) min: 1.0 max: 27.0 x̄: 2.31 x̃: 1
helped stats (rel) min: 0.49% max: 18.88% x̄: 7.63% x̃: 5.88%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.79% max: 6.25% x̄: 1.91% x̃: 1.01%
95% mean confidence interval for clauses value: -2.46 -0.09
95% mean confidence interval for clauses %-change: -6.38% -2.90%
Clauses are helped.
total cycles in shared programs: 7622.13 -> 7594.75 (-0.36%)
cycles in affected programs: 1078.67 -> 1051.29 (-2.54%)
helped: 103
HURT: 4
helped stats (abs) min: 0.041665999999999315 max: 3.0833319999999986 x̄: 0.27 x̃: 0
helped stats (rel) min: 0.32% max: 21.05% x̄: 3.62% x̃: 2.44%
HURT stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0
HURT stats (rel) min: 0.13% max: 7.14% x̄: 2.94% x̃: 2.25%
95% mean confidence interval for cycles value: -0.33 -0.19
95% mean confidence interval for cycles %-change: -4.14% -2.61%
Cycles are helped.
total arith in shared programs: 2762.46 -> 2728.08 (-1.24%)
arith in affected programs: 1550.12 -> 1515.75 (-2.22%)
helped: 197
HURT: 6
helped stats (abs) min: 0.041665999999999315 max: 3.0833319999999986 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.32% max: 21.05% x̄: 2.93% x̃: 1.61%
HURT stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0
HURT stats (rel) min: 0.13% max: 20.00% x̄: 5.78% x̃: 3.37%
95% mean confidence interval for arith value: -0.21 -0.13
95% mean confidence interval for arith %-change: -3.20% -2.15%
Arith are helped.
total quadwords in shared programs: 68155 -> 67555 (-0.88%)
quadwords in affected programs: 27944 -> 27344 (-2.15%)
helped: 151
HURT: 9
helped stats (abs) min: 1.0 max: 52.0 x̄: 4.09 x̃: 3
helped stats (rel) min: 0.23% max: 12.35% x̄: 2.87% x̃: 2.17%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.89 x̃: 1
HURT stats (rel) min: 0.20% max: 6.76% x̄: 1.91% x̃: 1.13%
95% mean confidence interval for quadwords value: -4.67 -2.83
95% mean confidence interval for quadwords %-change: -2.99% -2.21%
Quadwords are helped.
total threads in shared programs: 2232 -> 2233 (0.04%)
threads in affected programs: 1 -> 2 (100.00%)
helped: 1
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
a64534754d
pan/bi: Handle vectorized u2f16/i2f16
...
Will be useful when we enable int16, I guess...
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
6a05852f5b
pan/bi: Handle trivial i2i32
...
lower_bool_to_bitsize can generate i2i32 from a 32-bit source, which is
trivial but needs to be handled explicitly to avoid going down the 8-bit
conversion path.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
f7d44a46cd
pan/bi: Optimize replication
...
Bifrost's 16-bit support comes in the form of vectorized instructions,
so when we manipulate scalars, we usually replicate to both bottom and
top halves of 32-bit registers. Add an analysis pass that detects
replication. Then, use that replication pass to optimize out useless
swizzle instructions (by changing them to plain moves, which can be
copypropped).
This optimization is a slight shader-db win on its own, and allows us to
transition to lower_bool_to_bitsize without regressing shader-db.
total instructions in shared programs: 90323 -> 90257 (-0.07%)
instructions in affected programs: 2513 -> 2447 (-2.63%)
helped: 20
HURT: 0
helped stats (abs) min: 1.0 max: 16.0 x̄: 3.30 x̃: 2
helped stats (rel) min: 1.25% max: 11.11% x̄: 4.80% x̃: 4.29%
95% mean confidence interval for instructions value: -5.05 -1.55
95% mean confidence interval for instructions %-change: -6.06% -3.54%
Instructions are helped.
total tuples in shared programs: 73769 -> 73740 (-0.04%)
tuples in affected programs: 1611 -> 1582 (-1.80%)
helped: 17
HURT: 0
helped stats (abs) min: 1.0 max: 9.0 x̄: 1.71 x̃: 1
helped stats (rel) min: 0.58% max: 16.67% x̄: 4.80% x̃: 3.33%
95% mean confidence interval for tuples value: -2.70 -0.71
95% mean confidence interval for tuples %-change: -7.06% -2.54%
Tuples are helped.
total clauses in shared programs: 15997 -> 15993 (-0.03%)
clauses in affected programs: 27 -> 23 (-14.81%)
helped: 4
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 7.69% max: 25.00% x̄: 18.17% x̃: 20.00%
95% mean confidence interval for clauses value: -1.00 -1.00
95% mean confidence interval for clauses %-change: -29.91% -6.44%
Clauses are helped.
total cycles in shared programs: 7623.13 -> 7622.13 (-0.01%)
cycles in affected programs: 64.83 -> 63.83 (-1.54%)
helped: 13
HURT: 0
helped stats (abs) min: 0.0416660000000002 max: 0.375 x̄: 0.08 x̃: 0
helped stats (rel) min: 1.02% max: 5.56% x̄: 2.82% x̃: 2.50%
95% mean confidence interval for cycles value: -0.13 -0.02
95% mean confidence interval for cycles %-change: -3.79% -1.85%
Cycles are helped.
total arith in shared programs: 2763.75 -> 2762.46 (-0.05%)
arith in affected programs: 67.17 -> 65.88 (-1.92%)
helped: 18
HURT: 0
helped stats (abs) min: 0.0416660000000002 max: 0.375 x̄: 0.07 x̃: 0
helped stats (rel) min: 1.02% max: 22.22% x̄: 5.68% x̃: 3.16%
95% mean confidence interval for arith value: -0.11 -0.03
95% mean confidence interval for arith %-change: -8.56% -2.80%
Arith are helped.
total quadwords in shared programs: 68173 -> 68155 (-0.03%)
quadwords in affected programs: 1258 -> 1240 (-1.43%)
helped: 14
HURT: 0
helped stats (abs) min: 1.0 max: 3.0 x̄: 1.29 x̃: 1
helped stats (rel) min: 0.42% max: 8.70% x̄: 3.88% x̃: 3.67%
95% mean confidence interval for quadwords value: -1.64 -0.93
95% mean confidence interval for quadwords %-change: -5.27% -2.49%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
35ff537814
pan/bi: Constant fold swizzles on constants
...
This lets us avoid generating SWZ instructions. Those instructions could
be constant folded but that complicates the replication analysis
introduced in the next commit.
Almost no shader-db changes.
quadwords HURT: shaders/glmark/1-22.shader_test MESA_SHADER_FRAGMENT: 718 -> 722 (0.56%)
total quadwords in shared programs: 68169 -> 68173 (<.01%)
quadwords in affected programs: 718 -> 722 (0.56%)
helped: 0
HURT: 1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
62533a6e64
pan/bi: Lower swizzles on MUX.v2i16
...
We'll generate this in a moment.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Alyssa Rosenzweig
8bd4976d98
pan/bi: Lower swizzles on CSEL.i32/MUX.i32
...
This is counter-intuitive, but required for correct operation when
CSEL.i32 takes a 1-bit (stored 16-bit) boolean argument. The impedance
mismatch ultimately is between CSEL.b32 (nir's bcsel, nonexistant in the
hardware) and the lowering CSEL.i32. However, a similar problem exists
even with MUX.i32 which lacks a good way of zero/sign-extending
booleans.
Cherry-picked from my Valhall branch though the issue also affects
Bifrost. Fixes piglit shaders@glsl-vs-if-bool on Bifrost.
Unfortunately, shader-db is quite unhappy :-(
The proper fix is to use lower_bool_to_bitsize, but that can't be
backported to mesa-stable.
total instructions in shared programs: 157539 -> 158953 (0.90%)
instructions in affected programs: 55621 -> 57035 (2.54%)
helped: 2
HURT: 259
helped stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2
helped stats (rel) min: 2.11% max: 2.67% x̄: 2.39% x̃: 2.39%
HURT stats (abs) min: 1.0 max: 40.0 x̄: 5.47 x̃: 2
HURT stats (rel) min: 0.36% max: 16.13% x̄: 2.55% x̃: 1.59%
95% mean confidence interval for instructions value: 4.44 6.40
95% mean confidence interval for instructions %-change: 2.21% 2.82%
Instructions are HURT.
total tuples in shared programs: 132322 -> 132907 (0.44%)
tuples in affected programs: 31806 -> 32391 (1.84%)
helped: 5
HURT: 152
helped stats (abs) min: 1.0 max: 2.0 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.39% max: 3.03% x̄: 1.70% x̃: 1.61%
HURT stats (abs) min: 1.0 max: 42.0 x̄: 3.89 x̃: 2
HURT stats (rel) min: 0.29% max: 18.18% x̄: 2.50% x̃: 1.79%
95% mean confidence interval for tuples value: 2.88 4.58
95% mean confidence interval for tuples %-change: 1.87% 2.85%
Tuples are HURT.
total clauses in shared programs: 28672 -> 28698 (0.09%)
clauses in affected programs: 869 -> 895 (2.99%)
helped: 1
HURT: 24
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 5.88% max: 5.88% x̄: 5.88% x̃: 5.88%
HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.12 x̃: 1
HURT stats (rel) min: 0.49% max: 33.33% x̄: 8.46% x̃: 3.59%
95% mean confidence interval for clauses value: 0.82 1.26
95% mean confidence interval for clauses %-change: 3.84% 11.93%
Clauses are HURT.
total cycles in shared programs: 15119.04 -> 15137.88 (0.12%)
cycles in affected programs: 922.87 -> 941.71 (2.04%)
helped: 4
HURT: 79
helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.05 x̃: 0
helped stats (rel) min: 0.40% max: 3.17% x̄: 1.57% x̃: 1.35%
HURT stats (abs) min: 0.041665999999999315 max: 1.75 x̄: 0.24 x̃: 0
HURT stats (rel) min: 0.30% max: 20.00% x̄: 2.83% x̃: 2.12%
95% mean confidence interval for cycles value: 0.17 0.29
95% mean confidence interval for cycles %-change: 1.86% 3.37%
Cycles are HURT.
total arith in shared programs: 4922.71 -> 4947.71 (0.51%)
arith in affected programs: 1423.79 -> 1448.79 (1.76%)
helped: 5
HURT: 177
helped stats (abs) min: 0.0416669999999999 max: 0.0833330000000001 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.40% max: 3.17% x̄: 1.82% x̃: 1.67%
HURT stats (abs) min: 0.041665999999999315 max: 1.75 x̄: 0.14 x̃: 0
HURT stats (rel) min: 0.30% max: 22.22% x̄: 2.50% x̃: 1.52%
95% mean confidence interval for arith value: 0.11 0.17
95% mean confidence interval for arith %-change: 1.86% 2.90%
Arith are HURT.
total quadwords in shared programs: 120605 -> 120956 (0.29%)
quadwords in affected programs: 26535 -> 26886 (1.32%)
helped: 6
HURT: 143
helped stats (abs) min: 1.0 max: 7.0 x̄: 2.83 x̃: 1
helped stats (rel) min: 0.93% max: 6.33% x̄: 2.29% x̃: 1.71%
HURT stats (abs) min: 1.0 max: 21.0 x̄: 2.57 x̃: 2
HURT stats (rel) min: 0.34% max: 13.79% x̄: 2.02% x̃: 1.22%
95% mean confidence interval for quadwords value: 1.86 2.86
95% mean confidence interval for quadwords %-change: 1.45% 2.24%
Quadwords are HURT.
total threads in shared programs: 4670 -> 4669 (-0.02%)
threads in affected programs: 2 -> 1 (-50.00%)
helped: 0
HURT: 1
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14576 >
2022-02-19 03:02:10 +00:00
Emma Anholt
a2b7d9b9cd
ci/freedreno: Add a known spilling hangcheck flake.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085 >
2022-02-19 02:37:13 +00:00
Emma Anholt
b39d5e9705
ci/freedreno: Cut down pre-merge a630 VK coverage.
...
We've got lots of VK coverage on 618, so take some of the load off (but
leave a little bit of testing just to make sure we don't totally break
630). This should help with our Marge times since we've added some other
coverage to 630 that's started overloading the runners.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085 >
2022-02-19 02:37:13 +00:00
Emma Anholt
04790ec8bb
ci/freedreno: Move a 60s timeout test to skips instead of flakes.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15085 >
2022-02-19 02:37:13 +00:00
Connor Abbott
7e8d885919
spirv: Rewrite determinant calculation
...
The old calculation for mat3 was clever, but it turns out that a
straightforward application of subdeterminants similar to how mat4 is
handled is more efficient: on a scalar architecture with some sort of
combined multiply+add instruction with a negate modifier (both fairly
common), the new determinant is 9 instructions vs. 15 for the old one,
and without the multiply-add it's 14 instructions vs. 18 for the old
one. When used as a routine for inverse() the savings are compounded,
because we now use the same method as used to compute the adjucate
matrix and so CSE can combine most of the calculations with the adjucate
matrix ones.
Once mat3 and mat4 use the same method for computing determinants, we
can combine them into a single recursive function. I also pulled up the
mat_subdet() function because it was doing basically what we need, so
it's now shared between determinant and inverse. This shrinks the
implementation significantly, as can be seen from the diffstat.
The real reason I want to change this, though, is that it fixes
dEQP-VK.glsl.builtin.precision_fp16_storage16b.inverse.compute.mat3 with
turnip. Qualcomm uses round-to-zero for 16-bit frcp, which combined with
some inaccuracy in the old method of calculating the determinant led us
to fail. Qualcomm's driver uses something like the new method to
calculate the determinant in the inverse. We could argue that Mesa's
method should be allowed, because round-to-zero for floating-point
division is within spec and there are no precision guarantees given for
determinant() or inverse(). However we might as well use the more
efficient method.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14652 >
2022-02-19 02:03:25 +00:00
Connor Abbott
c21065c87a
util/blob: Clarify rules on blob::data
...
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028 >
2022-02-19 01:25:46 +00:00
Connor Abbott
6761550357
nir/serialize: Don't access blob->data directly
...
It won't work if the blob is fixed-size and we overrun the size, which
will be the case with the Vulkan pipeline cache.
This gets a bit tricky for the repeated-header optimization, because we
can't read the header from the blob. Instead we have to store the header
itself.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15028 >
2022-02-19 01:25:46 +00:00
Alyssa Rosenzweig
9168dcbbc1
pan/bi: Disambiguate IDVS variants in shader-db
...
Label IDVS variants as being MESA_SHADER_{POSITION, VARYING} stages;
reserve the MESA_SHADER_VERTEX label for non-IDVS shaders. This reduces
confusion where a single shader compiles to two MESA_SHADER_VERTEX
shaders with different stats.
While we're at it, de-vendor the blend shader stage name; these stats
are internal anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15086 >
2022-02-19 00:01:07 +00:00
Alyssa Rosenzweig
01d1bf6228
asahi: Wire in pure integer texture formats
...
Passes dEQP-GLES3.functional.texture.format.sized.2d.r*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:33 +00:00
Alyssa Rosenzweig
fded99b1c5
asahi: Support LOD clamps
...
Passes:
dEQP-GLES3.functional.texture.mipmap.2d.min_lod.*
dEQP-GLES3.functional.texture.mipmap.2d.max_lod.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:33 +00:00
Alyssa Rosenzweig
cc3e98e201
asahi: Identify minimum/maximum LOD fields
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:33 +00:00
Alyssa Rosenzweig
6554790dfb
asahi: Add LOD clamp packing unit tests
...
With GTest.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
e3a5c1b478
asahi: Add LOD type
...
Automatically packs and unpacks float <==> clamped 4:6 fixed point, used
for min/max LOD fields on the Sampler descriptor.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
db93090ffc
asahi: Allow GenXML to be used in C++
...
C++ requires explicit casts from integers to enums. Fixes errors like
the following when trying to use Asahi GenXML from a GTest unit test.
src/asahi/lib/agx_pack.h:554:23: error: assigning to 'enum agx_channels' from incompatible type 'uint64_t' (aka 'unsigned long long')
values->channels = __gen_unpack_uint(cl, 0, 6);
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
055c5a59f8
agx: Round and clamp array indices
...
Conforming with the GLSL spec. Fixes:
dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darray_fixed_fragment
(and probably others)
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
a822b7b6cc
agx: Naturally align uniform pushes
...
Required to pack correctly, e.g if we push a 16-bit value then a 64-bit
value.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
0c2bbb470a
agx: Add agx_size_align_16 helper
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
9aeb5156bc
agx: Add typed move helper
...
Useful for u2u16 in lowering code.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
830d16e9f0
asahi: Add AGX_PUSH_ARRAY_SIZE_MINUS_1
...
Required to clamp array indices against the array sizes per the GLSL
spec. Metal also does this, implying it's required by the hardware for
correct operation.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
7b4ea2fd38
asahi: Implement texturing with non-zero start level
...
Unsure if this comes up anywhere.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
11072cfd21
asahi: Handle reloads of specific cube/mipfaces
...
The texture descriptor we construct for reloading needs to respect the
surface's texture/layer selection. Fix exactly the same bug as
b8c31ac06d ("lima: fix glCopyTexSubImage2D").
Fixes:
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgb
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.2d_rgba
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgb
dEQP-GLES2.functional.texture.specification.basic_copytexsubimage2d.cube_rgba
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
062ca49ca7
asahi: Add agx_map_texture_{cpu,gpu} helpers
...
Streamline access to particular layer/levels. These patterns show up
across the driver and are easy to screw up, so add a helper.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
a8bf729f8a
asahi: Support 2D array and 3D textures
...
As far as I can tell, these *must* be tiled. Other than that, the
implementation is completely routine. Passes
dEQP-GLES3.functional.texture.format.unsized.*2d_array*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
204e2ffe1b
asahi: Track mipmap state explicitly
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
e714fae263
asahi: Pass correct tile shift to tiling routines
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
5f10ffd6e2
asahi: Handle page alignment of miptrees
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
2c490cd4e3
asahi: Align linear texture's strides to 64 bytes
...
Required to pack the stride, and should improve cache performance.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
5d957011ff
asahi: Align allocations to effective tile size
...
May be smaller than 64x64.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
25f48996a6
asahi: Rename bpp to blocksize
...
Will matter for block compressed formats.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
856f64de24
asahi: Allow tiling of all bpps
...
Use the usual macro trick via Panfrost. Fixes textures with formats with
non-32-bit bpp, including:
dEQP-GLES2.functional.texture.specification.basic_teximage2d.*
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
2028873ef6
asahi: Dynamically configure tile size
...
We need to shrink the tile size when using small images (including
due to mipmapping) or when using large block sizes.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
d103d64df6
asahi: Add some notes to XML about mipmapping
...
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
aea6d7f17f
asahi: Handle tiling of 2D arrays and 3D
...
Nothing special required, just need to respect the Z coordinate.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
06b2d97666
asahi: Add 2D Array and 3D texture dimensions
...
Add to XML and translate in the driver.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00
Alyssa Rosenzweig
266382d252
asahi: Respect mip level when rendering
...
Use hardware mip level field.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14903 >
2022-02-18 23:48:32 +00:00