Alyssa Rosenzweig
accffda30d
pan/bi: Skip ATEST for colour blit shaders
...
Small win.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9002 >
2021-02-12 12:33:19 +00:00
Alyssa Rosenzweig
e279606232
panfrost: Pass is_blit flag around
...
There are blit shader specific optimizations available.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9002 >
2021-02-12 12:33:19 +00:00
Erik Faye-Lund
5159f406d8
zink: use gallium api to copy to display-target
...
This allows us to avoid us to avoid forcing linear and host-visible
display-targets.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858 >
2021-02-12 11:24:50 +00:00
Erik Faye-Lund
1b8b14172f
zink: ignore irrelevant bind-flags
...
We don't need to create display-targets for shared or scanout, becuase
we never even see those in the sw-winsys case.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858 >
2021-02-12 11:24:50 +00:00
Erik Faye-Lund
9d0ad591f9
zink: limit host-visible bind-flags
...
The only type that should really require to be host-visible is the
display-target, and that's just because of our silly flush_frontbuffer
implementation.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858 >
2021-02-12 11:24:50 +00:00
Erik Faye-Lund
9fc179c774
zink: don't always require linear display-targets
...
We only need these display-targets to be linear in the case of a
software winsys. In the DRM case, they can be tiled without issues.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858 >
2021-02-12 11:24:50 +00:00
Erik Faye-Lund
708327472b
zink: do not use extra staging resource unless needed
...
The reason we check for staging-resources here is really because they
are the only images guaranteed to be host-visible.
But on UMA architectures, it's quite likely to have memory that is
*both* host-visible *and* device-local, so let's see what we found
instead.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858 >
2021-02-12 11:24:50 +00:00
Erik Faye-Lund
5e4ae3466b
zink: drop extra set of parens
...
We don't need to be doubly sure here, we can just use a single set of
parents instead.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8858 >
2021-02-12 11:24:50 +00:00
Erik Faye-Lund
8e52b7b46d
lavapipe: handle null-buffers for xfb
...
The Vulkan spec says the following for vkCmdBeginTransformFeedbackEXT:
"For each element of pCounterBuffers that is VK_NULL_HANDLE, transform
feedback will start capturing vertex data to byte zero in the
corresponding bound transform feedback buffer."
While not quite as explicit, similar wording exists for
vkCmdEndTransformFeedbackEXT in "Valid Usage" section.
So, this means that we should handle NULL in this case, and simply
ignore the corresponding reads and writes.
This fixes a whole lot of crashes when using transform-feedback with
Zink.
Reviewed-by: Dave Airlie <airlied@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8982 >
2021-02-12 11:33:27 +01:00
Giovanni Mascellani
72b8e643b0
anv: Allow null handle in DestroyDescriptorUpdateTemplate.
...
By the Vulkan specification, and similarly to many other Vulkan calls,
it is allowed to destroy a null descriptor update template.
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com >
Fixes: af5f13e58c ("anv: add VK_KHR_descriptor_update_template support")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9005 >
2021-02-12 09:30:46 +00:00
Iago Toral Quiroga
82981ccbb1
broadcom/compiler: use unifa for UBO loads from uniform addresses
...
This basically processes UBO loads as uniform loads by writing
the load address to the unifa register and reading sequential
values with ldunifa.
This process is faster than going through the TMU, but we can only
use it when the address we are reading from is uniform across all
channels, since we are basically reading from the UBO address
as if it was a uniform stream.
This leads to better performance in the UE4 Shooter demo.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:22 +00:00
Iago Toral Quiroga
878555976e
broadcom/compiler: emit ldunifarf when needed
...
Just like ldunif and ldunifrf, ldunifa writes to the r5 accumulator
and ldunifarf writes to the register file.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
c2a04aca48
broadcom/compiler: do not DCE ldunifa
...
ldunifa reads a uniform from the unifa address and updates the unifa
address implicitly, so if we dead-code-eliminate one a follow-up
ldunifa will not read from the appropriate address.
We could avoid this if the compiler ensures that every ldunifa is
paired with an explicit unifa, so for example if we are reading a
vec4, we could emit:
unifa (addrr)
ldunifa
unifa (addr+4)
ldunifa
unifa (addr+8)
ldunifa
unifa (addr+12)
ldunifa
instead of:
unifa (addr)
ldunifa
ldunifa
ldunifa
ldunifa
But since each unifa has a 3 delay slot before we can do ldunifa,
that would end up being quite expensive.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
efc75e13ea
broadcom/compiler: disallow reading two uniforms in the same instruction
...
The simulator asserts on this, which can happen if we merge a ldunif
(or any other instruction that reads a uniform implicitly) and
ldunifa in the same instruction.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
e8e4bdae8d
broadcom/compiler: ensure 3-slot delay between unifa and ldunifa
...
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
42880fdf5d
broadcom/compiler: preserve ordering of unifa/ldunifa sequences
...
unifa writes the addresss from which follow-up ldunifa loads,
and each ldunifa increments the unifa addeess by 32-bit so the
loads need to be ordered too.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
97c078488f
broadcom/compiler: disallow unifa overlap with thread switch/end
...
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
24db1a5112
broadcom/compiler: add a helper to check if an instruction writes unifa
...
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
4b929ae9f0
broadcom/compiler: don't check for GFXH-1633 on V3D 4.2.x
...
This has been fixed since V3D 4.2.14 (Rpi4), which is the hardware
we are targetting. Our version resolution doesn't allow us to check
for 4.2 versions lower than .14, but that is okay because the
simulator would still validate this in any case.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
457ed5aa01
broadcom/compiler: name registers correctly based on V3D version
...
So we can differentiate between TMU for V3D 4.x and UNIFA for V3D 4.x,
which are aliased.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
f85fcaa494
broadcom/compiler: pass a devinfo to check if an instruction writes to TMU
...
V3D 3.x has V3D_QPU_WADDR_TMU which in V3D 4.x is V3D_QPU_WADDR_UNIFA
(which isn't a TMU write address). This change passes a devinfo to
any functions that need to do these checks so we can account for the
target V3D version correctly.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Iago Toral Quiroga
449af48f42
broadcom/compiler: add V3D_QPU_WADDR_UNIFA
...
This only exists in V3D 4.x and aliases V3D_QPU_WADDR_TMU from V3D 3.x.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980 >
2021-02-12 08:24:21 +00:00
Giovanni Mascellani
c6731daa5e
disk_cache: Fail creation when cannot inizialize queue.
...
Signed-off-by: Giovanni Mascellani <gmascellani@codeweavers.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Fixes: e2c4435b07 ("util/disk_cache: add thread queue to disk cache")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8983 >
2021-02-12 08:01:46 +00:00
Arcady Goldmints-Orlov
9909fe6bac
broadcom/compiler: Skip bool_to_cond where possible
...
This change keeps track of when a boolean temp is loaded into the flags
by a comparison instruction and uses that information to skip emitting
instructions to set the flags in ntq_emit_bool_to_cond when the flags
already have the right contents.
total instructions in shared programs: 11116502 -> 11112225 (-0.04%)
instructions in affected programs: 631691 -> 627414 (-0.68%)
helped: 1591
HURT: 754
helped stats (abs) min: 1 max: 94 x̄: 4.14 x̃: 3
helped stats (rel) min: 0.11% max: 13.46% x̄: 2.10% x̃: 1.58%
HURT stats (abs) min: 1 max: 19 x̄: 3.07 x̃: 2
HURT stats (rel) min: 0.13% max: 19.67% x̄: 1.88% x̃: 1.15%
95% mean confidence interval for instructions value: -2.02 -1.63
95% mean confidence interval for instructions %-change: -0.94% -0.71%
Instructions are helped.
total uniforms in shared programs: 3281555 -> 3281513 (<.01%)
uniforms in affected programs: 1754 -> 1712 (-2.39%)
helped: 10
HURT: 5
helped stats (abs) min: 1 max: 19 x̄: 7.90 x̃: 5
helped stats (rel) min: 0.56% max: 11.11% x̄: 7.37% x̃: 11.05%
HURT stats (abs) min: 1 max: 15 x̄: 7.40 x̃: 3
HURT stats (rel) min: 0.64% max: 9.55% x̄: 5.31% x̃: 3.41%
95% mean confidence interval for uniforms value: -8.57 2.97
95% mean confidence interval for uniforms %-change: -7.35% 1.07%
Inconclusive result (value mean confidence interval includes 0).
total max-temps in shared programs: 1758419 -> 1758174 (-0.01%)
max-temps in affected programs: 7006 -> 6761 (-3.50%)
helped: 290
HURT: 14
helped stats (abs) min: 1 max: 8 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.79% max: 22.86% x̄: 6.61% x̃: 4.88%
HURT stats (abs) min: 1 max: 13 x̄: 6.00 x̃: 3
HURT stats (rel) min: 1.54% max: 54.17% x̄: 23.99% x̃: 9.12%
95% mean confidence interval for max-temps value: -1.03 -0.58
95% mean confidence interval for max-temps %-change: -6.24% -4.16%
Max-temps are helped.
total sfu-stalls in shared programs: 23676 -> 23610 (-0.28%)
sfu-stalls in affected programs: 1578 -> 1512 (-4.18%)
helped: 257
HURT: 252
helped stats (abs) min: 1 max: 3 x̄: 1.37 x̃: 1
helped stats (rel) min: 11.11% max: 100.00% x̄: 46.70% x̃: 40.00%
HURT stats (abs) min: 1 max: 2 x̄: 1.14 x̃: 1
HURT stats (rel) min: 0.00% max: 200.00% x̄: 41.65% x̃: 25.00%
95% mean confidence interval for sfu-stalls value: -0.25 -0.01
95% mean confidence interval for sfu-stalls %-change: -8.24% 2.33%
Inconclusive result (%-change mean confidence interval includes 0).
total inst-and-stalls in shared programs: 11140178 -> 11135835 (-0.04%)
inst-and-stalls in affected programs: 633972 -> 629629 (-0.69%)
helped: 1581
HURT: 755
helped stats (abs) min: 1 max: 94 x̄: 4.26 x̃: 3
helped stats (rel) min: 0.11% max: 13.46% x̄: 2.12% x̃: 1.59%
HURT stats (abs) min: 1 max: 17 x̄: 3.17 x̃: 2
HURT stats (rel) min: 0.05% max: 19.67% x̄: 1.93% x̃: 1.20%
95% mean confidence interval for inst-and-stalls value: -2.06 -1.66
95% mean confidence interval for inst-and-stalls %-change: -0.93% -0.70%
Inst-and-stalls are helped.
Reviewed-by: Iago Toral Quioroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8933 >
2021-02-12 07:05:33 +00:00
Arcady Goldmints-Orlov
8762f29e9c
broadcom/compiler: Add a v3d_compile argument to vir_set_[pu]f
...
Reviewed-by: Iago Toral Quioroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8933 >
2021-02-12 07:05:33 +00:00
Bas Nieuwenhuizen
c78b372dd0
radv: Define supported extensions in C.
...
One python generator less.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8900 >
2021-02-12 01:56:00 +00:00
Bas Nieuwenhuizen
8331b7c8d5
radv: Remove custom icd json generation.
...
No Android.mk changes as the radv provided json file isn't used.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8900 >
2021-02-12 01:56:00 +00:00
Alyssa Rosenzweig
2f44a76ab4
panfrost: Set barriers flag for compute shaders
...
Pipe in the info from NIR. Fix incorrect handling of helper invocations,
which also use the barrier flag.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6312 >
2021-02-12 01:37:05 +00:00
Alyssa Rosenzweig
9f934e922d
compiler, nir: Add and set barrier metadata
...
Useful for determining whether certain optimizations are legal for a
compute shader (e.g. optimizing workgroup size in the driver).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6312 >
2021-02-12 01:37:05 +00:00
Alyssa Rosenzweig
2bd2a03657
panfrost: Enable ES3 conformant floating-point
...
Don't suppress inf/nan. Triggers bugs in broken apps like glmark2 (fixed
upstream but traces don't have the fix yet), so update the trace
expectations.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7550 >
2021-02-12 01:21:29 +00:00
Kenneth Graunke
dcf6247fcb
iris: Remove context from iris_disk_cache_retrieve
...
We don't use the context other than getting the screen and uploader.
Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922 >
2021-02-11 20:51:18 +00:00
Kenneth Graunke
b65680d59f
iris: Remove context from iris_create_uncompiled_shader
...
Nothing uses the context here, just the screen.
Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922 >
2021-02-11 20:51:18 +00:00
Kenneth Graunke
cee922940b
iris: Remove context from iris_compile_vs and friends
...
Instead, we pass the screen, an uploader, and a debug callback.
Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922 >
2021-02-11 20:51:18 +00:00
Kenneth Graunke
730ce52104
iris: Remove context from iris_upload_shader()
...
Shaders are now shared across contexts, so we'd like to avoid requiring
access to a full context. Instead, we pass the screen and an uploader
to use.
Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922 >
2021-02-11 20:51:18 +00:00
Kenneth Graunke
979434639e
iris: Remove context from iris_debug_recompile
...
This doesn't and shouldn't use the context. It just wants a debug
callback to print things on.
Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922 >
2021-02-11 20:51:18 +00:00
Kenneth Graunke
4256f7ed58
iris: Fill out scratch base address dynamically
...
Now that shaders are shared between contexts, we can't pre-bake the
shader scratch address into the derived 3DSTATE_XS packets. Scratch
buffers are and must be per-context, as multiple contexts could be
executing shaders using scratch at the same time.
So instead, we leave that field blank when pre-filling those packets
up-front, and merge in the actual address when emitting them. It's
a little more overhead, but only in the case where scratch is used.
Fixes: 84a38ec133 ("iris: Enable PIPE_CAP_SHAREABLE_SHADERS.")
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8922 >
2021-02-11 20:51:18 +00:00
Mike Blumenkrantz
564a9e18a7
zink: lower flrp64 and ffma64 when in softfp64 mode
...
fixes a bunch of crashes
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8986 >
2021-02-11 20:22:54 +00:00
Mike Blumenkrantz
a64fe5ae5b
zink: add spirv interfaces for bo and image/sampler/push variables
...
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8985 >
2021-02-11 20:11:18 +00:00
Jordan Justen
89580073f3
anv: Add ANV_QUEUE_OVERRIDE env-var to override advertised queues
...
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8771 >
2021-02-11 19:58:23 +00:00
Jason Ekstrand
1326e1c0fe
anv: Add fake graphics-only and compute-only queue families
...
Rework:
* Jordan: Add graphics-only queue
* Jordan: Bump ANV_MAX_QUEUE_FAMILIES and add related asserts
* Jordan: Fix queueCount on compute-only family
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8771 >
2021-02-11 19:58:23 +00:00
Michel Zou
664a803879
vulkan: Fix windows api conflict
...
It must be undefined in the header too
Fixes: e487ae1b
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8975 >
2021-02-11 17:56:27 +00:00
Alyssa Rosenzweig
a27d76a2d9
pan/bi: Push UBOs on Bifrost
...
Based on the Midgard pass. Results look better since Midgard already had
a basic UBO pushing pass to begin with. Particularly nice to see the
dramatic reduction in spilling.
total instructions in shared programs: 169141 -> 161215 (-4.69%)
instructions in affected programs: 164102 -> 156176 (-4.83%)
helped: 1269
HURT: 90
helped stats (abs) min: 1 max: 61 x̄: 6.50 x̃: 4
helped stats (rel) min: 0.15% max: 17.58% x̄: 6.31% x̃: 5.88%
HURT stats (abs) min: 1 max: 170 x̄: 3.58 x̃: 1
HURT stats (rel) min: 0.08% max: 133.33% x̄: 16.65% x̃: 5.26%
95% mean confidence interval for instructions value: -6.28 -5.38
95% mean confidence interval for instructions %-change: -5.39% -4.18%
Instructions are helped.
total nops in shared programs: 121049 -> 120997 (-0.04%)
nops in affected programs: 110024 -> 109972 (-0.05%)
helped: 501
HURT: 758
helped stats (abs) min: 1 max: 45 x̄: 5.54 x̃: 2
helped stats (rel) min: 0.25% max: 47.06% x̄: 6.81% x̃: 4.55%
HURT stats (abs) min: 1 max: 102 x̄: 3.59 x̃: 3
HURT stats (rel) min: 0.32% max: 50.00% x̄: 7.13% x̃: 6.06%
95% mean confidence interval for nops value: -0.45 0.37
95% mean confidence interval for nops %-change: 1.07% 2.09%
Inconclusive result (value mean confidence interval includes 0).
total clauses in shared programs: 40388 -> 31610 (-21.73%)
clauses in affected programs: 38825 -> 30047 (-22.61%)
helped: 1367
HURT: 2
helped stats (abs) min: 1 max: 58 x̄: 6.43 x̃: 5
helped stats (rel) min: 1.34% max: 55.56% x̄: 24.97% x̃: 25.00%
HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 7
HURT stats (rel) min: 5.08% max: 6.67% x̄: 5.88% x̃: 5.88%
95% mean confidence interval for clauses value: -6.74 -6.08
95% mean confidence interval for clauses %-change: -25.50% -24.35%
Clauses are helped.
total quadwords in shared programs: 144937 -> 130686 (-9.83%)
quadwords in affected programs: 140419 -> 126168 (-10.15%)
helped: 1369
HURT: 13
helped stats (abs) min: 1 max: 112 x̄: 10.50 x̃: 7
helped stats (rel) min: 0.23% max: 31.82% x̄: 11.36% x̃: 10.78%
HURT stats (abs) min: 1 max: 106 x̄: 10.00 x̃: 1
HURT stats (rel) min: 5.88% max: 10.24% x̄: 9.26% x̃: 10.00%
95% mean confidence interval for quadwords value: -10.96 -9.66
95% mean confidence interval for quadwords %-change: -11.52% -10.82%
Quadwords are helped.
total spills in shared programs: 1106 -> 705 (-36.26%)
spills in affected programs: 1058 -> 657 (-37.90%)
helped: 41
HURT: 0
total fills in shared programs: 2241 -> 1645 (-26.60%)
fills in affected programs: 2219 -> 1623 (-26.86%)
helped: 43
HURT: 2
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
040a350b1e
pan/bi: Add SSA-based scalar copy propagation
...
This is a very simple (and slow...) copyprop pass. It's good enough to
get rid of redundant moves from FAU, but it doesn't help for vector
combines.
total instructions in shared programs: 175219 -> 169141 (-3.47%)
instructions in affected programs: 91439 -> 85361 (-6.65%)
helped: 599
HURT: 0
helped stats (abs) min: 1 max: 112 x̄: 10.15 x̃: 6
helped stats (rel) min: 0.30% max: 33.33% x̄: 8.61% x̃: 8.04%
95% mean confidence interval for instructions value: -11.06 -9.24
95% mean confidence interval for instructions %-change: -9.07% -8.16%
Instructions are helped.
total nops in shared programs: 120011 -> 121049 (0.86%)
nops in affected programs: 47355 -> 48393 (2.19%)
helped: 110
HURT: 309
helped stats (abs) min: 1 max: 6 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.44% max: 16.67% x̄: 3.59% x̃: 3.16%
HURT stats (abs) min: 1 max: 56 x̄: 4.10 x̃: 2
HURT stats (rel) min: 0.32% max: 80.85% x̄: 6.85% x̃: 3.12%
95% mean confidence interval for nops value: 1.86 3.09
95% mean confidence interval for nops %-change: 3.08% 5.14%
Nops are HURT.
total clauses in shared programs: 40576 -> 40388 (-0.46%)
clauses in affected programs: 3074 -> 2886 (-6.12%)
helped: 106
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.77 x̃: 2
helped stats (rel) min: 0.42% max: 22.22% x̄: 7.17% x̃: 6.90%
95% mean confidence interval for clauses value: -1.91 -1.63
95% mean confidence interval for clauses %-change: -7.80% -6.53%
Clauses are helped.
total quadwords in shared programs: 146590 -> 144937 (-1.13%)
quadwords in affected programs: 59475 -> 57822 (-2.78%)
helped: 493
HURT: 1
helped stats (abs) min: 1 max: 28 x̄: 3.35 x̃: 2
helped stats (rel) min: 0.28% max: 15.38% x̄: 4.08% x̃: 3.85%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 2.38% max: 2.38% x̄: 2.38% x̃: 2.38%
95% mean confidence interval for quadwords value: -3.61 -3.08
95% mean confidence interval for quadwords %-change: -4.33% -3.81%
Quadwords are helped.
total spills in shared programs: 1106 -> 1106 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0
total fills in shared programs: 2241 -> 2241 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
fa79168b9e
pan/bi: Simplify derivative lowering
...
Now that we lower FAU correctly, we don't need to write the extra move
explicitly, it will be lowered in later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
0acc6b564e
pan/bi: Rework FAU lowering
...
Move and reshape bi_lower_fau to bi_schedule.c. This generalizes the
pass for FAU reads, allowing copyprop to work with FAU without problems.
The pass must run immediately before scheduling. Its post-conditions are
directly specified as the scheduler's pre-conditions. It momentarily
will depend on internal scheduler predicates. It is, for all intents and
purposes, part of the scheduler. Keep it all together.
Finally, adjust the 0 handling to avoid a move at the expense of
constrained scheduling of something like `FADD.v2f16.clamp_0_1 u0, #0`
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
6106fb5d8d
pan/bi: Handle modifiers in rewrite_fau_to_pass
...
Will prevent failures when we start using FAU together with modifiers in
a few commits.
Fixes: fc7770b1dd ("pan/bi: Add trivial rewrite helpers")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
e9572ff3e9
pan/bi: Generalize bi_update_fau with fast zero
...
Ensure we don't fall over if we have an instruction like
FADD.f32 u0, #0
In this case, the tuple's FAU requirement implies the instruction can be
scheduler without lowering to the FMA slot but not the ADD slot.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
0f27e24934
pan/bi: Print FAU uniforms in IR
...
Uses "u3, u3[1]" syntax which is close enough to the assembly syntax
"u3.w0, u3.w1".
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
97e5181fe4
pan/bi: Add bi_is_ssa helper
...
Convenient for SSA-based opt passes.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00
Alyssa Rosenzweig
be02c0868c
pan/bi: Add bi_replace_index helper
...
I keep open-coding this, incorrectly... Since bi_index contains both
"position" and "modifier" data, it's common to want to swap the position
while preserving modifiers.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8973 >
2021-02-11 17:24:37 +00:00