Alyssa Rosenzweig
86a6597714
panfrost: Remove unused batch_fence->ctx
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995 >
2020-07-21 13:57:43 +00:00
Alyssa Rosenzweig
f18e5371cf
panfrost: Remove unused batch_fence->signaled
...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995 >
2020-07-21 13:57:43 +00:00
Alyssa Rosenzweig
64d6f56ad2
panfrost: Allocate syncobjs in panfrost_flush
...
For implementing panfrost_flush, it suffices to wait on only a single
syncobj, not an entire array of them. This lets us wait on it directly,
without coercing to/from syncfds in the middle (although some complexity
may be added later to support Android winsys).
Further, we should let the fence own the syncobj, tying together the
lifetimes and thus removing the connection between syncobjs and
batch_fence.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995 >
2020-07-21 13:57:43 +00:00
Alyssa Rosenzweig
85a2216fe4
panfrost: Skip specifying in_syncs
...
With the current kernel UABI, there is no benefit to explicitly
specifiying dependencies, since the kernel by design adds implicit
dependencies to any referenced BOs. This is something we'd like to
address in the future, but efficient handling with future kernels will
require a tweaked design in userspace as well. So let's do the obvious
thing now, and extend later.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995 >
2020-07-21 13:57:43 +00:00
Alyssa Rosenzweig
e5ef5a381e
panfrost: Remove wait parameter to flush_all_batches
...
It is always false now.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995 >
2020-07-21 13:57:43 +00:00
Alyssa Rosenzweig
0c4db886b6
panfrost: Avoid wait=true flushing all batches
...
What is intended is to flush the batches and wait on a particular BO at
a later time. Explicitly forcing a wait immediately is redundant.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5995 >
2020-07-21 13:57:43 +00:00
Rhys Perry
04ea4f1ce4
aco: implement b2i8/b2i16
...
Fixes lots of tests under dEQP-VK.spirv_assembly.type.*
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5993 >
2020-07-21 12:27:30 +00:00
Karol Herbst
3a7cd7bd65
nv50/ir: initialize persampleInvocation to false
...
Fixes: random KHR-GL45.sample_variables.mask.* fails
Fixes: 66ed9792ed ("nv50: Clear nv50_ir_prog_info of dead and codegen specific variables")
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6001 >
2020-07-21 12:16:54 +00:00
Karol Herbst
618b355504
nv50/ir/tgsi: silence warning about unhandled GS_INPUT_PRIM property
...
Fixes: 66ed9792ed ("nv50: Clear nv50_ir_prog_info of dead and codegen specific variables")
Signed-off-by: Karol Herbst <kherbst@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6001 >
2020-07-21 12:16:54 +00:00
Samuel Pitoiset
6ced98c94e
radv: disable CPU caching for the upload BO to reduce fetch latency
...
AMDGPU_GEM_CREATE_CPU_GTT_USWC should be faster when CPU reads
are unexpected (because they aren't cached).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978 >
2020-07-21 11:54:39 +00:00
Samuel Pitoiset
b3eae4e037
radv: do not perform read-modify-write with the upload BO
...
To disable CPU caching.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5978 >
2020-07-21 11:54:39 +00:00
Rhys Perry
d9072a113b
radv: replace discard with demote for Quantic Dream games
...
Detroit: Become Human uses dFdx/dFdy immediately after a quad-divergent
discard, which can cause the image to become white.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Cc: <mesa-stable@lists.freedesktop.org >
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3212
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991 >
2020-07-21 11:34:23 +00:00
Rhys Perry
51bc11abc2
aco: always set FI on GFX10
...
bounds_ctrl is set to true by default which works around some game bugs,
but that isn't enough on GFX10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5991 >
2020-07-21 11:34:23 +00:00
Eric Anholt
8b3452a556
ci: Set XDG_CACHE_HOME to tmpfs for bare-metal runners to avoid NFS.
...
We don't want these files shared between builds (it'll get blown away by
the next rsync), and NFS will just increase our latency for hitting the
cache.
Drops a630 gles31 run from 11-17 minutes to 5.5. Maximum cache size on a
run I've seen is 153M, which it seems we can easily spare.
Fixes: f97acb4bb4 ("freedreno/ir3: disk-cache support")
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com >
Reviewed-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5998 >
2020-07-21 11:04:14 +00:00
Tomeu Vizoso
b2cd6a0b15
gitlab-ci: Fix needs: of the arm64 LAVA test jobs
...
They were still depending on arm_build, but the build of kernel and
rootfs has been moved to a separate job.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com >
Reviewed-by: Daniel Stone <daniels@collabora.com >
Reviewed-By: Rohan Garg <rohan.garg@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472 >
2020-07-21 09:22:19 +00:00
Tomeu Vizoso
a1947f059f
gitlab-ci: Upload tracie artifacts to MinIO
...
Upload failed images and the results.yml file to MinIO, to facilitate
debugging.
Also, fix version checking when git is installed as Mesa is going to
output a different renderer string if git is installed.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com >
Reviewed-by: Daniel Stone <daniels@collabora.com >
Reviewed-By: Rohan Garg <rohan.garg@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472 >
2020-07-21 09:22:19 +00:00
Tomeu Vizoso
20507f8b17
gitlab-ci: Download traces from MinIO
...
Downloading the traces directly from git causes very high egress from
GCE, which is expensive.
So we can expand trace testing further, we are going to keep a cache in
freedesktop.org's MinIO instance. This commit implements downloading
from it.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com >
Reviewed-by: Daniel Stone <daniels@collabora.com >
Reviewed-By: Rohan Garg <rohan.garg@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472 >
2020-07-21 09:22:18 +00:00
Rohan Garg
087be7e322
gitlab-ci: Replay traces on lava devices
...
Submit lava jobs to replay traces on Veyron (Mali T760) and Kevin (Mali
T860) boards.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com >
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com >
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Daniel Stone <daniels@collabora.com >
Reviewed-By: Rohan Garg <rohan.garg@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5472 >
2020-07-21 09:22:18 +00:00
Kenneth Graunke
576c53dadf
iris: Fix CCS check in iris_texture_subdata().
...
The intention here was to check "Would the GPU be able to compress
this if we used the PBO-based texture upload path?" Prior to Gen12,
that meant checking for CCS_E. On Gen12, there are a lot more types
of compression, and basic CCS_E was replaced by GEN12_CCS_E, making
this check simply not work, so we'd take the CPU path instead.
Instead, check if it has CCS, and isn't the basic "fast clear" CCS_D.
Fixes: 39f06e2848 ("iris: Implement pipe->texture_subdata directly")
Tested-by: Mark Janes <mark.a.janes@intel.com >
Reviewed-by: Tapani Pälli <tapani.palli@intel.com >
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6005 >
2020-07-21 09:10:37 +00:00
Rhys Perry
0868638aed
nir/lower_int64: lower 64-bit amul
...
Fixes an issue with Renderdoc's shader debugging with ACO.
If nir_opt_algebraic isn't called in-between nir_lower_explicit_io and
nir_lower_int64, we can end up with 64-bit multiplications.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: 6320e37d4b ('nir: add amul instruction')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5709 >
2020-07-21 06:47:10 +00:00
Jason Ekstrand
4d44848c47
anv: Advertise support for VK_EXT_shader_atomic_float
...
We already have all of the shader code for load/store/exchange.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992 >
2020-07-21 05:01:34 +00:00
Jason Ekstrand
675d7b19a9
intel/fs: Use the correct logical op for global float atomics
...
Fixes: e644ed468f "intel/fs: Implement nir_intrinsic_global_atomic_*"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992 >
2020-07-21 05:01:34 +00:00
Jason Ekstrand
84086b620e
spirv: Add support for SPV_EXT_shader_atomic_float
...
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992 >
2020-07-21 05:01:34 +00:00
Jason Ekstrand
2a568c595b
spirv: Update headers and grammar json
...
This pulls in commit 63cb1fc131573fa from KhronosGroup/SPIRV-Headers
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5992 >
2020-07-21 05:01:34 +00:00
Eric Engestrom
cc03448008
egl: inline _EGLAPI into _EGLDriver
...
_EGLDriver was an empty wrapper around _EGLAPI, so let's only keep one
of them. "driver" represents better what's being accessed, so that's the
one we're keeping.
Signed-off-by: Eric Engestrom <eric@engestrom.ch >
Reviewed-by: Eric Anholt <eric@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5987 >
2020-07-21 00:59:43 +00:00
Bas Nieuwenhuizen
7b7917a424
radeonsi: Inhibit clock-gating for perf counters.
...
Otherwise most counters return 0. Should be much more user friendly
than having to totally disable clock-gating on the kernel cmdline.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972 >
2020-07-20 23:56:26 +00:00
Bas Nieuwenhuizen
794ba3efd7
amd/registers: add RLC_PERFMON_CLK_CNTL for pre-GFX10
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5972 >
2020-07-20 23:56:26 +00:00
Jason Ekstrand
36e6ac65c5
anv: Advertise VK_EXT_image_robustness
...
We already support a superset of VK_EXT_image_robustness via
VK_EXT_robustness2.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5985 >
2020-07-20 22:30:18 +00:00
Eric Anholt
d973e50f69
freedreno/ir3: Add missing ld_args_build_id to the ir3_delay unit test.
...
It triggers the disk cache for me, and asserts abount not getting the
build id right.
Fixes: f97acb4bb4 ("freedreno/ir3: disk-cache support")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5989 >
2020-07-20 22:11:51 +00:00
Samuel Pitoiset
3688da2192
radv: advertise VK_EXT_image_robustness
...
All new dEQP-VK.robustness.image_robustness.* pass.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5979 >
2020-07-20 21:18:27 +00:00
Christian Gmeiner
096adbe369
ci: bare-metal: use nginx to get results from DUT
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2655
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5661 >
2020-07-20 20:21:12 +00:00
Yevhenii Kolesnikov
101400d449
mesa: change error code of *TextureSubImage* for incorreect target
...
According to the "Errors" list of the OpenGL 4.6 spec, section 8.6
"Alternate Texture Image Specification Commands":
An INVALID_OPERATION error is generated by *TextureSubImage* if the
effective target of texture does not match the command, as shown in table 8.15.
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5934 >
2020-07-20 19:58:14 +00:00
Eric Anholt
af92348b1c
freedreno/ir3: Fix disasm of register offsets in ldp/stp.
...
I had a stp testcase that was getting its offset wrong, and by twiddling
bits and feeding it to qc disasm, I found that the comment was sort of
right: some the cat6a bits implicated in the old comment do get used, as
the high bits of the cat6c offset. Reallocating those bits also fixes how
we were getting r960.y for r0.y.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815 >
2020-07-20 19:42:45 +00:00
Eric Anholt
d6d8dc133e
freedreno/ir3: Refactor cat6 general dst printing.
...
We didn't need the extra branch and temp, we can move it inside of the dst
handling by just duplicating the print of the dst reg.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815 >
2020-07-20 19:42:45 +00:00
Eric Anholt
62dcf75432
freedreno/ir3: Add a bunch more tests for cat6 opcodes.
...
This started with making note of some ldp/stp instructions from the blob
and how we differ from them. In the process of fixing it, I accidentally
modified behavior of other opcodes, and the other instructions listed will
keep us from doing that. I also dropped an old stl test that looks like I
took from after a shader 'end' instruction.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815 >
2020-07-20 19:42:45 +00:00
Eric Anholt
ed3338f581
freedreno/ir3: Add a note about the instructions in the disasm test.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5815 >
2020-07-20 19:42:45 +00:00
Jason Ekstrand
4ab3a219cc
vulkan: Update Vulkan XML and headers to 1.2.148
...
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5983 >
2020-07-20 18:28:10 +00:00
Eric Anholt
fd24a95995
ci: Use FDO_CI_CONCURRENT as our -j flags when present in the runner env.
...
fd.o has retuned the x86 runners on packet for -j8. Rather than having to
tweak our CI every time fd.o decides to rebalance job concurrency, respect
what the runner admin has chosen for their builds (this will also be
convenient for people with large local runners).
Reviewed-by: Michel Dänzer <michel@daenzer.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5669 >
2020-07-20 17:22:17 +00:00
Daniel Schürmann
5f79e4e69a
nir/algebraic: fold some nested bcsel
...
Totals from 14266 (10.62% of 134368) affected shaders (Polaris):
SGPRs: 761756 -> 762732 (+0.13%); split: -0.00%, +0.13%
VGPRs: 430392 -> 430924 (+0.12%); split: -0.05%, +0.17%
SpillSGPRs: 4652 -> 4628 (-0.52%); split: -0.60%, +0.09%
CodeSize: 30133000 -> 29949780 (-0.61%); split: -0.66%, +0.05%
MaxWaves: 102122 -> 102111 (-0.01%); split: +0.00%, -0.01%
Instrs: 5845085 -> 5841668 (-0.06%); split: -0.08%, +0.03%
Cycles: 69033140 -> 68889188 (-0.21%); split: -0.22%, +0.01%
VMEM: 8479021 -> 8474978 (-0.05%); split: +0.03%, -0.08%
SMEM: 831437 -> 830464 (-0.12%); split: +0.06%, -0.18%
VClause: 105411 -> 105410 (-0.00%); split: -0.01%, +0.01%
SClause: 327727 -> 327780 (+0.02%); split: -0.00%, +0.02%
Copies: 372704 -> 373306 (+0.16%); split: -0.16%, +0.32%
Branches: 112260 -> 112269 (+0.01%); split: -0.00%, +0.01%
PreSGPRs: 433308 -> 433631 (+0.07%); split: -0.01%, +0.09%
PreVGPRs: 397888 -> 397905 (+0.00%); split: -0.01%, +0.01%
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:46 +00:00
Daniel Schürmann
27244662f2
nir/algebraic: propagate b2i out of ior/iand
...
Totals from 761 (0.57% of 134368) affected shaders (Polaris):
SGPRs: 29496 -> 29488 (-0.03%)
SpillSGPRs: 41 -> 43 (+4.88%)
CodeSize: 1922036 -> 1882408 (-2.06%); split: -2.08%, +0.02%
Instrs: 366051 -> 360362 (-1.55%); split: -1.57%, +0.02%
Cycles: 7692516 -> 7661216 (-0.41%); split: -0.41%, +0.01%
VMEM: 365175 -> 365172 (-0.00%)
VClause: 15324 -> 15322 (-0.01%)
SClause: 9825 -> 9824 (-0.01%); split: -0.02%, +0.01%
Copies: 41216 -> 41294 (+0.19%); split: -0.01%, +0.20%
Branches: 7020 -> 7033 (+0.19%)
PreSGPRs: 22103 -> 22106 (+0.01%)
PreVGPRs: 26518 -> 26515 (-0.01%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:46 +00:00
Daniel Schürmann
baee5a9812
nir/algebraic: add distributive rules for ior/iand
...
Totals from 581 (0.43% of 134368) affected shaders (Polaris):
CodeSize: 1389560 -> 1386488 (-0.22%)
Instrs: 264488 -> 263984 (-0.19%)
Cycles: 1057952 -> 1055936 (-0.19%)
VMEM: 296016 -> 291613 (-1.49%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:46 +00:00
Daniel Schürmann
70d3efeb88
nir/algebraic: optimize (a < 0.0) ? -a : a -> fabs(a)
...
Totals from affected shaders: (VEGA)
SGPRS: 13920 -> 13920 (0.00 %)
VGPRS: 10252 -> 10252 (0.00 %)
Spilled SGPRs: 62 -> 62 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 587648 -> 587224 (-0.07 %) bytes
LDS: 5 -> 5 (0.00 %) blocks
Max Waves: 1489 -> 1489 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:46 +00:00
Daniel Schürmann
9d22c5ed71
nir/algebraic: optimize fmul(x, bcsel(c, -1.0, 1.0)) -> bcsel(c, -x, x)
...
Totals from affected shaders: (VEGA)
SGPRS: 545712 -> 545712 (0.00 %)
VGPRS: 413092 -> 413116 (0.01 %)
Spilled SGPRs: 10616 -> 10616 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 37031684 -> 36984248 (-0.13 %) bytes
LDS: 427 -> 427 (0.00 %) blocks
Max Waves: 54350 -> 54340 (-0.02 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:46 +00:00
Daniel Schürmann
56ec814b56
nir/algebraic: add some more unop + bcsel optimizations
...
Totals from affected shaders: (VEGA)
SGPRS: 284392 -> 284400 (0.00 %)
VGPRS: 261080 -> 261076 (-0.00 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 24698596 -> 24277788 (-1.70 %) bytes
LDS: 196 -> 196 (0.00 %) blocks
Max Waves: 10101 -> 10105 (0.04 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:45 +00:00
Daniel Schürmann
2fca183910
nir/algebraic: add optimizations for fsign/isign
...
This just reverts fsign/isign lowering.
Totals from affected shaders:
SGPRS: 257496 -> 256672 (-0.32 %)
VGPRS: 181800 -> 178864 (-1.61 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 11355852 -> 11141840 (-1.88 %) bytes
LDS: 3789 -> 3789 (0.00 %) blocks
Max Waves: 30453 -> 30951 (1.64 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:45 +00:00
Daniel Schürmann
8e1b75b330
nir/algebraic: optimize iand/ior of (n)eq zero
...
Found in some Detroit: Become Human shaders.
Totals from affected shaders:
SGPRS: 700256 -> 700256 (0.00 %)
VGPRS: 507208 -> 507212 (0.00 %)
Spilled SGPRs: 142531 -> 142531 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 76404616 -> 76301768 (-0.13 %) bytes
LDS: 43 -> 43 (0.00 %) blocks
Max Waves: 21438 -> 21438 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:45 +00:00
Daniel Schürmann
e4281dbecc
nir: also move b2i in case of nir_move_copies
...
Booleans are often more efficient with register usage.
This also allows to move comparisons further.
Totals from affected shaders: (VEGA)
SGPRS: 451608 -> 450320 (-0.29 %)
VGPRS: 351448 -> 351256 (-0.05 %)
Spilled SGPRs: 105 -> 105 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 1008 -> 1008 (0.00 %) dwords per thread
Code Size: 26555596 -> 26551080 (-0.02 %) bytes
LDS: 10323 -> 10323 (0.00 %) blocks
Max Waves: 42850 -> 42934 (0.20 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:45 +00:00
Daniel Schürmann
de0ebaf09d
nir/algebraic: optimize bcsel(a, 0, 1) to b2i
...
This avoids combination with other bcsel operations,
and as b2i is often a no-op (when used for iadd and such),
the resulting pattern is preferable.
Totals from affected shaders: (VEGA)
SGPRS: 598448 -> 598448 (0.00 %)
VGPRS: 457940 -> 457352 (-0.13 %)
Spilled SGPRs: 127154 -> 127154 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 64836352 -> 64802728 (-0.05 %) bytes
LDS: 781 -> 781 (0.00 %) blocks
Max Waves: 22931 -> 22931 (0.00 %)
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4830 >
2020-07-20 15:56:45 +00:00
Icecream95
e764192f40
pan/mdg: Use the blend RT for blend shader framebuffer fetches
...
Fixes piglit test fbo-drawbuffers-blend-add when fixed-function
blending is disabled in panfrost_get_blend_for_context.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892 >
2020-07-20 14:15:49 +00:00
Icecream95
3ec252a3b2
panfrost: 8x MRT support
...
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5892 >
2020-07-20 14:15:49 +00:00