Pierre-Eric Pelloux-Prayer
87f7ec8a2c
st/dri: use st->flush callback to flush the backbuffer
...
Previously the flush was done before the call to st->flush but
could lead to problems as FLUSH_VERTICES could push some work
that would change the backbuffer (or modify it).
With this commit, all the backbuffer flushing code is executed
right before the call to st_flush.
Closes: https://gitlab.freedesktop.org/drm/amd/issues/842
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205049
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer
cc0d0afe3b
st/mesa: add a notify_before_flush callback param to flush
...
The new callback is called right before the flush is done to allow
users of st->flush to do some work after all the previous work has
been flushed.
This will be used by dri_flush in the next commit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer
f5c1cb2383
radeonsi: dcc dirty flag
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-10 09:25:28 +01:00
Pierre-Eric Pelloux-Prayer
e3e91cebcd
radeonsi: fix multi plane buffers creation
...
When using 3 planes, the sequence produces this chain:
plane0 -> plane2
This commit fixes this to produce:
plane0 -> plane1 -> plane2
Fixes: 86e60bc265 ("radeonsi: remove si_vid_join_surfaces and use combined planar allocations")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2193
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-10 08:52:16 +01:00
Pierre-Eric Pelloux-Prayer
ff0f108666
radeonsi: use gfx9.surf_offset to compute texture offset
...
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2177
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-10 08:52:07 +01:00
Sonny Jiang
6c901f0675
radeonsi: use compute shader for clear 12-byte buffer
...
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-09 23:25:57 -05:00
Marek Olšák
38e9eb9561
st/mesa: release the draw shader properly to fix driver crashes (iris)
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 22:41:41 -05:00
Marek Olšák
41118246c6
draw, st/mesa: generate TGSI for ffvp/ARB_vp if draw lacks LLVM
...
Reviewed-by: Roland Scheidegger <sroland@vmware.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
a3de63fbb3
st/mesa: don't generate VS TGSI if NIR is enabled
...
it's no longer needed
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
a90f4453fe
st/mesa: remove struct st_vp_variant in favor of st_common_variant
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
6299b90fd4
st/mesa: remove st_vp_variant::num_inputs
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
bc99b22a30
st/mesa: use a separate VS variant for the draw module
...
instead of keeping the IR indefinitely in st_vp_variant.
This trivially fixes Selection/Feedback/RasterPos for NIR.
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
17e8839a2f
st/mesa: support shader images for Selection/Feedback/RasterPos
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
b7393f1115
st/mesa: support SSBOs for Selection/Feedback/RasterPos
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
e91b044bd8
st/mesa: support samplers for Selection/Feedback/RasterPos
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
2891c4b2e2
st/mesa: save currently bound vertex samplers and sampler views in st_context
...
for st_draw_feedback.c
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
226e7aee70
st/mesa: support UBOs for Selection/Feedback/RasterPos
...
Reviewed-by: Dave Airlie <airlied@redhat.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
60db75cb77
gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipe
...
This is already used in st_draw_feedback.c, because it uses shaders
generated for drivers.
Reviewed-by: Roland Scheidegger <sroland@vmware.com >
2019-12-09 21:09:28 -05:00
Marek Olšák
525c8b90c7
llvmpipe: implement TEX_LZ and TXF_LZ opcodes
...
gallivm receives these opcodes anyway because st_draw_feedback.c uses
shaders that were assembled for drivers, not llvmpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com >
2019-12-09 21:09:28 -05:00
Gurchetan Singh
3c8ddc8f4b
drirc: set allow_higher_compat_version for Faster Than Light
...
With 781a78 ("mesa: enable ARB_direct_state_access in compat for
GL3.1+), it's possible to have DSA with GL3.1+.
FTL creates a GL3.1 compat context, but fails the
_mesa_has_geometry_shaders(..) check in frame_buffer_texture.
Bump the compat version to pass the check.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-09 15:27:02 -08:00
Roland Scheidegger
23f1b78e8f
util/atomic: Fix p_atomic_add for unlocked and msvc paths
...
Braces mismatch (flagged by CI, untested).
Fixes: 385d13f26d "util/atomic: Add a _return variant of p_atomic_add"
Reviewed-by: Brian Paul <brianp@vmware.com >
Reviewed-by: Jose Fonseca <jfonseca@vmware.com >
Reviewed-by: Dylan Baker <dylan@pnwbakers.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-12-09 15:02:58 -08:00
Eric Anholt
0470a03769
freedreno: Track the set of UBOs to be uploaded in UBO analysis.
...
We were iterating over the entire 32-entry array each time, when we
can just use a bitset to know that we're only uploading from the first
entry normally.
Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL
fishtank.
Reviewed-by: Rob Clark <robdclark@chromium.org >
2019-12-09 14:13:50 -08:00
Eric Anholt
10da0a9d18
freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off.
...
The default is to not throw GL errors when drawing with mapped
buffers, but we were forcing it on for unclear reasons. Internally we
keep all our buffers mapped anyway, so it should be a no-op other than
reducing CPU overhead (.23% in a perf report for WebGL fishtank)
Reviewed-by: Rob Clark <robdclark@chromium.org >
2019-12-09 14:13:47 -08:00
Rob Clark
dc791d3c68
freedreno/fdperf: use drmOpen()
...
Signed-off-by: Rob Clark <robdclark@chromium.org >
2019-12-09 13:09:58 -08:00
Alyssa Rosenzweig
a37822f5f7
gallium/util: Support POLYGON in u_stream_outputs_for_vertices
...
u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is
trivial to support as a special case directly (since we have the number
of vertices directly).
Fixes aborts in Panfrost in apps using GL_POLYGON.
Fixes: e881aa8c12 ("gallium/util: Add u_stream_outputs_for_vertices helper")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Revewied-by: Eric Anholt <eric@anholt.net >
2019-12-09 21:09:05 +00:00
Anuj Phogat
1a32fbd48c
intel: Add pci-ids for Jasper Lake
...
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
2019-12-09 12:22:57 -08:00
Anuj Phogat
11fdd5f52c
intel: Add device info for 1x4x6 Jasper Lake
...
Also removing the FIXME comments after matching the numbers with
updated documentation.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
2019-12-09 12:22:56 -08:00
Vasily Khoruzhick
9f5fa496cb
lima: expose tiled format modifier in query_dmabuf_modifiers()
...
Fixes: 8c12f4e5f2 ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com >
2019-12-09 15:21:55 +00:00
Vasily Khoruzhick
01a451b04d
lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()
...
Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID
in resource_from_handle() and we don't have RO.
Fixes: 8c12f4e5f2 ("lima: enable tiling")
Reviewed-by: Qiang Yu <yuq825@gmail.com >
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com >
2019-12-09 15:21:55 +00:00
Jonathan Marek
9d78cf4584
turnip: add hw binning
...
Signed-off-by: Jonathan Marek <jonathan@marek.ca >
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com >
Reviewed-by: Eric Anholt <eric@anholt.net >
2019-12-09 08:22:18 -05:00
Samuel Pitoiset
86dfe92bd0
radv: do not use VK_TRUE/VK_FALSE
...
For consistency.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-12-09 09:21:26 +01:00
Dave Airlie
d7dc14628a
gallivm: add bitfield reverse and ufind_msb
...
Reviewed-by: Roland Scheidegger <sroland@vmware.com >
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com >
2019-12-09 06:05:02 +10:00
Roland Scheidegger
1c7693e3bd
gallium/scons: fix graw_gdi build
...
Fixes: 44a6b0107b (gallivm: add nir->llvm translation (v2))
Reviewed-by: Dave Airlie <Airlied@redhat.com >
Reviewed-by: Jose Fonseca <jfonseca@vmware.com >
2019-12-07 17:50:53 +01:00
Daniel Schürmann
8259c97b2d
aco: propagate temporaries into expanded vectors
...
Gives a very slight decrease in code size:
Totals from affected shaders:
Code Size: 1708488 -> 1702768 (-0.33 %) bytes
Max Waves: 2858 -> 2855 (-0.10 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
df3e674fb3
aco: improve readfirstlane after uniform ssbo loads on GFX7
...
pipeline-db changes for GFX7:
80310 shaders in 40472 tests
Totals:
SGPRS: 3655900 -> 3643916 (-0.33 %)
VGPRS: 2678324 -> 2686324 (0.30 %)
Spilled SGPRs: 1730 -> 1634 (-5.55 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread
Code Size: 136106120 -> 135457616 (-0.48 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601014 -> 600206 (-0.13 %)
Totals from affected shaders:
SGPRS: 307832 -> 295848 (-3.89 %)
VGPRS: 267864 -> 275864 (2.99 %)
Spilled SGPRs: 770 -> 674 (-12.47 %)
Spilled VGPRs: 14 -> 21 (50.00 %)
Scratch size: 16 -> 12 (-25.00 %) dwords per thread
Code Size: 22007488 -> 21358984 (-2.95 %) bytes
LDS: 65 -> 65 (0.00 %) blocks
Max Waves: 28668 -> 27860 (-2.82 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0837471463
aco: use soffset for MUBUF instructions on SI/CI
...
pipeline-db changes for GFX7:
80310 shaders in 40472 tests
Totals:
SGPRS: 3655300 -> 3655900 (0.02 %)
VGPRS: 2677732 -> 2678324 (0.02 %)
Spilled SGPRs: 1730 -> 1730 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 15540 -> 15540 (0.00 %) dwords per thread
Code Size: 136488364 -> 136106120 (-0.28 %) bytes
LDS: 1259 -> 1259 (0.00 %) blocks
Max Waves: 601039 -> 601014 (-0.00 %)
Totals from affected shaders:
SGPRS: 316312 -> 316912 (0.19 %)
VGPRS: 273844 -> 274436 (0.22 %)
Spilled SGPRs: 770 -> 770 (0.00 %)
Spilled VGPRs: 14 -> 14 (0.00 %)
Scratch size: 16 -> 16 (0.00 %) dwords per thread
Code Size: 22724904 -> 22342660 (-1.68 %) bytes
LDS: 114 -> 114 (0.00 %) blocks
Max Waves: 30861 -> 30836 (-0.08 %)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
7b38d95b32
radv: Enable ACO on GFX7 (Sea Islands)
...
This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used.
Note that shader_ballot works correctly, but performance seems inferior.
To enable shader_ballot use RADV_PERFTEST=shader_ballot.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
28c95cc402
aco: return to loop_active mask at continue_or_break blocks
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
0f9447ccb0
radv: disable Youngblood app profile if ACO is used
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
746165e540
aco: implement exclusive scan for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
7ae227effd
aco: implement inclusive_scan for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
f895a8b1df
aco: implement (clustered) reductions for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
9254fb4fc7
aco: don't use a scalar temporary for reductions on GFX10
...
This patch also adds the scalar temporary for scans on SI/CI
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
8ad43d8838
aco: flush denorms after fmin/fmax on pre-GFX9
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
21f67a3bdc
radv: only flush scalar cache for SSBO writes with ACO on GFX8+
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
79ce6c1b33
aco: disable disassembly for SI/CI due to lack of support by LLVM
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
1c4afe38f2
aco: implement 64bit ine/ieq for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
1e1356b2ad
aco: implement 64bit i2b for SI /CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
da7ff58835
aco: make 1/2*PI a literal constant on SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00
Daniel Schürmann
90fad7360d
aco: implement 64bit VGPR shifts for SI/CI
...
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
2019-12-07 11:23:11 +01:00