Commit Graph

39203 Commits

Author SHA1 Message Date
Rob Clark 010d255656 freedreno/a6xx: fix MSAA resolve hangs
Seems like RB_BLIT_SCISSOR needs to be aligned to (minimum?) tile size.

Fixes intermittent GPU hangs triggered by some of the three.js samples
on https://threejs.org/

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-07-29 15:15:31 -07:00
Leo Liu 8d7f2e2221 radeon/vcn/vp9: add Arcturus VP9 support
Arcturus CHIP enum is less than Navi10, since it's still gfx9,
but its VCN version belongs to VCN2.x

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:58 -04:00
Leo Liu a439863918 radeon/vcn: add Arcturus decode support
different internal registers offset from previous HW

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:56 -04:00
Marek Olšák 417ab8ef6b radeonsi: add AMD_DEBUG=nogfx for testing
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:53 -04:00
Marek Olšák 19d04191c4 radeonsi: add support for compute-only chips
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:51 -04:00
Sonny Jiang c82f338855 gallium/auxiliary/vl: add compute shaders for deint yuv
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:49 -04:00
Sonny Jiang ef77a92bca gallium/auxiliary/vl: don't call gfx functions on compute-only chips
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Reviewed-by: Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:46 -04:00
James Zhu b618b65c98 gallium/auxiliary/vl: add PIPE_CAP_GRAPHICS check for vl compositor
Init graphic shader Only when PIPE_CAP_GRAPHICS is true.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:42 -04:00
Marek Olšák 187cc07d05 gallium: create multimedia contexts as compute-only if graphics is unsupported
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:41 -04:00
Marek Olšák ea7646dc13 gallium: add PIPE_CAP_GRAPHICS
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-07-29 17:52:39 -04:00
Eric Anholt 65aeeae670 freedreno: Fix helgrind complaint on shader-db key setup.
If the variable's going to be static, we shouldn't be memsetting it
from every thread and instead just have it in the data section.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2019-07-29 12:50:49 -07:00
Gert Wollny 4ee638cd78 softpipe: Don't draw when rasterizer_discard is set
Fixes:
  dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points
  dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points
  dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points
  dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points
  dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points
  dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-07-29 15:47:34 +02:00
Gert Wollny 45ac0dfad4 softpipe: Fix cube arrays layer selection
To select the correct layer the z-coordinate must be rounded before it
is multiplied by six.

Fixes a number of tests out of
   dEQP-GLES31.functional.texture.filtering.cube_array.formats.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2019-07-29 15:47:34 +02:00
Connor Abbott 6fc7384fd4 lima/gpir/sched: Handle more special ops in can_use_complex()
We were missing handling for a few other ops that rearrange their
sources somehow in codegen, namely complex2 and select.

This should fix spec@glsl-1.10@execution@built-in-functions@vs-asin-vec3
and possibly other random regressions from the new scheduler which were
supposed to be fixed in the commit right after.

Fixes: 54434fe670 ("lima/gpir: Rework the scheduler")
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Acked-by: Qiang Yu <yuq825@gmail.com>
2019-07-28 23:38:31 +02:00
Connor Abbott af95f80a24 lima/gp: Clean up lima_program_optimize_vs_nir() a little
Remove an unnecessary nir_lower_regs_to_ssa as that should be done by
the state tracker, and add a missing DCE pass after running copy
propagation in order to remove the dead copies. This shouldn't fix
anything but the second part will reduce shader sizes.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
2019-07-28 23:38:31 +02:00
Connor Abbott d26d8c5617 lima/gpir/sched: Don't try to spill when something else has succeeded
In try_node(), we assume that the node we pick can still be scheduled
successfully after speculatively trying all the other nodes. Normally we
always undo every node after speculating it, so that when we finally
schedule best_node the scheduler state is exactly the same and it
succeeds. However, we also try to spill nodes, which can change the
state and in a corner case that can make scheduling best_node fail. In
particular, the following sequence of events happened with piglit
shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a
register store node S, which is a use of N, was created and then later
the other uses of N were scheduled, so that S is now ready and N is
partially ready. First we try to schedule S and succeed, then we try to
schedule another node M, which fails, so we try to spill the remaining
uses of N. This succeeds, but scheduling M still fails so that best_node
is still S. However since one of the uses of N is one cycle ago, and
therefore we inserted a read dependent on S one cycle ago when spilling
N, S can no longer be scheduled as read-after-write latency is three
cycles.

While we could ad-hoc try to catch cases like this, or (the best option
but very complicated) treat the spill as speculative and roll it back if
we decide not to schedule the node, a simpler solution is to just
give up on spilling if we've already successfully speculatively
scheduled another node. We'd give up a few cases where we discover that
by spilling even harder we could schedule a more desirable node, but
that seems like it would be pretty rare in practice. With this we
guarantee that nothing has been touched after best_node was successfully
scheduled. We also cut down on pointless spilling, since if we already
scheduled a node it's unlikely that spilling harder will let us schedule
an even better node, and hence any spilling at this point is probably
useless.

While we're here, clean up the code around spilling by flattening the
two if's and getting rid of the second unnecessary check for INT_MIN.

Fixes: 54434fe670 ("lima/gpir: Rework the scheduler")
Acked-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
2019-07-28 23:38:31 +02:00
Ilia Mirkin de17922b8a nv50/ir: don't consider the main compute function as taking arguments
With OpenCL, kernels can take arguments and return values (?). However
in practice, there is no more TGSI compute implementation, and even if
there were, it would probably have named functions and no explicit main.

This improves RA considerably for compute shaders, since temps are not
kept around as return values.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-07-27 18:24:11 -04:00
Ilia Mirkin 3e468ff2fe nv50/ir: handle insn not being there for definition of CVT arg
This can happen if it's e.g. a uniform or a function argument.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2019-07-27 18:24:11 -04:00
Ilia Mirkin 23dfff0669 nouveau: flip DEBUG -> !NDEBUG
The meson conversion chose to change the meaning of DEBUG to "used for
debugging" to be "used for expensive things for debugging", primarily
for nir_validate. Flip things over so that we get nice things with
optimizations enabled.

While we're at it, also kill off nouveau_statebuf.h which is unused (and
has a mention of DEBUG which is how I found it).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
2019-07-27 18:24:11 -04:00
Ilia Mirkin 9f8ed5aa67 nvc0: allow a non-user buffer to be bound at position 0
Previously the code only handled it for positions 1 and up (as would be
for UBO's in GL). It's not a lot of trouble to handle this, and vl or
vdpau want this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2019-07-27 18:24:11 -04:00
Ilia Mirkin c52b057e00 nv50,nvc0: update sampler/view bind functions to accept NULL array
Apparently vl (or vdpau) wants to pass that in now. Handle it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Cc: mesa-stable@lists.freedesktop.org
2019-07-27 18:24:11 -04:00
Ilia Mirkin face27fdc5 gallium/vl: fix compute tgsi shaders to not process undefined components
This caused nouveau's function handling logic to think that the MAIN
function was due to receive external parameters, and cascaded some
failures after that. Instead avoid having the undefined components in
the first place.

Fixes: f6ac0b5d71 (gallium/auxiliary/vl: Add compute shader to support video compositor render)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-07-27 18:24:11 -04:00
Boyuan Zhang b0626c1f30 radeon/vcn: enable rate control for hevc encoding
Set cu_qp_delta_enable_flag on when rate control is enabled, and set it
off when rate control is disabled (e.g. constant qp).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: fix typo and add bugzilla info

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2019-07-26 14:33:09 -04:00
Boyuan Zhang 5115c25bb8 radeon/uvd: enable rate control for hevc encoding
Set cu_qp_delta_enable_flag on when rate control is enabled, and set it
off when rate control is disabled (e.g. constant qp).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: fix typo and add bugzilla info

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2019-07-26 14:33:09 -04:00
Boyuan Zhang 9aaf3aaf5d radeon/vcn: fix poc for hevc encode
MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.

Also use poc value passed from st instead of calculation
in slice header encoding.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: Fix typo

V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2019-07-26 14:33:09 -04:00
Boyuan Zhang 77cf700fa3 radeon/uvd: fix poc for hevc encode
MaxPicOrderCntLsb should be at least 16 according to the spec,
therefore add minimum value check.

Also use poc value passed from st instead of calculation
in slice header encoding.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673
Cc: mesa-stable@lists.freedesktop.org

V2: Fix typo

V3: Use MAX2 macro instead of coding. Also MaxPicOrderCntLsb
should be power of 2 according to spec.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
2019-07-26 14:33:09 -04:00
Iago Toral Quiroga 1a99fc0fd0 v3d: fix glDrawTransformFeedback{Instanced}()
This needs to take the vertex count from the provided transform
feedback buffer.

v2:
 - don't take the vertex count from the underlying buffer, instead,
   take it from a v3d subclass of pipe_stream_output_target (Eric).

Fixes piglit tests:
spec/ext_transform_feedback2/draw-auto
spec/ext_transform_feedback2/draw-auto instanced

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-26 08:29:41 +02:00
Iago Toral Quiroga 47eb74ae00 v3d: subclass pipe_streamout_output_target to record TF vertices written
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-26 08:29:41 +02:00
Iago Toral Quiroga 39df568ca1 v3d: refactor v3d_tf_statistics_record slightly
Reviewed-by: Eric Anholt <eric@anholt.net>
2019-07-26 08:29:41 +02:00
Alyssa Rosenzweig 2f9236096a Revert "panfrost: Don't DIY point size/coord fields"
This reverts commit 4508f43eed, which
broke a bunch of dEQP tests (e.g. in
dEQP-GLES2.functional.draw.draw_arrays.*)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 13:17:22 -07:00
Kenneth Graunke 0e24d10ff5 iris: Use gen_mi_builder to handle CS ALU operations.
In a few cases, we switch to MI_MATH instead of MI_PREDICATE,
just because we were already doing math and it's easier to chain
together.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke fe7ed6b057 iris: Make iris_query.c a genxml-compiled file.
This will let us use Jason's new MI-builder shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke 975f7e4a59 iris: Move iris_resolve_conditional_render to the vtable.
It's going to be in genxml code shortly.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke 6c4c7b600d iris: Refactor genxml macros and inlines into iris_genx_macros.h.
This will let us put the genxml boilerplate in one place, before we
expand genxml to more files shortly.  Like i965/genX_boilerplate.h.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Kenneth Graunke 204a3bb816 iris: Make an iris_genx_protos.h header for prototypes.
This lets us specify the prototypes once, instead of cut and pasting
them per generation.  isl uses a similar approach (isl_genX_priv.h).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-25 18:42:55 +00:00
Marek Olšák 068093e84c radeonsi: fix DAL hang due to incorrect DCC offset on Raven
Set the correct relative offset.

Fixes: f8b6c5a "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support"
2019-07-25 14:09:11 -04:00
Alyssa Rosenzweig 5534fdb7bf panfrost: Compute I/O counts from shader_info
...rather than exposing it in the vendored compiler region.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig 4508f43eed panfrost: Don't DIY point size/coord fields
Again, it's in shader_info for us!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig bab4f6c724 panfrost: Use nir_gather_info information about discards
No need to track this ourselves!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig 48991c7a1f panfrost: Use NIR helper invocations info
We don't need to guesstimate this ourselves. This will help when we
bringup derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig fb2fe6e7bc panfrost/sfbd: Flesh out fragment job
We include a zsbuf attachment function based on how the corresponding
MFBD code works, as well as extending cbufs to mipmapped rendering while
we're at it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig e6802af8c3 panfrost: Disable tiled formats on SFBD systems
Just because we don't have the format codes to render to them yet.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:20 -07:00
Alyssa Rosenzweig 990e24469c panfrost: Move require_sfbd to screen
We'll need it to specialize resource creation by chip.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:20 -07:00
Alyssa Rosenzweig a9c73e825a panfrost: Reserve, but do not upload, shader padding
Fixes invalid read errors reported by valgrind.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:20 -07:00
Tomeu Vizoso 688d9b4fb7 panfrost/ci: Update kernel to 5.2
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 15:08:44 +02:00
Alyssa Rosenzweig 31c9fcbd0f panfrost: Don't expose some atomic stuff even with dEQP
Fixes dEQP crashes.

Fixes: 2f93ecd654 ("panfrost: Fake CAPs for dEQP-GLES31")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-24 17:21:12 -07:00
Dave Airlie 16fcbb2eba gallium: fix windows build from params change.
This is why we can't have nice things. I'm sure there's someway
to do this with {0} but I really don't have time for that.

Fixes: 2631fd3b0b ("gallivm: rework lp_build_tgsi_soa to take a struct")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2019-07-25 10:02:22 +10:00
Jonathan Marek bc3b6168ba nir: replace lower_sincos with algebraic opt
This version has less ops for the same precision.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Matt Turner <mattst88@gmail.com>
2019-07-24 17:36:21 -04:00
Rob Clark b4f4768672 gallium/u_transfer_helper: fix assert in RGTC case
Previously we'd hit the unreachable() for uploading RGTC.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-07-24 21:11:06 +00:00
Jason Ekstrand c84b8eeeac intel/compiler: Be more conservative about subgroup sizes in GL
The rules for gl_SubgroupSize in Vulkan require that it be a constant
that can be queried through the API.  However, all GL requires is that
it's a uniform.  Instead of always claiming that the subgroup size in
the shader is 32 in GL like we have to do for Vulkan, claim 8 for
geometry stages, the maximum for fragment shaders, and the actual size
for compute.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-07-24 12:55:40 -05:00