Commit Graph

32777 Commits

Author SHA1 Message Date
Rob Clark 4a9aad96aa freedreno/a5xx: fix SSBO emit for non-zero offset
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:29:00 -05:00
Rob Clark 5f25ab4fee freedreno/a5xx: remove obsolete comment
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:29:00 -05:00
Rob Clark 8fcee858d5 freedreno/ir3: don't create split/fo if only writing .x
In case an instruction only writes one register, and it is .x, we can
skip the extra level of fanout indirection.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark e7b2719f69 freedreno/a5xx: indirect grids
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 471aa1b6d0 freedreno/a5xx: add global size compute cap
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 62981bbe65 freedreno/ir3: turn on std430 packing
Seems to fix dEQP compute related tests.. and matches what i965 does, so
perhaps there is some assumption that std430 packing is on by default
somewhere in NIR?
2017-11-12 12:28:59 -05:00
Rob Clark bedbe7f90c freedreno/a5xx: image support 2017-11-12 12:28:59 -05:00
Rob Clark 819a613ae3 freedreno/ir3: moar better scheduler
Add a new pass that inserts additional dependencies, rather than simply
relying on SSA srcs added in the nir->ir3 frontend.  This makes it
easier to deal with barriers, but the additional false deps also lets us
deal properly with ensuring a write depends on all previous reads.

Since conversion to barrier instructions is lossy (ie. just knowing the
instruction doesn't tell us enough about what other instructions the
barrier applies to), use barrier_class/barrier_conflict fields in the
ir3_instruction to retain this information.

This could probably be relaxed somewhat by considering *which* array/
buffer/image variable is being referenced.  Ie. a write to buffer A
can overtake a read from buffer B, if B is not coherent.  (right?)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 15ea8d128a freedreno/ir3: move macros
I want to add a growable array to ir3_instruction, so we can append
false dependencies for purposes of scheduling barriers, atomics, and
dealing with write after read hazards.

Just code motion preparing for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 9edfc369c0 freedreno/ir3: image support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark eaae81058c freedreno/ir3: shared variable support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark dd75abc6f3 freedreno/ir3: some SSBO cleanups/fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 2f8bdf2e2b freedreno/ir3: split out INSTR4F instructions
Atomic instructions take a different # of src args depending on .g or .l
variant, split these out into different helpers with INSTR*F() helper
macro that lets you specify instruction flag.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 0038deb256 freedreno/ir3: cat6 encoding fixes
Instruction encoding/decoding fixes needed for images, shared variables,
etc.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 4e9a6c6868 freedreno/ir3: add barriers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 4c711f4d18 freedreno/ir3: invert is_same_type_mov() logic
Some instructions (like barriers) have no dst, which causes problems
with dereferencing a NULL dst.  Flip the logic around to reject opc's
that can't be a type of move first, to filter out those instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 6da5130074 freedreno/ir3: add cat7 instructions
Needed for memory and execution barriers.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 33f5f63b8f freedreno/ir3: add SSBO get_buffer_size() support
Somehow I overlooked this when adding initial SSBO support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark b267a08404 freedreno/ir3: extract helper for common consts
User consts and driver consts such as UBO addresses and immediates are
handled the same for all shader stages, so split out a shared helper for
these, to make it easier to add more.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 13fe1feb62 freedreno: add image view state tracking
It is unfortunate that image state isn't a real CSO, since (at least for
a4xx/a5xx) it is a combination of sampler and "SSBO" image state, and it
would be useful to pre-compute the state block "register" values rather
than doing it at emit time.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 12c1c3ab23 freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 5009dc55f2 freedreno/ir3: rename ir3_compile -> ir3_context
Having both an ir3_compile (which was really context for compiling a
single shader variant) and ir3_compiler (which is the compiler object
that compiles all variants, ie. basically holds the RA regset) is a
bit confusing.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Timothy Arceri 8fe6abd964 ac: add emit_vertex to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Timothy Arceri dc42a2177c radeonsi: rework gs_vtx_offset handling
This simplifies things a bit and will enable it to work with the
common NIR -> LLVM code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-12 11:08:26 +11:00
Marek Olšák 3a71eac783 st/dri: fix deadlock when waiting on android fences
Android fences can't be deferred, because st/dri calls fence_finish
with ctx = NULL, so the driver can't flush u_threaded_context.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-11 04:12:53 +01:00
Rob Clark 881f6e741f meson: Guard freedreno build with with_gallium_freedreno.
This prevents build failures when libdrm_freedreno is unavailable,
which started happening after the ir3_compiler build was enabled.

(Patch by Rob, commit message by Ken).

Fixes: fecd04a66a ("freedreno/ir3: fix standalone compiler meson build")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-11-10 17:11:48 -08:00
Dylan Baker ad9c2f5469 meson: build gallium-xlib based glx
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 13:00:01 -08:00
Dylan Baker 140b688c57 meson: add nir_builder_opcodes_h to gallium_auxiliary
This creates a dependency on this header being generated before trying
to compile any of these targets, as well as passing the correct -I to
the compiler to ensure it's included correctly.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-10 12:59:54 -08:00
Dylan Baker 7210d0096a gallium/xlib: remove GL_{MAJOR,MINOR,TINY}
These variables were removed from autotools in 2008 (sha:
80f68e1b6a), but they have lived on here. The Scons build
meanwhile doesn't set a patch/tiny version at all, just major and minor.
This patch removes the unused variables and simply sets the version,
leaving patch/tiny as 0 since that's what the autotools build as been
doing forever. This shouldn't change any behavior.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2017-11-10 12:40:08 -08:00
Timothy Arceri f9e5216f71 radeonsi: get llvm types from ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-11 06:54:25 +11:00
Marek Olšák e456d4def5 st/dri: fix android fence regression
Fixes piglit - egl_khr_fence_sync/android_native tests.
Broken by 884a0b2a9e.

Introduce state-tracker flush flags, analogous to the pipe ones. Use
the former when with stapi->flush().

Fixes: 884a0b2a9e ("st/dri: use stapi flush instead of pipe flush
when creating fences")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-10 17:17:13 +01:00
Nicolai Hähnle ee880e91cc gallium/u_threaded: fix end_query regression
Ouch...

Fixes: 244536d3d6 ("gallium/u_threaded: avoid syncs for get_query_result")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103653
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-10 16:37:37 +01:00
Bruce Cherniak d473f91758 swr: Fixed an uncommon freed-memory access during state validation
State validation is performed during clear and draw calls.  Validation
during clear was still accessing vertex buffer state.  When the currently
set vertex buffers are client arrays, this could lead to accessing freed
memory.  Such is the case with the VMD application.

Previously, vertex buffer validation depended on a dirty bit or the
draw info indicating an indexed draw.  This required special handling for
clears.  But, vertex buffer validation still occurred which was unnecessary
and wrong.

Now, only minimal validation is performed during clear, deferring the
remainder to the next draw.  And, by setting the dirty bit in swr_draw_vbo
for indexed draws, vertex buffer validation is only dependent upon a
single dirty bit.

This fixes a bug exposed by the VMD application when changing models.

Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
2017-11-10 08:55:42 -06:00
Rob Clark fecd04a66a freedreno/ir3: fix standalone compiler meson build
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-10 08:57:33 -05:00
Rob Clark 86154acb57 freedreno/ir3: correct # of dest components for intrinsics
Don't rely on intr->num_components having a valid value.  It doesn't
seem to anymore for non-vectorized intrinsics.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-10 08:57:33 -05:00
Rob Clark 3fcf18634c freedreno/ir3: remove bogus assert
The ssbo atomic instructions are not vectorized.  So num_components is
not expected to be valid.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-10 08:57:33 -05:00
Eric Anholt 62deeaa23a broadcom/vc4: Fix simulator mode for the MADVISE usage. 2017-11-09 15:51:56 -08:00
Dave Airlie 06993e4ee3 r600: add support for hw atomic counters. (v3)
This adds support for the evergreen/cayman atomic counters.

These are implemented using GDS append/consume counters. The values
for each counter are loaded before drawing and saved after each draw
using special CP packets.

v2: move hw atomic assignment into driver.
v3: fix messing up caps (Gert Wollny), only store ranges in driver,
drop buffers.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
2017-11-10 08:39:36 +10:00
Dave Airlie cca5617348 gallium: add hw atomic buffer binding API.
This API binds atomic buffers for all bound shaders (as per the
GL semantics).

This is needed to support cross shader hw atomic counters.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie 4b0b82770a gallium/tgsi: start adding hw atomics (v3.2)
This adds support for a hw atomic counters to TGSI.

A new register file for storing atomic counters is added,
along with a new atomic counter semantic, along with docs
for both.

v2: drop semantic, move hw counter to backend,
Ilia pointed out SSO would have busted my plan, and he
was right.
v3: drop BUFFER decls. (Marek)
v3.1: minor fixups for whitespace, set ureg error
if we overflow the hw atomic limits. (nha)
v3.2: fix some docs inconsistencies (Ilia)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:35 +10:00
Dave Airlie 2a06423c00 gallium: add CAPs to support HW atomic counters. (v3)
This looks like an evergreen specific feature, but with atomic
counters AMD have hw specific counters they use instead of operating
on buffers directly. These are separate to the buffer atomics,
so require different limits and code paths.

I've left the CAP for atomic type extensible in case someone
else has a variant on this sort of thing (freedreno maybe?)
and needs to change it.

This adds all the CAPs required to add support for those atomic
counters, along with a related CAP for limiting the number of
output resources.

I'd like to land this and the st patch then I can start to
upstream the evergreen support for these and other GL4.x features.

v2: drop the ATOMIC_COUNTER_MODE cap, just use the return
from the HW counters. If 0 we use the current mode.
v3: fix some rebase errors (Gert Wollny)

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:34 +10:00
Dave Airlie 24baca6e75 r600/query: drop rest of vi workaround code.
This isn't needed in r600 anymore.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-10 08:39:16 +10:00
Boris Brezillon 359a8f6ae5 broadcom/vc4: Mark BOs as purgeable when they enter the BO cache
This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all
BOs placed in the mesa BO cache as purgeable so that the system can
reclaim this memory under memory pressure.

v2:
- Removed BOs from the cache when they've been purged by the kernel
- Check whether the madvise ioctl is supported or not before using it

v3: Don't walk the whole list when we find a busy BO (by anholt, acked by
    Boris)

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-09 10:57:17 -08:00
Eric Anholt ebcb4c2156 meson: Enable VC4's NEON assembly support.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:30 -08:00
Eric Anholt 9c9fd8ff37 meson: Always link libgallium_dri.so against dep_thread.
Somehow on my cross build the -pthread is getting lost.  All the other
deps seem to work out fine.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:27 -08:00
Marek Olšák 9ceb057ebf radeonsi: pack r600_surface better
160 -> 136 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák 169525684f radeonsi: pack r600_texture better
1752 -> 1736 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák f8a4b606a2 radeonsi: clean up r600_surface
216 -> 160 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák 6916ee7e17 radeonsi: remove r600_texture::non_disp_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák a06fe75eac radeonsi: remove DBG_NO_DISCARD_RANGE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00