Timur Kristóf
aa75be05af
aco: Clean up usages of PhysReg::reg from aco_assembler.
...
These are not needed anymore, since PhyReg has an implicit
conversion operator that can convert it to unsigned int,
which is equivalent to accessing this field.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
d729d8f1dc
aco: Add extra assertion for number of FS input VGPRs.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
a89153d038
aco: Fix s_dcache_wb on GFX10.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
68c9554732
aco: Have s_waitcnt_vscnt write to NULL.
...
Not sure if this instruction actually writes anything, but LLVM
disassembles a destination and sets it to NULL.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
619f0a71cc
aco: Use the VOP3-only add/sub GFX10 instructions if needed.
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
6a6bef59b0
aco: Initial work to avoid GFX10 hazards.
...
Currently just breaks up SMEM groups and fixes
FeatureVMEMtoScalarWriteHazard (name from LLVM).
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
d63c175897
aco: pad code with s_code_end on GFX10
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
83993f535e
aco: workaround GFX10 0x3f branch bug
...
According to LLVM, branches with an offset of 0x3f are buggy.
v2: (by Timur Kristóf)
- extract the GFX10 specific part to its own function
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
0be1dd8564
aco: Fix VS input VGPRs on GFX10.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
c24cd97515
aco: Assemble opsel in VOP3 instructions.
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Rhys Perry
818bdab796
aco: Allow literals on VOP3 instructions.
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com >
2019-10-10 09:57:53 +02:00
Timur Kristóf
7cf1dcf22d
aco: Support subvector loops in aco_assembler.
...
These are currently not used, but could be useful later.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
21f1953383
aco: Set GFX10 dimensionality on the instructions that need it.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
eaa2a7cdf6
aco: Use ac_get_sampler_dim, delete duplicate code.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
1de9ef9c96
aco: Set GFX10 DLC bit properly.
...
The DLC bit is now set to 1 for all loads when GLC is also set,
but cleared to 0 for all stores (otherwise it causes issues),
and also cleared to 0 for atomics.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
89b074be86
aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
d3a48c272f
aco: Support GFX10 EXP in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
e6330d71b5
aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
64d74ca816
aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
c0df15e645
aco: Support GFX10 MTBUF in aco_assembler.
...
Also remove img_format from aco_ir, since it can be calculated
from dfmt and nfmt. So only the assember needs to deal with it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:53 +02:00
Timur Kristóf
e96124bd65
aco: Link ACO with amd/common.
...
We'd like to use some functions, for example some
ac_shader_util functions in ACO, so we need to link
ACO to AC.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
c57503b932
amd/common: Add extern "C" to some headers that were missing it.
...
We'd like to include some of these in C++ code later.
Specifically, ACO is written in C++ and we would like to use
some of this code in ACO in order to avoid code duplication.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
9e27816252
aco: Support GFX10 MUBUF in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
6106d4bce9
aco: Support GFX10 DS in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
bbe87eb6c3
aco: Support GFX10 VINTRP in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
b6235651b9
aco: Support GFX10 SMEM in aco_assembler.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
fd1d947457
aco: Add missing GFX10 specific fields and some README notes.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Timur Kristóf
a01d796de4
aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.
...
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
2019-10-10 09:57:52 +02:00
Alejandro Piñeiro
fa41a51891
v3d: take into account prim_counts_offset
...
Specifically when reading the primitive counters.
This fixed ~700 CTS tests using this pattern:
dEQP-GLES3.functional.transform_feedback.*
when run after tests like
dEQP-GLES3.functional.prerequisite.read_pixels on the same
caselist. When run individually those tests were passing because
prim_counts_offset was zero.
Fixes: 0f2d1dfe65 ("v3d: use the GPU to
record primitives written to transform feedback")
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
2019-10-10 09:51:50 +02:00
Samuel Pitoiset
42b2d1119a
radv: get the device name from radeon_info::name
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-10 08:15:41 +02:00
Dave Airlie
b1f3173d0f
st/mesa: fix R8 bitmap texture for TGSI paths.
...
The initial patch only fixed up the NIR path, but forgot
the TGSI path needed fixing as well.
Fixes: f92226931b ("st/mesa: Prefer R8 for bitmap textures")
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
2019-10-10 10:22:37 +10:00
Jason Ekstrand
c7e5d24d8f
anv/pipeline: Capture serialized NIR
...
This allows the serialized NIR to be displayed in RenderDoc and similar
tools.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
2019-10-09 22:28:01 +00:00
Matt Turner
b2f6fda542
clover: Remove unused code
...
Fixes: 96b592696f ("gallium: Require LLVM >= 3.9")
Bug: https://bugs.gentoo.org/685678
2019-10-09 14:54:07 -07:00
Greg V
6da865bcfe
clover: use iterator_range in get_kernel_nodes
...
With libc++ (LLVM's STL implementation), the original code does not compile because an
appropriate vector constructor cannot be found (for the _ForwardIterator one, requirement
is_constructible is not satisfied).
2019-10-09 14:54:07 -07:00
Marek Olšák
aed1f7ad34
radeonsi: enable MSAA shader images
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:38 -04:00
Marek Olšák
095a58204d
radeonsi: expand FMASK before MSAA image stores are used
...
Image stores don't use FMASK, so we have to turn it into identity.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:36 -04:00
Marek Olšák
98b88cc1f6
radeonsi: apply FMASK to MSAA image loads
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:34 -04:00
Marek Olšák
c0575a6241
radeonsi: clean up image_fetch_rsrc
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:33 -04:00
Marek Olšák
743a9d85e2
radeonsi: add FMASK slots for shader images (for MSAA images)
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:31 -04:00
Marek Olšák
1881b35bf6
radeonsi: set the sample index for shader images correctly
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:30 -04:00
Marek Olšák
0a0def7317
radeonsi: fix GLSL imageSamples()
...
We haven't supported MSAA images, so it doesn't matter much.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:28 -04:00
Marek Olšák
279da8a201
tgsi/scan: add tgsi_shader_info::msaa_images_declared
...
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:27 -04:00
Marek Olšák
e26bd397a8
nir: add shader_info::last_msaa_image
...
for radeonsi
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
2019-10-09 17:12:19 -04:00
Marek Olšák
e4f4bb8abd
radeonsi: don't set BO metadata for non-zero planes
...
pointed out by Bas
2019-10-09 17:06:54 -04:00
Marek Olšák
28da990bed
radeonsi: ignore metadata for non-zero planes
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-09 17:06:54 -04:00
Marek Olšák
86e60bc265
radeonsi: remove si_vid_join_surfaces and use combined planar allocations
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-09 17:06:54 -04:00
Marek Olšák
0f7c9dad44
radeonsi: allocate planar multimedia formats in 1 buffer
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-09 17:06:54 -04:00
Marek Olšák
35680bfea1
vl: use u_format in vl_video_buffer_formats
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-09 17:06:54 -04:00
Marek Olšák
a122e70858
gallium/u_tests: test NV12 allocation and export
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-09 17:06:54 -04:00
Marek Olšák
20f132e5ef
gallium/util: add planar format layouts and helpers
...
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
2019-10-09 17:06:54 -04:00