Commit Graph

89132 Commits

Author SHA1 Message Date
Dave Airlie 9aec76aca3 radv: handle layered fast clears.
This iterates the fast clear flush across the layers in the
specified range.

It also moves the compute resolve flush into the function
and builds the range in there.

This fixes:
dEQP-VK.geometry.layered.* regressions since fast clears.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-02-19 20:30:01 +10:00
Dave Airlie efc89edf5a radv: pass subresourceRange by pointer.
This struct is 5 dwords, we should really just pass a pointer
to it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-19 20:28:22 +10:00
Dave Airlie 2b3c490e23 radv: fix typo in a2b10g10r10 fast clear calculation.
This fixes:
dEQP-VK.renderpass.formats.a2b10g10r10_unorm_pack32*
regressions.

Fixes:
f22836dbdd radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-02-19 20:27:28 +10:00
Bas Nieuwenhuizen c7fcaf2314 radv: Invert ring SGPR check.
I assume this wants to check if all pipelines use the same SGPR for
the rings.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-19 10:13:11 +01:00
Bas Nieuwenhuizen e12cf3f9bf radv: Clamp framebuffer dimensions to min. attachment dimensions.
Even though the preferred stance is not to fix incorrect applications
via the driver, this prevents some nasty GPU hangs.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-19 10:13:01 +01:00
Marek Olšák ad019bf5c6 gallium: remove TGSI_OPCODE_CLAMP
Not used and not widely supported. Use MIN+MAX instead.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák 675ef9c0c7 ac/llvm: use min+max instead of AMDGPU.clamp on LLVM 5.0
It selects v_med3_f32, which has the same rate & size.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák 660b55e6d9 radeonsi: stop using TGSI_OPCODE_CLAMP by moving it amd/common
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák 73d1c8c686 tgsi/lowering: stop using TGSI_OPCODE_CLAMP
v2: do it correctly

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák 1d1b769561 st/mesa: stop using TGSI_OPCODE_CLAMP
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 02:58:43 +01:00
Marek Olšák 45240ce598 radeonsi: use R600_RESOURCE_FLAG_UNMAPPABLE where it's desirable
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák a41587433c gallium/radeon: add R600_RESOURCE_FLAG_UNMAPPABLE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 9434421213 gallium/radeon: change r600_aligned_buffer_create to take flags, not bind
All call sites set bind = 0. The next commit will use this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák ac6007460a radeonsi: upload constants into VRAM instead of GTT
This lowers lgkm wait cycles by 30% on VI and normal conditions.
The might be a measurable improvement when CE is disabled (radeon)
or under L2 thrashing.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák a550fbb510 gallium/radeon: use TCC line size as alignment in other places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 791e8ce04a radeonsi: use a clever alignment for index buffer uploads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák d6c8c26851 radeonsi: use a clever alignment for descriptor uploads
Non-VBO descriptors won't be smaller than the cache line, so simply use
the cache line size.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 6b73aafceb radeonsi: use a clever alignment for constant buffer uploads
This results in a very tiny decrease in lgkm wait cycles.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 620aded541 radeonsi: move index buffer flushing into a non-upload indexed case
The other codepaths don't need this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 22b8a773e1 radeonsi: use SI_MAX_ATTRIBS where it should be used
for consistency; no change in behavior

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 054f853035 radeonsi: sort members of si_shader_key::part
and improve some comments

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 1fabb29717 radeonsi: have separate LS and ES main shader parts in the shader selector
This might reduce the on-demand compilation if the initial VS/LS/ES
determination is wrong.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák a02117ba6e radeonsi: don't compile pure monolithic shaders asynchronously
there is no point, we have to wait anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 9b91e0b54c radeonsi: allow unaligned vertex buffer offsets and strides on CIK-VI
So that we can disable u_vbuf for GL core profiles.

This is a v2 of the previous VI-only patch.
It requires SH_MEM_CONFIG.ALIGNMENT_MODE = UNALIGNED on CIK-VI.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 2fb021b620 radeonsi: remove the fix_size3 workaround
not needed with the shader fallback

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák dbd38f2a92 radeonsi: add a workaround for clamping unaligned RGB 8 & 16-bit vertex loads
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 41a2157a68 radeonsi: make fix_fetch an array of uint8_t
so that we can add 3-component fallbacks.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák f246ae1ee9 vl: fix a buffer leak in the bicubic filter by using an uploader
there's no error checking, because the previous code didn't do it either.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák c8d84801b7 gallium/hud: create files after graphs are created to get final names
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 22c34bbc55 gallium/u_suballoc: allow setting pipe_resource::flags
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák edf6bcf6c6 gallium/u_suballoc: use clear_buffer if available
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 02cd8b20d1 gallium/util: correctly unref a buffer in u_prim_restart
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 42297c862f gallium/util: remove unused u_index_modify helpers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák 7f0bf00dc9 gallium/util: remove unused helper util_draw_texquad
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-18 01:22:08 +01:00
Marek Olšák b5b0936677 gallium/docs: remove documentation of non-existent instructions
trivial
2017-02-18 01:22:08 +01:00
Jason Ekstrand 5f02c2a054 anv/TODO: Check off Storage Image Without Format
The code for this landed a few days ago.
2017-02-17 14:18:34 -08:00
Marek Olšák edd23e0606 ac/llvm: fix various findMSB bugs
sffbh needs to be suffixed with ".i32"

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-18 06:24:32 +10:00
Jose Maria Casanova Crespo 429f112a11 glsl: link error if unsized array not-last in ssbo
If an unsized declared array is not the last in an SSBO
and an implicit size can not be defined on linking time,
the linker should raise an error instead of reaching
an assertion on GL.

This reverts part of commit 3da08e1664
getting back to the behavior of commit 5b2675093e

The original patch was correct for GLES that should produce
a compile-time error but the linker error is still necessary
in desktop GL.

Fixes the following piglit tests:
tests/spec/arb_shader_storage_buffer_object/non_integral_size_array_member.shader_test
tests/spec/arb_shader_storage_buffer_object/unsized_array_member.shader_test

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
2017-02-17 15:49:16 +02:00
Lionel Landwerlin a0ac118398 i965/fs: fix uninitialized memory access
Found while running shader-db under valgrind.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
2017-02-17 10:07:56 +00:00
Timothy Arceri 62c90492ef glsl: disable on disk shader cache when running as another user
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-02-17 20:21:22 +11:00
Alejandro Piñeiro 966ddd5d3d mesa/formatquery: use consistent local function names
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2017-02-17 10:17:54 +01:00
Bas Nieuwenhuizen d5bf4c7394 radv: Use different allocator for descriptor set vram.
This one only keeps allocated memory in the list, and list nodes
in the descriptor sets. Thsi doesn't need messing around with
max_sets, and we get automatic merging of free regions.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-17 09:28:23 +01:00
Bas Nieuwenhuizen f448701622 radv: Never try to create more than max_sets descriptor sets.
We only use the freed ones after all free space has been used. If
the app only allocates small descriptor sets, we might go over
max_sets before the memory is full.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79
2017-02-17 09:28:14 +01:00
Samuel Iglesias Gonsálvez fccbad73ef i965/fs: fix 32-bit data type to int64 conversion on BSW/BXT
The 32-bit to 64-bit conversions need to have the 32-bit
data source elements aligned to 64-bit but only with doubles as
destination type.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-02-17 06:50:22 +01:00
Timothy Arceri 172c48cc15 glsl: fix scons builds with shader cache
For now its disabled for scons so wrap glsl cache calls in a
define conditional.
2017-02-17 16:31:47 +11:00
Timothy Arceri a2bf0954fb util/disk_cache: fix typo in function stub 2017-02-17 15:54:00 +11:00
Jason Ekstrand b073811617 i965/fs: Remove hand-coded 64-bit packing optimizations
The optimization in unpack_64 is clearly subsumed with the opt_algebraic
optimizations in the previous commit.  The pack optimization may not be
quite handled by opt_algebraic but opt_algebraic should get the really
bad cases.  Also, it's been broken since it was merged and we've never
noticed so it must not be doing anything.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand 70e86a3f2d nir/algebraic: Optimize 64bit pack/unpack
This reduces the instruction count in some fp64 and int64 piglit tests

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand e10f522cd7 nir: Rename lower_double_pack to lower_64bit_pack
There's nothing "double" about it other than, perhaps, the fact that it
packs two 32-bit values.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Jason Ekstrand 161d3e81be nir: Combine the int and double [un]pack opcodes
NIR is a typeless IR and the two opcodes, when considered bitwise, do
exactly the same thing.  There's no reason to have two versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00