Commit Graph

74545 Commits

Author SHA1 Message Date
Connor Abbott 18e73bf7f8 i965/fs: remove special case in setup_payload_interference()
regs_read() will handle LINTERP for us since the previous commit. In
addition, we were being too conservative, since it will only read 2
registers on SIMD8.

instructions in affected programs:     9061 -> 8893 (-1.85%)
helped:                                10
HURT:                                  0
GAINED:                                0
LOST:                                  0

All of the changes were due to spills being eliminated, mostly in KSP
shaders.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 10:10:51 -07:00
Jordan Justen c4a2217e79 i965/fs: Mark last used ip for all regs read in the payload
If a source register in the push constant registers uses more than one
register, then we wouldn't update payload_last_use_ip for subsequent
registers.

Unlike most uniform data pushed into registers, the CS gl_LocalInvocationID
data varies per execution channel. Therefore for SIMD16 mode, we have vec16
data in the payload. In this case we then need to mark 2 registers in
payload_last_use_ip as last used by the instruction. There's a similar
situation for the z and w coordinates of gl_FragCoord for fragment shaders,
where it had only happened to work before because of some bogus interferences
which the next commit removes.

(Connor: added bit about gl_FragCoord to commit message)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 10:10:48 -07:00
Connor Abbott 9f344b908a i965/fs: fix regs_read() for LINTERP
The second source always stays within the same SIMD8 register.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 10:10:39 -07:00
Connor Abbott eaf799ddff nir: add nir_foreach_instr_safe_reverse()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 09:49:53 -07:00
Connor Abbott 8eea091747 nir: add nir_instr_is_first() and nir_instr_is_last() helpers
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
2015-07-17 09:47:22 -07:00
Jordan Justen 01cdbba341 i965/cs: Use dispatch width of 8 for cs terminate payload setup
This prevents an assertion failure in brw_fs_live_variables.cpp,
fs_live_variables::setup_one_write: Assertion `var < num_vars' failed.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-07-16 21:37:24 -07:00
Jordan Justen 7e337859ff i965/cs: Return 1 for regs_read on CS_OPCODE_CS_TERMINATE
This prevents an assertion failure in brw_fs_live_variables.cpp,
fs_live_variables::setup_one_read: Assertion `var < num_vars' failed.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-07-16 21:37:07 -07:00
Kenneth Graunke 4b17f0d9f5 program: Allow redundant OPTION ARB_fog_* directives.
A fragment program from "Pixel Piracy" contains redundant OPTION
directives:

!!ARBfp1.0
OPTION ARB_precision_hint_fastest;
OPTION ARB_fog_exp2;
OPTION ARB_precision_hint_fastest;
OPTION ARB_fog_exp2;
...

We already allow redundant ARB_precision_hint_fastest directives, but
disallow the redundant (yet consistent) ARB_fog_exp2 directives, failing
to compile the program.

The specification seems to contradict itself - the main text says that
only one fog application option may be specified, but then backpedals,
indicating the intent is to disallow /contradictory/ flags.  One of the
issues suggests that specifying contradictory ones is stupid, but
allowed, and only the last one should take effect.

Accepting multiple redundant (but consistent) directives seems harmless,
and like a reasonable interpretation of the specification.  It also
fixes a fragment program found in the wild.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-07-16 20:26:43 -07:00
Ben Widawsky 3a31876600 i965: Push miptree tiling request into flags
With the last few patches a way was provided to influence lower layer miptree
layout and allocation decisions via flags (replacing bools). For simplicity, I
chose not to touch the tiling requests because the change was slightly less
mechanical than replacing the bools.

The goal is to organize the code so we can continue to add new parameters and
tiling types while minimizing risk to the existing code, and not having to
constantly add new function parameters.

v2: Rebased on Anuj's recent Yf/Ys changes
Fix non-msrt MCS allocation (was only happening in gen8 case before)

v3: small fix in assertion requested by Chad

v4: Use parens to get the order right from v3.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2015-07-16 17:02:35 -07:00
Ben Widawsky ef42352ff4 Revert "i965: Push miptree tiling request into flags"
This reverts commit 51e8d549e1.
2015-07-16 16:52:08 -07:00
Ben Widawsky 51e8d549e1 i965: Push miptree tiling request into flags
With the last few patches a way was provided to influence lower layer miptree
layout and allocation decisions via flags (replacing bools). For simplicity, I
chose not to touch the tiling requests because the change was slightly less
mechanical than replacing the bools.

The goal is to organize the code so we can continue to add new parameters and
tiling types while minimizing risk to the existing code, and not having to
constantly add new function parameters.

v2: Rebased on Anuj's recent Yf/Ys changes
Fix non-msrt MCS allocation (was only happening in gen8 case before)

v3: small fix in assertion requested by Chad

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v2)
Reviewed-by: Chad Versace <chad.versace@intel.com> (v2)
2015-07-16 13:28:33 -07:00
Connor Abbott b2cfd85060 nir/spirv: don't declare builtin blocks
They aren't used, and the backend was barfing on them. Also, remove a
hack in in compiler.cpp now that they're gone.
2015-07-16 11:04:22 -07:00
Connor Abbott b599735be4 nir/spirv: add support for loading UBO's
We directly emit ubo load intrinsics based off of the offset information
handed to us from SPIR-V.
2015-07-16 10:54:09 -07:00
Francisco Jerez 4bddd82bf3 i965/fs: Factor out universally broken calculation of the register component size.
This in principle simple calculation was being open-coded in a number
of places (in a series I haven't yet sent for review there will be a
couple more), all of them were subtly broken in one way or another:
None of them were handling the HW_REG case correctly as pointed out by
Connor, and fs_inst::regs_read() was handling the stride=0 case rather
naively.  This patch solves both problems and factors out the
calculation as a new fs_reg method.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-07-16 18:31:01 +03:00
Francisco Jerez b00cd6e4a0 i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator.
This gets rid of two no16() fall-backs and should allow better
scheduling of the generated IR.  There are no uses of usubBorrow() or
uaddCarry() in shader-db so no changes are expected.  However the
"arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and
"arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit
tests go from 40 to 28 instructions.  The reason is that the plain ADD
instruction can easily be CSE'ed with the original addition, and the
b2i negation can easily be propagated into the source modifier of
another instruction, so effectively both operations are performed with
just one instruction.

v2: Rely on carry_to_arith() and borrow_to_arith() to lower these
    (Ilia Mirkin).

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-07-16 18:29:32 +03:00
Francisco Jerez 3ee2daf23d i965: Implement b2f and b2i using negation.
Booleans are represented as 0/-1 on modern hardware which means we can
just negate them to convert them into a numeric type.  Negation has
the benefit that it can be implemented using a source modifier which
can easily be propagated into some other instruction.  shader-db
results on HSW:

total instructions in shared programs: 6349082 -> 6346693 (-0.04%)
instructions in affected programs:     40948 -> 38559 (-5.83%)
helped:                                123
HURT:                                  1
GAINED:                                1
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-07-16 18:29:32 +03:00
Marek Olšák 8fba933ca2 gallium: add interface for writable shader buffers
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-07-16 16:52:21 +02:00
Marek Olšák 05a12c53a3 gallium: add interface for writable shader images
PIPE_CAPs will be added some other time.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-07-16 16:52:20 +02:00
Marek Olšák b73bec0ecd gallium: add new limits for shader buffers and images
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-07-16 16:52:17 +02:00
Marek Olšák f9f79d29ce gallium: add BIND flags for R/W buffers and images
PIPE_CAPs and TGSI support will be added later. The TGSI support should be
straightforward. We only need to split TGSI_FILE_RESOURCE into TGSI_FILE_IMAGE
and TGSI_FILE_BUFFER, though duplicating all opcodes shouldn't be necessary.

The idea is:
* ARB_shader_image_load_store should use set_shader_images.
* ARB_shader_storage_buffer_object should use set_shader_buffers(slots 0..M-1)
  if M shader storage buffers are supported.
* ARB_shader_atomic_counters should use set_shader_buffers(slots M..N)
  if N-M+1 atomic counter buffers are supported.

PIPE_CAPs can describe various constraints for early DX11 hardware.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-07-16 16:52:02 +02:00
Marek Olšák 26222932c0 gallium: add PIPE_CAP_MAX_SHADER_PATCH_VARYINGS
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-07-16 16:09:20 +02:00
Francisco Jerez af768922ca i965/gen9: Use custom MOCS entries set up by the kernel.
Instead of relying on hardware defaults the i915 kernel driver is
going program custom MOCS tables system-wide on Gen9 hardware.  The
"WT" entry previously used for renderbuffers had a number of problems:
It disabled caching on eLLC, it used a reserved L3 cacheability
setting, and it used to override the PTE controls making renderbuffers
always WT on LLC regardless of the kernel's setting.  Instead use an
entry from the new MOCS tables with parameters: TC=LLC/eLLC, LeCC=PTE,
L3CC=WB.

The "WB" entry previously used for anything other than renderbuffers
has moved to a different index in the new MOCS tables but it should
have the same caching semantics as the old entry.

Even though the corresponding kernel change ("drm/i915: Added
Programming of the MOCS") is in a way an ABI break it doesn't seem
necessary to check that the kernel is recent enough because the change
should only affect Gen9 which is still unreleased hardware.

v2: Update MOCS values for the new Android-incompatible tables
    introduced in v7 of the kernel patch.

Cc: 10.6 <mesa-stable@lists.freedesktop.org>
Reference: http://lists.freedesktop.org/archives/intel-gfx/2015-July/071080.html
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2015-07-16 13:48:20 +03:00
EdB 7e0180d57d clover: little OpenCL status code logging clean
s/build_error/compile_error in order to match the stored OpenCL status code.
Make program::build catch and log every OpenCL error.
Make tgsi error triggering uniform with the llvm one.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2015-07-16 13:48:20 +03:00
Renaud Gaubert 7b9ebf879b glsl: avoid compiler's segfault when processing operators with void arguments
This is done by returning an rvalue of type void in the
ast_function_expression::hir function instead of a void expression.

This produces (in the case of the ternary) an hir with a call
to the void returning function and an assignment of a void variable
which will be optimized out (the assignment) during the optimization
pass.

This fix results in having a valid subexpression in the many
different cases where the subexpressions are functions whose
return values are void.

Thus preventing to dereference NULL in the following cases:
  * binary operator
  * unary operators
  * ternary operator
  * comparison operators (except equal and nequal operator)

Equal and nequal had to be handled as a special case because
instead of segfaulting on a forbidden syntax it was now accepting
expressions with a void return value on either (or both) side of
the expression.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85252

Signed-off-by: Renaud Gaubert <renaud@lse.epita.fr>
Reviewed-by: Gabriel Laskar <gabriel@lse.epita.fr>
Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
2015-07-16 08:06:41 +02:00
Connor Abbott 513ee7fa48 nir/types: add more nir_type_is_xxx() wrappers 2015-07-15 21:58:32 -07:00
Roland Scheidegger 779cabfc7d r200: fix some potential big endian issues
The formats chosen (both by texture format choser, fbo storage allocation)
are different for big endian not just for rgba8 but also lower bit width
formats (why I don't actually know). Even the function to test for renderable
formats used different formats, however the actual colorbuffer setup did not.
And the blitter did not take that into account neither.
Untested (what could possibly go wrong...).
Same as for r100.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-16 03:55:59 +02:00
Roland Scheidegger d21320f625 radeon: fix some potential big endian issues
The formats chosen (both by texture format choser, fbo storage allocation)
are different for big endian not just for rgba8 but also lower bit width
formats (why I don't actually know). Even the function to test for renderable
formats used different formats, however the actual colorbuffer setup did not.
And the blitter did not take that into account neither.
Untested (what could possibly go wrong...).

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-16 03:54:53 +02:00
Roland Scheidegger 882476fea3 radeon/r200: mark state atoms as dirty after blits
Blit submits lots of packets which are usually handled by state atoms, so
these must be dirtied.
Not sure if this fixes anything, but it was a concern raised by bug 51658
(with this all issues there seen as actual bugs should be fixed, with the
exception of the patch to upload non-used texenv state atoms which I just
don't understand).

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-16 03:07:07 +02:00
Roland Scheidegger 26c1361ac3 r200: fix fbo rendering by disabling optimized texture format chooser
It is rather unfortunate that we don't know if a texture is going to be used
as a rt later, and we lack the means to do something about a format chosen
which we can't render to directly, so disable this and always chose renderable
format for rgba8 textures.
This addresses an issue raised on (old) bug,
https://bugs.freedesktop.org/show_bug.cgi?id=51658 with gnome-shell, don't
know if that's still applicable but it might fix other things as well.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-16 03:06:47 +02:00
Connor Abbott 9fa0989ff2 nir: move to two-level binding model for UBO's
The GLSL layer above is still hacky, so we're really just moving the
hack into GLSL-to-NIR. I'd rather not go all the way and make GLSL
support the Vulkan binding model too, since presumably we'll be
switching to SPIR-V exclusively, and so working on proper GLSL support
will be a waste of time. For now, doing this keeps it working as we add
SPIR-V->NIR support though.
2015-07-15 17:18:48 -07:00
Chad Versace 756d8064c1 vk/0.132: Do type-safety 2015-07-15 17:16:07 -07:00
Jason Ekstrand 927f54de68 vk/cmd_buffer: Move batch buffer padding to anv_batch_bo_finish() 2015-07-15 17:11:04 -07:00
Jason Ekstrand 9c0db9d349 vk/cmd_buffer: Rename bo_count to exec2_bo_count 2015-07-15 16:56:29 -07:00
Jason Ekstrand 6037b5d610 vk/cmd_buffer: Add a helper for allocating dynamic state
This matches what we do for surface state and makes the dynamic state pool
more opaque to things that need to get dynamic state.
2015-07-15 16:56:29 -07:00
Jason Ekstrand 7ccc8dd24a vk/private.h: Move cmd_buffer functions to near the cmd_buffer struct 2015-07-15 16:56:29 -07:00
Jason Ekstrand d22d5f25fc vk: Split command buffer state into its own structure
Everything else in anv_cmd_buffer is the actual guts of the datastructure.
2015-07-15 16:56:29 -07:00
Jason Ekstrand da4d9f6c7c vk: Move most of the anv_Cmd related stuff to its own file 2015-07-15 16:56:28 -07:00
Jason Ekstrand d862099198 vk: Pull the guts of anv_cmd_buffer into its own file 2015-07-15 16:56:28 -07:00
Chad Versace 498ae009d3 vk/glsl: Replace raw casts
Needed for upcoming type-safety changes.
2015-07-15 15:51:37 -07:00
Chad Versace 6f140e8af1 vk/meta: Remove raw casts
Needed for upcoming type-safety changes.
2015-07-15 15:51:37 -07:00
Chad Versace badbf0c94a vk/x11: Remove raw casts
The raw casts in the WSI functions will break the build when the
type-safety changes arrive.
2015-07-15 15:49:10 -07:00
Chad Versace 61a4bfe253 vk: Delete vkDbgSetObjectTag()
Because VkObject is going away.
2015-07-15 15:34:20 -07:00
Jason Ekstrand e1c78ebe53 vk/device: Remove unneeded checks for NULL 2015-07-15 15:22:32 -07:00
Jason Ekstrand f4748bff59 vk/device: Provide proper NULL handling in anv_device_free
The Vulkan spec does not specify that the free function provided to
CreateInstance must handle NULL properly so we do it in the wrapper.  If
this ever changes in the spec, we can delete the extra 2 lines.
2015-07-15 15:22:32 -07:00
Chad Versace 4c8e1e5888 vk: Stop internally calling anv_DestroyObject()
Replace each anv_DestroyObject() with anv_DestroyFoo().

Let vkDestroyObject() live for a while longer for Crucible's sake.
2015-07-15 15:11:16 -07:00
Chad Versace f5ad06eb78 vk: Fix vkDestroyObject dispatch for VkRenderPass
It called anv_device_free() instead of anv_DestroyRenderPass().
2015-07-15 15:07:41 -07:00
Chad Versace 188f2328de vk: Fix vkCreate/DestroyRenderPass
While updating vkDestroyObject, I discovered that vkDestroyPass reliably
crashes. That hasn't been an issue yet, though, because it is never
called.

In vkCreateRenderPass:
    - Don't allocate empty attachment arrays.
    - Ensure that pointers to empty attachment arrays are NULL.
    - Store VkRenderPassCreateInfo::subpassCount as
      anv_render_pass::subpass_count.

In vkDestroyRenderPass:
    - Fix loop bounds: s/attachment_count/subpass_count/
    - Don't call anv_device_free on null pointers.
2015-07-15 15:07:41 -07:00
Anuj Phogat 642f289824 i965: Fix 32 bit build warnings in intel_get_yf_ys_bo_size()
Along with fixing the type of pitch parameter, patch also changes
the types of few local variables and function return type.

Warnings fixed are:
intel_mipmap_tree.c:671:7: warning: passing argument 3 of
'intel_get_yf_ys_bo_size' from incompatible pointer type

intel_mipmap_tree.c:563:1: note: expected 'uint64_t *' but
argument is of type 'long unsigned int *'

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-07-15 15:02:02 -07:00
Chad Versace c6270e8044 vk: Refactor create/destroy code for anv_descriptor_set
Define two new functions:
    anv_descriptor_set_create
    anv_descriptor_set_destroy
2015-07-15 14:31:22 -07:00
Chad Versace 365d80a91e vk: Replace some raw casts with safe casts
That is, replace some instances of
    (VkFoo) foo
with
    anv_foo_to_handle(foo)
2015-07-15 14:00:21 -07:00