Commit Graph

68144 Commits

Author SHA1 Message Date
Matt Turner bb33a31c38 i965/fs: Add algebraic optimizations for MAD.
total instructions in shared programs: 5764176 -> 5763808 (-0.01%)
instructions in affected programs:     25121 -> 24753 (-1.46%)
helped:                                164
HURT:                                  2

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 8cfd1e2ac6 i965/fs: Emit MAD instructions when possible.
Previously we didn't emit MAD instructions since they cannot take
immediate arguments, but with the opt_combine_constants() pass we can
handle this properly.

total instructions in shared programs: 5920017 -> 5733278 (-3.15%)
instructions in affected programs:     3625153 -> 3438414 (-5.15%)
helped:                                22017
HURT:                                  870
GAINED:                                91
LOST:                                  49

Without constant pooling, this patch is a complete loss:

total instructions in shared programs: 5912589 -> 5987888 (1.27%)
instructions in affected programs:     3190050 -> 3265349 (2.36%)
helped:                                1564
HURT:                                  17827
GAINED:                                27
LOST:                                  101

And since the constant pooling patch by itself hurt a bunch of things,
from before constant pooling to this patch the results are:

total instructions in shared programs: 5895414 -> 5747946 (-2.50%)
instructions in affected programs:     3617993 -> 3470525 (-4.08%)
helped:                                20478
HURT:                                  4469
GAINED:                                54
LOST:                                  146

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 36bc5f06dd i965/fs: Allow immediates in MAD and LRP instructions.
And then the opt_combine_constants() pass will pull them out into
registers. This will allow us to do some algebraic optimizations on MAD
and LRP.

total instructions in shared programs: 5946656 -> 5931320 (-0.26%)
instructions in affected programs:     778247 -> 762911 (-1.97%)
helped:                                3780
HURT:                                  6
GAINED:                                12
LOST:                                  12

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 2dad1e3abd i965/fs: Add pass to combine immediates.
total instructions in shared programs: 5885407 -> 5940958 (0.94%)
instructions in affected programs:     3617311 -> 3672862 (1.54%)
helped:                                3
HURT:                                  23556
GAINED:                                31
LOST:                                  165

... but will allow us to always emit MAD instructions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 0d8f27eab7 i965/fs: Remove force_writemask_all assertion for execsize < 8.
This doesn't seem to be necessary.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 662c645318 i965/cfg: Add function to generate a dot file of the dominator tree.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner b06eef05d0 i965/cfg: Add function to generate a dot file of the CFG.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 0e3dbc0248 i965/cfg: Calculate the immediate dominators.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 08f304bb3b i965/cfg: Allow cfg::dump to be called without a visitor.
The fs_visitor's dump_instruction() implementation calls cfg_t()
indirectly through calculate_live_intervals, so if you have an infinite
loop in the CFG code, you can't call cfg::dump(fs_visitor *) to debug
it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Matt Turner 1af5c4a526 i965: Allow exec_list sentinels as arguments to insert functions.
To insert an instruction at the end of a basic block, we typically do
something like

   inst = block->last_non_control_flow_inst();
   inst->insert_after(block, new_inst);

But blocks can consist of a single control flow instruction, so inst
will actually be the exec_list's head sentinel. We shouldn't use it as
if it were a regular instruction, but it is safe to insert something after
it.

This patch avoids assert-failing because an exec_list sentinel wasn't in
the basic block's instruction list.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-17 20:44:09 -08:00
Alan Coopersmith b7ce7c00e3 Make _mesa_swizzle_and_convert argument types in .c match those in .h
Caused Solaris Studio compilers to fail to build with errors about
incompatible function redefinitions.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith 4671dca0ee Use __typeof instead of typeof with Solaris Studio compilers
While the C compiler accepts typeof, C++ requires __typeof.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86944
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith d602fbd861 Avoid fighting with Solaris headers over isnormal()
When compiling in C99 or C++11 modes, Solaris defines isnormal() as
a macro via <math.h>, which causes the function definition to become
too mangled to compile.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith 815b3bd096 Remove extraneous ; after DECL_TYPE usage
The macro is defined to provide a trailing ; so this caused the expansion
to end in ";;" which made the Solaris Studio compilers issue warnings for
every line of:
  "builtin_type_macros.h", line 113: Warning: extra ";" ignored.
for every file that included the header, filling build logs with thousands
of useless warnings.

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 18:16:33 -08:00
Alan Coopersmith 60ad5103b9 Bracket arguments to tr so they work with Solaris tr
https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Limitations-of-Usual-Tools.html#index-g_t_0040command_007btr_007d-1842

Without this fix, egl fails to build on Solaris, with the error:

<command-line>:0:22: error: '_EGL_PLATFORM_x11' undeclared (first use in this function)
egldisplay.c:207:31: note: in expansion of macro '_EGL_NATIVE_PLATFORM'
             native_platform = _EGL_NATIVE_PLATFORM;
                               ^

Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-17 18:16:32 -08:00
Kenneth Graunke 76960a55e6 glsl: Reduce memory consumption of copy propagation passes.
opt_copy_propagation and opt_copy_propagation_elements create new ACP
and Kill sets each time they enter a new control flow block.  For if
blocks, they also copy the entire existing ACP set contents into the
new set.

When we exit the control flow block, we discard the new sets.  However,
we weren't freeing them - so they lived on until the pass finished.
This can waste a lot of memory (57MB on one pessimal shader).

This patch makes the pass allocate ACP entries using this->acp as the
memory context, and Kill entries out of this->kill.  It also steals
kill entries when moving them from the inner kill list to the parent.

It then frees the lists, including their contents.

v2: Move ralloc_free(this->acp) just before this->acp = orig_acp
    (suggested by Eric Anholt).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.5 10.4" <mesa-stable@lists.freedesktop.org>
2015-02-17 17:33:27 -08:00
Chris Forbes eda3dd0076 i965: Add device limits for tess threads & URB entries
This should cover all platforms prior to Skylake.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
2015-02-17 17:33:27 -08:00
Dave Airlie e8e4437ed0 r600g/sb: treat undefined values like constants
When we schedule an instructions with undefined value, we
eventually will use 0, which is a constant, however sb wasn't
taking this into account and creating ops with illegal scalar
swizzles.

this replaces my fix for op3 in t slots.

Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-18 11:13:06 +10:00
Kenneth Graunke 598d144cef i915c: Use the actual MIN instruction.
Matt Turner noticed that the hardware has always had a MIN
instruction, but the driver always used MAX+MOV for no
apparent reason.

This should cut an instruction, and a temporary, allowing
more programs to run in hardware.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 15:24:15 -08:00
Kenneth Graunke 7bf774034a i915g: Use the actual MIN instruction.
Matt Turner noticed that the hardware has always had a MIN
instruction, but the driver always used MAX+MOV for no
apparent reason.

This should cut an instruction, and a temporary, allowing
more programs to run in hardware.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 15:24:15 -08:00
Kenneth Graunke 27b6ef7eca i965: Add a function to disassemble an instruction from the 4 dwords.
I used this a while back when debugging GPU hangs, and it seems like it
could be useful, so I figured I'd add it so people can use it in the
debugger.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-17 15:24:15 -08:00
Kenneth Graunke 0b499abb51 i965: Do Sandybridge workaround flushes before each primitive.
Sandybridge requires the post-sync non-zero workaround in a ton of
places, and if you ever miss one, the GPU usually hangs.

Currently, we try to track exactly when a workaround flush is
necessary (via the brw->batch.need_workaround_flush flag).  This is
tricky to get right, and we've botched it several times in the past.

This patch unconditionally performs the post-sync non-zero flush at the
start of each primitive's state upload (including BLORP).  We drop the
needs_workaround_flush flag, and drop all the other callers, as the
flush has already been performed.

We have no data to indicate that simply flushing all the time will
hurt performance, and it has the potential to help stability.

v2: Add post-sync workaround to initial GPU state upload to be extra
    cautious (suggested by Chad Versace).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2015-02-17 15:24:14 -08:00
Laura Ekstrand 92163482bd main: Fixed _mesa_GetCompressedTexImage_sw to copy slices correctly.
Previously array textures were not working with GetCompressedTextureImage,
leading to failures in the test
arb_direct_state_access/getcompressedtextureimage.c.

Tested-by: Laura Ekstrand <laura@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>

Cc: "10.4, 10.5" <mesa-stable@lists.freedesktop.org>
2015-02-17 13:45:48 -08:00
Ian Romanick 4470bf1f49 i965/vec4: Silence unused parameter warnings
brw_vec4_copy_propagation.cpp:243:59: warning: unused parameter 'reg' [-Wunused-parameter]
                    int arg, struct copy_entry *entry, int reg)
                                                           ^

brw_vec4_generator.cpp:869:57: warning: unused parameter 'inst' [-Wunused-parameter]
 vec4_generator::generate_unpack_flags(vec4_instruction *inst,
                                                         ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick 2524f9b80d mesa/main: Silence unused parameter warning
Just remove the _mesa_free_lighting_data function.  The body has been
empty since the shine table was moved into the tnl module (commit
ba1d921).

main/light.c:1216:46: warning: unused parameter 'ctx' [-Wunused-parameter]
 _mesa_free_lighting_data( struct gl_context *ctx )
                                              ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick 1424bbfb57 util/hash: Silence comparison between signed and unsigned integer warnings in tests
delete_management.c:56:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^
delete_management.c:69:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = size - 100; i < size; i++) {
                           ^
delete_management.c:79:31: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       assert(key_value(entry->key) >= size - 100 &&
                               ^
delete_management.c:79:70: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       assert(key_value(entry->key) >= size - 100 &&
                                                                      ^
insert_many.c:56:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^
insert_many.c:62:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^
insert_many.c:67:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    assert(ht->entries == size);
                  ^
random_entry.c:62:18: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    for (i = 0; i < size; i++) {
                  ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick 3d8f9570cd util/hash: Silence unused parameter warnings in tests
delete_and_lookup.c:37:21: warning: unused parameter ‘key’ [-Wunused-parameter]
 badhash(const void *key)
                     ^
delete_and_lookup.c:43:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
delete_and_lookup.c:43:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
collision.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
collision.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
destroy_callback.c:50:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
destroy_callback.c:50:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
insert_many.c:46:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
insert_many.c:46:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
insert_and_lookup.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
insert_and_lookup.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
null_destroy.c:32:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
null_destroy.c:32:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
random_entry.c:52:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
random_entry.c:52:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
remove_null.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
remove_null.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^
replacement.c:34:10: warning: unused parameter ‘argc’ [-Wunused-parameter]
 main(int argc, char **argv)
          ^
replacement.c:34:23: warning: unused parameter ‘argv’ [-Wunused-parameter]
 main(int argc, char **argv)
                       ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Ian Romanick 147afac80c glcpp: Silence GCC warning
glcpp/glcpp.c:124:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
 const static struct option
 ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-17 12:29:58 -08:00
Marek Olšák 2ead74888a radeonsi: fix a crash if a stencil ref state is set before a DSA state
+ minor indentation fixes

Discovered by Axel Davy.

This can't be reproduced with any app, because all state trackers set a DSA
state first.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
2015-02-17 17:41:00 +01:00
Marek Olšák 7713d594e4 r600g,radeonsi: implement GL_AMD_pinned_memory
v2: update release notes

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák c688988b0d winsys/radeon: test the userptr ioctl to see if it's present
There is no other way to check for support.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák 064847122a winsys/radeon: allow unaligned size for user-memory buffers
This is not required, but being user-friendly doesn't hurt.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák e8d727a2b6 winsys/radeon: allow mapping a user buffer
OpenGL requires this.

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák 8b587ee701 gallium: add interface and state tracker support for GL_AMD_pinned_memory
v2: add alignment restrictions to docs, fix indentation in headers

Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák 11ebb03c26 mesa: implement GL_AMD_pinned_memory
It's not possible to query the current buffer binding, because the extension
doesn't define GL_..._BUFFER__BINDING_AMD.

Drivers should check the target parameter of Drivers.BufferData. If it's
equal to GL_EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD, the memory should be pinned.
That's all there is to it.

A piglit test is on the piglit mailing list.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-02-17 17:31:48 +01:00
Christian König 4fa61b1a23 winsys/radeon: add user pointer support
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák e8625a29fe mesa: fix AtomicBuffer typo in _mesa_DeleteBuffers
Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2015-02-17 17:31:48 +01:00
Marek Olšák 218b15715e radeonsi: initialize TC_L2_dirty to false after buffer allocation
I forgot to do this, though "true" should have no effect on correctness.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák a27b74819a radeonsi: small fix in SPI state
Cc: 10.5 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák 5f1cef76f9 r600g,radeonsi: use fences to implement PIPE_QUERY_GPU_FINISHED
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89014

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Marek Olšák f1103f6a1e r600g,radeonsi: demote TIMESTAMP_DISJOINT query to be a software query
The query result is always constant.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-17 17:31:48 +01:00
Dave Airlie 59292b38eb st/glsl_to_tgsi: fix whitespace
everytime I open this file in emacs with show trailing whitespace
or git add from it my screen flares with red.

Just do a general cleanup, makes working on fp64 support not as
jarring.

I'm not saying this is perfect, its just better than before.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-17 14:49:19 +10:00
Ilia Mirkin b53fbec01d glsl/tests: add IMAGE type.
This fixes a warning when running make check.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-02-17 11:26:06 +10:00
Chia-I Wu faaf13f6bf ilo: always set up BLEND_STATE on Gen8
There is now an DW0 that seems to be always referenced.
2015-02-17 04:59:33 +08:00
Chia-I Wu 6d4475d7bf ilo: fix alpha test on Gen8
Shoudl use GEN8_BLEND_DW0_ALPHA_TEST_ENABLE instead of
GEN6_RT_DW1_ALPHA_TEST_ENABLE (and others).
2015-02-17 04:59:33 +08:00
Ben Widawsky d9cd982d55 i965/simd8vs: Fix SIMD8 atomics
The short version: we need to set bits in R0.7 which provide a mask to be used
for PS kill samples/pixels. Since the VS has no such concept, we just need to
set all 1.

The longer version...
Execution for SIMD8 atomics is defined as follows:
SIMD8: The low 8 bits of the execution mask are ANDed with 8 bits of the
Pixel/Sample Mask from the message header. For the typed messages, the Slot
Group in the message descriptor selects either the low or high 8 bits. For the
untyped messages, the low 8 bits are always selected. The resulting mask is used
to determine which slots are read into the destination GRF register (for read),
or which slots are written to the surface (for write). If the header is not
present, only the low 8 bits of the execution mask are used.

The message header for untyped messages is defined in R0.7 "This field contains
the 16-bit pixel/sample mask to be used for SIMD16 and SIMD8 messages. All 16
bits are used for SIMD16 messages.  For typed SIMD8 messages, Slot Group selects
which 8 bits of this field are used. For untyped SIMD8 messages, the low 8 bits
of this field are used." Furthermore, "The message header for the untyped
messages only needs to be delivered for pixel shader threads, where the
execution mask may indicate pixels/samples that are enabled only due to
derivative (LOD) calculations, but the corresponding slot on the surface must
not be accessed." We're not using a pixel shader here, but AFAICT, this mask is
used for all stages.

This leaves two options, Remove the header, or make the VS code emit the correct
thing for the header. I believe one of the goals of using SIMD8 VS was to get as
much code reuse as possible, and so I chose the latter. Since the VS has no such
thing as kill instructions, the mask is derived simple as all 1's.

v2:
Add a comment to the code (stolen from Curro on the mailing list)
Change the control flow style (Curro + Jason)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87258
Cc: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-16 12:22:44 -08:00
Brian Paul 9ac3700146 mesa: move assertion after declarations in texstore.c
To fix MSVC build.
2015-02-16 08:39:25 -07:00
Brian Paul 4d2cee4d5e mesa: silence uninitialized var warning in get_tex_rgba_uncompressed()
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-16 08:33:28 -07:00
Neil Roberts bb77745681 meta: Fix saving the results of the current occlusion query
When restoring the current state in _mesa_meta_end it was previously trying to
copy the on-going sample count of the current occlusion query into the new
query after restarting it so that the driver will continue adding to the
previous value. This wouldn't work for two reasons. Firstly, the query might
not be ready yet so the Result member will usually be zero. Secondly the saved
query is stored as a pointer to the query object, not a copy of the struct, so
it is actually restarting the exact same object. Copying the result value is
just copying between identical addresses with no effect. The call to
_mesa_BeginQuery will have always reset it back to zero.

This patch fixes it by making it actually wait for the query object to be
ready before grabbing the previous result. The downside of doing this is that
it could introduce a stall but I think this situation is unlikely so it might
not matter too much. A better solution might be to introduce a real
suspend/resume mechanism to the driver interface. This could be implemented in
the i965 driver by saving the depth count multiple times like it does in the
i945 driver.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88248
Reviewed-by: Carl Worth <cworth@cworth.org>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
2015-02-16 12:09:17 +00:00
Francisco Jerez 946e29847b i965/vec4: Override destination register writemask in sampler message send.
This line was removed by accident in commit
16b9112574 causing a regression in the
ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_vert Khronos conformance
test.  It's necessary because the swizzle_result() code below expects
all four components of the vector to be valid.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89094
Tested-by: Lu Hua <huax.lu@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-16 13:51:08 +02:00