Commit Graph

67815 Commits

Author SHA1 Message Date
Neil Roberts
af8fd694d4 dir-locals.el: Don't set variables for non-programming modes
This limits the style changes to modes inherited from prog-mode. The
main reason to do this is to avoid setting fill-column for people
using Emacs to edit commit messages because 78 characters is too many
to make it wrap properly in git log. Note that makefile-mode also
inherits from prog-mode so the fill column should continue to apply
there.

v2: Apply to all the .dir-locals.el files, not just the one in the
    root directory.

Acked-by: Michel Dänzer <michel.daenzer@amd.com>
2015-02-02 12:02:55 +00:00
Iago Toral Quiroga
68155e5a36 i965: Fix intel_miptree_copy_teximage for GL_TEXTURE_1D_ARRAY
For GL_TEXTURE_1D_ARRAY targets we store the depth of the array
in the Height field and leave Depth=1 in the underlying texture
object. When we call intel_miptree_copy_teximage in the process
of re-creating a miptree (possibily because the number of miplevels
has changed) we didn't account for this, so we where only copying
texture images for the first slice.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-02 09:29:18 +01:00
Eric Anholt
753c327151 vc4: Kill a bunch of color write calculation when colormask is all off.
I could have done this in the bit that generates the ANDs and ORs, but
it's probably generally useful.  Sadly, I still need this even if I move
to NIR, because I can't yet express my read of the destination color in
NIR, which I would need to move my blend/logicop/colormask handling into
NIR.

total uniforms in shared programs: 13497 -> 13455 (-0.31%)
uniforms in affected programs:     101 -> 59 (-41.58%)
total instructions in shared programs: 40797 -> 40296 (-1.23%)
instructions in affected programs:     1639 -> 1138 (-30.57%)
2015-02-01 16:07:24 -08:00
Fredrik Höglund
0508032413 docs: Update ARB_direct_state_access
Mark vertex array objects as started.
2015-02-01 23:00:42 +01:00
Martin Peres
9272022353 doc: break down ARB_direct_state_access in GL3.txt
A student was wondering what was going on + I started working on it too.

CC: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Martin Peres <martin.peres@linux.intel.com>
Reviewed-by: Laura Ekstrand <laura@jlekstrand.net>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
2015-02-01 22:50:35 +01:00
Eric Anholt
12ebd7e20e vc4: Dump the VPM read index in QIR disasm.
Since the VPM reads have to be in order, it's useful to see their indices
in the dump.
2015-02-01 12:53:08 -08:00
Jason Ekstrand
6094619c02 i965/pixel_read: Don't try to do a tiled_memcpy from a multisampled buffer
The GL spec guarantees that glGetTexImage will never get a multisampled
texture, but this is not true for glReadPixels.  If we get a multisampled
buffer, we have to do a multisample resolve on it before we can pull the
data down for the user.  Since this isn't practical to handle in
tiled_memcpy, we just fall back to the other paths that can handle this.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-31 08:54:32 -08:00
Francisco Jerez
11f5d8a5d4 i965: Enable L3 caching of buffer surfaces.
And remove the mocs argument of the emit_buffer_surface_state vtbl hook.  Its
semantics vary greatly from one generation to another, so it kind of
encourages the caller to pass 0 which is the only valid setting across
generations.  After this commit the hardware-specific code decides what the
best cacheability settings are for buffer surfaces, just like we do for
textures.

This together with some additional changes coming is expected to improve
performance of pull constants, buffer textures, atomic counters and image
objects on Gen7 and up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-31 17:01:49 +02:00
José Fonseca
11a955aef4 egl: Pass the correct X visual depth to xcb_put_image().
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual.  However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:

  https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911

Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2015-01-31 09:14:36 +00:00
Jason Ekstrand
5c31184cf5 intel/pixel_read: Properly flip the results for window system buffers
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-30 18:56:56 -08:00
Jason Ekstrand
837a4c42a6 i965/tiled_memcpy: Support a signed linear pitch
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-01-30 18:56:56 -08:00
Jason Ekstrand
7cc3bb2318 main: Add STENCIL_INDEX formats to base_tex_format
This fixes a bug on BDW when our meta-based stencil blit path assert-fails
due to an invalid internal format even though we do support the
ARB_stencil_texturing extension.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 15:49:45 -08:00
Jason Ekstrand
16875bc5cd teximage: Don't indent switch cases
No functional change.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 15:49:45 -08:00
Brian Paul
b930ef1ce8 mesa: remove some dead display list code
The size of a Node is always four bytes so no need for the old code
that was used when sizeof(Node)==8.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 13:27:18 -07:00
Brian Paul
20bc72b791 mesa: remove stale comment in dlist.c code
sizeof(Node) is always 4 bytes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 13:27:18 -07:00
Brian Paul
613974b774 mesa: s/union gl_dlist_node/Node/ in dlist.c code
Just minor clean-up.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 13:27:17 -07:00
Brian Paul
53b01938ed mesa: fix display list 8-byte alignment issue
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-01-30 08:48:19 -07:00
José Fonseca
fbc3e030e6 util/u_atomic: Provide a _InterlockedCompareExchange8 for older MSVC.
Fixes build with Windows SDK 7.0.7600.

Tested with u_atomic_test, both on x86 and x86_64.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-30 15:24:34 +00:00
José Fonseca
d7f2dfb67e util/u_atomic: Use _Interlocked* intrinsics for non 64bits.
The intrinsics are universally available, whereas older Windows SDKs (e.g.
7.0.7600) don't have the non-intrisic entrypoint.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2015-01-30 15:24:33 +00:00
Neil Roberts
a7eec6d620 i965/skl: Force a BINDING_TABLE_POINTER_* after push constant command
According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take
effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_*
command. This patch just makes it set the BRW_NEW_SURFACES state when
uploading the push constants to ensure the binding tables will be
updated.

This fixes the fbo-blending-formats Piglit test and possibly others.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-30 12:25:13 +00:00
Topi Pohjolainen
083fb215e1 meta: Don't write depth when decompressing tex-images
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:59:13 +02:00
Topi Pohjolainen
c49c750579 meta: Don't write depth when generating miptrees
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:59:04 +02:00
Topi Pohjolainen
941aced635 meta/blit: Compile programs with and without depth
When color buffers alone are concerned the depth is not needed.

No regression on BDW where meta blit is used instead of blorp. I
also disabled blorp temporarily for fbo-blits on IVB and saw no
regressions there either.
I also compared several graphics benchmarks on BDW and saw neither
regressions or improvements.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:58:32 +02:00
Topi Pohjolainen
97caf5fa04 meta/blit: Write depth only when asked for
Implementing an idea from Ken, on i965 the shader program for 2D
blits becomes significantly simpler.

Before:

pln(8)   g6<1>F    g4<0,1,0>F    g2<8,8,1>F  { align1 1Q compacted };
pln(8)   g7<1>F    g4.4<0,1,0>F  g2<8,8,1>F  { align1 1Q compacted };
send(8)  g2<1>UW   g6<8,8,1>F
         sampler (1, 0, 0, 1) mlen 2 rlen 4  { align1 1Q };
mov(8)   g123<1>F  g2<8,8,1>F                { align1 1Q compacted };
mov(8)   g124<1>F  g3<8,8,1>F                { align1 1Q compacted };
mov(8)   g125<1>F  g4<8,8,1>F                { align1 1Q compacted };
mov(8)   g126<1>F  g5<8,8,1>F                { align1 1Q compacted };
mov(8)   g127<1>F  g2<8,8,1>F                { align1 1Q compacted };
nop                                                             ;
sendc(8) null        g123<8,8,1>F
    render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT };

After:

pln(8)   g6<1>F     g4<0,1,0>F    g2<8,8,1>F   { align1 1Q compacted };
pln(8)   g7<1>F     g4.4<0,1,0>F  g2<8,8,1>F   { align1 1Q compacted };
send(8)  g124<1>UW  g6<8,8,1>F
         sampler (1, 0, 0, 1) mlen 2 rlen 4    { align1 1Q };
sendc(8) null        g124<8,8,1>F
   render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT };

v2 (Matt): Removed unintended white-space change

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:57:51 +02:00
Topi Pohjolainen
4c157d34c0 meta/blit: Add plumbing for shaders without depth
Currently all blit programs are unconditionally compiled with
gl_FragDepth.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-30 09:54:53 +02:00
Jason Ekstrand
604ae33c8b nir/opt_algebraic: Add some constant bcsel reductions
total instructions in shared programs: 5998190 -> 5997603 (-0.01%)
instructions in affected programs:     54276 -> 53689 (-1.08%)
helped:                                293

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:11:13 -08:00
Jason Ekstrand
7f19cd5a56 nir/opt_algebraic: Add some boolean simplifications
total instructions in shared programs: 5998321 -> 5998287 (-0.00%)
instructions in affected programs:     4520 -> 4486 (-0.75%)
helped:                                8

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:11:10 -08:00
Jason Ekstrand
70273c5cd5 nir/algebraic: Support specifying variable as constant or by type
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
81f77e4f3a nir/algebraic: Fail to compile of a variable is used in a replace but not the search
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
026b5cc792 nir/search: Allow for matching variables based on types
This allows you to match on an unknown value but only if it is of a given
type.  90% of the uses of this are for matching only booleans, but adding
the generality of arbitrary types is no more complex.

nir_algebraic.py doesn't handle this yet but that's ok because the C
language will ensure that the default type on all variables is void.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
d8999bcdce nir/search: Add support for matching unknown constants
There are some algebraic transformations that we want to do but only if
certain things are constants.  For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.

nir_algebraic.py doesn't handle this yet, but that's ok because the C
language will make sure that false is the default for now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Jason Ekstrand
5ab1489ae6 nir: Add an invalid type
This allows us to indicate a concept of an invalid type.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-01-29 17:07:45 -08:00
Roland Scheidegger
f01e8d3ba5 gallium/docs: fix docs wrt ARL/ARR/FLR
since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-01-29 22:08:12 +01:00
Eric Anholt
fc884eadf1 nir: Add variants of some of the comparison simplifications.
We end up with these from TGSI-to-NIR because the pass generating the
comparisons doesn't know if the arg is actually a bool input or not.  vc4
results:

total instructions in shared programs: 41801 -> 41508 (-0.70%)
instructions in affected programs:     4253 -> 3960 (-6.89%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-01-29 11:44:06 -08:00
Eric Anholt
2b9c3bace7 vc4: Fix point size handling when it's the first output. 2015-01-29 11:43:33 -08:00
Eric Anholt
9a3a60cb13 nir: Don't try to to-SSA ALU instructions that are already SSA.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-29 11:43:33 -08:00
Eric Anholt
68d476167c nir: Fix a bit of broken indentation.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-29 11:42:08 -08:00
Eric Anholt
36c604c824 nir: Add a couple of helpers for glsl types.
This will be used by tgsi_to_nir, which needs to get vec4 types for
declaring shader input/output variables.

v2: Add a missing space.

Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-01-29 11:41:17 -08:00
Emil Velikov
765cfe9a90 docs: fix mesa 10.4.3 release date
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2015-01-29 14:02:48 +00:00
Kalyan Kondapally
e638841b87 Mesa: Advertise GL_OES_texture_*float* extensions support with i965.
This patch advertises support for GL_OES_texture_*float* extensions
when using i965 drivers.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-29 08:22:12 +02:00
Kalyan Kondapally
2c2a92d5b8 Mesa: Add support for HALF_FLOAT_OES type.
This patch adds needed support for accepting HALF_FLOAT_OES as valid type
for TexImage*D and TexSubImage*D when Texture FLoat extensions are supported.

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-29 08:21:41 +02:00
Kalyan Kondapally
a63c8a524b Mesa: Add support for GL_OES_texture_*float* extensions.
This patch series adds support for following GLES2 Texture Float extensions:
1)GL_OES_texture_float,
2)GL_OES_texture_half_float,
3)GL_OES_texture_float_linear,
4)GL_OES_texture_half_float_linear.

This patch adds basic infrastructure and needed boolean flags to advertise
support for these extensions, by default the support is disabled. Next patch
in the series introduces support for HALF_FLOAT_OES token.

v4: take assert away and make valid_filter_for_float conditional (Tapani),
    fix the alphabetical order (Emil)

Signed-off-by: Kevin Rogovin <kevin.rogovin@intel.com>
Signed-off-by: Kalyan Kondapally <kalyan.kondapally@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-01-29 08:16:47 +02:00
Eric Anholt
dd4d9a4e62 nir: Make vec-to-movs handle src/dest aliasing.
It now emits vector MOVs instead of a series of individual MOVs, which
should be useful to any vector backends.  This pushes the problem of
src/dest aliasing of channels on a scalar chip to the backend, but if
there are any vector operations in your shader then you needed to be
handling this already.

Fixes fs-swap-problem with my scalarizing patches.

v2: Rename to insert_mov(), and add a comment about what it does.
v3: Rewrite the comment.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v3)
2015-01-28 16:33:34 -08:00
Eric Anholt
d70eb38517 gallium: Replace u_simple_list.h with util/simple_list.h
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 16:33:34 -08:00
Eric Anholt
7c99187c6a mesa: Port a variant of 68afbe89c7 to util/
The idea is that after a remove_from_list(), you might want to be able to
do a remove_from_list() on it again or an is_empty_list().  This is
apparently relied on by r300g.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 16:33:34 -08:00
Eric Anholt
8ab6759cef mesa: Move simple_list.h to src/util.
We have two copies of it in the tree, I'm going to delete one.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-01-28 16:33:34 -08:00
Tom Stellard
2397a72129 radeonsi: Enable VGPR spilling for all shader types v5
v2:
  - Only emit write SPI_TMPRING_SIZE once per packet.
  - Use context global scratch buffer.

v3:
  - Patch shaders using WRITE_DATA packet instead of map/unmap.
  - Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and
    VS_PARTIAL_FLUSH when patching shaders.

v4:
  - Code cleanups.
  - Remove unnecessary multiplies.

v5:
  - Patch shaders in system memory and re-upload to vram.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:47 +00:00
Tom Stellard
5dcd97f25c radeonsi/compute: Allocate the scratch buffer during state creation
This moves scratch buffer allocation from si_launch_grid() to
si_create_compute_state().  This helps to reduce the overhead of
launching a kernel and also fixes a bug in the code that would cause
the scratch buffer to be too small if a kernel with smaller scratch size
was launched before a kernel with a larger scratch size.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Tom Stellard
32206c5e56 radeonsi: Add radeon_shader_binary member to struct si_shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00
Tom Stellard
37559f8dfc radeonsi/compute: Rename si_compute::program to si_compute::shader
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-28 21:03:46 +00:00