Commit Graph

82916 Commits

Author SHA1 Message Date
Topi Pohjolainen
39fdee6b2d i965/blorp/gen7+: Do not trigger push constant space reconfig
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
cc2d0e64c0 i965/blorp/gen7+: Stop trashing push constant allocation
Packet 3DSTATE_CONSTANT_PS is still emitted explicitly as ps stage
itself is enabled and hardware may try to prefetch constants from
the buffer. From the BSpec: 3D Pipeline - Windower -
3DSTATE_PUSH_CONSTANT_ALLOC_PS

  "Specifies the size of the PS constant buffer. This value will
   determine the amount of data the command stream can pre-fetch
   before the buffer is full."

This is not possible on gen6. From the BSpec about 3DSTATE_CONSTANT_PS:

"This packet must be followed by WM_STATE."

Binding table emissions for stages other than PS can be now dropped,
they were only needed for the 3DSTATE_CONSTANT_XS to be effective:

From the BSpec:

  "The 3DSTATE_CONSTANT_* command is not committed to the shader unit
   until the corresponding (same shader) 3DSTATE_BINDING_TABLE_POINTER_*
   command is parsed."

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
175e095744 i965/blorp: Remove support for push constants
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
46e1132b80 i965/blorp: Use flat inputs instead of uniforms
v2 (Jason): Use LOAD_INPUT() macro

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
07db95c24d i965/blorp: Fix the size requirement for vertex elements
v2: Rebased as this is needed before flat inputs are enabled

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
741a245ae4 i965/blorp: Load tranformation coordinates as vec4
In preparation for loading as flat vertex input.

v2: Use LOAD_INPUT() macro

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
01f2f364d4 i965/blorp: Rename LOAD_UNIFORM to LOAD_INPUT
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Topi Pohjolainen
641868103c i965/blorp: Organize pixel kill and blend/scaled inputs into vec4s
In addition, as these are never used in parallel, add a few
assertions.

v2 (Jason): Skip some complexity by putting them into a union but
            pad rectangle grid into a vec4 instead. Also keep the
            LOAD_UNIFORM macro.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 20:43:11 +03:00
Lionel Landwerlin
dbbc4fb4cc anv/wsi: create swapchain images using specified image usage
The image usage specified by the caller of vkCreateSwapchainKHR should be
passed onto the internal image creation. Otherwise the driver might later
crash when the user tries to use the image as a combined sampler even though
the creation was explicitly created with VK_IMAGE_USAGE_TRANSFER_SRC_BIT.

Leaving the previous VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT as this might be
expected even if the swapchain is created without any flag.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96791
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
2016-07-04 10:15:48 -07:00
Indrajit Das
51227b41c6 radeon/uvd: fix overflow error while calculating bit stream buffer size
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-04 11:38:05 +02:00
Topi Pohjolainen
9e3774a460 i965/blorp: Prepare for more than two vertex attributes
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:05:02 +03:00
Topi Pohjolainen
e762354309 i965/blorp: Tell vertex fetcher about flat inputs
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:04:38 +03:00
Topi Pohjolainen
89e6b4ef5d i965/blorp: Add support for flat input buffer
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:04:00 +03:00
Topi Pohjolainen
9b2fa17e97 i965/blorp: Store input read mask
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 09:03:41 +03:00
Topi Pohjolainen
73f78ab44b i965/blorp: Rename push constants to inputs
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:37:51 +03:00
Topi Pohjolainen
f2c472fcb3 i965/blorp: Use core vertex buffer state setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:37:44 +03:00
Topi Pohjolainen
4f7e68799f i965/blorp: Split vertex data and element setup
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Topi Pohjolainen
575c8cbb54 i965: Unify vertex buffer setup
On gen >= 8 one doesn't provide ending address but number of bytes
available. This is relative to the given offset.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Topi Pohjolainen
bdab945edd i965/draw: Expose vertex buffer state setup
Also change the interface to use start and end offsets.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-07-04 08:33:41 +03:00
Rob Clark
7295428e41 freedreno: fix crash on smaller gpus and higher resolutions
Devices with smaller GMEM size need more tiles.  On db410c at 2048x1152,
glmark2 shadow needed ~330 tiles for fullscreen.  Lets bump it up to
512.  (Maybe with MRT you could end up needing more, but at that point
things are probably going to be painfully slow.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-03 11:16:28 -04:00
Rob Clark
01ccb0d91e i965: don't drop const initializers in vector splitting
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-02 09:00:19 -04:00
Rob Clark
f78a6b1ce3 glsl: add driconf to zero-init unintialized vars
Some games are sloppy.. perhaps because it is defined behavior for DX or
perhaps because nv blob driver defaults things to zero.

So add driconf param to force uninitialized variables to default to zero.

This issue was observed with rust, from steam store.  But has surfaced
elsewhere in the past.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-07-02 09:00:19 -04:00
Rob Clark
202710d110 freedreno/ir3: support glsl linking for cmdline compiler
For .vert/.frag, now multiple can be specified on the cmdline for
purposes of linking, and the last one specified is the one that is
fed into the ir3 backend (and dumped along the way if --verbose is
specified)

Without this, varyings in frag shaders would appear as undefined.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-07-02 09:00:19 -04:00
Rob Clark
07cfe4e6aa glsl/standalone: initialize MaxUserAssignableUniformLocations
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-07-02 09:00:19 -04:00
Rob Clark
1759eb1d19 freedreno: update valid_buffer_range for SO buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
da39ac9c51 freedreno/ir3: support non-user_buffer consts
Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
2081c1ecc0 freedreno/a2xx: move setup/restore cmds into binning pass
Rather than doing a separate submit at context create, move these cmds
to before first tile, as is done on a3xx/a4xx.  Otherwise state can
be overwritten by other contexts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
2c3b54c278 freedreno: pass index buffer as a pipe_resource
This will be useful in a following patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Rob Clark
88cc11e971 freedreno: switch emit_const_bo() to take prsc's
We can push the unwrap of pipe_resource down.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2016-07-02 08:58:50 -04:00
Hans de Goede
d7dfd4cb51 nv30: Fix "array subscript is below array bounds" compiler warning
gcc6 does not like the trick where we point to one entry before the
array start and then start a while with a pre-increment.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-02 12:21:28 +02:00
Hans de Goede
110ef733dc nouveau: Fix a couple of "foo may be used uninitialized' compiler warnings
These are all new false positives with gcc6.

In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer
to a variable into a function initialises that variable.

In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0
enabled dst channels, this never happens, but gcc cannot know this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-02 12:21:28 +02:00
Hans de Goede
1f3c8f3664 nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warnings
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
2aa1197eee nouveau: Add support for SV_WORK_DIM
Add support for SV_WORK_DIM for nvc0 and nve4.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
3345f70f63 nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argument
This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO
and NVC0_CB_AUX_TEX_INFO.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
ef8e50a841 clover: Pass work_dim parameter of clEnqueueNDRangeKernel() to driver
In order to implement get_work_dim() the driver may need to know the
clEnqueueNDRangeKernel() work_dim parameter, so pass it to the driver.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Hans de Goede
d386cef246 tgsi: Add WORK_DIM System Value
Add a new WORK_DIM SV type, this is will return the grid dimensions
(1-4) for compute (opencl) kernels.

This is necessary to implement the opencl get_work_dim() function.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2016-07-02 12:21:28 +02:00
Alejandro Piñeiro
da7efadf04 mesa/main: fix error checking logic on CopyImageSubData
For the case (both src or dst) where we had a texobject, but the
texobject target was not the same that the method target, this spec
paragraph was appplied:

 /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core
  * Profile spec says:
  *
  *     "An INVALID_VALUE error is generated if either name does not
  *     correspond to a valid renderbuffer or texture object according
  *     to the corresponding target parameter."
  */

But for that case, the correct spec paragraph should be:
 /* Section 18.3.2 (Copying Between Images) of the OpenGL 4.5 Core
  * Profile spec says:
  *
  *     "An INVALID_ENUM error is generated if either target is
  *      not RENDERBUFFER or a valid non-proxy texture target;
  *      is TEXTURE_BUFFER or one of the cubemap face selectors
  *      described in table 8.18; or if the target does not
  *      match the type of the object."
  */

specifically the last sentence: "or if the target does not match the
type of the object".

This patch fixes the error returned (s/INVALID/ENUM) for that case,
and moves up the INVALID_VALUE spec paragraph, as that case (invalid
texture object) was handled before.

Fixes:
GL44-CTS.copy_image.target_miss_match

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-07-02 11:54:40 +02:00
Dave Airlie
27d456cc87 st/glsl_to_tgsi: don't increase immediate index by 1.
Immediates are stored into a separate table, and are
consolidated, so if we get an immediate we don't need
to offset it as the index it has is correct.

Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-07-02 17:01:25 +10:00
Ilia Mirkin
6f4d35212b st/mesa: get max supported number of image samples from driver
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-07-01 23:01:03 -04:00
Ilia Mirkin
b2b5075e04 nvc0: fix up image support for allowing multiple samples
Basically we just have to scale up the coordinates and then add the
relevant sample offset. The code to handle this was already largely
present from Christoph's earlier attempts to pipe images through back in
the dark ages, this just hooks it all up.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 23:01:02 -04:00
Nicolai Hähnle
07cc838b10 st/mesa: check the texture image level in st_texture_match_image
Otherwise, 1x1 images of arbitrarily high level are accepted.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96639#add_comment
Cc: 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 17:55:19 +02:00
Nicolai Hähnle
0ba053b34c st/mesa: an incomplete texture may have a zero-size first image
Fixes a regression introduced by commit 42624ea83 which triggered
an assertion in
dEQP-GLES2.functional.texture.completeness.cube.not_positive_level_0

While stImage must have a non-zero size as verified by the caller, we also
look at the size of the base image in an attempt to make a better guess at
the level0 size (this is important when the base image size is odd). However,
the base image may have a zero size even when it exists.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96629
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 17:54:40 +02:00
Nayan Deshmukh
de772bc060 st/vdpau: use bicubic filter for scaling(v6.1)
use bicubic filtering as high quality scaling L1.

v2: fix a typo and add a newline to code
v3: -render the unscaled image on a temporary surface (Christian)
    -apply noise reduction and sharpness filter on
     unscaled surface
    -render the final scaled surface using bicubic
     interpolation
v4: support high quality scaling
v5: set dst_area and dst_clip in bicubic filter
v6: set buffer layer before setting dst_area
v6.1: add PIPE_BIND_LINEAR when creating resource

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-01 12:54:58 +02:00
Nayan Deshmukh
872dd9ad15 vl: add a bicubic interpolation filter(v5)
This is a shader based bicubic interpolater which uses cubic
Hermite spline algorithm.

v2: set dst_area and dst_clip during scaling (Christian)
v3: clear the render target before rendering
v4: intialize offsets while initializing shaders
    use a constant buffer to send dst_size to frag shader
    small changes to reduce calculation in shader
v5: send half pixel offset instead of sending dst_size

Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2016-07-01 12:54:33 +02:00
Vinson Lee
3fea592c4e mesa/st: Use 'struct nir_shader' instead of 'nir_shader'.
Fix this build error with GCC 4.4.

  CC     state_tracker/st_nir_lower_builtin.lo
In file included from state_tracker/st_nir_lower_builtin.c:61:
state_tracker/st_nir.h:34: error: redefinition of typedef ‘nir_shader’
../../src/compiler/nir/nir.h:1830: note: previous declaration of ‘nir_shader’ was here

Suggested-by: Rob Clark <robdclark@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96235
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-07-01 00:19:24 -07:00
Alejandro Piñeiro
a97ee60926 docs: update MESA_DEBUG envvar documentation.
silent, flush, incomplete_tex and incomplete_fbo flags were not
documented (see src/mesa/main.debug.c for more info).

FP is not checked anymore.

v2 (Brian Paul):
 * MESA_DEBUG accepts a comma-separated list of parameters.
 * Clarify how MESA_DEBUG behaves with mesa debug and release builds.
 * Updated wording.

v3: Better wording for one paragraph (Brian Paul)

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-07-01 08:15:15 +02:00
Alejandro Piñeiro
5e553a6bb3 i965: intel_texture_barrier reimplemented
Fixes:
GL44-CTS.texture_barrier_ARB.same-texel-rw-multipass

On Haswell, Broadwell and Skylake (note that in order to execute that
test, it is needed to override GL and GLSL versions).

On gen6 this test was already working without this change. It keeps
working after it.

This commit replaces the call to brw_emit_mi_flush for gen6+ with two
calls to brw_emit_pipe_control_flush:

 * The first one with RENDER_TARGET_FLUSH and CS_STALL set to initiate
   a render cache flush after any concurrent rendering completes and
   cause the CS to stop parsing commands until the render cache
   becomes coherent with memory.

 * The second one have TEXTURE_CACHE_INVALIDATE set (and no CS stall)
   to clean up any stale data from the sampler caches before rendering
   continues.

Didn't touch gen4-5, basically because I don't have a way to test
them.

More info on commits:
0aa4f99f56
72473658c5

Thanks to Curro to help to tracking this down, as the root case was a
hw race condition.

v2: use two calls to pipe_control_flush instead of a combination of
    gen7_emit_cs_stall_flush and brw_emit_mi_flush calls (Curro)
v3: no need to const cache invalidation (Curro)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-07-01 08:09:27 +02:00
Ilia Mirkin
51ca57df01 nv30: go back to not using viewport validate function for swtnl
The output of draw requires a null viewport transform, which the regular
code is ill-equiped to do. Reinstate the original settings in the render
path, and add setting of the viewport clip polygon based on fb
width/height (as that is all taken care of by draw).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 01:04:10 -04:00
Ilia Mirkin
71609c9954 nv30: fix viewport clipping settings to be based on viewport, not rt
This fixes a ton of "*clip*" dEQP GLES2 tests, as well as
triangle-guardband-viewport in piglit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-07-01 00:02:23 -04:00
Brian Paul
c823ff8dfb gallium/util: check for window cliprects in util_can_blit_via_copy_region()
We can't blit with resource_copy_region() if there are window clip rects.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-06-30 18:19:09 -06:00