Commit Graph

77304 Commits

Author SHA1 Message Date
Kenneth Graunke de505f7d7b i965: Whack UAV bit when FS discards and there are no color writes.
dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no
framebuffer attachments, using a shader that discards based on
gl_FragCoord.  It uses occlusion queries to inspect whether pixels
are rendered or not.

Unfortunately, the hardware is not dispatching any pixel shaders,
so discards never happen, and the full quad of pixels increments
PS_DEPTH_COUNT, making the occlusion query results bogus.

To understand why, we have to delve into the WM_INT internal
signalling mechanism's formulas.

The "WM_INT::Pixel Shader Kill Pixel" signal is defined as:

    3DSTATE_WM::ForceKillPixel == ON ||
    (3DSTATE_WM::ForceKillPixel != Off &&
     !WM_INT::WM_HZ_OP &&
     3DSTATE_WM::EDSC_Mode != PREPS &&
     (WM_INT::Depth Write Enable || WM_INT::Stencil Write Enable) &&
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     (3DSTATE_PS_EXTRA::PixelShaderKillsPixels ||
      3DSTATE_PS_EXTRA:: oMask Present to RenderTarget ||
      3DSTATE_PS_BLEND::AlphaToCoverageEnable ||
      3DSTATE_PS_BLEND::AlphaTestEnable ||
      3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable))

Because there is no depth or stencil buffer, writes to those buffers
are disabled.  So the highlighted condition is false, making the whole
"Kill Pixel" condition false.  This then feeds into the following
"WM_INT::ThreadDispatchEnable" condition:

    3DSTATE_WM::ForceThreadDispatch != OFF &&
    !WM_INT::WM_HZ_OP &&
    3DSTATE_PS_EXTRA::PixelShaderValid &&
    (3DSTATE_PS_EXTRA::PixelShaderHasUAV ||
     WM_INT::Pixel Shader Kill Pixel ||
     WM_INT::RTIndependentRasterizationEnable ||
     (!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT &&
      3DSTATE_PS_BLEND::HasWriteableRT) ||
     (WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF &&
      (WM_INT::Depth Test Enable || WM_INT::Depth Write Enable)) ||
     (3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) ||
     (3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable ||
                                     WM_INT::Depth Write Enable ||
                                     WM_INT::Stencil Test Enable)))

Given that there's no depth/stencil testing, no writeable render target,
and the hardware thinks kill pixel doesn't happen, all of these
conditions are false.  We have to whack some bit to make PS invocations
happen.  There are many options.

Curro suggested using the UAV bit.  There's some precedence in doing
that - we set it for fragment shaders that do SSBO/image/atomic writes
when no color buffer writes are enabled.  We can simply include discard
here too.

Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests.

v2: Add a comment suggested and written by Jason Ekstrand.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-28 14:36:47 -07:00
Rhys Kidd 668b6ddfc5 vc4: Remove unused include from vc4_nir_lower_txf_ms.c
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
2016-03-28 11:51:11 -07:00
Adam Jackson 2b8492d63e glapi/glx: Treat xserver generated targets as .PHONY
Meaning, always rebuild them when asked instead of bothering to look at
timestamps (and then wondering why nothing happened when you said make).

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:37:12 -04:00
Adam Jackson c2f0bc2537 glapi/glx: Thunk non-ABI calls through GetProcAddress
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:37:12 -04:00
Adam Jackson ce3f0b23d1 glapi/glx: Emit direct GL calls instead of dispatch lookup
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:28:51 -04:00
Adam Jackson c0a9cbea4d glx: Unbreak generating some of the xorg glx headers
Broken by:

    commit 9ace0b5422
    Author: Dylan Baker <baker.dylan.c@gmail.com>
    Date:   Wed May 20 15:49:11 2015 -0700

	glapi: glX_proto_size.py: use argparse instead of getopt

Which changed most, but not all, callers to use --header-tag instead of
-h.

Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2016-03-28 14:28:36 -04:00
Bas Nieuwenhuizen dd5f0950e4 mesa/st: Fix NULL access if no fragment shader is bound
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-28 18:02:07 +02:00
Rob Clark b4c72b792c freedreno/ir3: fix for load_front_face intrinsic
Seems like trying to widen in the same instruction as the add.s does a
non-sign-extending widen.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-28 10:19:53 -04:00
Rob Clark 3ca034cada freedreno/ir3: fix compiler warn
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-28 10:19:09 -04:00
Ilia Mirkin b9f1affb2e nvc0: make sure to disable fetches from previously-set VBOs when blitting
We disable the vertex attributes, but also disable the VBO fetch details
as well, just in case. Not known to fix anything.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-28 08:36:34 -04:00
Ilia Mirkin 41100b6b44 nvc0: disable primitive restart and index bias during blits
Back in the dawn of time, we used to do immediate uploads for the vertex
data, and all was well. However Maxwell dropped support for immediate
vertex data, so we started feeding in a VBO (in all cases). But we
forgot to disable some things that apply in such cases, specifically
primitive restart and index bias. The latter was causing WoW and other
Blizzard games trouble as they use a pattern where they draw with a base
vertex (aka index bias), followed by texture uploads (aka blits,
internally).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Karol Herbst <nouveau@karolherbst.de>
2016-03-28 08:35:38 -04:00
Ilia Mirkin f667d15561 nvc0/ir: fix picking of coordinates from tex instruction for textureGrad
On Fermi, there's an argument in front of the coords that combines array
and indirect handle, while on Kepler the array and the indirect handle
are separate (and in front of the coords). We were previously only
accounting for the array bit of it, if there were an indirect access it
wouldn't be counted in the formula.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
2016-03-28 08:35:38 -04:00
Ilia Mirkin 6711f159d9 nv50/ir: saturate depth writes
Apparently there's no post-FS clamping logic, so we have to do this by
hand. The depth will never be outside of the 0..1 range, even on
floating point zeta buffers, so this should be safe.

Fixes dEQP-GLES3.functional.fbo.depth.*clamp.* which tests writing
invalid values on various zeta buffer formats.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-28 08:35:38 -04:00
Marek Olšák 6262d6125a gallium/util: fix up inaccurate behavior of util_framebuffer_state_equal (v2)
v2: move the nr_cbufs check above the loop

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> (v1)
2016-03-28 00:46:23 +02:00
Marek Olšák 21c479256a st/mesa: only minify height if target != 1D array in st_finalize_texture
The st_texture_object documentation says:
  "the number of 1D array layers will be in height0"

We can't minify that.

Spotted by luck. No app is known to hit this issue.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2016-03-28 00:44:45 +02:00
Miklós Máté 50d653c2bb mesa: optimize out the realloc from glCopyTexImagexD()
v2: comment about the purpose of the code
v3: also compare texFormat,
 add a perf debug message,
 formatting fixes

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté baab345b19 st/mesa: fix handling the fallback texture
This fixes crash when post-processing is enabled in SW:KotOR.

v2: fix const-ness
v3: move assignment into the if() block

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté 920fbecf57 st/mesa: enable GL_ATI_fragment_shader
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté dee274477f st/mesa: implement GL_ATI_fragment_shader
v2: fix arithmetic for special opcodes,
 fix fog state, cleanup
v3: simplify handling of special opcodes,
 fix rebinding with different textargets or fog equation,
 lots of formatting fixes
v4: adapt to the compile early, fix later architecture,
 formatting fixes

Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté d71c1e9e54 program: add ATI_fragment_shader to shader stages list
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Miklós Máté e2d5a6fac5 mesa: optionally associate a gl_program to ATI_fragment_shader
the state tracker will use it

Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Miklós Máté <mtmkls@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 19:58:33 +02:00
Edward O'Callaghan 11bd53933e gallium/p_context.h: Make comment more readable
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:03:04 +02:00
Edward O'Callaghan 2df141087a mesa/st: Remove GLSLVersion clamping
While here, remove itermediate glsl_feature_level variable.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:00:36 +02:00
Edward O'Callaghan ca22d2f1fd radeon/r600: Fix return type in failure branch
Commit `d4e847ea` introduced a warning about making an
integer from a pointer without a cast, fix it here.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:00:35 +02:00
Edward O'Callaghan 1fb05a9a0c radeon/r600_query.c: Minor style fix
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2016-03-27 18:00:35 +02:00
Dave Airlie fc3b000fef virgl: drop next shader property for now.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-03-26 17:50:32 +10:00
Timothy Arceri 8683d54d2b glsl: reduce buffer block duplication
This reduces some of the craziness required for handling buffer
blocks. The problem is each shader stage holds its own information
about a block in memory, we were copying that information to a
program wide list but the per stage information remained meaning
when a binding was updated we needed to update all versions of it.

This changes the per stage blocks to instead point to a single
version of the block information in the program list.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-26 09:26:30 +11:00
Brian Paul a8e5edaadf st/xa: emit sampler view declarations in shaders
Fixes recent regressions with the VMware gallium driver.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Tested-by: Charmaine Lee <charmainel@vmware.com>
2016-03-25 14:53:59 -06:00
Tim Rowley 74a04840e5 swr: [rasterizer jitter] Fix MASKLOADD AVX prototype (float -> i32) 2016-03-25 14:45:40 -05:00
Tim Rowley 93c1a2dedf swr: [rasterizer core] NUMA optimizations...
- Affinitize hot-tile memory to specific NUMA nodes.
- Only do BE work for macrotiles assoicated with the numa node
2016-03-25 14:45:40 -05:00
Tim Rowley 090be2e434 swr: [rasterizer jitter] Fix logic bug for alpha-to-coverage. 2016-03-25 14:45:40 -05:00
Tim Rowley 0767e820fd swr: [rasterizer core] Fix Compute workitem retirement 2016-03-25 14:45:40 -05:00
Tim Rowley 813e89c0cc swr: [rasterizer core] Cleanup state ring arena after last draw that references it completes
Rather than waiting for the API thread to re-use it.
2016-03-25 14:45:40 -05:00
Tim Rowley 83822d7ed5 swr: [rasterizer jitter] add missing include for llvm jitevents 2016-03-25 14:45:40 -05:00
Tim Rowley 51549912d1 swr: [rasterizer core] Reduce Arena blocksize to 128KB (from 1MB).
With global allocator this doesn't seem to affect performance at all.
Overall memory consumption drops by up to 85%.
2016-03-25 14:45:40 -05:00
Tim Rowley ed5b953919 swr: [rasterizer core] One last pass at Arena optimizations 2016-03-25 14:45:40 -05:00
Tim Rowley ee6be9e92d swr: [rasterizer core] CachedArena optimizations
Reduce list traversal during Alloc and Free.

Add ability to have multiple lists based on alloc size (not used for now)
2016-03-25 14:45:39 -05:00
Tim Rowley 68314b6769 swr: [rasterizer jitter] support llvm-svn 2016-03-25 14:45:39 -05:00
Tim Rowley ec9d4c4b37 swr: [rasterizer core] Globally cache allocated arena blocks for fast re-allocation. 2016-03-25 14:45:39 -05:00
Tim Rowley 12ce9d9aa1 swr: [rasterizer] more arena work 2016-03-25 14:45:39 -05:00
Tim Rowley 4893224e28 swr: [rasterizer core] Add clipping against user clip distances in the NullPS backend. 2016-03-25 14:45:39 -05:00
Tim Rowley 700a5b06e0 swr: [rasterizer core] Arena optimizations - preparing for global allocator. 2016-03-25 14:45:39 -05:00
Tim Rowley 5899076b6b swr: [rasterizer core] Reset DrawContext arena at end of draw rather than upon reclaim of DC
Keeps overall memory consumption lower.
Also, remove unused knobs.
2016-03-25 14:45:39 -05:00
Tim Rowley 7390418441 swr: [rasterizer core] Add clipping of user clip planes in clipper. 2016-03-25 14:45:39 -05:00
Tim Rowley 4b4547a721 swr: [rasterizer] Reduce max in-flight draws to 96 (by default) 2016-03-25 14:45:39 -05:00
Tim Rowley 9111d63228 swr: [rasterizer] Fix run-time check asserts
One innocuous (uninitialized variable), and one not so innocuous
(stack corruption).
2016-03-25 14:45:39 -05:00
Tim Rowley 257db3610a swr: [rasterizer jitter] signed immediate builder 2016-03-25 14:45:39 -05:00
Tim Rowley b958aea78a swr: [rasterizer common] changes for cygwin 2016-03-25 14:45:39 -05:00
Tim Rowley e1222ade00 swr: [rasterizer] code styling and update copyrights 2016-03-25 14:45:14 -05:00
Tim Rowley c75314ec67 swr: [rasterizer core] Guard against enquing work to invalid hot tiles 2016-03-25 14:43:15 -05:00