Commit Graph

107606 Commits

Author SHA1 Message Date
Eric Anholt 581eba072d v3d: Add a function to describe what the c->execute.file check means.
This is what pointed out that we were misusing the check for last_thrsw in
the previous commit.
2019-02-18 18:09:07 -08:00
Eric Anholt 441294962c v3d: Fix the check for "is the last thrsw inside control flow"
The execute.file check used to be good enough, until I stopped setting up
the execute mask for uniform ifs.

No known tests fixed, noticed while doing a refactor.

Fixes: 0805060573 ("v3d: Handle dynamically uniform IF statements with uniform control flow.")
2019-02-18 18:09:07 -08:00
Eric Anholt 07d5b5a972 v3d: Fix f2b32 behavior.
Now that we don't have the vir_PF() magic, it's obvious that we were doing
the wrong thing for f2b32 by allowing -0.0 to produce true instead of
false.
2019-02-18 18:09:07 -08:00
Eric Anholt 3022b4bd82 v3d: Kill off vir_PF(), which is hard to use right.
You were allowed to pass in any old temp so that you could hopefully fold
the PF up into the def of the temp.  If we couldn't find one, it
implicitly generated a MOV(nop, reg).  However, that PF could have
different behavior depending on whether the def being folded into was a
float or int opcode, which the caller doesn't necessarily control.

Due to the fragility of the function, just switch all callers over to
vir_set_pf().  This also encourages the callers to use a _dest call for
the inst they're putting the PF on, eliminating a bunch of temps in the
pre-optimization VIR.

shader-db says the change is in the noise:

total instructions in shared programs: 6226247 -> 6227184 (0.02%)
instructions in affected programs: 851068 -> 852005 (0.11%)
2019-02-18 18:09:06 -08:00
Eric Anholt 6186a8d44e v3d: Do bool-to-cond for discard_if as well.
Turns this minimal conditional discard (glsl-fs-discard-01.shader_test):

0x3de0b086c5fe9000 fcmp.pushn  -, r1, r5; mov  r2, 0
0x3dec3086bbfc001f nop                  ; mov.ifa  r2, -1
0x3c047186bbe80000 nop                  ; mov.pushz  -, r2
0x3dea3186ba837000 setmsf.ifna  -, 0    ; nop

into:

0x3c00b186c582a000 fcmp.pushn  -, r2, r5; nop
0x3de83186ba837000 setmsf.ifa  -, 0     ; nop

total instructions in shared programs: 6229820 -> 6226247 (-0.06%)
2019-02-18 18:09:06 -08:00
Eric Anholt 718eef62cb v3d: Refactor bcsel and if condition handling.
Both were doing the same thing to try to get a condition to predicate on.
Noticed when I wanted to do this for discard_if as well.

No change in shader-db.
2019-02-18 18:09:06 -08:00
Eric Anholt 4586f9f902 v3d: Add a helper function for getting a nop register.
Just a little refactor to explain what's going on with QFILE_NULL.
2019-02-18 18:09:06 -08:00
Eric Anholt 339155122b v3d: Drop our hand-lowered nir_op_ffract.
The NIR lowering works fine, though it causes some slight noise due to
what looks like choices about propagating constants up multiply chains
changing.

total instructions in shared programs: 6229671 -> 6229820 (<.01%)
total uniforms in shared programs: 2312171 -> 2312324 (<.01%)
2019-02-18 18:09:06 -08:00
Eric Anholt 16f5085490 v3d: Drop a perf note about merging unpack_half_*, which has been implemented.
This is handled with copy-propagation now.
2019-02-18 18:09:06 -08:00
Eric Anholt 146e432b49 v3d: Fix incorrect flagging of ldtmu as writing r4 on v3d 4.x.
Fixes some stalls in 3DMMES's main vertex shader.

total instructions in shared programs: 6280751 -> 6211270 (-1.11%)
instructions in affected programs: 2935050 -> 2865569 (-2.37%)
2019-02-18 18:09:06 -08:00
Eric Anholt cd5e0b2729 v3d: Use the early_fragment_tests flag for the shader's disable-EZ field.
Apparently we need disable-EZ flagged, not just "does Z writes".

Fixes
dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
on 7278, even though it passed in simulation.

Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 051a41d3d5 ("v3d: Add support for the early_fragment_tests flag.")
2019-02-18 18:09:06 -08:00
Eric Anholt 332b969c4e v3d: Sync indirect draws on the last rendering.
Fixes intermittent fails in
dEQP-GLES31.functional.draw_indirect.compute_interop.separate.drawelements_compute_cmd_and_data_and_indices
and others (particularly when run as part of a CTS run)
2019-02-18 18:09:06 -08:00
Eric Anholt 32f16b0b1e v3d: Clear the GMP on initialization of the simulator.
Otherwise, we might have pages accessible that shouldn't be and miss out
on errors.  This is unlikely for most tests since v3d_hw_get_mem() is big
enough that it'll be a freshly zeroed mmap, but if screens are destroyed
and recreated then we'd be reusing the old v3d_hw_get_mem() contents.
2019-02-18 18:09:06 -08:00
Emil Velikov ba652394a3 docs: update calendar, add news item and link release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
2019-02-18 18:38:14 +00:00
Emil Velikov d7108dac73 docs: add sha256 checksums for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bfb5bdaa97272537567cdf1e6caf1c7db9f28aba)
2019-02-18 18:36:23 +00:00
Emil Velikov a1ccff4aaf docs: add release notes for 18.3.4
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit b26488deadc3a8221d558a323dbe81dcf09115ab)
2019-02-18 18:36:21 +00:00
Ilia Mirkin 57441af8bf i965: always enable EXT_float_blend
From the table in isl_format.c, it appears that all generations
support blending on 32-bit float surfaces.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Ilia Mirkin 9fec653093 st/mesa: enable GL_EXT_float_blend when possible
If the driver supports PIPE_BIND_BLENABLE on RGBA32F, flip
EXT_float_blend on (which will affect ES3 contexts).

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
2019-02-18 12:13:54 -05:00
Ilia Mirkin 070a5e5d92 mesa: add explicit enable for EXT_float_blend, and error condition
If EXT_float_blend is not supported, error out on blending of FP32
attachments in an ES2 context.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-18 12:13:54 -05:00
Samuel Pitoiset 47616810ed radv: fix writing the alpha channel of MRT0 when alpha coverage is enabled
This version is better and safer.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 18:06:07 +01:00
Rob Clark d6c43cceff freedreno/ir3: handle quirky atomic dst for a6xx
The new encoding returns a value via the 2nd src.  The legalize pass
needs to be aware of this to set the correct needs_sy flag, otherwise we
can, in cases where the atomic dst is not used, overwrite the register
that hardware will asynchronously load result into without (sy) flag, so
it gets clobbered by the atomic result.

This fixes a whole lot of rando ssbo+atomic fails, like
dEQP-GLES31.functional.ssbo.layout.single_basic_type.packed.highp_vec4.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 12:01:36 -05:00
Rob Clark 28fc6733cd freedreno/a6xx: fix helper_invocation (sampler mask/id)
Since gl_HelperInvocation is lowered to:

  !((1 << sample_id) & sample_mask_in))

Not setting these enable bits was causing it be broken.  (And probably a
bunch of other stuff too.)

Fixes dEQP-GLES31.functional.shaders.helper_invocation.*

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-18 10:37:54 -05:00
Samuel Pitoiset 32ab7a59bb radv: remove unused variable in gather_push_constant_info()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-02-18 13:30:16 +01:00
Lionel Landwerlin 8c87d029bc i965: scale factor changes should trigger recompile
Found by inspection.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3da858a6b9 ("intel/compiler: add scale_factors to sampler_prog_key_data")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-02-18 12:18:13 +00:00
Samuel Pitoiset 0d8f096293 radv: write the alpha channel of MRT0 when alpha coverage is enabled
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597
Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:22 +01:00
Samuel Pitoiset 2cf5433b99 ac: use new LLVM 8 intrinsic when loading 16-bit values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:20 +01:00
Samuel Pitoiset f0223143a8 ac: add ac_build_llvm8_tbuffer_load() helper
It uses the new LLVM intrinsics.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-18 12:14:17 +01:00
Tapani Pälli 9762a9f893 mesa: return NULL if we exceed MaxColorAttachments in get_fb_attachment
This fixes invalid access to Attachment array which would occur if caller
would exceed MaxColorAttachments. In practice this should not ever happen
because DiscardFramebufferEXT specifies only GL_COLOR_ATTACHMENT0 to be
valid and InvalidateFramebuffer will error out before but this should
make coverity happy.

v2: const, remove _EXT (Ian)

CID: 1442559
Fixes: 0c42b5f3cb "mesa: wire up InvalidateFramebuffer"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-18 07:51:55 +02:00
Alyssa Rosenzweig 2c6a7fbeb7 panfrost: Fix clipping region
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:50 +00:00
Alyssa Rosenzweig fa1b36ddc2 panfrost: Preserve w sign in perspective division
This fixes issues where polygons that should be culled (due to negative
w, for instance) may not be.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:34 +00:00
Alyssa Rosenzweig 49985cebea panfrost: Cleanup mali_viewport (clipping) code
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:13:03 +00:00
Alyssa Rosenzweig a94463732a panfrost: Swap order of tiled texture (de)alloc
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:33 +00:00
Alyssa Rosenzweig 4a4ed53c01 panfrost: Free imported BOs
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:10:06 +00:00
Alyssa Rosenzweig b5a01296f4 panfrost: Fix various leaks unmapping resources
v2: Don't check for NULL before free()

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
2019-02-18 05:09:41 +00:00
Kenneth Graunke 535251487b nir: Don't reassociate add/mul chains containing only constants
The idea here is to reassociate a * (b * c) into (a * c) * b, when
b is a non-constant value, but a and c are constants, allowing them
to be combined.

But nothing was enforcing that 'b' must be non-constant, which meant
that running opt_algebraic in a loop would never terminate if the IR
contained non-folded constant expressions like 256 * 0.5 * 2.  Normally,
we call constant folding in such a loop too, but IMO it's better for
nir_opt_algebraic to be robust and not rely on that.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109581
Fixes: 32e266a9a5 i965: Compile fp64 funcs only if we do not have 64-bit hardware support

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-02-16 23:36:14 -08:00
Chris Wilson e9882b879b i965: Assert the execobject handles match for this device
Object handles are local to the device fd, so double check we are not
mixing together objects from multiple screens on execbuf submission.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-02-16 23:35:29 -08:00
Rob Clark 99b90ecd35 freedreno/a6xx: cache flush harder
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark 1af0c5d320 freedreno/a6xx: compute support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark 5118dcf8c3 freedreno/a6xx: image/ssbo state emit
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark 2183d9cff7 freedreno/a6xx: border-color offset helper
Soon we'll need this logic to deal w/ image/SSBO case, so split out a
helper rather than duplicate the logic.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark c1a27ba9ba freedreno/ir3: HIGH reg w/a for a6xx
It seems like some instructions (noticed this w/ cat3), cannot read HIGH
regs.. cat1 (mov/cov) can, and possibly some/all of cat2.

The blob seems to stick w/ an extra mov into low regs.  So lets do the
same.

This fixes WGID on a6xx, which unsurprisingly is related to a lot of
deqp compute fails.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark 947848524d freedreno/ir3: add a6xx+ SSBO/image support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:28:00 -05:00
Rob Clark b46d5b8a84 freedreno/ir3: add a6xx instruction encoding
For the handful of instructions that use a new encoding.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark 2e0ea3f09c freedreno/ir3: add image/ssbo <-> ibo/tex mapping
Images and SSBOs don't map directly to the hw.  They end up being part
texture and part something else.  Starting with a6xx, the hack used for
a5xx to smash the image tex state into hw texture state starting from
MAX counting down won't work, because we start using tex state also for
SSBO read.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark 75f3a5245e freedreno/ir3: fix ncomp for _store_image() src
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark feee3050d3 freedreno/ir3: split out a4xx+ instructions
Note that image/ssbo support is currently only implemented for a5xx.
But the instruction encoding is the same for a4xx.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark 42af0640f6 freedreno/ir3: split out image helpers
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark aefdb9bed2 freedreno/a6xx: clean up some open-coded bits
Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:27:59 -05:00
Rob Clark b51de44dea freedreno/a6xx: move stream-out emit to helper
Split out of the main fd6_emit() code, since it was already getting to
be a pretty giant function.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:26:14 -05:00
Rob Clark c0d6be11d6 freedreno/ir3: fix varying packing vs. tex sharp edge
We probably need to rethink how we detect which instruction first
defines higher register classes.  But for now, this at least fixes
the symptom.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2019-02-16 16:26:14 -05:00