Commit Graph

20820 Commits

Author SHA1 Message Date
Emil Velikov a75baba2f1 targets/xa: limit the amount of exported symbols
In the presence of LLVM the final library exports every symbol from
the llvm namespace. Resolve this by using a version script (w/o the
version/name tag).

Considering that there are only ~35 symbols, explicitly list them
to minimize the chances of rogue symbols sneaking in.

v2: Conditionally include the version-script.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25 23:21:46 +01:00
Emil Velikov b52a530ce2 gallium/egl: st_profiles are build time decision, treat them as such
The profiles are present depending on the defines at build time.
Drop the extra functions and feed the defines directly into the
state-tracker at build time.

v2: Drop unused variable i.

Acked-by: Chia-I Wu <olvaffe@gmail.com> (v1)
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-25 23:21:46 +01:00
Joakim Sindholt 404387ecd7 nv50: count wrapped textures towards the tex_obj count
But don't count their size towards the allocated memory, since that
belongs to whoever created it.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23 12:34:39 -04:00
Christoph Bumiller caa34a7a64 nvc0: assert that we have vertex elements state
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23 12:34:39 -04:00
Christoph Bumiller 2595682689 nvc0: use PRIxPTR for sizeof()
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23 12:34:39 -04:00
Christoph Bumiller 7669e362ab nv50,nvc0: allow 15,16,30 bpp display formats
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23 12:34:39 -04:00
Christoph Bumiller b9142c246d nv50,nvc0: handle guard band defines
[imirkin: moved default case out of switch]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-05-23 12:34:39 -04:00
Christoph Bumiller d479713d25 nv50/ir/tgsi: optimize KIL
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:39 -04:00
Christoph Bumiller 452a4151aa nv50/ir: fix lowering of predicated instructions (without defs)
Note that predicated instructions with defs are still not supported
because transformation to SSA doesn't handle them yet.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:38 -04:00
Christoph Bumiller 3b0867f35b nv50/ir/opt: fix constant folding with saturate modifier
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:38 -04:00
Christoph Bumiller 2f2d1b3d9b nv50/ir/tgsi: TGSI_OPCODE_POW replicates its result
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:38 -04:00
Christoph Bumiller 49eccef06b nv50,nvc0: set constbufs dirty on pipe context switch
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:38 -04:00
Christoph Bumiller 200382be85 nv50: setup scissors on clear_render_target/depth_stencil
[imirkin: add logic to also clear the "regular" scissors]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:38 -04:00
Christoph Bumiller 7d11b761f2 nv50,nvc0: always pull out bufctx on context destruction
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-05-23 12:34:38 -04:00
Jon TURNEY 45f9aae004 Make DRI dependencies and build depend on the target
- Don't require xcb-dri[23] etc. if we aren't building for a target with DRM, as
we won't be using dri[23]

- Enable a more fine-grained control of what DRI code is built, so that a libGL
using direct swrast can be built on targets which don't have DRM.

The HAVE_DRI automake conditional is retired in favour of a number of other
conditionals:

HAVE_DRI2 enables building of code using the DRI2 interface (and possibly DRI3
with HAVE_DRI3)

HAVE_DRISW enables building of DRI swrast

HAVE_DRICOMMON enables building of target-independent DRI code, and also enables
some makefile cases where a more detailled decision is made at a lower level.

HAVE_APPLEDRI enables building of an Apple-specific direct rendering interface,
still which requires additional fixing up to build properly.

v2:
Place xfont.c and drisw_glx.c into correct categories.
Update 'make check' as well

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-23 15:24:04 +01:00
Jon TURNEY ff90a8784c Fix build for darwin
Fix build for darwin, when ./configured --disable-driglx-direct

- darwin ld doesn't support -Bsymbolic or --version-script, so check if ld
supports those options before using them
- define GLX_ALIAS_UNSUPPORTED as config/darwin used to, as aliasing of non-weak
symbols isn't supported
- default to -with-dri-drivers=swrast

v2:
Use -Wl,-Bsymbolic, as before, not -Bsymbolic
Test that ld --version-script works, rather than just looking for it in ld --help
Don't use -Wl,--no-undefined on darwin, either

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-05-23 15:24:01 +01:00
Emil Velikov e0372239a5 targets/egl-static: add missing line break in ldflags
Accidently omitted by commit 7b7944ee1c.

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
2014-05-23 15:23:59 +01:00
Emil Velikov d4c3968c25 targets/osmesa: limit the amount of exported symbols
src/gallium/targets/osmesa/Makefile.am |  1 +
src/gallium/targets/osmesa/osmesa.sym  | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
create mode 100644 src/gallium/targets/osmesa/osmesa.sym
2014-05-23 07:40:24 -06:00
José Fonseca 172ef0c5a5 gallivm: Disable workaround for PR12833 on LLVM 3.2+.
Fixed upstream.
2014-05-23 11:37:47 +01:00
José Fonseca 2c02f34fcc gallivm: Support MCJIT on Windows.
It works fine, though it requires using ELF objects.

With this change there is nothing preventing us to switch exclusively
to MCJIT, everywhere.  It's still off though.
2014-05-23 11:37:47 +01:00
Alexander von Gluck IV d4225f803b haiku: Add missing u_memory.h for FREE()
Acked-by: Brian Paul <brianp@vmware.com>
2014-05-21 20:58:06 -04:00
Rob Clark a4d229b099 freedreno/a3xx: fix blend opcode
Seems the opcodes are slightly different from a2xx.  Resync headers and
move blend_func() helper into hw generation specific code.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-21 17:29:13 -04:00
Rob Clark b81de5352d freedreno/a3xx: fix depth/stencil gmem restore
We already multiply by bytes per pixel for this, so f3ba7611 broke
mem2gmem for depth/stencil.  Drop the now-redundant mutiply by cpp.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-21 16:11:46 -04:00
Rob Clark f3ba761129 freedreno/a3xx: fix depth/stencil GMEM positioning
In cases where there was no color buf bound, there were inconsistancies
in register settings related to position of depth/stencil inside GMEM.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-21 12:06:38 -04:00
Rob Clark 4da8267c36 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-21 12:06:38 -04:00
Rob Clark 0d54904c04 freedreno: use OUT_RELOCW when buffer is written
These aren't buffers we ever read back from CPU, so using incorrect
reloc fxn wasn't really harming anything.  But might as well be correct.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-21 12:06:38 -04:00
Rob Clark cb9ed57072 rbug: add missing pipe->blit() entrypoint
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
2014-05-21 12:06:38 -04:00
Ilia Mirkin cdeb7004e0 tgsi: add GS_INVOCATIONS to property names array
In commit 4be146b1, I neglected to add the new property to the strings
array. This leads to the string '(null)' to be printed instead when
converting a GS shader to text.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-05-21 09:31:16 -04:00
Ilia Mirkin 28360fcad7 nv50,nvc0: fix 3d blits with mipmap levels
Make sure to normalize the z coordinates as well as the x/y ones when
there are mipmaps present. Fixes 3d mipmap generation, which now uses
the blit path.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
2014-05-21 09:31:16 -04:00
Ilia Mirkin d2a3de19c6 nv50/ir: fix constant folding for OP_MUL subop HIGH
These instructions can come in either through IMUL_HI/UMUL_HI TGSI
opcodes, or from OP_DIV constant folding.

Also make sure that the constant foldings which delete the original
instruction still get counted as having done something.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
2014-05-21 09:31:16 -04:00
Ilia Mirkin d3a5cf052c nv50/ir: fix s32 x s32 -> high s32 multiply logic
Retrieving the high 32 bits of a signed multiply is rather annoying. It
appears that the simplest way to do this is to compute the absolute
value of the arguments, and perform a u32 x u32 -> u64 operation. If the
arguments' signs differ, then negate the result. Since there is no u64
support in the cvt instruction, we have the perform the 2's complement
negation "by hand".

This logic can come into use by the IMUL_HI instruction (very unlikely
to be seen), as well as from constant folding of division by a constant.
Fixes dolphin's divisions by 255.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
2014-05-21 09:31:16 -04:00
Rob Clark 57e68a91f5 freedreno: don't advertise texture arrays for now
I think a3xx and later should support (it is part of GLES3), but this
isn't needed for the time being and still needs to be reversed.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-20 10:52:56 -04:00
Rob Clark 52381a7ffb freedreno/a3xx: shadow sampler support
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-19 21:17:25 -04:00
Rob Clark 08b9180819 freedreno/a3xx/compiler: refactor trans_samp()
Split it up into some smaller fxns so it doesn't grow into a huge
monster as we add things.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-19 21:17:25 -04:00
Rob Clark 1686a0edc0 freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-19 21:17:25 -04:00
Roland Scheidegger 1e9cbbb1c4 llvmpipe: do IR counting for shader cache management after optimization.
2ea923cf57 had the side effect of IR counting
now being done after IR optimization instead of before. Some quick analysis
shows that there's roughly 1.5 times more IR instructions before optimization
than after, hence the effective shader cache size got quite a bit smaller.
Could counter this with an increase of the instruction limit but it probably
makes more sense to count them after optimizations, so move that code.

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-05-19 17:07:41 +02:00
Ilia Mirkin 5b8f1a0f7c nv50/ir: fix integer mul lowering for u32 x u32 -> high u32
UNION appears to expect that all of its sources are conditionally
defined. Otherwise it inserts an unpredicated mov instruction which
overwrites the desired result. This fixes tests that use UMUL_HI, and
much less directly, unsigned integer division by a constant, which uses
this functionality in a peephole pass.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
2014-05-18 17:59:16 -04:00
Ilia Mirkin 4ebaabcccb nv50/ir: make sure that texprep/texquerylod's args get coalesced
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
2014-05-18 17:59:16 -04:00
Rob Clark acc1651711 freedreno/a3xx: use util_format_compose_swizzles()
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-18 16:05:06 -04:00
Rob Clark 88ba9de917 freedreno/a3xx/compiler: 1D textures
Gallium already gives us height==1 for these, so the texture state is
already setup correctly to emulate 1D textures as a Nx1 2D texture.  We
just need to supply the .y coord.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-18 15:23:53 -04:00
Rob Clark 6f84f64643 freedreno: fix caps
In particular, we want mesa to emulate primitive restart for us.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-18 15:22:55 -04:00
Rob Clark f7debd4a3e freedreno: fix index buffer offset
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-18 15:22:25 -04:00
Rob Clark 5646319f25 freedreno/a3xx: add sRBG texture support
That was easy.  Turns out it is just a matter of setting one bit.
Enable sampling from sRGB texture, and therefore enable GL 2.1 :-)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-16 20:48:40 -04:00
Rob Clark 9227e6c98c freedreno: update generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2014-05-16 20:08:09 -04:00
Roland Scheidegger 3bf2d86c09 gallivm: (trivial) fix compilation with llvm 3.1, 3.2
I actually checked the getModuleIdentifier() function exists with 3.1 but
missed that the file moved...
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=78803
2014-05-17 02:03:35 +02:00
Roland Scheidegger 3a1da0abee gallivm: print out how long it takes to optimize shader IR.
Enabled with GALLIVM_DEBUG=perf (which up to now was only used to print
warnings for unoptimized code).

While some unexpectedly long shader compile times for some shaders were fixed
with 8a9f5ecdb1 this should help recognize such
problems in the future. For now though only available in debug builds (which
are not always suitable for such analysis). And since this uses system time,
it might not be all that accurate (even llvmpipe's own rasterization threads
might be running at the same time, or just other tasks).
(llvmpipe also has LP_DEBUG=counters but this only gives an average per shader
and the the total time for all shaders.)
This prints information like this:
optimizing module fs17_variant0 took 1 msec
optimizing module setup_variant_0 took 0 msec
optimizing module draw_llvm_vs_variant0 took 9 msec
optimizing module draw_llvm_vs_variant0 took 12 msec
optimizing module fs17_variant1 took 2 msec

v2: rebase for recent gallivm compilation changes, and print time for whole
modules instead of functions (otherwise it would be very spammy since it would
include all trivial inline sse2 functions), using the shiny new module names,
prying them off LLVM using new helper (not available through C bindings).
Per function timings, while possibly giving more information (if there'd be
a problem only in for instance the partial not the whole function), don't seem
all that useful for now.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-16 22:50:14 +02:00
Roland Scheidegger 26cac02c51 gallivm: give more verbose names to modules
When we had just one module "gallivm" was an appropriate name. But now we have
modules containing all functions for a particular variant, so give it a
corresponding name (this is really just for helping debugging).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-16 22:50:14 +02:00
Roland Scheidegger b416645387 gallivm: remove optimization workaround when not having sse 4.1
This workaround doesn't list any llvm version, but it was introduced
2010-06-10 (e277d5c1f6). It is unlikely
this bug is still present in llvm versions we support (3.1+).
There's no specific test listed, but I ran lp_test_arit (which uses
the mentioned functions) on llvm 3.1 and 3.3 with sse41 disabled and
this pass enabled without issues.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-16 01:09:34 +02:00
Roland Scheidegger 93731fbeec gallivm: remove workaround for reversing optimization pass order.
32bit code generation and llvm >= 2.7 used a different optimization pass
order - this code was initially introduced (2010-07-23) by
815e79e72c, apparently due to buggy code being
generated with then brand new llvm versions (which was llvm 2.7 plus pre 2.8
devel).
It seems very highly likely that whatever this bug was it has been fixed in
newer llvm versions, though there's no easy way to test this - the mentioned
piglit test has been removed years ago, and even if you'd build it I'm
sceptical the glsl compiler would still produce the required code to trigger
it.
I have no idea what a good order of passes is, but just remove the workaround
and use the same order everywhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-05-16 01:09:34 +02:00
Emil Velikov 39ae284a69 egl-static: include libradeonwinsys.la only once
With this and the previous patch, we no longer have multiple
definitions in the final egl_gallium.so.

v2: Drop duplicate libloader link.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Chia-I Wu <olv@lunarg.com> (v1)
Reviewed-by: Tom Stellard <thomas.stellard@amd.com> (v1)
2014-05-15 17:32:31 +01:00