Commit Graph

87933 Commits

Author SHA1 Message Date
Marek Olšák
22f5dfd300 radeonsi: don't read the number of TCS out vertices from an SGPR in TCS
-16 bytes in one shader binary.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:07 +02:00
Marek Olšák
17dd4856a6 radeonsi: don't always apply the PrimID instancing bug workaround on SI
It looks like commit 391673af7a that should
have fixed the perf regression didn't really change much if anything.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:06 +02:00
Marek Olšák
a0823df148 radeonsi: remove 2 callbacks from si_shader_context
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 13:00:06 +02:00
Marek Olšák
1cda9a2fee winsys/amdgpu: disable local BOs on Raven
It hangs with a high degree of reproducibility.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 12:57:48 +02:00
Marek Olšák
7b4b8f6373 disk_cache: make the thread queue resizable and low priority
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 12:57:14 +02:00
Thomas Hellstrom
e96d175c7d loader/dri3: Make sure we invalidate a drawable on size change
If we're seeing a drawable size change, in particular after processing a
configure notify event, make sure we invalidate so that the state tracker
picks up the new geometry.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-09-07 12:43:29 +02:00
Thomas Hellstrom
a727c804a2 loader/dri3: Process event after each fence wait
This tries to mimic dri2 behaviour where events are typically processed
while waiting for X replies. Since, during steady-state dri3 rendering, we
seldom wait for xcb replies, and haven't enabled any automatic event
processing, instead check for events after a fence wait.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-09-07 12:43:29 +02:00
Marek Olšák
e4018fdd85 st/mesa: skip draw calls with pipe_draw_info::count == 0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102502

Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-09-07 12:34:28 +02:00
Samuel Pitoiset
86b99893eb radv: do not use a bitfield when dirtying the vertex buffers
Useless to track which one has been updated because we
re-upload all the vertex buffers in one shot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-07 10:01:21 +02:00
Samuel Pitoiset
2408f616e8 radv: remove unused radv_meta_saved_state::vertex_saved field
It's always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-07 10:01:21 +02:00
Eric Engestrom
77713a0acb mesa: allow user to set MESA_NO_ERROR=0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102530
Cc: Michel Dänzer <michel@daenzer.net>
Cc: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2017-09-07 08:54:44 +01:00
Eric Engestrom
56f16c4fbb util: rename include guard to avoid clash
src/mesa/main/debug.h uses the same include guard.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-07 08:54:44 +01:00
Roland Scheidegger
6d9d6071ee llvmpipe, tgsi: hook up dx10 gather4 opcode
Trivial. We already support tg4 for legacy tex opcodes, so the actual
texture sampling code already handles it.
(Just like TG4, we don't handle additional capabilities and always sample
red channel.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-07 03:32:01 +02:00
Roland Scheidegger
de6810d9be llvmpipe, draw: increase shader cache limits
We're not particularly concerned with memory usage, if the tradeoff is
shader recompiles. And it's common for apps to have a lot of shaders
nowadays (and, since our shaders include a LOT of context state of course
we may create quite a bit more shaders even).
So quadruple the amount of shaders draw will cache (from 128 to 512).
For llvmpipe (fs shaders) quadruple the number of instructions, keep the
number of variants the same for now (only with very simple, non-texturing
shaders the variant limit could really be reached), and simplify the
definition, it's probably easier to just have one different definition
per branch...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-09-07 03:32:01 +02:00
Dave Airlie
e852ecd22b ac/surface: reduce gfx9_surface_layout size.
152->144.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
cc73ab9884 radv: reduce radv_amdgpu_winsys struct size.
1168->1160.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
3cc620bf55 radv: reduce radv_image struct size.
1480->1472.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
66031d8925 radv: reduce radv_shader_variant struct size.
544->536

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
a2c2a76c9e radv: reduce radv_cmd_state struct size.
1632->1624.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:08 +10:00
Dave Airlie
f45e768413 radv: reduce meta_saved_state struct size.
904->896.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:07 +10:00
Dave Airlie
42d50c779b nir: put compact into bitfields in nir_variable_data
This being declared bool means it won't get merged with the previous
bitfields, this seems like an oversight rather than deliberate.

Noticed when running pahole.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-07 11:00:04 +10:00
Chad Versace
ec8ed2f277 anv: Annotate entrypoint table with index and func name
This helps when debugging a broken entrypoint table.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-09-06 13:07:12 -07:00
Leo Liu
e1e3c0384b radeon/uvd: fix the assertion check for YUYV format
Fixes:7319ff87("radeon/uvd: add YUYV format support for target buffer")

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2017-09-06 15:53:18 -04:00
Anuj Phogat
4c4c28ca70 intel: Remove unused device info for KBL GT1.5
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-09-06 10:09:38 -07:00
Emil Velikov
54a789aa2a mesa: replace date/time macros with MESA_GIT_SHA1
Former is non-deterministic, results in non-reproducible builds and
compilers throw a warning about it.

Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
acf7f84564 mesa: don't use %s for PACKAGE_VERSION macro
The macro itself is a well defined string, which cannot cause issues
with printf or other printf-like functions.

All other places through Mesa already use it directly, so let's update
the final two instances.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
c9d449de64 egl/x11: advertise __DRI_USE_INVALIDATE for DRI2
Back in 2012 (commit 1e7776ca2b - egl: Remove bogus invalidate code.)
the loader use of invalidate() was purged as "bogus". One of the factors
defining that statement was the lack of the loader-side invalidate
extension - __DRI_USE_INVALIDATE.

Since then the commit was reverted (commit eed0a80137 - egl: Restore
"bogus" DRI2 invalidate event code.), always performing the driver
invalidate call, although the loader was never updated to expose the
extension.

Do so allowing the driver to do fine grained tuning.

Cc: Eric Anholt <eric@anholt.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net
2017-09-06 17:48:50 +01:00
Emil Velikov
f24bc18162 egl/x11/dri3: adding missing __DRI_BACKGROUND_CALLABLE extension
Fixes: 3b7b6adf3a ("egl: Implement __DRI_BACKGROUND_CALLABLE")
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 17:48:50 +01:00
Emil Velikov
731ba6924a i965: expose RGBA visuals only on Android
As Marek pointed out in earlier commit - exposing RGBA on other
platforms introduces ~500 Visuals, which are not tested.

Note that this does not quite happen, yet. Reason being that the GLX
code does not check the masks - see scaralEqual().

Thus as we fix that, we'll run into the issue described.

v2: Rebase, while keeping loaderPrivate
v3: Beef-up commit message, getCapability() returns unsigned (Tapani)

Fixes: 1bf703e4ea ("dri_interface,egl,gallium: only expose RGBA visuals
on Android")
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Chad Versace <chadversary@chromium.org>
Cc: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2017-09-06 17:48:50 +01:00
Tim Rowley
dad32fc61c swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:36 -05:00
Tim Rowley
1ebf6fc865 swr/rast: Remove use of C++14 template variable
SWR rasterizer must remain C++11 compliant.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:29 -05:00
Tim Rowley
9df5691fff swr/rast: SIMD16 FE remove templated immediates workaround
Fixed properly in gcc-compatible fashion.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:23 -05:00
Tim Rowley
404ac6da9e swr/rast: SIMD16 PA - rename Assemble_simd16 to Assemble
For consistency and to support overloading.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:17 -05:00
Tim Rowley
6cb20c9f3a swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:12 -05:00
Tim Rowley
6afdc8732c swr/rast: Removed some trailing whitespace caught during review
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:02:06 -05:00
Tim Rowley
4edc5d8305 swr: set caps for VB 4-byte alignment
Needed to compensate for change to fetch jit requiring
alignment.

Fixes regressions in piglit: vertex-buffer-offsets and about
another hundred of the vs-input*byte* tests.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:01:59 -05:00
Tim Rowley
4475583f5e swr/rast: Allow gather of floats from fetch shader with 2-4GB offsets
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
2017-09-06 11:01:39 -05:00
Samuel Pitoiset
5c9af800cb radv: fix error code when resizing the upload BO
malloc() failures are unrelated to the device memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-09-06 15:52:19 +02:00
Gert Wollny
107ecd97f1 mesa/st/st_glsl_to_tgsi_temprename.cpp: Fix compilation with MSVC
If <windows.h> is included then max is a macro that clashes
with std::numeric_limits::max, hence undefine it.
For some reason the struct access_record is not recognizes
outside the anonymouse namespace, make it a class.
The patch successfully was tested on AppVeyor.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 15:12:19 +02:00
Gert Wollny
09ffe274b0 mesa/st: glsl_to_tgsi: tie in new temporary register merge approach
This patch replaces the old register lifetime estiamtion and
rename mapping evaluation with the new one.

Performance to compare between the current and the new implementation
were measured by running the shader-db in one thread.

-----------------------------------------------------------
                    old          new(std::sort)

---------------- time ./run -j1 shaders --------------------

  real              5.80s          5.75s
  user              5.75s          5.70s
  sys               0.05s          0.05s

---- valgrind --tool=callgrind --dump-instr=yes------------

 merge               0.08%         0.18%
 estimate lifetime   0.02%         0.11%
 evaluate mapping  (incl=0.3%)     0.04%
 apply mapping       0.03%         0.02%

---   perf (approximate because of statistic sampling) ----

merge (total)        0.09%         0.16%
estimate lifetime    0.03%         0.10%
evaluate mapping  (incl=0.02%)     0.04%
apply mapping        0.04%         0.04%

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:52 +02:00
Gert Wollny
33b7728bf9 mesa/st: glsl_to_tgsi: Add test set for evaluation of rename mapping
The patch adds tests for the register rename mapping evaluation and
combined life time estimation and renaming.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:49 +02:00
Gert Wollny
84529c077b mesa/st: glsl_to_tgsi: add register rename mapping evaluator
The remapping evaluator first sorts the temporary registers ascending
based on their first life time instruction, and then uses a binary search
to find merge canidates.
For the initial sorting it uses std::sort because qsort is quite slow in
comparison. By removing the define USE_STL_SORT in
  src/mesa/state_tracker/st_glsl_to_tgsi_temprename.cpp
one can enable the alternative code path that uses qsort.

Registers that are not written to are not considered for renaming since in
glsl_to_tgsi_visitor::renumber_registers they are eliminated anyway.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:46 +02:00
Gert Wollny
7be6d8fe12 mesa/st: glsl_to_tgsi: add tests for the new temporary lifetime tracker
This patch adds a set of unit tests for the new lifetime tracker.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:43 +02:00
Gert Wollny
978c437b12 mesa/st: glsl_to_tgsi: implement new temporary register lifetime tracker
This patch adds a class for tracking the life times of temporary registers
in the glsl to tgsi translation. The algorithm runs in three steps:
First, in order to minimize the number of needed memory allocations the
program is scanned to evaluate the number of scopes.
Then, the program is scanned  second time to record the important register
access time points: first and last reads and writes and their link to the
execution scope (loop, if/else branch, switch case).
In the third step for each register the actual minimal life time is
evaluated.

In addition, when compiled in debug mode (i.e. NDEBUG is not defined)
the shaders and estimated temporary life times can be logged to stderr
by setting the environment variable GLSL_TO_TGSI_RENAME_DEBUG.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:39 +02:00
Gert Wollny
732246701f mesa/st: glsl_to_tgsi move some helper classes to extra files
To prepare the implementation of a temp register lifetime tracker
some of the classes are moved into seperate header/implementation
files to make them accessible from other files.

Specifically these are:

    class st_src_reg;
    class st_dst_reg;
    class glsl_to_tgsi_instruction;
    struct rename_reg_pair;

    int swizzle_for_type(const glsl_type *type, int component);

  as inline:

    bool is_resource_instruction(unsigned opcode);
    unsigned num_inst_dst_regs(const glsl_to_tgsi_instruction *op);
    unsigned num_inst_src_regs(const glsl_to_tgsi_instruction *op);

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-09-06 11:49:27 +02:00
Dave Airlie
b65ff7a02d st_glsl_to_tgsi: rewrite rename registers to use array fully.
Instead of having to search the whole array, just use the whole
thing and store a valid bit in there with the rename.

Removes this from the profile on some of the fp64 tests

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-09-06 11:44:16 +02:00
Nicolai Hähnle
45c5c44451 radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug
When the HS wave is empty, the hardware writes the LS VGPRs starting at
v0 instead of v2. Workaround by shifting them back into place when
necessary. For simplicity, this is always done in the LS prolog.

According to the hardware team, this will be fixed in future chips,
so take that into account already.

Note that this is not a bug fix, as the bug was already worked
around by commit 166823bfd2 ("radeonsi/gfx9: add a temporary workaround
for a tessellation driver bug"). This change merely replaces the
workaround by one that should be better.

v2: add workaround code to shader only when necessary
v3: clarify the prefer_mono comment

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 10:02:49 +02:00
Nicolai Hähnle
552aaa11ed ac/debug: take ASIC generation into account when printing registers
There were some overlapping changes in gfx9 especially in the CB/DB
blocks which made register dumps rather misleading.

The split is along the lines of the header files, so we'll print VI-only
fields on SI and CI, for example, but we won't print GFX9 fields on
SI/CI/VI, and we won't print SI/CI/VI fields on GFX9.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:19 +02:00
Nicolai Hähnle
274f1dace7 amd/common: pass chip_class to ac_dump_reg
Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:17 +02:00
Nicolai Hähnle
925ad7d2f6 ac/sid_tables: add FieldTable object
Automatically re-use table entries like StringTable and IntTable do.
This allows us to get rid of the "fields_owner" logic, and simplifies
the next change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-09-06 09:59:14 +02:00