Commit Graph

9511 Commits

Author SHA1 Message Date
Stefan Dösinger e866bd1ade r300g: Give CLIP_DISABLE another try
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-12-04 00:07:13 +01:00
James Benton 16f0d70ffe llvmpipe: Implement PIPE_QUERY_TIMESTAMP and PIPE_QUERY_TIME_ELAPSED.
This required an update for the query storage in llvmpipe, there
can now be an active query per query type, so an occlusion query
can run at the same time as a time elapsed query.

Based on PIPE_QUERY_TIME_ELAPSED patch from Dave Airlie.

v2: fix up piglits for timers (also from Dave Airlie)

a) if we don't render anything the result is 0, so just
return the current time

b) add missing screen get_timestamp callback.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-12-03 17:21:57 +00:00
José Fonseca 6a2f2300a8 llvmpipe: Refactor convert_to/from_blend_type to convert in place.
This fixes the "Source and destination overlap in memcpy" valgrind
warnings.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-03 14:02:43 +00:00
José Fonseca 03aa3fd54b llvmpipe: Improve color buffer loads/stores alignment.
Tell LLVM the exact alignment we can guarantee, based on the fs block
dimensions, pixel format, and the alignment of the resource base pointer
and stride.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-12-03 14:02:43 +00:00
José Fonseca 0bc6ec238b llvmpipe: Recompute the fs shader key when framebuffer varies.
The fs shader now depends on the color buffer formats. The shader key was
extended to accommodate this, but llvmpipe_update_derived needs to be
updated to check the framebuffer dirty flag.

This fixes bug 57674.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2012-12-03 14:02:43 +00:00
Marek Olšák 54ff536823 r300g: increment num_z_clears only if we have Hyper-Z 2012-12-02 22:22:39 +01:00
Marek Olšák 838b19609f r300g: add blacklist for apps that shouldn't steal hyperz access 2012-12-02 22:18:11 +01:00
Marek Olšák 12dcbd5954 r300g: enable Hyper-Z by default on r500
I fixed the only known bugs on r500 with 0222b2bd41.
Now there are no piglit regressions with Hyper-Z and all apps I tested seem
to work.

To summarize how it works:
- Only one process can use it at a time. This is a hardware limitation.
- The first process to clear a zbuffer gets the exclusive access to use
  Hyper-Z.
- Compositors don't use any zbuffer, so they won't steal it, but some web
  browsers do, so make sure there's no web browser running if you want your
  game to use Hyper-Z.
- There's no need to restart an app which couldn't get the access to Hyper-Z.
  Just quit the app which took it, the driver can turn it on for the other app
  in the middle of rendering.
- If an app gets the access to Hyper-Z, it prints "radeon: Acquired Hyper-Z"
  to stdout.

r300-r400:
  Hyper-Z will be enabled by default on r300-r400 once sufficient testing is
  done with piglit and Lightsmark at least.
  Be sure to set the env var RADEON_HYPERZ and run piglit with parameters: -c 0
2012-12-02 18:07:26 +01:00
Marek Olšák 0222b2bd41 r300g: clear the ZB cache before clearing ZMASK or HIZ
This fixes wrong rendering in Lightsmark and
the piglit/depthstencil-render-miplevels.

I think I fixed Hyper-Z. So far every app seems to work like a charm.
2012-12-02 07:07:33 +01:00
Marek Olšák 62cba629c0 Revert "r300g: fix occlusion queries when depth test is disabled or zbuffer is missing"
It broke Hyper-Z terribly.
2012-12-02 07:07:33 +01:00
Marek Olšák 8ad9d42b33 r300g: refuse to create too large textures 2012-12-01 22:41:39 +01:00
Marek Olšák e694ea09f5 r300g: fix memory leaks in texture_create error paths 2012-12-01 22:38:36 +01:00
Marek Olšák 3e3a586236 r300g: fix revoking hyperz access
The bug was uncovered by 67c8e96f5a.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57763
2012-12-01 21:43:17 +01:00
Marek Olšák 224d0e4a3f r300g: handle map flag DISCARD_WHOLE_RESOURCE
This should improve performance in apps which trigger this codepath.
(e.g. Wine does)
2012-12-01 14:33:11 +01:00
Dave Airlie d128ae347a svga: remove pointless assert on unsigned >= 0
all unsigneds are >= 0 :-)

There may be an argument for leaving this in, in case someone
changes min_lod to an integer, so feel free to apply or drop.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-01 11:25:15 +10:00
Dave Airlie 67c8e96f5a r300g: fix comparison of hyperz flush time.
I haven't confirmed this is doing the correct thing, but at
least this might make someone review it!

Reported by internal RH coverity scan.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-12-01 11:23:48 +10:00
José Fonseca e7177e362e llvmpipe: Remove remnants of lp_tile_soa from Makefile.
Completely forgot about updating Makefile when removing it. Stephane
already fixed the make build, but there were a few mentions of
lp_tile_soa left in the tree.
2012-11-30 07:07:38 +00:00
Vinson Lee f126f34c1d llvmpipe: Fix incorrect sizeof.
Fixes sizeof not portable defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-29 21:08:48 -08:00
Stéphane Marchesin 4430d44eac llvmpipe: Fix build break from 75da95c50
The Makefile looks for a file which is gone (lp_tile_soa.c)

http://bugs.freedesktop.org/show_bug.cgi?id=57713
2012-11-29 19:54:34 -08:00
Vincent Lejeune 3fcb3fbf22 r600g: mirror simplification of if/break opcodes
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-29 22:15:18 +01:00
Vincent Lejeune 5fda2990aa r600g: separate resource_id and sampler_id tex info in tgsi-to-llvm
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-11-29 22:15:18 +01:00
Roland Scheidegger 6d50148742 llvmpipe: support array textures
This adds array (1d,2d) texture support to llvmpipe.
Though probably should do something about 1d array textures requiring gobs
of memory (this issue is not strictly limited to arrays but it is probably
worse there).
Initial code by Jakob Bornecrantz <jakob@vmware.com>

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-29 15:30:19 +01:00
José Fonseca 88e92f5bcd llvmpipe: Remove lp_build_blend_soa()
No longer used/necessary, as we always blend in AoS now.

Trivial.
2012-11-29 14:08:43 +00:00
José Fonseca 75da95c50a llvmpipe: Eliminate color buffer swizzling.
Now dead code.

Also had to remove the show_tiles/show_subtiles because now the color
buffers are always stored in their native format, so there is no longer
an easy way to paint the tile sizes.

Depth-stencil buffers are still swizzled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:43 +00:00
José Fonseca 6916387e53 llvmpipe: Only advertise unswizzled formats.
Update llvmpipe_is_format_supported and llvmpipe_is_format_unswizzled
so that only the formats that we can render without swizzling are
advertised.

We can still render all D3D10 required formats except
PIPE_FORMAT_R11G11B10_FLOAT, which needs to be implemented in a future
opportunity.

Removal of rendertarget swizzling will be done in a subsequent change.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2012-11-29 14:08:42 +00:00
Michel Dänzer 8b6aec6533 radeonsi: Bitcast result of packf16 intrinsic to float for export intrinsic.
Fixes 7 piglit tests, and prevents many more from crashing.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-and-Tested-by: Christian König <christian.koenig@amd.com>
2012-11-29 10:08:53 +01:00
José Fonseca 1cead8845b llvmpipe: Implement logic ops for the AoS path.
It was forgotten in the previous patch series, but it is trivial to
implement, based on the SoA path.

This fixes glean logicOp failures.
2012-11-28 20:45:18 +00:00
José Fonseca 547efc76df llvmpipe: Don't use dynamically sized arrays.
Unfortunately for MSVC arrays with a constant variable size are still
considered dynamically sized.
2012-11-28 19:58:47 +00:00
James Benton 960ab06da0 llvmpipe: Update llvmpipe_is_format_unswizzled to reflect latest changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton 66fdf626bb llvmpipe: Enable vertex color clamping.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton fa1b481c09 llvmpipe: Unswizzled rendering.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton 1d3789bccb gallivm: Updated lp_build_const_mask_aos to input number of channels.
Also updated lp_build_const_mask_aos_swizzled to reflect this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
Michel Dänzer 6e33b55ee1 radeonsi: Reinstate assertions against invalid colour/depth formats.
radeonsi now supports Z16 and doesn't fail these assertions anymore.

This partially reverts commit 7bba4879bb, but
leaves the error messages in place to allow diagnosing such problems even with
non-debugging builds.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-11-28 15:48:50 +01:00
Michel Dänzer a8d46d0173 radeonsi: Re-enable Z16 depth buffers.
8 more piglits.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-11-28 13:53:54 +01:00
Marek Olšák 726fe54cbc radeonsi: remove redundant parameter in r600_init_surface
[ Cherry-picked from r600g commit f5ac60152b ]
2012-11-28 13:35:17 +01:00
Michel Dänzer fa83d52961 radeonsi: Use explicit stencil mipmap level offsets.
Extracted from r600g commit 428e37c2da.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:17 +01:00
Marek Olšák 39b56afaa2 radeonsi: correct texture memory size for Z32F_S8X24
[ Cherry-picked from r600g commit ea72351a91 ]
2012-11-28 13:35:17 +01:00
Michel Dänzer 20f651d003 radeonsi: Depth/stencil fixes.
Adapted from r600g commit 018e3f75d6.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:17 +01:00
Michel Dänzer 1a616c1009 radeonsi: Flesh out support for depth/stencil exports from the pixel shader.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Michel Dänzer 49003a5cb6 radeonsi: Fix sampler views for depth textures.
Consistently reference the flushed depth texture in the sampler view, not the
original one.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Jerome Glisse 3c024624fd radeonsi: Fix z/stencil texture creation.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>

[ Cherry-picked from r600g commit b4f0ab0b22 ]
2012-11-28 13:35:16 +01:00
Roland Scheidegger 0b6554ba6f gallivm,llvmpipe: handle TXF (texelFetch) instruction, including offsets
This also adds some code to handle per-quad lods for more than 4-wide fetches,
because otherwise I'd have to integrate the texelFetch function into
the splitting stuff... (but it is not used yet outside texelFetch).
passes piglit fs-texelFetch-2D, fails fs-texelFetchOffset-2D due to I believe
a test error (results are undefined for out-of-bounds fetches, we return
whatever is at offset 0, whereas the test expects [0,0,0,1]).
Texel offsets are only handled by texelFetch for now, though the interface
can handle it for everything.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 03:26:49 +01:00
Marek Olšák cff4c948ed r600g: fix broken streamout if streamout_begin caused a context flush
This fixes graphics corruption in the case where the DISCARD_RANGE flag
is used to map a buffer.

NOTE: This is a candidate for the stable branches.
2012-11-23 00:42:02 +01:00
Marek Olšák d172fa825b r600g: fix ARB_map_buffer_alignment with unaligned offsets and staging buffers 2012-11-22 22:40:06 +01:00
Tom Stellard 71877143b6 r300/compiler: Avoid generating MOV instructions for invalid IMM swizzles v2
If an instruction reads from a constant register that contains
immediates using an invalid swizzle, we can avoid generating MOV
instructions to fix up the swizzle by loading the immediates into a
different constant register that can be read using a valid swizzle.

This only affects r300 and r400 cards.

For example:

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }

MAD temp[4].xy, const[0].xy__, const[1].xz__, input[0].xy__;

========== Before this change would be lowered to: =========

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }

MOV temp[0].x, const[1].x___;
MOV temp[0].y, const[1]._z__;
MAD temp[4].xy, const[0].xy__, temp[0].xy__, input[0].xy__;

========== After this change is lowered to:  ===============

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }
CONST[2] = {     0.0000    -3.5000     2.5000     0.0000 }

MAD temp[4].xy, const[0].xy__, const[2].yz__, input[0].xy__;

============================================================

This change reduces one of the Lightsmark shaders from 133 to 91
instructions.

v2:
  - Fix crash caused by swizzles with only inline constants.
2012-11-16 17:07:11 -05:00
Alex Deucher 26463b8996 radeonsi: clean up some magic numbers
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-16 13:02:42 -05:00
Alex Deucher ce17964fe5 radeonsi: emit PA_SC_RASTER_CONFIG
Use per asic golden values.

Programming this register doesn't seem to be strictly
necessary on SI, but programming it wrong leads to
rendering issues or reduced performance so just
go ahead and program the golden values explicitly
to avoid any potential problems down the road.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-16 13:02:42 -05:00
Alex Deucher 7bba4879bb radeonsi: remove new asserts and replace with warnings
Fixes piglit regressions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-11-15 15:46:02 -05:00
Alex Deucher 3893593732 radeonsi: cleanup si_db()
Clean up a few magic numbers and rework the code a bit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-15 12:11:28 -05:00
Alex Deucher 565c29f221 radeonsi: assert the CB format is valid (v2)
Assert the the CB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.

v2: use INVALID hw format rather than ~0U

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-15 12:10:48 -05:00