Commit Graph

47068 Commits

Author SHA1 Message Date
Tom Stellard 438b1da7e5 radeon/llvm: Handle loads from the constants address space.
Reading from constant memory is not supported yet, so constant reads use
global memory.
2012-09-21 19:30:58 +00:00
Tom Stellard 3882d7b5e4 radeon/llvm: Add support for v4f32 stores on R600 2012-09-21 19:30:58 +00:00
Tom Stellard e866dbd1b5 radeon/llvm: Add support for i8 reads on R600 2012-09-21 19:30:57 +00:00
Tom Stellard b282c9611e radeon/llvm: Expand vector fadd and fmul on R600 2012-09-21 19:30:57 +00:00
Tom Stellard aa8367dd13 radeon/llvm: Add optimization for FP_ROUND 2012-09-21 19:30:57 +00:00
Tom Stellard 87decd6e66 radeon/llvm: Replace AMDGPU pow intrinsic with the llvm version 2012-09-21 19:30:53 +00:00
Paul Berry aa3c2e3186 i965/blorp: Fix narrowing warnings.
Blorp has to convert rectangle coordinates from integers to floats in
order to send them down the GPU pipeline.  Recent versions of GCC
issue a warning for this, since a float is not capable of precisely
representing all possible 32-bit integer values.  Suppress the warning
with an explicit type cast in the case of blorp, since rectangle
coordinates will never be large enough to cause a loss of precision.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-21 10:53:25 +02:00
Kenneth Graunke cd49025aff i965: Remove brw_set_predicate_inverse(p, true) from scratch offset code
Given that it exists between a push/pop of instruction state, this call
can only affect the MOV or ADD instruction generated just below it.
Neither of those instructions are predicated, so it makes no sense to
ask for the inverse predicate.

This fixes grumblings from the simulator debugger, which was
complaining about an invalid predicate.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-09-21 01:29:40 -07:00
Kenneth Graunke 328961d955 mesa: Don't override S3TC internalFormat if data is pre-compressed.
Commit 42723d88d intended to override an S3TC internalFormat to a
generic compressed format when the application requested online
compression of uncompressed data.  Unfortunately, it also broke
pre-compressed textures when libtxc_dxtn isn't installed but the
extensions are forced on.

Both glCompressedTexImage2D() and glTexImage2D() call teximage(), which
calls _mesa_choose_texture_format(), hitting this override code.  If we
have actual S3TC source data, we can't treat it as any other format, and
need to avoid the override.

Since glCompressedTexImage2D() passes in a format of GL_NONE (which is
illegal for glTexImage), we can use that to detect the pre-compressed
case and avoid the overrides.

Fixes a regression since 42723d88d3.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-and-tested-by: Jordan Justen <jordan.l.justen@intel.com>
2012-09-20 14:49:19 -07:00
Kenneth Graunke e2249e8c4d i965/blorp: Add support for blits between SRGB and linear formats.
Fixes colorspace issues in L4D2 when multisampling is enabled (the
scene was far too dark, but the flashlight area was way too bright).

The nVidia and AMD binary drivers both allow this kind of blit.

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-20 14:48:02 -07:00
Kenneth Graunke c96828ecb4 mesa: Ignore SRGB when determining compatible resolve formats.
MSAA resolves and other blit-like operations ignore SRGB state anyway,
so we should be able to safely allow resolves between compatible
SRGB/linear formats like SRGBA8 and RGBA8888.

This matches the behavior of the nVidia and AMD binary drivers.

Fixes completely black rendering when using multisampling in L4D2.

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-20 14:47:23 -07:00
Brian Paul 58f386b20b gallium: mention PIPE_TIMEOUT_INFINITE in the fence_finish() comment 2012-09-20 09:49:12 -06:00
Brian Paul 0bcad02955 llvmpipe: fix overflow bug in total texture size computation
v2: use uint64_t for the total_size variable, per Jose.

Also add two earlier checks for exceeding the max texture size.
For example a 1K^3 RGBA volume would overflow the lpr->image_stride
variable.

Use simple algebra to avoid overflow in intermediate values.
So instead of "x * y > z" use "x > z / y".

This should work if we happen to be on a platform that doesn't have
64-bit types.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-20 09:47:09 -06:00
Alex Deucher 7b4aefd3c9 r600g/llvm: rs780/rs880 are r600 asics
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2012-09-20 11:17:52 -04:00
Ian Romanick ae3023e967 mesa: Allow glGetTexParameter of GL_TEXTURE_SRGB_DECODE_EXT
This was already (correctly) supported for glGetSamplerParameter paths.

NOTE: This is a candidate for stable branches.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-09-20 11:42:56 +02:00
Tom Stellard bd8fb9e805 r300/compiler: Use precomputed q values in the register allocator 2012-09-19 19:25:53 -04:00
Tom Stellard 886a4d4a6a r300g: Init regalloc state during context creation
Initializing the regalloc state is expensive, and since it is always
the same for every compile we only need to initialize it once per
context.  This should help improve shader compile times for the driver.
2012-09-19 19:25:53 -04:00
Tom Stellard 9282adcae9 r300/compiler: Don't create register classes for inputs 2012-09-19 19:25:53 -04:00
Tom Stellard e0f64a837f ra: Add q_values parameter to ra_set_finalize()
This allows the user to pass precomputed q values to the allocator.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-19 19:25:53 -04:00
Tom Stellard cfeb99c7da ra: Clarify usage of ra_set_node_reg()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-09-19 19:25:53 -04:00
Tom Stellard 69b387fbdc r600g: Invalidate texture cache when creating vertex buffers for compute v2
Compute shaders fetch data from vertex buffers via the texture cache, so
we need to make sure the texture cache is flushed.

v2:
  - Fix rebase mistake
  - Fix spelling in comment

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard 810345492e r600g: Use LOOP_START_DX10 for loops
LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited
to 4096 iterations like the other LOOP_* instructions.  Compute shaders
need to use this instruction, and since we aren't optimizing loops with
the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like
we should just use it for everything.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard 3e3ca92718 r600g: Set the correct value of COLOR*_DIM for RATs
For buffers (which is what is being used for RATs), the
COLOR*_DIM.WIDTH_MASK field needs to be set to the low 16-bits of the
buffer size, and the COLOR*_DIM.HEIEGHT_MAX needs to be set to the
high bits.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard 9db64530bb r600g: Make sure to initialize DB_DEPTH_CONTROL register for compute
The kernel CS checker will fail if this register is not initialized.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard 69d814885b r600g: Add some comments and debug printfs to compute code
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 14:58:53 -04:00
Tom Stellard 6bd11bc9d5 r600g: Add missing break to case statement 2012-09-19 15:27:32 -04:00
Michal Sciubidlo 0e0c21e00e radeon/llvm: Emit ISA for ALU instructions in the R600 code emitter
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
2012-09-19 13:17:41 -04:00
Tom Stellard d525ed1a84 radeon/llvm: Only support 512 constant registers on R600
This is necessary upcoming encoding changes, since we will only be
using 9-bits for register encoding.
2012-09-19 13:11:36 -04:00
Brian Paul ead9cfdcc4 Revert "mesa: consolidate subtexture x/y/width/height error checking code"
This reverts commit 5b807400a8.

accidentally pushed.
2012-09-19 10:07:45 -06:00
Brian Paul e1e302c7f6 Revert "more comment"
This reverts commit 5205db6a7c.

accidentally pushed
2012-09-19 10:07:34 -06:00
Brian Paul f51d232e5f Revert "mesa: clean-up and fix glCompressedTexSubImage error checking"
This reverts commit 0c67fe5d2d.

accidentally pushed.
2012-09-19 10:07:22 -06:00
Brian Paul 0c67fe5d2d mesa: clean-up and fix glCompressedTexSubImage error checking 2012-09-19 09:21:03 -06:00
Brian Paul 5205db6a7c more comment 2012-09-19 09:21:03 -06:00
Brian Paul 5b807400a8 mesa: consolidate subtexture x/y/width/height error checking code
This is the code that checks if a subtexure region is aligned to the
compressed format's block size.
2012-09-19 09:21:03 -06:00
Vadim Girlin 9aa8bac98b winsys/radeon: fix relocs caching
Don't cache pointers to elements of reallocatable array.
In some circumstances it caused false cache hits resulting in incorrect
command stream and gpu lockup.

Note: This is a candidate for the stable branches.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-09-19 04:48:16 +04:00
Vincent Lejeune 175fdd7b86 radeon/llvm: Add a fdiv pattern.
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-09-18 18:00:20 +02:00
Vincent Lejeune 12c4526157 radeon/llvm: reserve also corresponding 128bits reg
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
2012-09-18 17:59:51 +02:00
Brian Paul 7d624799b9 softpipe: implement the new can_create_resource() function
And define a SP_MAX_TEXTURE_SIZE value as we do in llvmpipe.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul b9e88c5592 llvmpipe: implement the new can_create_resource() function
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul ead8847d44 st/mesa: implement new proxy texture code
If the gallium driver implements the can_create_resource() function, call
it to do proxy texture size checks.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul bd8b43a9f4 gallium: add new pipe_screen::can_create_resource() function
Used to implement proxy textures.  If a gallium driver doesn't implement
this function we'll just continue to use the core Mesa fallback code.

Without this hook we really have no good way to implement OpenGL proxy
textures with gallium drivers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:27 -06:00
Brian Paul a0fc7620f5 mesa: take cube faces into account in _mesa_test_proxy_teximage()
There will always be six cube faces so take that into consideration when
computing the texture size and comparing against the limit.
2012-09-17 19:49:27 -06:00
Brian Paul 90ca4c0c62 mesa: handle GL_PROXY_TEXTURE_CUBE_MAP in _mesa_num_tex_faces() 2012-09-17 19:49:27 -06:00
Brian Paul df73be9105 llvmpipe: set max cube texture size to 4K x 4K
Before, the limit was 8K.  For 32-bit RGBA that would be require 1.5 GB
of memory (w/out mipmaps).  That's well beyond the LP_MAX_TEXTURE_SIZE
of 1GB.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul 7dc76e9424 mesa: move/fix levels check for glTexStorage()
Fix copy&paste error and move min levels check closer to max levels check.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul ff24ed09fa mesa: rewrite glTexStorage() code
Simplify the code and make it more like the other glTexImage commands.
Call _mesa_legal_texture_dimensions() to validate width, height, depth.
Call ctx->Driver.TestProxyTexImage() to make sure texture is not too large.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul e6eaa85a43 mesa: rework texture size error checking
There are two aspects to texture image size checking:
1. Are the width, height, depth legal values (not negative, not larger
   than the max size for the mipmap level, etc)?
2. Is the texture just too large to handle?  For example, we might not be
   able to really allocate memory for a 3D texture of maxSize x maxSize x
   maxSize.

Previously, we did (1) via the ctx->Driver.TestProxyTextureImage() hook
but those tests are really device-independent.  Now we do (2) via that
hook since the max texture memory and texture shape are device-dependent.

Also, (1) is now done outside the general texture parameter error checking
functions because of the special interaction with proxy textures.  The
recently introduced PROXY_ERROR token is removed.

The teximage() and copyteximage() functions are bit simpler now (less
if-then nesting, etc.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul ce2ae3c3a2 mesa: refactor _mesa_test_proxy_teximage() code
Basically, move the body into a new _mesa_legal_texture_dimensions() function.
More refactoring to come.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul b1874ec931 mesa: move glTexImage 'level' error checking
Move level checking out of _mesa_test_proxy_teximage() and into
the other error-checking functions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-09-17 19:49:26 -06:00
Brian Paul 35f16600b3 mesa: change create_version_string() return type to void
Fixes "warning: no return statement in function returning non-void"
2012-09-17 19:46:20 -06:00