Commit Graph

74545 Commits

Author SHA1 Message Date
Jason Ekstrand 9d18555c8d anv/gen7: Add push constant support 2015-11-10 15:14:11 -08:00
Glenn Kennard 3f45d29fe4 r600g: Pass conservative depth parameters to hw
Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-11 09:06:25 +10:00
Dave Airlie b3e793f2db Revert "r600g: Pass conservative depth parameters to hw"
This reverts commit a1fc78911e.

I pushed the wrong patch.
2015-11-11 09:05:50 +10:00
Jason Ekstrand 427978d933 anv/device: Use an actual int64_t in WaitForFences 2015-11-10 15:02:52 -08:00
Jason Ekstrand d9079648d0 anv/meta: Create a sampler in meta_emit_blit 2015-11-10 14:43:18 -08:00
Glenn Kennard c878d61124 r600g: Implement ARB_texture_view
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-11-11 08:36:08 +10:00
Glenn Kennard a1fc78911e r600g: Pass conservative depth parameters to hw
Supported on R700 and up.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-11 08:32:35 +10:00
Eduardo Lima Mitev de51676b41 i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const
When both fadd and fmul instructions have at least one operand that is a
constant and it is only used once, the total number of instructions can
be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because
the constants will be progagated as immediate operands of fmul and fadd.

This patch detects these situations and prevents fusing fmul+fadd into ffma.

Shader-db results on i965 Haswell:

total instructions in shared programs: 6235835 -> 6225895 (-0.16%)
instructions in affected programs:     1124094 -> 1114154 (-0.88%)
total loops in shared programs:        1979 -> 1979 (0.00%)
helped:                                7612
HURT:                                  843
GAINED:                                4
LOST:                                  0

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev fb3b5669ce util: Add list_is_singular() helper function
Returns whether the list has exactly one element.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev 94ff35204d nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver
Because the next patch will add an optimization that is specific to i965,
we want to move this loweing pass to that driver altogether.

This is safe because i965 is the only consumer.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-11-10 21:13:35 +01:00
Kristian Høgsberg Kristensen 96b22fb080 glsl: Use array deref for access to vector components
We've assumed that we could lower per-component vector access from

  vec[i] = scalar

to

  vec = ir_triop_vector_insert(vec, scalar, i)

but with SSBOs (and compute shader SLM and tesselation outputs) this is
no longer valid. If a vector is "externally visible", multiple threads
can write independent components simultaneously. With lowering to
ir_triop_vector_insert, each thread read the entire vector, changes one
component, then writes out the entire vector. This is racy.

Instead of generating a ir_binop_vector_extract when we see v[i], we
generate ir_dereference_array. We then add a lowering pass to lower the
ir_dereference_array to ir_binop_vector_extract for rvalues and for to
vector_insert for lvalues in a separate lowering pass.

The resulting IR is the same as before, but we now have a window between
ast->ir conversion and the lowering pass where v[i] appears in the IR as
an array deref. This lets us run lowering passes that lower the vector
access to I/O (eg for SSBO load/store) before we lower the per-component
access to full vector writes.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen 60dd5287ff glsl: Lower UBO and SSBO access in glsl linker
All GLSL IR consumers run this lowering pass so we can move it to the
linker. This moves the pass up quite a bit, but that's the point: it
needs to run before we throw away information about per-component vector
access.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen f0e95c2500 glsl: Drop exec_list argument to lower_ubo_reference
We always pass in shader->ir and we already pass in the shader, so just
drop the exec_list. Most passes either take just a exec_list or a
shader, so this seems more consistent.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
2015-11-10 12:02:46 -08:00
Jason Ekstrand b461744c52 anv/gen7: Properly handle VS with VertexID but no vertices 2015-11-10 11:31:31 -08:00
Jason Ekstrand aafc87402d anv/device: Work around the i915 kernel driver timeout bug
There is a bug in some versions of the i915 kernel driver where it will
return immediately if the timeout is negative (it's supposed to wait
indefinitely).  We've worked around this in mesa for a few months but never
implemented the work-around in the Vulkan driver.

I rediscovered this bug again while working on Ivy Bridge becasuse the
drive in my Ivy Bridge currently has Fedora 21 installed which has one of
the offending kernels.
2015-11-10 11:24:11 -08:00
Connor Abbott 213f86416f nir/glsl: switch to using the builder
v2: use nir_bulder_cf_insert (Ken)

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:56:43 -05:00
Connor Abbott fbbfb7c025 nir/glsl: make emit() take nir_ssa_def * sources
Again, this matches what the builder will have to do.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:56:35 -05:00
Connor Abbott a60e990dd2 nir/glsl: convert nir_visitor::result to a nir_ssa_def *
Its only user now returns a nir_ssa_def *, and we'll need this since the
builder returns a nir_ssa_def *.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:55:54 -05:00
Connor Abbott 30fe8eaa8e nir/glsl: make evaluate_rvalue() return a nir_ssa_def *
A long time ago, before NIR was even merged to master, glsl_to_nir used
registers and these sources were actually register sources. But nowadays
everything in glsl_to_nir is an SSA value, so stop pretending that by
evaluating an rvalue we can get an arbitrary nir_src. Most importantly,
we need this since the builder takes nir_ssa_def * sources directly.

Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-11-10 13:55:14 -05:00
Jose Fonseca 6f42162329 st/mesa: Destroy buffer object's mutex.
Ideally we should have a _mesa_cleanup_buffer_object function in
src/mesa/bufferobj.c so that the destruction logic resided in a single
place.

Reviewed-by: Brian Paul <brianp@vmware.com>
2015-11-10 11:04:28 +00:00
Kenneth Graunke db54673b54 nir: Store PatchInputsRead and PatchOutputsWritten in nir_shader_info.
These tessellation shader related fields need plumbing through NIR.

v2: Use uint32_t instead of uint64_t to match the source type of
    GLbitfield (caught by Iago Toral).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2015-11-10 01:03:43 -08:00
Eric Anholt 437d7b6119 vc4: Avoid loading undefined (newly-allocated) FBO contents.
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear.  For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
2015-11-09 19:17:36 -08:00
Eric Anholt 5980389bbf vc4: Return NULL when we can't make our shadow for a sampler view.
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-09 19:17:36 -08:00
Eric Anholt eb8fb0064d vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails.
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
2015-11-09 19:17:36 -08:00
Eric Anholt 84608e07e7 vc4: Add CL dumping for GL_ARRAY_PRIMITIVE. 2015-11-09 19:17:36 -08:00
Eric Anholt 855a3ca598 vc4: Fix a compiler warning. 2015-11-09 19:17:36 -08:00
Jordan Justen fb3da129d1 glsl: Use shared storage variable type for shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:24 -08:00
Jordan Justen 32746fc9b4 glsl: Add shared variable type
Shared variables are stored in a common pool accessible by all threads
in a compute shader local work group.

These variables are similar to OpenCL's local/__local variables.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:24 -08:00
Jordan Justen c0ac4740a7 glsl: Add space to shader_storage in print_visitor
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:17 -08:00
Jordan Justen 007d96730e glsl: Align comments on variables types
v2:
 * Split from patch to add ir_var_shader_shared (tarceri)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:17 -08:00
Jordan Justen 8b28b35531 glsl: Parse shared keyword for compute shader variables
v2:
 * Move shared parsing under storage qualifiers (tarceri)
 * Fail to compile if shared is used in non-compute shader (tarceri)
 * Use separate shared_storage bit for shared variables (tarceri)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
2015-11-09 17:21:12 -08:00
Timothy Arceri a4a46fe3fa glsl: simplify interface block stream qualifier validation
Qualifiers on member variables are redundent all we need to do
if check if it matches the stream associated with the block and
throw an error if its not.

Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
2015-11-10 12:02:30 +11:00
Jason Ekstrand 06f466a770 anv/nir: Fix codegen in lower_push_constants 2015-11-09 16:29:05 -08:00
Jason Ekstrand abede04314 anv/gen7: Fix the length of 3DSTATE_SF 2015-11-09 16:04:07 -08:00
Jason Ekstrand e8c2a52a70 anv/gen7: Properly handle missing color-blend state 2015-11-09 16:04:06 -08:00
Jason Ekstrand 862da6a891 anv/device: Add a newline to the end of a comment 2015-11-09 16:04:06 -08:00
Nanley Chery 9c2b37a9c3 anv/formats: Define ETC2 formats
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery 41cf35d1d8 anv/image: Determine the alignment units for compressed formats
Alignment units, i and j, match the compressed format block
width and height respectively.

v2: Don't assert against HALIGN* and VALIGN* enums (Chad)

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery 381f602c6b anv/image: Handle compressed format qpitch and padding
Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery 300f7c2be3 anv/image: Handle compressed format stride and size
These formulas did not take compressed formats into account.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery 7b4244dea0 anv/formats: Add fields for block dimensions
A non-compressed texture is a 1x1x1 block. Compressed
textures could have values which vary in different
dimensions WxHxD.

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Nanley Chery a6c7d1e016 anv/formats: Add surface_format initializer
v2: Rename __brw_fmt to __hw_fmt (Chad)

Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Chad Versace chad.versace@intel.com
2015-11-09 15:41:41 -08:00
Nanley Chery 3ee923f1c2 anv: Rename cpp variable to "bs"
cpp (chars-per-pixel) is an integer that fails to give useful data
about most compressed formats. Instead, rename it to "bs" which
stands for block size (in bytes).

v2: Rename vk_format_for_bs to vk_format_for_size (Chad)
    Use "block size" instead of "bs" in error message (Chad)

Reviewed-by: Chad Versace <chad.versace@intel.com>
2015-11-09 15:41:41 -08:00
Brian Paul 28f6faca51 st/wgl: add null pointer check for HUD texture
Fixes crash when using HUD with Nobel Clinician Viewer.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-11-09 11:25:59 +00:00
Brian Paul 75d1e363ff st/wgl: fix double-present on swapbuffers bug
The stw_st_framebuffer_present_locked() function was getting called
twice per SwapBuffers.  First, when st_context_iface::flush() was
called from DrvSwapBuffers() because the ST_FLUSH_FRONT flag was
given.  Second, by stw_st_swap_framebuffer_locked() which does the
actual SwapBuffers.

Two code changes:
1. Pass ST_FLUSH_END_OF_FRAME, instead of ST_FLUSH_FRONT.
2. Move the implementation of stw_flush_current_locked() into
DrvSwapBuffers() since it's not called anywhere else.

Not much change in perf for benchmarks like Lightsmark, but some simple
Mesa demos are measurably faster.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-09 11:25:59 +00:00
Brian Paul 8083943e2e st/wgl: reorder pixel formats to put MSAA formats last
And put 8-bit/channel formats before 5/6/5 formats.

The ChoosePixelFormat() function seems to be finicky about format
selection.  Putting the MSAA formats after the non-MSAA formats
means most apps get a low-numbered format.  Now we generally get
the same pixel format regardless of whether using vgpu9 or 10.

VMware bug 1455030

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2015-11-09 11:25:59 +00:00
José Fonseca e524df5ef3 st/wgl: Don't rely on GDI to bookkeep pixelformat for us.
This allows to use apitrace's retracediff script on Windows to retrace and
compare two builds of a Mesa based opengl32.dll/ICD side-by-side.

See also https://github.com/apitrace/apitrace/commit/e4a4f15f5b92e0abbd24d7d053da25f8278c9f64
2015-11-09 11:08:27 +00:00
Michel Dänzer 24abbaff9a winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3
Fixes GPUVM conflicts with non-4K page size.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738

v2: Replace sanitization of VM base address alignment with comment why
    that's not necessary.
v3: Use unsigned instead of long as the type for the size_align member.
    (Marek)

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2015-11-09 17:24:32 +09:00
Leo Liu 519502d08f st/omx: add headless support
This will allow dec/enc/transcode without X

v2:  use env override even with X,
     use loader_open_device instead of open
v3:  clean up

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-08 18:15:57 -05:00
Leo Liu 25526d77b1 st/va: use vl screen drm support from vl_wys_drm
v2: move the dup to vl_wys_drm for pipe loader

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2015-11-08 18:15:57 -05:00