AlexIndustrial/mesa

Author	SHA1	Message	Date
Dave Airlie	b3e793f2db	Revert "r600g: Pass conservative depth parameters to hw" This reverts commit `a1fc78911e`. I pushed the wrong patch.	2015-11-11 09:05:50 +10:00
Glenn Kennard	c878d61124	r600g: Implement ARB_texture_view Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-11 08:36:08 +10:00
Glenn Kennard	a1fc78911e	r600g: Pass conservative depth parameters to hw Supported on R700 and up. Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-11 08:32:35 +10:00
Eduardo Lima Mitev	de51676b41	i965/nir/opt_peephole_ffma: Bypass fusion if any operand of fadd and fmul is a const When both fadd and fmul instructions have at least one operand that is a constant and it is only used once, the total number of instructions can be reduced from 3 (1 ffma + 2 load_const) to 2 (1 fmul + 1 fadd); because the constants will be progagated as immediate operands of fmul and fadd. This patch detects these situations and prevents fusing fmul+fadd into ffma. Shader-db results on i965 Haswell: total instructions in shared programs: 6235835 -> 6225895 (-0.16%) instructions in affected programs: 1124094 -> 1114154 (-0.88%) total loops in shared programs: 1979 -> 1979 (0.00%) helped: 7612 HURT: 843 GAINED: 4 LOST: 0 Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev	fb3b5669ce	util: Add list_is_singular() helper function Returns whether the list has exactly one element. Reviewed-by: Matt Turner <mattst88@gmail.com>	2015-11-10 21:13:35 +01:00
Eduardo Lima Mitev	94ff35204d	nir/nir_opt_peephole_ffma: Move this lowering pass to the i965 driver Because the next patch will add an optimization that is specific to i965, we want to move this loweing pass to that driver altogether. This is safe because i965 is the only consumer. Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>	2015-11-10 21:13:35 +01:00
Kristian Høgsberg Kristensen	96b22fb080	glsl: Use array deref for access to vector components We've assumed that we could lower per-component vector access from vec[i] = scalar to vec = ir_triop_vector_insert(vec, scalar, i) but with SSBOs (and compute shader SLM and tesselation outputs) this is no longer valid. If a vector is "externally visible", multiple threads can write independent components simultaneously. With lowering to ir_triop_vector_insert, each thread read the entire vector, changes one component, then writes out the entire vector. This is racy. Instead of generating a ir_binop_vector_extract when we see v[i], we generate ir_dereference_array. We then add a lowering pass to lower the ir_dereference_array to ir_binop_vector_extract for rvalues and for to vector_insert for lvalues in a separate lowering pass. The resulting IR is the same as before, but we now have a window between ast->ir conversion and the lowering pass where v[i] appears in the IR as an array deref. This lets us run lowering passes that lower the vector access to I/O (eg for SSBO load/store) before we lower the per-component access to full vector writes. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen	60dd5287ff	glsl: Lower UBO and SSBO access in glsl linker All GLSL IR consumers run this lowering pass so we can move it to the linker. This moves the pass up quite a bit, but that's the point: it needs to run before we throw away information about per-component vector access. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-11-10 12:02:46 -08:00
Kristian Høgsberg Kristensen	f0e95c2500	glsl: Drop exec_list argument to lower_ubo_reference We always pass in shader->ir and we already pass in the shader, so just drop the exec_list. Most passes either take just a exec_list or a shader, so this seems more consistent. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>	2015-11-10 12:02:46 -08:00
Connor Abbott	213f86416f	nir/glsl: switch to using the builder v2: use nir_bulder_cf_insert (Ken) Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:56:43 -05:00
Connor Abbott	fbbfb7c025	nir/glsl: make emit() take nir_ssa_def * sources Again, this matches what the builder will have to do. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:56:35 -05:00
Connor Abbott	a60e990dd2	nir/glsl: convert nir_visitor::result to a nir_ssa_def * Its only user now returns a nir_ssa_def , and we'll need this since the builder returns a nir_ssa_def . Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:55:54 -05:00
Connor Abbott	30fe8eaa8e	nir/glsl: make evaluate_rvalue() return a nir_ssa_def * A long time ago, before NIR was even merged to master, glsl_to_nir used registers and these sources were actually register sources. But nowadays everything in glsl_to_nir is an SSA value, so stop pretending that by evaluating an rvalue we can get an arbitrary nir_src. Most importantly, we need this since the builder takes nir_ssa_def * sources directly. Signed-off-by: Connor Abbott <cwabbott0@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2015-11-10 13:55:14 -05:00
Jose Fonseca	6f42162329	st/mesa: Destroy buffer object's mutex. Ideally we should have a _mesa_cleanup_buffer_object function in src/mesa/bufferobj.c so that the destruction logic resided in a single place. Reviewed-by: Brian Paul <brianp@vmware.com>	2015-11-10 11:04:28 +00:00
Kenneth Graunke	db54673b54	nir: Store PatchInputsRead and PatchOutputsWritten in nir_shader_info. These tessellation shader related fields need plumbing through NIR. v2: Use uint32_t instead of uint64_t to match the source type of GLbitfield (caught by Iago Toral). Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2015-11-10 01:03:43 -08:00
Eric Anholt	437d7b6119	vc4: Avoid loading undefined (newly-allocated) FBO contents. Since X has undefined contents in new pixmaps, it will allocate new textures for an FBO and draw to them without an explicit clear. For VC4, it's much faster to emit a clear than the load of the actual undefined memory contents, so just do that instead.	2015-11-09 19:17:36 -08:00
Eric Anholt	5980389bbf	vc4: Return NULL when we can't make our shadow for a sampler view. I'm not sure what the caller does is appropriate (just have a NULL sampler at this slot), but it fixes the immediate crash. Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-09 19:17:36 -08:00
Eric Anholt	eb8fb0064d	vc4: Return GL_OUT_OF_MEMORY when buffer allocation fails. I was afraid our callers weren't prepared for this, but it looks like at least for resource creation, mesa/st throws an error appropriately. Cc: "11.0" <mesa-stable@lists.freedesktop.org>	2015-11-09 19:17:36 -08:00
Eric Anholt	84608e07e7	vc4: Add CL dumping for GL_ARRAY_PRIMITIVE.	2015-11-09 19:17:36 -08:00
Eric Anholt	855a3ca598	vc4: Fix a compiler warning.	2015-11-09 19:17:36 -08:00
Jordan Justen	fb3da129d1	glsl: Use shared storage variable type for shared variables Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:24 -08:00
Jordan Justen	32746fc9b4	glsl: Add shared variable type Shared variables are stored in a common pool accessible by all threads in a compute shader local work group. These variables are similar to OpenCL's local/__local variables. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:24 -08:00
Jordan Justen	c0ac4740a7	glsl: Add space to shader_storage in print_visitor Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:17 -08:00
Jordan Justen	007d96730e	glsl: Align comments on variables types v2: * Split from patch to add ir_var_shader_shared (tarceri) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:17 -08:00
Jordan Justen	8b28b35531	glsl: Parse shared keyword for compute shader variables v2: * Move shared parsing under storage qualifiers (tarceri) * Fail to compile if shared is used in non-compute shader (tarceri) * Use separate shared_storage bit for shared variables (tarceri) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2015-11-09 17:21:12 -08:00
Timothy Arceri	a4a46fe3fa	glsl: simplify interface block stream qualifier validation Qualifiers on member variables are redundent all we need to do if check if it matches the stream associated with the block and throw an error if its not. Reviewed-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com> Cc: Emil Velikov <emil.l.velikov@gmail.com>	2015-11-10 12:02:30 +11:00
Brian Paul	28f6faca51	st/wgl: add null pointer check for HUD texture Fixes crash when using HUD with Nobel Clinician Viewer. Reviewed-by: Jose Fonseca <jfonseca@vmware.com>	2015-11-09 11:25:59 +00:00
Brian Paul	75d1e363ff	st/wgl: fix double-present on swapbuffers bug The stw_st_framebuffer_present_locked() function was getting called twice per SwapBuffers. First, when st_context_iface::flush() was called from DrvSwapBuffers() because the ST_FLUSH_FRONT flag was given. Second, by stw_st_swap_framebuffer_locked() which does the actual SwapBuffers. Two code changes: 1. Pass ST_FLUSH_END_OF_FRAME, instead of ST_FLUSH_FRONT. 2. Move the implementation of stw_flush_current_locked() into DrvSwapBuffers() since it's not called anywhere else. Not much change in perf for benchmarks like Lightsmark, but some simple Mesa demos are measurably faster. Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-09 11:25:59 +00:00
Brian Paul	8083943e2e	st/wgl: reorder pixel formats to put MSAA formats last And put 8-bit/channel formats before 5/6/5 formats. The ChoosePixelFormat() function seems to be finicky about format selection. Putting the MSAA formats after the non-MSAA formats means most apps get a low-numbered format. Now we generally get the same pixel format regardless of whether using vgpu9 or 10. VMware bug 1455030 Reviewed-by: José Fonseca <jfonseca@vmware.com>	2015-11-09 11:25:59 +00:00
José Fonseca	e524df5ef3	st/wgl: Don't rely on GDI to bookkeep pixelformat for us. This allows to use apitrace's retracediff script on Windows to retrace and compare two builds of a Mesa based opengl32.dll/ICD side-by-side. See also https://github.com/apitrace/apitrace/commit/e4a4f15f5b92e0abbd24d7d053da25f8278c9f64	2015-11-09 11:08:27 +00:00
Michel Dänzer	24abbaff9a	winsys/radeon: Use CPU page size instead of hardcoding 4096 bytes v3 Fixes GPUVM conflicts with non-4K page size. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92738 v2: Replace sanitization of VM base address alignment with comment why that's not necessary. v3: Use unsigned instead of long as the type for the size_align member. (Marek) Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Christian König <christian.koenig@amd.com> (v1) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-09 17:24:32 +09:00
Leo Liu	519502d08f	st/omx: add headless support This will allow dec/enc/transcode without X v2: use env override even with X, use loader_open_device instead of open v3: clean up Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Leo Liu	25526d77b1	st/va: use vl screen drm support from vl_wys_drm v2: move the dup to vl_wys_drm for pipe loader Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Leo Liu	7da86e0ec0	vl: add drm support for vl_screen This will allow the state trackers to use render nodes with screen creation v2: dup fd for pipe loader Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Leo Liu	d115e47099	st/va: fix build fails with pipe loader There is no dev in drv, and dev should be from vl_screen here Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2015-11-08 18:15:57 -05:00
Samuel Pitoiset	ffb60e7788	nvc0: enable compute support on Fermi Altough the compute support is still not complete because textures and surfaces need to be implemented, it allows to launch very simple compute kernel like one which reads reading MP performance counters. This turns on PIPE_CAP_COMPUTE and PIPE_SHADER_COMPUTE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-08 16:47:59 +01:00
Ilia Mirkin	e06238cb9e	nv50/ir: fix emission of s[] args in certain situations There might only be a single arg (e.g. cvt), so use mode rather than looking at the source directly. Also we don't want to rely on the type of the value, which can be unreliable, but instead use the instruction's. This works out well since mkSplit doesn't adjust the type. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 18:58:58 -05:00
Ilia Mirkin	af218217d7	nv50/ir: only take abs value when computing high result Not reachable from TGSI since it only has UMUL, no IMUL. However it's surprising that setting argument types to s32 will cause sign to get lost. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 18:58:58 -05:00
Ilia Mirkin	53cbb11707	nouveau: avoid queueing too much work onto a single fence Force the fence to get kicked off, which won't actually wait for its completion, but any additional work will be put onto a fresh list. This fixes crashes in teximage-colors --benchmark with too many active maps. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2015-11-07 18:58:58 -05:00
Dave Airlie	0f5b1409fd	llvmpipe: disable front updates for now As pointed out by Emil, this sometimes hangs, appears to be due to threading need to rethink how this stuff works for llvmpipe. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-08 07:55:17 +10:00
Dave Airlie	87711183ac	virgl: wrap ret assignment with braces to do correct thing Coverity reported that ret could only be 0 or 1, since it was setting ret = fn() > 0, instead of doing (ret = fn()) > 0. Signed-off-by: Dave Airlie <airlied@redhat.com>	2015-11-08 06:27:02 +10:00
Jason Ekstrand	6c731d8566	nir: Add a nir_deref_tail helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 12:09:44 -08:00
Jason Ekstrand	7d90e570f3	nir/types: Add an is_vector_or_scalar helper Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 12:09:38 -08:00
Jason Ekstrand	d43e16b163	i965/fs: Use regs_read/written for post-RA scheduling in calculate_deps Previously, we were assuming that everything read/wrote exactly 1 logical GRF (1 in SIMD8 and 2 in SIMD16). This isn't actually true. In particular, the PLN instruction reads 2 logical registers in one of the components. This commit changes post-RA scheduling to use regs_read and regs_written instead so that we add enough dependencies. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92770 Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 08:41:48 -08:00
Jason Ekstrand	c839174d55	nir/validate: Add better validation of load/store types Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2015-11-07 08:41:35 -08:00
Marek Olšák	d57ede92b7	radeonsi: add register definitions for Stoney There are a few non-stoney changes too. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-11-07 10:22:13 +01:00
Marek Olšák	2658777f46	radeonsi: add workarounds for CP DMA to stay on the fast path v2: set emit_scratch_reloc, add a NULL check Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:13 +01:00
Marek Olšák	fc0416ef5d	radeonsi: unify CP DMA preparation logic Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:13 +01:00
Marek Olšák	89da3b4458	radeonsi: unify CP DMA code determining various flags v2: don't call get_flush_flags twice per function Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:12 +01:00
Marek Olšák	c3e527f93d	radeonsi: only enable write confirmation on the last CP DMA packet This should improve performance for big copies that need to be split. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-11-07 10:22:12 +01:00

1 2 3 4 5 ...

67423 Commits