AlexIndustrial/mesa

Author	SHA1	Message	Date
Rob Herring	574a92b048	Android: fix build break from nir/glsl move to compiler/ Commits `a39a8fbbaa` ("nir: move to compiler/") and `eb63640c1d` ("glsl: move to compiler/") broke Android builds. Fix them. There is also a missing dependency between generated NIR headers and several libraries. This isn't a new issue, but seems to have been exposed by the NIR move. Built with i915, i965, freedreno, r300g, r600g, vc4, and virgl enabled. Cc: "11.2" <mesa-stable@lists.freedesktop.org> Cc: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-29 10:51:44 +00:00
Oded Gabbay	a640ad15e1	gallium/radeon: disable evergreen_do_fast_color_clear for BE This function is currently broken for BE. I assume it's because of util_pack_color(). Until I fix this path, I prefer to disable it so users would be able to see correct colors on their desktop and applications. Together with the two following patches: - gallium/r600: Don't let h/w do endian swap for colorformat - gallium/radeon: remove separate BE path in r600_translate_colorswap it fixes BZ#72877 and BZ#92039 Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Oded Gabbay	e3dfc0e095	gallium/r600: Don't let h/w do endian swap for colorformat Since the rework on gallium pipe formats, there is no more need to do endian swap of the colorformat in the h/w, because the conversion between mesa format and gallium (pipe) format takes endianess into account (see the big #if in p_format.h). v2: return ENDIAN_NONE only for four 8-bits components (V_0280A0_COLOR_8_8_8_8) Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Oded Gabbay	9559071ed6	gallium/radeon: remove separate BE path in r600_translate_colorswap After further testing, it appears there is no need for separate BE path in r600_translate_colorswap() The only fix remaining is the change of the last if statement, in the 4 channels case. Originally, it contained an invalid swizzle configuration that never got hit, in LE or BE. So the fix is relevant for both systems. This patch adds an additional 120 available visuals for LE and BE, as seen in glxinfo v2: Tested for regressions by running piglit gpu.py with CAICOS (r600g) on x86-64 machine. No regressions found. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-29 12:26:27 +02:00
Samuel Pitoiset	07ed003faf	nv50/ir: emit VOTE instruction Changes from v2: - add missing NOT modifier for GK110/GM107 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-28 23:58:11 +01:00
Samuel Pitoiset	b3efa0a59e	gk110/ir: add ld lock/st unlock emission Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-28 19:20:20 +01:00
Ilia Mirkin	aa3b85fd18	nv50,nvc0: bump minimum texture buffer offset alignment It appears that it actually needs to be aligned to the datum size, so it was 1 when testing with R8, but it can be as high as 16 with RGBA32. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>	2016-02-27 16:26:34 -05:00
Samuel Pitoiset	aad48f8691	nvc0: rework nvc0_compute_validate_program() Reduce the amount of duplicated code by re-using nvc0_program_validate(). While we are at it, change the prototype to return void and remove nvc0_compute.h which is now useless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:27 +01:00
Samuel Pitoiset	e1f5c76047	nvc0: make sure to validate compute global buffers on Fermi No reason to not validate those global buffers and this might avoid fails if someone try to use the global memory from compute programs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:23 +01:00
Samuel Pitoiset	dcf7938833	nvc0: move nvc0_validate_global_residents() to nvc0_compute.c While we are at it, rename it to nvc0_compute_validate_globals() and update its prototype. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre Moreau <pierre.morrow@free.fr> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-26 14:00:18 +01:00
Dave Airlie	840aa52f50	virgl: add missing CAP turned off.	2016-02-26 04:03:09 +00:00
Emil Velikov	b08dbc84fe	st/nine: don't forget to bundle the nine_limits.h file Without this mesa 11.2.0-rc1 ended up busted :-( Cc: "11.2" <mesa-stable@lists.freedesktop.org> Repored-by: Ondřej Súkup <mimi.vx@gmail.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-25 19:56:07 +00:00
Oded Gabbay	439b5b008f	gallium/radeon: return correct values for BE in r600_translate_colorswap Because I changed the swizzle check, I also need to adapt the return values for each check. It's basically almost the same as before, we just cross between STD and STD_REV, and cross between ALT and ALT_REV This fixes the rgba test in gl-1.0-readpixsanity (piglit) and also fixes tri-flat (mesa demos). Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-25 09:21:08 +02:00
Oded Gabbay	ff8b41b702	gallium: remove duplicate define from enum pipe_format Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Thomas Helland <thomashelland90@gmail.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com>	2016-02-25 09:21:08 +02:00
Oded Gabbay	4b7e219e61	gallium/radeon: Correctly translate colorswaps for big endian The current code in r600_translate_colorswap uses the swizzle information to determine which colorswap to use. This works for BE & LE when the nr_channels is <4, but when nr_channels==4 (e.g. PIPE_FORMAT_A8R8G8B8_UNORM), this method can not be used for both BE and LE, because the swizzle info is the same for both of them. As a result, r600g doesn't support 24bit color formats, only 16bit, which forces the user to choose 16bit color in X server. This patch fixes this bug by separating the checks for LE and BE and adapting the swizzle conditions in the BE part of the checks. Tested on an Evergreen GPU (Cedar GL FirePro 2270) running inside POWER7 Big-Endian Machine. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> CC: "11.2" "11.1" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-02-23 20:55:40 +02:00
Marek Olšák	190a291b03	tgsi/scan: handle holes between VS inputs, assert-fail in other cases "st/mesa: overhaul vertex setup for clearing, glDrawPixels, glBitmap" added a vertex shader declaring IN[0] and IN[2], but not IN[1]. Drivers relying on tgsi_shader_info can't handle holes in declarations, because tgsi_shader_info doesn't track that. This is just a quick workaround meant for stable that will work for vertex shaders. This fixes radeonsi DrawPixels and CopyPixels crashes. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Brian Paul <brianp@vmware.com>	2016-02-23 16:42:16 +01:00
Samuel Pitoiset	2999257e0f	nvc0: rename 3d binding points to NVC0_BIND_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	9c6a7bfb40	nvc0: rename 3d dirty flags to NVC0_NEW_3D_XXX Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	2c48369f54	nvc0: prefix compute macros with _CP_ instead of _COMPUTE_ Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	bbff97ae39	nvc0: rename NVXX_COMPUTE to NVXX_CP Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	5330ed959e	nvc0: rename nvc0_context::dirty to nvc0_context::dirty_3d Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	84b9b8f0a3	nvc0/ir: add missing emission of locked load predicate Like unlocked store on shared memory, locked store can fail and the second dest which is a predicate must be emitted. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: mesa-stable@lists.freedesktop.org	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	9f0d059d4b	nvc0/ir: add ld lock/st unlock emission on GK104 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Samuel Pitoiset	6526225f88	nv50/ir: restore OP_SELP to be a regular instruction Actually OP_SELP doesn't need to be a compare instruction. Instead we just need to set the NOT modifier when building the instruction. While we are at it, fix the dst register type and use a GPR. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2016-02-22 21:28:51 +01:00
Brian Paul	9de3b0273d	svga: unbind index buffer when drawing non-indexed primitives Silences a warning reported by the svga3d device. v2: also null-out the index buffer pointer Reviewed-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>	2016-02-22 12:14:48 -07:00
Emil Velikov	4cd5e5b48e	nouveau: update the Makefile.sources list Reflect the nv50->g80 change and the new gm107_texture header. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2016-02-22 11:40:29 +00:00
Marek Olšák	ff360a52e6	radeonsi: implement binary shaders & shader cache in memory (v2) v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	1132910e50	gallium/radeon: remove unused radeon_shader_binary_free_* functions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	50ac2612d0	radeonsi: make radeon_shader_reloc name string fixed-sized This will simplify implementations of binary shaders. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	1fe73d55e3	radeonsi: move some struct si_shader members to new struct si_shader_info This will be part of shader binaries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	10fa269f4f	radeonsi: use smaller types for some si_shader members in order to decrease the shader size for a shader cache. v2: add & use SI_MAX_VS_OUTPUTS Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	9aaf28da62	radeonsi: enable compiling one variant per shader Shader stats from VERDE: Default scheduler: Totals: SGPRS: 491272 -> 488672 (-0.53 %) VGPRS: 289980 -> 311093 (7.28 %) Code Size: 11091656 -> 11219948 (1.16 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave Max Waves: 78063 -> 77352 (-0.91 %) Wait states: 0 -> 0 (0.00 %) Looking at some of the worst regressions, I get: - The VGPR increase seems to be caused by the fact that if PS has used less than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20. However, the wave count remains at 10 if VGPRs <= 24, so no harm there. - The scratch increase seems to be caused by SGPR spilling. The unnecessary SGPR spilling has been an ongoing issue with the compiler and it's completely fixable by rematerializing s_loads or reordering instructions. SI scheduler: Totals: SGPRS: 374848 -> 374576 (-0.07 %) VGPRS: 284456 -> 307515 (8.11 %) Code Size: 11433068 -> 11535452 (0.90 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 509952 -> 522240 (2.41 %) bytes per wave Max Waves: 79456 -> 78217 (-1.56 %) Wait states: 0 -> 0 (0.00 %) VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much and generally spills way less than the default scheduler. (522240 spills vs 2246656 spills) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	754cf171e9	radeonsi: print full shader name before disassembly Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	3c98e0b369	radeonsi: compile non-GS middle parts of shaders immediately if enabled Still disabled. Only prologs & epilogs are compiled in draw calls, but each variant of those is compiled only once per process. VS is always compiled as hw VS. TES is always compiled as hw VS. LS and ES stages are always compiled on demand. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	e038f8fd49	radeonsi: rework polygon stippling for PS prolog Don't use the pstipple module. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	4636d9be4a	radeonsi: add PS prolog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:58 +01:00
Marek Olšák	e79bb746ab	radeonsi: add PS epilog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	eb10919b83	radeonsi: add TCS epilog Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	e1b21696a3	radeonsi: add VS epilog It only exports the primitive ID. Also used by TES when it's compiled as VS. The VS input location of the primitive ID input is v2. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	70de433dea	radeonsi: add VS prolog This is disabled with use_monolithic_shaders = true. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	19a92886a8	radeonsi: first bits for non-monolithic shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	0303886b10	radeonsi: add code for dumping all shader parts together (v2) v2: unify some code into si_get_shader_binary_size Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	17eb99d8b9	radeonsi: add code for combining and uploading shaders from 3 shader parts Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	9d5bf1a3ef	radeonsi: fail compilation if non-GS non-CS shaders have rodata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	09408764c1	radeonsi: separate 2 pieces of code from create_function Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	292759220c	radeonsi: add samplemask parameter to si_export_mrt_color Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	e6aea08b86	radeonsi: add start_instance parameter to get_instance_index_for_fetch Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	dc27456194	radeonsi: separate out shader key bits for prologs & epilogs Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	d995d4830e	radeonsi: compute how many input VGPRs fragment shaders have Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00
Marek Olšák	fe1b6ede01	radeonsi: compute how many input SGPRs and VGPRs shaders have Prologs (shader binaries inserted before the API shader binary) need to know this, so that they won't change the input registers unintentionally. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-02-21 21:08:57 +01:00

1 2 3 4 5 ...

26224 Commits