AlexIndustrial/mesa

Author	SHA1	Message	Date
Ilia Mirkin	19963231a3	nv50/ir: optimize shl + and Address loading can often end up as shl + shr + shl combinations. The latter two are equal shifts, which get converted into an and mask. However if the previous shl is more than the mask is trying to remove (in terms of low bits), we can just remove the and entirely. This reduces some large shaders by as many as 3% of instructions (out of 2K). total instructions in shared programs : 6495509 -> 6491076 (-0.07%) total gprs used in shared programs : 954621 -> 954623 (0.00%) local gpr inst bytes helped 0 0 1014 1014 hurt 0 2 0 0 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	5ba380c226	nvc0: enable FBFETCH with a special slot for color buffer 0 We don't need to support all the color buffers for advanced blend, just cb0. For Fermi, we use the special binding slots so that we don't overlap with user textures, while Kepler+ gets a dedicated position for the fb handle in the driver constbuf. This logic is only triggered when a FBFETCH is actually present so it should be a no-op most of the time. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	a1c8484271	gallium: add flags parameter to texture barrier This is so that we can differentiate between flushing any framebuffer reading caches from regular sampler caches. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	ee3ebe68f9	gallium: add PIPE_CAP_TGSI_FS_FBFETCH Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:09 -05:00
Ilia Mirkin	1393999541	gallium: add FBFETCH opcode to retrieve the current sample value Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 21:13:08 -05:00
Ilia Mirkin	0baa639f76	nvc0: true up exposing of the HW_METRIC_QUERY_GROUP for maxwell This had been updated in one place but not the other. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-01-16 16:04:55 -05:00
Ilia Mirkin	5eeebca12f	nv50/ir: handle new DDIV op which will be used for double divisions The existing lowering is in place to lower that to RCP + MUL, or fancier things down the line if necessary. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-16 14:45:46 -05:00
Nicolai Hähnle	6be4a40430	tgsi: add DDIV instruction Double-precision division, to allow more precision than a DRCP + DMUL sequence. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-16 20:17:22 +01:00
Nicolai Hähnle	5e94e5bb9b	radeonsi: fix R600_DEBUG=nooptvariant Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net>	2017-01-16 20:16:18 +01:00
Marek Olšák	d523415609	radeonsi: implement GL_FIXED vertex format Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 18:07:08 +01:00
Marek Olšák	018fb2ecb3	radeonsi: implement 32-bit SNORM/UNORM/SSCALED/USCALED vertex formats Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 18:07:08 +01:00
Marek Olšák	44e9b67229	radeonsi: make fix_fetch 64-bit v2: add u_bit_consecutive64 Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 18:07:08 +01:00
Thomas Hindoe Paaboel Andersen	8daf6de3de	gallium/hud: avoid buffer overrun Renaming data sources was added in `e8bb97ce30` It was possible to use a new name longer than the name array in hud_graph of 128. This patch truncates the name to fit the array. CC: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-01-16 18:07:08 +01:00
Marek Olšák	0d9a4efce9	gallium/radeon: add GPU-shaders-busy HUD query It should be close to the GPU load, but it can be much lower if something is stalling shader execution (e.g. CP DMA). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	aa0de724c7	gallium/radeon: make the GPU load / GRBM_STATUS monitoring extensible The next patch will add SPI_BUSY monitoring. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	935d58ac73	radeonsi: show average results per frame for perf counters in HUD so that the graphs are independent from FPS. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	1fe7c8d3c9	gallium/hud: disable queries during HUD draw calls Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Marek Olšák	5b2eddc40f	gallium/hud: increase the vertex buffer size for background quads Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-16 15:35:30 +01:00
Nayan Deshmukh	4b0e9babc6	st/va: delay calling begin_frame until we have all parameters If begin_frame is called before setting intra_matrix and non_intra_matrix it leads to segmentation faults when vl_mpeg12_decoder.c is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92634 Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-16 15:09:01 +01:00
Ilia Mirkin	dd39e48726	nvc0/ir: emit FMZ flag when requested on FFMA Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-15 13:13:58 -05:00
sguttula	9b14a828db	st/va: flush pipeline after post processing This will flush the pipeline,which will allow to share dma-buf based buffers. Signed-off-by: Suresh Guttula <Suresh.Guttula@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>	2017-01-13 14:21:29 +01:00
Samuel Pitoiset	e1ea70d9f3	radeonsi: replace si_shader_context::soa by bld_base We no longer need to use lp_build_tgsi_soa_context. No regressions founds with full piglit run. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:41:08 +01:00
Samuel Pitoiset	ecf04b84e5	radeonsi: replace ctx->soa.outputs by ctx->outputs The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:41:06 +01:00
Samuel Pitoiset	f04088a7ba	radeonsi: move si_shader_context::soa::addr to si_shader_context The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:41:02 +01:00
Samuel Pitoiset	6f0d955b6d	radeonsi: allocate the array of immediates dynamically Currently, we can store up to 256 immediates in a static array, but this is not always enough. Instead, allocate a dynamic array like what we currently do for temps. This fixes a segfault with dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 No regressions found with full piglit run. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 10:40:57 +01:00
Ilia Mirkin	f897036978	nvc0/ir: only try to check for zero LOD if we aren't already forcing it There's a levelZero flag which forces texturing to pick level zero (and not consume an explicit LOD argument). This is set for MS targets, but could also be set for any other incoming instruction. As that is what determines whether a LOD argument is present, check that rather than the more indirect isMS logic. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-01-12 21:08:42 -05:00
Ilia Mirkin	eb60a89bc3	nouveau: take extra push space into account for pushbuf_space calls Ever since a long time ago when I messed around with fences, I ensure that after a PUSH_SPACE call there is enough space to write a fence out into the pushbuf. However the PUSH_SPACE macro is not all-knowing, and so sometimes we have to invoke nouveau_pushbuf_space manually with the relocs/pushes args set. If we don't take the extra allocation from PUSH_SPACE into account, then we will end up accidentally flushing when the code was not expecting a flush. This can lead to various runtime and rendering failures. The amount of extra allocation isn't that important - it has to be at least 8 based on the current nouveau_winsys.h setting, but even more won't hurt. I just rounded up to powers of 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354 Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Ben Skeggs <bskeggs@redhat.com>	2017-01-12 20:39:19 -05:00
Nicolai Hähnle	fccf29373d	radeonsi: remove unused si_prepare_cube_coords Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:13 +01:00
Nicolai Hähnle	a0ce09b4b2	amd/common: unify cube map coordinate handling between radeonsi and radv Code is taken from a combination of radv (for the more basic functions, to avoid gallivm dependencies) and radeonsi (for the new and improved derivative calculations). v2: add 0.5 offset to tex coords only after derivative calculation v3: - really only touch the first three coordinates - rebase on the removal of the 1.5 --> 0.5 offset change Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:10 +01:00
Nicolai Hähnle	0ee1ee5fbb	radeonsi: only touch first three coordinates in si_prepare_cube_coords Sourcing coords_arg[4] is actually never correct, since bias is handled differently in tex_fetch_args anyway. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:07 +01:00
Nicolai Hähnle	9f590ee9d9	radeonsi: remove unused si_llvm_cube_to_2d_coords Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:39:03 +01:00
Nicolai Hähnle	205ad5234a	radeonsi: restrict cube map derivative computations to the correct plane As remarked by the comment in the original code, the old algorithm fails when (tc + deriv) points at a different cube face. Instead, simply project the derivative directly to the plane of the selected cube face. The new code is based on exactly differentiating (using the chain rule) the projection onto a plane corresponding to a fixed cube map face (which is still selected in the usual way based on the texture coordinate itself). The computations end up fairly involved, but we do save two reciprocal computations. Fixes GL45-CTS.texture_cube_map_array.sampling. v2: add 0.5 offset to tex coords only after derivative calculation v3: go back to 1.5 offset Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v2) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:38:59 +01:00
Nicolai Hähnle	e01deee42f	radeonsi: communicate cube map coordinates more explicitly v2: fix compile error that snuck in during rebase Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-01-13 00:38:34 +01:00
Grazvydas Ignotas	c728051131	ac/debug: move .gitignore for sid_tables.h too `b838f642` "ac/debug: Move sid_tables.h generation to common code." moved sid_tables.h but forgot the corresponding .gitignore. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-13 00:37:52 +01:00
Chuck Atkins	e9a4ec4bd8	glx: Add missing glproto dependency for gallium-xlib glx Cc: mesa-stable@lists.freedesktop.org Cc: Bruce Cherniak <bruce.cherniak@intel.com> Signed-of-by: Chuck Atkins <chuck.atkins@kitware.com> Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 22:01:55 +00:00
Emil Velikov	c90f921273	ac, radeonsi: automake: add missing builddir include The generated file is correctly stored in the builddir as of earlier commit. Yet the commit forgot to add the respective include flag thus the compiler would error out failing to find sid_tables.h Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=99389 Fixes: `d1dc22eb46` "ac: automake: rework sid_tables.h generation" Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 22:01:55 +00:00
Axel Davy	970556292b	st/nine: Protect dtors with mutex When the flag D3DCREATE_MULTITHREAD is set, a global mutex is used to protect nine calls. However for performance reasons, AddRef and Release didn't hold the mutex, and instead used atomics. Unfortunately at item release, the item can be destroyed, and that destruction path should be protected by a mutex (at least for some objects). Without this patch, it is possible an app thread is in a dtor while another thread is making gallium nine calls. It is possible that two threads are using the same gallium pipe, which is forbiden. The problem has been made worse with csmt, because it can cause hang, since nine_csmt_process is not threadsafe. Fixes Hitman hang, and possibly others. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	5f4359ea0e	st/nine: Flush the queue at device dtor Flush the queue to get refcounts right, and properly release the items, instead of throwing away all pending commands. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	4e922c81f6	st/nine: Process pending commands on Reset Some nine_state_* and nine_context_* functions used for Reset() require all pending commands are flushed. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	6b87a2a77a	st/nine: Flush pending commands if needed for surface9 changes nine_context uses NineSurface9 fields, thus we need to flush pending commands using the surface before changing the fields. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	f895ab8e22	st/nine: Rework CreatePipeSurface Create both surfaces in one call. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Axel Davy	d43bc05e8b	st/nine: Remove duplicated checks There is no need to check on csmt_active before calling nine_csmt_process, because the function checks already. Signed-off-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Masanori Kakura	9b5f5de9e9	st/nine: Don't call u_box_union_* when dirty region is empty When dirty region is empty, u_box_union_* incorrectly expands the new region. This fixes broken font rendering issue in WOLF RPG Editor v2.10 games. Signed-off-by: Masanori Kakura <kakurasan@gmail.com> Reviewed-by: Axel Davy <axel.davy@ens.fr>	2017-01-12 20:33:11 +01:00
Emil Velikov	a5f0cdb36f	winsys/etnaviv: automake: introduce Makefile.sources ... and list the public header within it. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:30:15 +00:00
Emil Velikov	0467700536	etnaviv: automake: include all files in the sources lists Note: the currently mentioned etnaviv_utils.h is typo. Signed-off-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:30:09 +00:00
Christian Gmeiner	e8626e3b31	imx: gallium driver for imx-drm scanout driver Changes from V1 -> V2: - updated Copyright - added $(top_srcdir)/src/gallium/winsys to include path (suggested by Emil) - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:27:11 +00:00
The etnaviv authors	c9e8b49b88	etnaviv: gallium driver for Vivante GPUs This driver supports a wide range of Vivante IP cores like GC880, GC1000, GC2000 and GC3000. Changes from V1 -> V2: - added missing files to actually integrate the driver into build system. - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Russell King <rmk@arm.linux.org.uk> Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com>	2017-01-12 19:27:11 +00:00
Christian Gmeiner	848b49b288	gallium: add renderonly library This a very lightweight library to add basic support for renderonly GPUs. A kms gallium driver must specify how a renderonly_scanout objects gets created. Also it must provide file handles to the used kms device and the used gpu device. This could look like: struct renderonly ro = { .create_for_resource = renderonly_create_gpu_import_for_resource, .kms_fd = fd, .gpu_fd = open("/dev/dri/renderD128", O_RDWR \| O_CLOEXEC) }; The renderonly_scanout object exits for two reasons: - Do any special treatment for a scanout resource like importing the GPU resource into the scanout hw. - Make it easier for a gallium driver to detect if anything special needs to be done in flush_resource(..) like a resolve to linear. A GPU gallium driver which gets used as renderonly GPU needs to be aware of the renderonly library. This library will likely break android support and hopefully will get replaced with a better solution based on gbm2. Changes from V1 -> V2: - reworked the lifecycle of renderonly object (suggested by Nicolai Hähnle) - killed the midlayer (suggested by Thierry Reding) - made the API more explicit regarding gpu and kms fd's - added some docs Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Acked-by: Emil Velikov <emil.velikov@collabora.com> Tested-by: Alexandre Courbot <acourbot@nvidia.com>	2017-01-12 19:27:11 +00:00
George Kyriazis	a61528fa33	Always defer memory free in swr_resource_destroy Defer delete on regular resources. This ensures that any work being done on the resource is completed before freeing up the resource's memory. Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>	2017-01-12 09:10:15 -06:00
Samuel Pitoiset	f0997e2aa8	nvc0: enable GL 4.3 on gm107+ Although, arb_shader_image_load_store-atomicity will most likely hang your box, I think it's now quite reasonable to enable GL 4.3 on Maxwell/Pascal GPUs. I suspect that test to be wrong because it doesn't even work on the NVIDIA blob. I have tested a bunch of benchmarks (UE4 demos) and real games like Shadow of Mordor and they all work fine. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-01-12 15:22:21 +01:00

1 2 3 4 5 ...

29858 Commits