AlexIndustrial/mesa

Author	SHA1	Message	Date
Marek Olšák	bb860f63f6	mesa: create glBitmap textures while creating display lists This makes glCallList just a textured draw, which is blazingly fast. Reviewed-by: Brian Paul <brianp@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17780>	2022-08-24 18:13:02 +00:00
Marek Olšák	6da2fb81a7	Revert "mesa: implement a display list / glBitmap texture atlas" This reverts commit `b26ddda12f` and commit `06d3b0a006`. Reviewed-by: Brian Paul <brianp@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17780>	2022-08-24 18:13:02 +00:00
Lionel Landwerlin	f242c9af76	intel/fs: bump max SIMD size for A64 atomics with LSC Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Tested-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>. Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>	2022-08-24 17:51:40 +00:00
Lionel Landwerlin	407f2beb97	intel/fs: port block a64/surface messages to use LSC v2: Fixup block load/store on surfaces/shared-memory (Rohan) v3: drop write specific size_written case (Rohan) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>	2022-08-24 17:51:40 +00:00
Lionel Landwerlin	37b3601052	intel/fs: switch register allocation spilling to use LSC on Gfx12.5+ v2: drop the hardcoded inst->mlen=1 (Rohan) v3: Move back to LOAD/STORE messages (limited to SIMD16 for LSC) v4: Also use 4 GRFs transpose loads for fills (Curro) v5: Reduce amount of needed register to build per lane offsets (Curro) Drop some now useless SIMD32 code Unify unspill code Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>	2022-08-24 17:51:40 +00:00
Lionel Landwerlin	3c6fa2703d	intel/fs: fixup SEND validation check on overlapping src0/src1 With the following SEND instruction : send(1) nullUD nullUD g0UD 0x4200c504 a0.1<0>UD This instruction although valid but somewhat nonsensical (SEND message to write at offset contained in NULL register), triggers an error in the validator. The restriction is that we cannot have overlapping sources. The validator not checking the type of register incorrectly thinks that the null register (offset 0) is the same as g0. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>	2022-08-24 17:51:40 +00:00
Lionel Landwerlin	a81ca32f96	intel/fs: remove unused opcode Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Acked-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>	2022-08-24 17:51:40 +00:00
Lionel Landwerlin	aa65f83203	intel/fs: switch compute push constant loads to LSC We're now able to load up to 8 GRFs in one send. v2: Switch to use transpose + vector of up to 64 (Thanks Curro!) v3: Increase parallelism by not reusing the same register for push constant offset (Curro) v4: Drop dead ADD() instruction (Curro) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17555>	2022-08-24 17:51:40 +00:00
Mike Blumenkrantz	1e7a131fd1	tu: fix invalid free on alloc failure this is not an allocated pointer cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18230>	2022-08-24 17:29:53 +00:00
Georg Lehmann	b3cc213f56	radv: Fold 16bit image sources. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106>	2022-08-24 17:04:03 +00:00
Georg Lehmann	9151048957	aco: Combine 16bit undef and constants instead of using s_pack. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106>	2022-08-24 17:04:03 +00:00
Georg Lehmann	46f6e2ddbb	aco: Implement storage image A16. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106>	2022-08-24 17:04:03 +00:00
Georg Lehmann	c8ad1aeeb2	nir/fold_16bit_tex_image: Add an option to fold image sources. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18106>	2022-08-24 17:04:03 +00:00
Gert Wollny	13355232e4	nir_lower_atomics_to_ssbo: Initialize deref struct This fixes the use of an uninitialzed value: Conditional jump or move depends on uninitialised value(s) bcmp (vg_replace_strmem.c:1203) _mesa_add_sized_state_reference (prog_parameter.c:434) st_nir_assign_uniform_locations(gl_context, gl_program, nir_shader) (st_glsl_to_nir.cpp:209) st_finalize_nir (st_glsl_to_nir.cpp:1041) by 0x58271B9: st_glsl_to_nir_post_opts(st_context, gl_program, gl_shader_program) (st_glsl_to_nir.cpp:571) ... Uninitialised value was created by a heap allocation malloc (vg_replace_malloc.c:381) ralloc_size (ralloc.c:114) ralloc_array_size (ralloc.c:218) deref_offset_var (nir_lower_atomics_to_ssbo.c:47) lower_instr (nir_lower_atomics_to_ssbo.c:111) nir_lower_atomics_to_ssbo (nir_lower_atomics_to_ssbo.c:204) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18227>	2022-08-24 16:02:03 +00:00
Georg Lehmann	8eac45b274	nir: Add nir_ssa_scalar_is_undef. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18183>	2022-08-24 15:22:40 +00:00
Konstantin Seurer	78564b5b84	radv: Advertise subgroup ops for rt stages Closes: #7098 Signed-off-by: Konstantin Seurer <konstantin.seurer@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18169>	2022-08-24 13:05:38 +00:00
Mike Blumenkrantz	c4f78396d4	zink: support PIPE_CAP_FBFETCH_COHERENT that's what VK_EXT_rasterization_order_attachment_access is for Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18133>	2022-08-24 12:19:13 +00:00
Mike Blumenkrantz	9f7195949b	vulkan: Update the XML and headers to 1.3.225 Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18133>	2022-08-24 12:19:13 +00:00
Samuel Pitoiset	15a7361ce9	radv: merge gather_tess_info() with radv_fill_shader_info() Shouldn't introduce any functional changes. The dependencies between stages might be improved with a new helper that will link shader_info. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18184>	2022-08-24 11:17:05 +00:00
Samuel Pitoiset	7b94ca287b	radv: remove unused num_tess_patches assignment for VS This is never used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18184>	2022-08-24 11:17:05 +00:00
Samuel Pitoiset	068891a383	radv: remove unused tcs_vertices_out assignment for VS This is never used. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18184>	2022-08-24 11:17:05 +00:00
Samuel Pitoiset	76f33cbf25	radv: remove redundant assignment of tcs.tcs_vertices_out It's already assigned from radv_nir_shader_info_pass() and it's only used to configure the VGT_TF_PARAM register. Otherwise, we read it from NIR shader info during compilation. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18184>	2022-08-24 11:17:05 +00:00
Lucas Stach	8b8beae8d5	etnaviv: expose ARB_draw_instanced Just set the pipe cap correctly. The InstanceID support is already hooked up in the NIR compiler. All enabled piglit tests pass. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18046>	2022-08-24 09:13:31 +00:00
Vinson Lee	1dffad2f83	zink: Remove duplicate variable zero. Fix defect reported by Coverity Scan. Evaluation order violation (EVALUATION_ORDER) write_write_typo: In zero = zero = nir_imm_zero(b, nir_dest_num_components(intr->dest), nir_dest_bit_size(intr->dest)), zero is written twice with the same value. Fixes: `0f97e317e3` ("zink: rewrite all undefined shader reads as 0001 instead of undef") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18164>	2022-08-24 04:48:10 +00:00
Timothy Arceri	0c8492cd3b	glsl: fix location for array subscript xfb_decl_assign_location() assumes that arrays are going to be packed. But some conditions might prevent packing (e.g: explicit location or smooth interpolation mode). Instead of assuming that packing will happen, this commit adds a check to determine if it'll happen and use the result to compute the proper location. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2214 Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18175>	2022-08-24 02:19:34 +00:00
Timothy Arceri	04e7ed8323	glsl: make packed varying helper needs_lowering() external We will use this helper to correctly calculate xfb offsets in the following patch. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18175>	2022-08-24 02:19:34 +00:00
Qiang Yu	ff7c59672f	radeonsi: fix tcs_out_lds_offsets arg alignment tcs_out_lds_offsets is not sure to be 16 byte aligned, it's calculated like this: num_patches * patch_vertices * lshs_vertex_stride num_patches and patch_vertices are not sure to be any value aligned, lshs_vertex_stride is added one extra dword, so it's only 4 byte aligned. This may cause problem even before we switch to nir tess output lower when write tess factor before read tail of input. But it's more likely to cause problem after we switch to nir tess output lower because the main body won't eliminate the low 4bit offset but epilog will, so they use different offset to read/write tess factor. Fixes: `7598bfd768` ("radeonsi: replace llvm tcs output with nir lower pass") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7083 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18174>	2022-08-24 02:04:15 +00:00
Caio Oliveira	bee2df64d2	intel/compiler: Use fs_reg helpers for GS icp_handle selection Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18221>	2022-08-24 01:42:23 +00:00
Caio Oliveira	b4aff6ab49	intel/compiler: Use fs_reg helpers for TCS icp_handle selection Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18221>	2022-08-24 01:42:22 +00:00
Caio Oliveira	a1b1fdf70d	intel/compiler: Rename 8_PATCH to MULTI_PATCH Make it clearer we are dealing with multiple patches, works better in constrast with SINGLE_PATCH. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>	2022-08-24 00:39:57 +00:00
Caio Oliveira	7cd06249b9	intel/compiler: Remove INTEL_DEBUG=tcs8 For Gen11 and prior, the dispatch mode for TCS was SINGLE_PATCH, and this debug setting could be used to change it to 8_PATCH (falling back to SINGLE_PATCH when shader couldn't be in the multi dispatch mode). However after talking to Ken, seems this debug setting is not really worth keeping around, so removing it. For Gen12+ the only option is 8_PATCH, so it was always using that dispatch mode as before. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18151>	2022-08-24 00:39:57 +00:00
Bas Nieuwenhuizen	bb2a444324	vulkan/wsi: Take max extent into consideration for modifier selection. For AMD we kinda have some modifiers with a max size ... (Which is really a compositor/kms issue, but getting them to try kinda falls into the unsolved "how to allocate/what pitch to use" bucket, so we solve it on the allocating side) Cc: mesa-stable Tested-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Joshua Ashton <joshua@froggi.es> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18139>	2022-08-23 23:36:53 +00:00
Jordan Justen	e9f40e42de	iris: Drop extra file-descriptor dup in iris_drm_screen_create() In `a99e85db9e`, we added a dup into iris_screen_create(). Apparently some android code paths must be hitting iris_screen_create() without calling iris_drm_screen_create(). After `a99e85db9e`, the code paths that do hit iris_drm_screen_create() will now dup the fd twice, but iris_screen_destroy() will only close 1 of these fds. Fixes: `a99e85db9e` ("iris:Duplicate DRM fd internally instead of reuse.") Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18020>	2022-08-23 22:54:23 +00:00
Lionel Landwerlin	3c78e94ff3	intel/fs: fixup scratch load/store handling on Gfx12.5+ We did not handle the operation with data size < 4. It works fine on all other messages (global/shared). The initial commit was just too restrictive. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `1e242785c3` ("intel/fs: Implement load/store_scratch on XeHP") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16964>	2022-08-23 22:19:16 +00:00
Lionel Landwerlin	46a13404c0	intel/fs: fix load_scratch intrinsic The selection of the internal opcode to deal with load_scratch is incorrect. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c643979228` ("intel/fs: Choose memory message type based on bit size") Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16964>	2022-08-23 22:19:16 +00:00
Caio Oliveira	0a2cfa14dd	intel/compiler: Make component() work for FIXED_GRF/ARF Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18157>	2022-08-23 19:52:38 +00:00
Francisco Jerez	6f33b22495	intel/fs: Fix horiz_offset() to handle FIXED_GRFs with non-trivial 2D regions. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18157>	2022-08-23 19:52:38 +00:00
Kai Wasserbäch	559c027ade	chore(deps): clover: raise the minimum LLVM version to 11.0.0 LLVM 11 was released in October 2020. If you want to build against Mesa's Git version, that seems like enough time to upgrade to at least LLVM 11 (Debian stable has this too). It reduces the amount of #if gates we need and more will be incoming again, given the Opaque Pointer transition. Additionally radeonsi is already requiring LLVM 11. Therefore the minimum will have been LLVM 11 for many builds anyway. Note that clc is kept to LLVM 10 for the time being. Signed-off-by: Kai Wasserbäch <kai@dev.carbon-project.org> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16047>	2022-08-23 19:23:05 +00:00
John Brooks	98ba1e0d81	radv: Fix mipmap views on GFX10+ As explained in the previous commit, GFX9+ has issues with addressing mipmaps in block-compressed images. In the case of copy commands, we fix this by doing an extra copy for the missing blocks. For GFX10, the mipmap layout in memory allows us to do better than that. We can change the base level of the descriptor to one level bigger than the requested level and adjust the extent and address to match. This is done by ComputeNonBlockCompressedView in addrlib. Thus on GFX10 we can skip the fixup copy workaround, and this will also fix cases outside of explicit copy commands. Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:18 +00:00
John Brooks	35f053ba8c	radv: Fix corrupted mipmap copies on GFX9+ GFX9+ hardware has an issue where mipmap degradations are calculated incorrectly due to using divide-by-two integer math and certain mipmap sizes lose blocks. This issue has been documented before, and we ported a workaround from AMDVLK to increase the extent that is programmed into the descriptor, so that the hardware arrives at the correct result. However, this is insufficient as we cannot safely increase the extent beyond the physical extent of the image in memory. If we can't increase it enough, the image will still be missing blocks. But there is still hope. In cases where RADV is responsible for copying to or from an image (such as vkCmdCopyBufferToImage/vkCmdCopyImageToBuffer), we can perform a second copy of the blocks that the hardware excluded so that the resulting image is complete. This is another workaround from AMDVLK. This fixes corrupted textures in Halo: The Master Chief Collection. v2: Add RADV_CMD_FLAG_INV_L2 \| RADV_CMD_FLAG_INV_VCACHE to flush_bits just in case (Samuel Pitoiset) Closes: #3347 Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:17 +00:00
John Brooks	ea84143d1e	radv: Only apply mipmap view adjustments to block compressed images This workaround need not apply to subsampled formats. Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:17 +00:00
John Brooks	88401e031b	vulkan: Introduce vk_format_is_block_compressed function Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:17 +00:00
John Brooks	ef6a8a9a6f	radv: Add get_addrlib function to radv_radeon_winsys Signed-off-by: John Brooks <john@fastquake.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17970>	2022-08-23 19:01:17 +00:00
Eric Engestrom	c535434fd9	anv: convert assert into unreachable to avoid fallthrough error Signed-off-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18192>	2022-08-23 18:37:41 +00:00
Karol Herbst	f56609a679	nvc0: limit max global and alloc size Signed-off-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>	2022-08-23 18:29:44 +00:00
Pierre Moreau	16b07b342d	nv50/nir: A group barrier is CTA-level not global-level Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Pierre Moreau <dev@pmoreau.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>	2022-08-23 18:29:44 +00:00
Pierre Moreau	9236af8b6c	nv50/ir: Avoid generating splits of splits Among others, it would result in the spill offsets being wrong due to being relative to the parent split and not absolute. For example when computing a 64-bit multiply on Tesla (which only supports 16-bit mul in hardware), the sources will first be split into 32-bit values and then a second time down to 16-bit ones. Looking at the first source, the spill offsets ended being computed as follows: { .hihi = +2, .hilo = +0, .lohi = +2, .lolo = +0 } instead of the expected { .hihi = +6, .hilo = +4, .lohi = +2, .lolo = +0 } This is resolved with this patch. Signed-off-by: Pierre Moreau <dev@pmoreau.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>	2022-08-23 18:29:44 +00:00
Pierre Moreau	b327f46e45	nv50/ra: Fix the offset computation for compounds compMask is expressed in terms of colours, not bytes, where on Tesla we have 1 colour per 16-bit (whereas it is 1 per 32-bit for later architectures). By multiplying by units we will get back to a result in bytes. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Pierre Moreau <dev@pmoreau.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>	2022-08-23 18:29:44 +00:00
Pierre Moreau	4d892829f3	nv50/peephole: Disallow combining sub 4-byte ld/st for now Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Pierre Moreau <dev@pmoreau.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>	2022-08-23 18:29:44 +00:00
Pierre Moreau	81828284b2	nv50/ir: Handle non-32-bit values when cst folding SPLIT Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Pierre Moreau <dev@pmoreau.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10711>	2022-08-23 18:29:44 +00:00

1 2 3 4 5 ...

146666 Commits