AlexIndustrial/mesa

Author	SHA1	Message	Date
Lionel Landwerlin	08f3950d6b	anv: stop using old entrypoint/struct/enum names for 1.3 v2: More replacements Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15920>	2022-04-13 21:13:56 +00:00
Lionel Landwerlin	e11bedb9f5	intel/fs: add a note on possible optimization of root node address Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>	2022-04-13 11:24:49 +00:00
Lionel Landwerlin	9c0805ef91	intel/fs: fix metadata preserve on trace_ray intrinsic `c78be5da30` ("intel/fs: lower ray query intrinsics") introduced a helper function using nir_(push\|pop)_if which invalidated dominance & block_index for the replacement of nir_intrinsic_rt_trace_ray. We can still keep dominance/block_index metadata for the lowering of nir_intrinsic_rt_execute_callable though. This change uses 2 different lowering function with correct metadata preservation. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15910>	2022-04-13 11:24:49 +00:00
Jason Ekstrand	69b5424ea4	intel/nir: Lower 8 and 16-bit bitwise unops Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>	2022-04-12 23:19:38 +00:00
Jason Ekstrand	a482877c70	intel/fs: Implement 16-bit [ui]mul_high Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15829>	2022-04-12 23:19:38 +00:00
Mykhailo Skorokhodov	9c7e750ffe	intel/fs: Enable b2f(inot(a)) and b2i(inot(a)) optimization for Gfx12+ The commit enables the optimization for Intel Gfx12+ graphics. Tigerlake ``` total instructions in shared programs: 1289326 -> 1289015 (-0.02%) instructions in affected programs: 37841 -> 37530 (-0.82%) helped: 78 HURT: 9 helped stats (abs) min: 1 max: 26 x̄: 4.69 x̃: 3 helped stats (rel) min: 0.10% max: 12.50% x̄: 2.07% x̃: 1.21% HURT stats (abs) min: 1 max: 18 x̄: 6.11 x̃: 4 HURT stats (rel) min: 0.16% max: 1.95% x̄: 0.94% x̃: 0.61% 95% mean confidence interval for instructions value: -4.95 -2.20 95% mean confidence interval for instructions %-change: -2.34% -1.18% Instructions are helped. total cycles in shared programs: 105606388 -> 105606442 (<.01%) cycles in affected programs: 620119 -> 620173 (<.01%) helped: 49 HURT: 28 helped stats (abs) min: 2 max: 3618 x̄: 228.63 x̃: 12 helped stats (rel) min: 0.02% max: 23.31% x̄: 4.60% x̃: 1.11% HURT stats (abs) min: 1 max: 2142 x̄: 402.04 x̃: 29 HURT stats (rel) min: 0.01% max: 36.42% x̄: 5.01% x̃: 0.46% 95% mean confidence interval for cycles value: -151.80 153.20 95% mean confidence interval for cycles %-change: -3.00% 0.79% Inconclusive result (value mean confidence interval includes 0). ``` Related-to: https://gitlab.freedesktop.org/mesa/mesa/-/commit/7725d609387a8165ccb71e2d9e0221d9248b1729 Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14017>	2022-04-12 10:55:05 +00:00
Marcin Ślusarz	65600a34c2	anv: initialize 3DMESH_1D.ExtendedParameter0 when ExtendedParameter0Present When IndirectParameterEnable==true it's not actually used by the hardware, but if it's not initialized and INTEL_DEBUG=bat is set, then Valgrind complains. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15850>	2022-04-12 09:10:31 +00:00
Marcin Ślusarz	f844ce66c8	anv: fix push constant lowering for task/mesh Fixes: `a6031cd9bd` ("anv: fix push constant lowering with bindless shaders") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15850>	2022-04-12 09:10:31 +00:00
Francisco Jerez	e858da39e5	intel/perf: Fix OA report accumulation on Gfx12+. The intel_perf_query path used for performance queries on GL was passing a bogus "end" pointer to intel_perf_query_result_accumulate(), causing it to accumulate garbage values. This was causing the values of many performance counters to be corrupted. The "end" pointer was incorrect because the current code was assuming that different OA reports were located TOTAL_QUERY_DATA_SIZE bytes apart, which is a hard-coded preprocessor define. However recent (Gfx12+) hardware generations use a variable query size determined by the query layout. Use the size derived from it instead, and remove the stale define. Fixes: `3c51325025` ("intel/perf: switch query code to use query layout") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15783>	2022-04-12 00:11:47 +00:00
Kenneth Graunke	b05ac36f01	intel/genxml: Add SAMPLER_MODE bits for enabling Small PL on Icelake This enables a lower power mode in the sampler hardware in certain common scenarios. On Tigerlake, SAMPLER_MODE is not programmable by userspace but the kernel already sets this bit for us. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>	2022-04-11 19:17:07 +00:00
Kenneth Graunke	e3defe7ae7	intel/genxml: Delete SAMPLER_MODE register definition on Gfx12+ While this register still exists, it's no longer a per-context register. Instead, on Gfx12+, SAMPLER_MODE exists per dual-subslice and is accessed as a "multicast" register, where you write control which version is accessed by the "steering control register". At any rate, userspace cannot write it any longer, and so there's not much point to it existing in our genxml (which was missing most of the fields anyway). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>	2022-04-11 19:17:07 +00:00
Kenneth Graunke	8092704705	intel/genxml: Add new "Low Quality Filter" field on Gfx12+. This allows the sampler to perform faster filtering of 8-bit UNORM textures by filtering them at a different precision. The filtering is intended to still be OpenGL and DirectX spec compliant. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>	2022-04-11 19:17:07 +00:00
Kenneth Graunke	9a70385e2b	intel/genxml: Add SAMPLER_STATE::Allow Low Quality LOD Calculation field This allows the hardware to perform a faster LOD calculation in many simple cases. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15628>	2022-04-11 19:17:07 +00:00
Vitalii.Lomaka	1407a4db69	intel/batch-decoder: Fix uninitialized scalar variables CID: 1498516 CID: 1498560 Signed-off-by: Vitalii Lomaka <vitalii.lomaka@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15685>	2022-04-08 18:35:34 +00:00
Benjamin Cheng	0666b7fecc	anv: drop from_wsi bit from anv_image It was originally introduced in `ca791f5c` but it was never actually set anywhere. It doesn't serve any purpose other than some sanity checking so let's clean it up for now. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15799>	2022-04-07 18:46:50 +00:00
Ian Romanick	b5fa43952a	intel/fs: Better handle constant sources of FS_OPCODE_PACK_HALF_2x16_SPLIT I noticed that a LOT of fragment shaders in Shadow of the Tomb Raider, for instance, end up with a sequence of NIR like: vec1 32 ssa_2 = load_const (0x00000000 = 0.000000) ... vec1 32 ssa_191 = pack_half_2x16_split ssa_188, ssa_2 vec1 32 ssa_192 = pack_half_2x16_split ssa_189, ssa_2 vec1 32 ssa_193 = pack_half_2x16_split ssa_190, ssa_2 This results in an assembly sequence like: mov(8) g28<1>UD 0x00000000UD mov(8) g21<2>HF g28<8,8,1>F shl(8) g21<1>UD g21<8,8,1>UD 0x00000010UD mov(8) g21<2>HF g25<8,8,1>F mov(8) g19<2>HF g28<8,8,1>F shl(8) g19<1>UD g19<8,8,1>UD 0x00000010UD mov(8) g19<2>HF g23<8,8,1>F mov(8) g20<2>HF g28<8,8,1>F shl(8) g20<1>UD g20<8,8,1>UD 0x00000010UD mov(8) g20<2>HF g24<8,8,1>F After this commit, the generated assembly is: mov(8) g21<1>UD 0x00000000UD mov(8) g21<2>HF g23<8,8,1>F mov(8) g19<1>UD 0x00000000UD mov(8) g19<2>HF g17<8,8,1>F mov(8) g20<1>UD 0x00000000UD mov(8) g20<2>HF g18<8,8,1>F Tiger Lake, Ice Lake, Skylake, and Haswell had similar results. (Ice Lake shown) total instructions in shared programs: 20119086 -> 20119034 (<.01%) instructions in affected programs: 9056 -> 9004 (-0.57%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 16 x̄: 6.50 x̃: 4 helped stats (rel) min: 0.29% max: 1.75% x̄: 1.00% x̃: 0.98% 95% mean confidence interval for instructions value: -11.01 -1.99 95% mean confidence interval for instructions %-change: -1.56% -0.44% Instructions are helped. total cycles in shared programs: 861019414 -> 861021044 (<.01%) cycles in affected programs: 279862 -> 281492 (0.58%) helped: 4 HURT: 2 helped stats (abs) min: 6 max: 936 x̄: 239.00 x̃: 7 helped stats (rel) min: 0.03% max: 8.13% x̄: 2.09% x̃: 0.09% HURT stats (abs) min: 18 max: 2568 x̄: 1293.00 x̃: 1293 HURT stats (rel) min: 0.36% max: 1.14% x̄: 0.75% x̃: 0.75% 95% mean confidence interval for cycles value: -972.56 1515.89 95% mean confidence interval for cycles %-change: -4.77% 2.49% Inconclusive result (value mean confidence interval includes 0). Broadwell total instructions in shared programs: 17812327 -> 17812263 (<.01%) instructions in affected programs: 9867 -> 9803 (-0.65%) helped: 8 HURT: 0 helped stats (abs) min: 2 max: 28 x̄: 8.00 x̃: 4 helped stats (rel) min: 0.32% max: 1.80% x̄: 1.00% x̃: 0.95% 95% mean confidence interval for instructions value: -15.46 -0.54 95% mean confidence interval for instructions %-change: -1.54% -0.47% Instructions are helped. total cycles in shared programs: 904768620 -> 904773291 (<.01%) cycles in affected programs: 454799 -> 459470 (1.03%) helped: 4 HURT: 4 helped stats (abs) min: 36 max: 586 x̄: 344.50 x̃: 378 helped stats (rel) min: 0.47% max: 4.04% x̄: 2.01% x̃: 1.77% HURT stats (abs) min: 1 max: 5572 x̄: 1512.25 x̃: 238 HURT stats (rel) min: <.01% max: 2.77% x̄: 1.46% x̃: 1.53% 95% mean confidence interval for cycles value: -1122.40 2290.15 95% mean confidence interval for cycles %-change: -2.26% 1.71% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 18581 -> 18579 (-0.01%) spills in affected programs: 323 -> 321 (-0.62%) helped: 1 HURT: 0 total fills in shared programs: 24985 -> 24981 (-0.02%) fills in affected programs: 1348 -> 1344 (-0.30%) helped: 1 HURT: 0 Tiger Lake, Ice Lake, and Skylake had similar results. (Ice Lake shown) Instructions in all programs: 143585431 -> 143513657 (-0.0%) Instructions helped: 14403 Cycles in all programs: 8439312778 -> 8439371578 (+0.0%) Cycles helped: 10570 Cycles hurt: 3290 Gained: 146 Lost: 74 All of the lost and gained fossil-db shaders are SIMD32 fragment shaders. 14,247 of the affected shaders are from Shadow of the Tomb Raider. 154 are from Batman Arkham Origins, and the remaining two are from Octopath Traveler. Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15089>	2022-04-07 18:26:23 +00:00
Ian Romanick	c08302670b	intel/compiler: Fix sample_d messages on DG2 DG2 can only do sample_d and sample_d_c on 1D and 2D surfaces. The maximum number of gradient components and coordinate components should be 2. In spite of this limitation, the Bspec lists a mysterious R component before the min_lod, so the maximum coordinate components is 3. Fixes the following Vulkan CTS failures on DG2: dEQP-VK.glsl.texture_functions.texturegradclamp.isampler1d_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.isampler2d_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1d_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1d_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2d_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2d_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler1d_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler2d_fragment The Fixes: tag below is a bit misleading. This commit fixes some test cases similar to ones fixed by the Fixes: commit. I just want to make sure this commit gets applied everywhere that commit was also applied. Fixes: `635ed58e52` ("intel/compiler: Lower txd for 3D samplers on XeHP.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15781>	2022-04-07 17:09:28 +00:00
Jason Ekstrand	13fc698cef	anv/formats: Relax usage checks if EXTENDED_USAGE_BIT is set Reviewed-by: Ivan Briano <ivan.briano@intel.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14153>	2022-04-07 15:56:33 +00:00
Lionel Landwerlin	b5031bd6f7	intel/nir: don't report progress on rayqueries if no queries Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `c78be5da30` ("intel/fs: lower ray query intrinsics") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15769>	2022-04-07 08:24:19 +00:00
Lionel Landwerlin	56ef501e3a	blorp: disable depth bounds Otherwise the driver setting interacts with it. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `939ddccb7a` ("anv: Add support for depth bounds testing.") Fixes: `1df871f8ff` ("iris: Add support for depth bounds testing.") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15763>	2022-04-06 19:00:50 +00:00
Lionel Landwerlin	3069337144	anv: remove unused 3DSTATE_DEPTH_BOUNDS fields Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15763>	2022-04-06 19:00:50 +00:00
Lionel Landwerlin	88f77aa811	anv: disable preemption on 3DPRIMITIVE on gfx12 To workaround a push constant corruption issue. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5963 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5662 Cc: mesa-stable Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15753>	2022-04-06 12:51:15 +00:00
Vadym Shovkoplias	04a6693871	anv: fix EXT_depth_clip_control This fixes arb_clip_control-clip-control and depth_clamp piglit tests on zink. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6186 Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15561>	2022-04-06 13:26:52 +03:00
Jason Ekstrand	29b8097408	anv: Enable VK_EXT_debug_utils It's implemented in common code as long as you use vk_command_buffer. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>	2022-04-06 01:18:23 +00:00
Mike Blumenkrantz	6fd344ff98	anv: expose VK_EXT_image_2d_view_of_3d sampling only available on gen9+ Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15754>	2022-04-05 20:30:31 +00:00
Omar Akkila	4208895175	ci: bump VK-GL-CTS to 1.3.1.1 Signed-off-by: Omar Akkila <omar.akkila@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15668>	2022-04-04 23:04:33 +00:00
Jason Ekstrand	94ce812497	anv: Advertise two more formats These both require swizzling so border colors won't work. However, they're conveniently in the list of formats for which custom border colors require you to specify a format in the sampler. That list constists of: - VK_FORMAT_B4G4R4A4_UNORM_PACK16 - VK_FORMAT_B5G6R5_UNORM_PACK16 - VK_FORMAT_B5G5R5A1_UNORM_PACK16 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6226 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>	2022-04-04 21:42:23 +00:00
Jason Ekstrand	e32b9e5c3f	anv: Generalize border color swizzles Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>	2022-04-04 21:42:23 +00:00
Jason Ekstrand	54509d27d9	anv: Disallow blending on swizzled formats Fixes: `c20f78dc5d` ("anv: Support swizzled formats.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>	2022-04-04 21:42:23 +00:00
Jason Ekstrand	257a20f40d	intel/isl: Add a helper for swizzling color values Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15624>	2022-04-04 21:42:23 +00:00
Ian Romanick	7fd1955412	nir: intel/compiler: Lower TXD on array surfaces on DG2+ DG2 can only do sample_d and sample_d_c on 1D and 2D surfaces. Cube maps and 3D surfaces were already handled, but 1D array and 2D array surfaces were not. Fixes the following Vulkan CTS failures on DG2: dEQP-VK.glsl.texture_functions.texturegradclamp.isampler1darray_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.isampler2darray_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1darray_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler1darray_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2darray_fixed_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.sampler2darray_float_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler1darray_fragment dEQP-VK.glsl.texture_functions.texturegradclamp.usampler2darray_fragment The Fixes: tag below is a bit misleading. This commit adds another lowering, similar to the one in the Fixes: commit, that probably should have been added at the same time. I just want to make sure this commit gets applied everywhere that commit was also applied. Fixes: `635ed58e52` ("intel/compiler: Lower txd for 3D samplers on XeHP.") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15681>	2022-03-31 12:59:18 -07:00
Rohan Garg	d876abeaa8	anv: Drop dead code in anv_UpdateDescriptorSets Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15666>	2022-03-30 15:19:47 +02:00
Lionel Landwerlin	684a4ea30c	intel/clc: fix missing pointer write Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `346a7f14fb` ("intel/compiler: Add code for compiling CL-style SPIR-V kernels") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15611>	2022-03-30 07:56:25 +00:00
Mike Blumenkrantz	65ec846f77	intel/isl: fix 2d view of 3d textures according to KHR_gl_texture_3D_image: If <target> is EGL_GL_TEXTURE_3D_KHR, <buffer> must be the name of a complete, nonzero, GL_TEXTURE_3D (or equivalent in GL extensions) target texture object, cast into the type EGLClientBuffer. <attr_list> should specify the mipmap level (EGL_GL_TEXTURE_LEVEL_KHR) and z-offset (EGL_GL_TEXTURE_ZOFFSET_KHR) which will be used as the EGLImage source; the specified mipmap level must be part of <buffer>, and the specified z-offset must be smaller than the depth of the specified mipmap level. thus a 2d view of a 3d surface is not only legal, it's part of the spec and must be supported when available cc: mesa-stable Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15584>	2022-03-29 21:44:51 +00:00
Kenneth Graunke	8831cb38aa	anv: Stop updating STATE_BASE_ADDRESS on XeHP Now that we're using 3DSTATE_BINDING_TABLE_POOL_ALLOC to set the base address for the binding table pool separately from surface states, we don't actually need to update surface state base address anymore. Instead, we can just set STATE_BASE_ADDRESS once at context creation, and never bother updating it again, saving some heavyweight flushes and freeing us from the need for address offsetting trickery. This patch was originally written by Jason Ekstrand, with fixes from Lionel Landwerlin, but was targeting Icelake. Doing it there requires additional changes (15:5 -> 18:8 binding table pointer formats) which also involve some trade-offs, whereas the XeHP change is purely a win, so we'll do it here first. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15616>	2022-03-29 20:45:59 +00:00
Kenneth Graunke	1967fd3b10	intel/compiler: Call inst->resize_sources before setting the sources You should probably resize the sources array before accessing entries that might be out of bounds. inst->resize_sources() always allocates enough space for at least 3 sources, so this is really only an issue when there are 4+ sources. Fixes: `a920979d4f` ("intel/fs: Use split sends for surface writes on gen9+") Fixes: `4f86a70599` ("intel/fs: Lower DW untyped r/w messages to LSC when available") Fixes: `d372abe397` ("intel/fs: Add surface OWORD BLOCK opcodes") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15632>	2022-03-29 13:06:17 -07:00
Kenneth Graunke	9bc97e4fc1	intel/decoder: Fix decoder handling of binding table pool alloc on XeHP 3DSTATE_BINDING_TABLE_POOL_ALLOC no longer has a "Binding Table Pool Enable" bit. It is always enabled. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15625>	2022-03-29 02:35:54 -07:00
Georg Lehmann	922916bf64	nir: Move lower_usub_sat64 to nir_lower_int64_options. Signed-off-by: Georg Lehmann <dadschoorse@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15421>	2022-03-28 20:02:52 +00:00
Kenneth Graunke	823745dc27	intel/compiler: Use nir_opt_uniform_atomics() In general, an atomic intrinsic may perform separate atomics for every enabled SIMD channel, as each channel may operate on different memory. However, an extremely common case is for all channels to access the same memory location. In this case, we can simply perform a reduction/scan across the subgroup, and perform one atomic for the whole subgroup, rather than one per channel. For example, if an intrinsic says to take the minimum value of the existing memory and the value in each channel, we can do a thread-local minimum of all enabled channels, then do a single atomic to take the minimum of that and the existing memory. Our hardware doesn't optimize the case where multiple channels ask for atomics on the same memory location; it assumes the compiler will do so. nir_opt_uniform_atomics() uses divergence analysis to detect this case, adds the necessary subgroup operations, and moves the atomic inside a conditional that disables all but a single invocation. It even detects cases where the shader code already performs this kind of optimization, and avoids doing it a second time. This may not be the optimal solution for us. In the backend, we could detect this case and emit send(1) instructions with NoMask, rather than generating if...send(16)...endif, and a lot of unnecessary ALU ops. But it's simple to do, reuses the same path as ACO, and still provides most of the benefit by cutting up to 16x atomics down to a single atomic, which is more merciful to the memory bus. Improves performance of Shadow of the Tomb Raider by 5.5% on XeHP. Improves performance of a customer-internal benchmark on XeHP at 3840x2160 and low settings by approximately 30%. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Kenneth Graunke	49ef23f4a6	intel/compiler: Convert to LCSSA and use divergence analysis. We'll use this more shortly. For now, enable it to separately in case anything bisects to this. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Kenneth Graunke	b3942beecf	intel/compiler: Set divergence analysis options Although we don't use divergence analysis yet, we've had several work-in-progress series that make use of it. We may as well set our options so that those series can assume they're in place. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Kenneth Graunke	6fa66ac228	intel/compiler: Implement nir_intrinsic_last_invocation We haven't exposed this intrinsic as it doesn't directly correspond to anything in SPIR-V. However, it's used internally by some NIR passes, namely nir_opt_uniform_atomics(). We reuse most of the infrastructure in brw_find_live_channel, but with LZD/ADD instead of FBL. A new SHADER_OPCODE_FIND_LAST_LIVE_CHANNEL is like SHADER_OPCODE_FIND_LIVE_CHANNEL but from the other side. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15484>	2022-03-26 00:28:19 +00:00
Caio Oliveira	c32d386ce2	intel/compiler: Inline TUE map computation into TUE Input lowering Refactor since the TUE compute function is simpler now and the comments make sense being near the lowering. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>	2022-03-25 23:29:19 +00:00
Caio Oliveira	c36ae42e4c	intel/compiler: Use nir_var_mem_task_payload Instead of reusing the in/out slot mechanism, use a separated NIR variable mode. This will make easier later to implement staging the output in shared memory (and storing all at the end to the URB). Note to get 64-bit type support we currently rely on the brw_nir_lower_mem_access_bit_sizes() pass. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15022>	2022-03-25 23:29:19 +00:00
Boris Brezillon	49c8b93288	anv: Stop using VK_OUTARRAY_MAKE() We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED() so people don't get tempted to use it and make things incompatible with MSVC (which doesn't support typeof()). Suggested-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>	2022-03-25 11:00:03 +00:00
Caio Oliveira	f82731d0d7	intel/fs: Fix IsHelperInvocation for the case no discard/demote are used Use emit_predicate_on_sample_mask() helper that does check where to get the correct mask depending on whether discard/demote was used or not. Fixes: `45f5db5a84` ("intel/fs: Implement "demote to helper invocation"") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>	2022-03-25 08:20:27 +00:00
Caio Oliveira	bb311c22df	intel/fs: Initialize the sample mask in flags register when using demote Without this change, a check for "is helper invocation" could read uninitialized values. Fixes: `45f5db5a84` ("intel/fs: Implement "demote to helper invocation"") Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15400>	2022-03-25 08:20:27 +00:00
Lionel Landwerlin	8cdd5647c6	anv: don't store sample location sample count This information should match the current pipeline sample count. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>	2022-03-24 10:49:07 +00:00
Lionel Landwerlin	6f5f817c0f	anv: fix dynamic sample locations on Gen7/7.5 3DSTATE_MULTISAMPLE should be baked into the pipeline if not dynamic. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `27ee40f4c9` ("anv: Add support for sample locations") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>	2022-03-24 10:49:07 +00:00
Lionel Landwerlin	8ad78671b3	anv: use local dynamic pointer more Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15310>	2022-03-24 10:49:07 +00:00

1 2 3 4 5 ...

7861 Commits