AlexIndustrial/mesa

Author	SHA1	Message	Date
Lionel Landwerlin	1908d2c171	anv: split image view from anv_image.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
Lionel Landwerlin	eff01c46d8	anv: split buffer view from anv_image.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
Lionel Landwerlin	f5af56528b	anv: split sampler from anv_device.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
Lionel Landwerlin	543c726781	anv: split buffer from anv_device.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
Lionel Landwerlin	c59e8e814a	anv: split events from anv_device.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
Lionel Landwerlin	ca51a02e7b	anv: split physical_device from anv_device.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
Lionel Landwerlin	c7ecf10c20	anv: split instance from anv_device.c Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30285>	2024-07-22 18:46:05 +00:00
José Roberto de Souza	69ee1c4b46	anv: Drop useless 'if (total_scratch > 0) {' block in cmd_buffer_ensure_cfe_state() cmd_buffer_ensure_cfe_state() returns ealier if total_scratch == 0 here: if (total_scratch <= comp_state->scratch_size) return; Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30271>	2024-07-22 18:17:38 +00:00
José Roberto de Souza	de5d767f9a	intel/brw: Add a maximum scratch size restriction Gfx 12.5 moved scratch to a surface and SURFTYPE_SCRATCH has this pitch restriction: RENDER_SURFACE_STATE::Surface Pitch For surfaces of type SURFTYPE_SCRATCH, valid range of pitch is: [63,262143] -> [64B, 256KB] The pitch of the surface is the scratch size per thread and the surface should be large enough to accommodate every physical thread. So here adding a new field to intel_device_info, setting it in intel_device_info_init_common() so even offline tools can have it set. And finally adding a check to fail shader compilation if needed scratch is larger than supported. This issue can be reproduced in debug builds when running dEQP-VK.protected_memory.stack.stacksize_1024 on Gfx 12.5 or newer platforms. Ref: BSpec 43862 (r52666) Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30271>	2024-07-22 18:17:38 +00:00
Paulo Zanoni	c65a76db85	anv/trtt: don't just crash when we can't find device->trtt.queue Please refer to the big comment this patch introduces. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:34 -07:00
Paulo Zanoni	3ab8ff99fa	anv/trtt: fix the process of picking device->trtt.queue We want to use actual sparse-capable queues as the default trtt->queue, not copy queues that may have a companion_rcs_batch. Before this patch, if we expose more than one queue and the application creates a copy queue first, we'll end up setting trtt->queue as the copy queue, which will GPU hang when we submit the TR-TT batches as they don't support the pipe_control commands we issue. The trtt->queue queue is used for binding/unbinding buffers in code paths where there's no specific queue coming from user space, such as when we're creating or destroying a sparse resource. This is not a problem yet on i915.ko since we are exposing only a single queue, and it is not a problem for xe.ko since TR-TT is not the default there. This is also not a problem in applications that create the render or compute queue first. We plan to expose more queues when using TR-TT, so this would become a problem without this patch. None of VK-GL-CTS seems to exercise that, and none of the Steam games I tested exercise that as well. I was able to reproduce this issue using our internal tracing tool. v2: New implementation that doesn't break when we only have a compute queue (Lionel). Fixes: `04bfe828db` ("anv/sparse: allow sparse resouces to use TR-TT as its backend") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:34 -07:00
Paulo Zanoni	5ca224aa0c	anv/trtt: make all contexts have the same TR-TT programming On Gen12 (the oldest we support on Mesa right now for TR-TT) we started having per-engine TR-TT registers and we are supposed to make all contexts share the same TR-TT programming. On LNL+, this is documented in the BSpec page for the TRTT_CNTRL register (68417), with more details in HSDs 14020454786 and 16022013154. On Gen12 platforms this information is a little harder to find and there's a whole trail of HSDs leading up to 1209977595, which links to the documents that describe the programming. BSpec for TR-TT on Gen12 is very confusing as it still contains registers and other information from Gen11 that were not removed. Regarding the additional BLT and COMP registers, please notice that on the BSpec pages for the TR-TT registers, the "Register Instance" section only lists the GFX registers as non-privileged. However, the "User Mode Privileged Commands" lists the other instances of the TR-TT Regsiters as non-privileged, which matches what we see: there's no need to put these addresses in the FORCE_TO_NONPRIV registers. Notice that for now, when TR-TT is being used we only expose a single queue, so this change effectively does nothing until we start exposing extra queues. I left that part for later to help bisectability. v2: - s/trtt_init_context_state/trtt_init_queues_state/ (José) - pass device as the argument to init_queues_state (José) v3: - use async_submit_end (José) Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:34 -07:00
Paulo Zanoni	6415027d85	anv/trtt: submit a separate batch in anv_trtt_init_context_state() Having this as a separate batch was the normal behavior until `7da5b1caef` ("anv: move trtt submissions over to the anv_async_submit"). While it certainly sounds better to do everything related to TR-TT initialization in one batch, we need to revert it back to be a separate batch (but now using the new anv_async_submit infrastructure) because we'll want to run this batch on every engine. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:34 -07:00
Paulo Zanoni	abbb4b20f3	anv/trtt: check the return value of anv_trtt_init_context_state() I haven't seen this happening anywhere, but let's have it for correctness. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:34 -07:00
Paulo Zanoni	fb9d94f4ed	anv/trtt: make genX(init_trtt_context_state) a little more compact In this series we're going to further change this function, adding a lot more lines, so this patch should make the next diffs a little easier to comprehend and review. Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:34 -07:00
Paulo Zanoni	6bc9a57173	intel/genxml: add the BLT and COMP_CTX0 versions of the TR-TT registers Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30252>	2024-07-22 10:04:33 -07:00
Rohan Garg	fe387e14b5	anv: use the WA infrastructure when emitting WA 16013994831 Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30295>	2024-07-22 13:43:39 +00:00
Sushma Venkatesh Reddy	2f6919e6c2	intel/clflush: Utilize clflushopt in intel_invalidate_range On MTL ChromeOS boards, during AI based video conference, we were observing a lot of overhead from invalidations. Upon debug, it was found that we were using clflush in this function and that isn't efficient. With this change, while executing compute workloads like zoo models, we are getting ~25% performance improvements in a best case scenario. Rework: * Jordan: Call intel_clflushopt_range() rather than __builtin_ia32_clflushopt() because intel_mem.c is not compiled with -mclflushopt. Backport-to: 24.1 24.2 Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30238>	2024-07-20 16:10:16 +00:00
Francisco Jerez	ff3c3792b4	anv/gfx12.5: Pass non-empty push constant data to PS stage for TBIMR workaround. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10728 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11399 Fixes: `57decad976` ("intel/xehp: Enable TBIMR by default.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30031>	2024-07-20 01:13:19 +00:00
Francisco Jerez	b98eebbcb2	intel/brw: Implement null push constant workaround. This implements an undocumented workaround for a hardware bug that affects draw calls with a pixel shader that has 0 push constant cycles when TBIMR is enabled, which has been seen to lead to a hang with Fallout 3 and Metal Gear Rising Revengeance. This hardware bug has been reported as HSDES#22020184996 which is still pending a resolution by the hardware team. However since this workaround found empirically has been confirmed to fix the issue reliably and it's relatively harmless it seems worth checking in already even though no final W/A number is available nor has the W/A json file been updated. To avoid the issue we simply pad the push constant payload to be at least 1 register. This is enabled via a brw_wm_prog_key since the driver needs to be in agreement with the compiler on whether the dummy push constant cycle is present, and it can be avoided in cases where the driver knows that TBIMR will be disabled (e.g. for BLORP). Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10728 Related: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11399 Fixes: `57decad976` ("intel/xehp: Enable TBIMR by default.") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30031>	2024-07-20 01:13:19 +00:00
Francisco Jerez	bb2513918a	intel/dev: Add devinfo flag for TBIMR push constant workaround. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30031>	2024-07-20 01:13:19 +00:00
Jordan Justen	bb8063e1f4	anv/generated_indirect_draws: Adjust xe2 simd32 sends_count_expectation Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `e9f63df` ('intel/dev: Enable LNL PCI IDs without INTEL_FORCE_PROBE') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30093>	2024-07-19 16:09:06 +00:00
Daniel Stone	e05415a82e	format: Generate endian-independent format aliases Instead of having a hardcoded list of endian-independent format aliases in the header, generate them from the format definitions. Signed-off-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29649>	2024-07-19 13:50:42 +00:00
Lionel Landwerlin	692e1ab2c1	anv: get rid of the second dynamic state heap Pretty big change... Sorry for that. I can't exactly remember why I created 2 heaps. I think it's because I mistakenly thought the samplers in the binding sampler pointers needed to be indexed from the binding table. But that's not the case, they just need to be in the dynamic state heap. In the future, this change will allow to also allocate buffers for push constant data in the newly created dynamic_visible_pool which will be useful on < Gfx12.0 where this is the only place push constant data can live for compute shaders. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30047>	2024-07-19 12:21:46 +00:00
David Heidelberg	decc040abe	intel/debug: allow silencing CL warnings Useful for CI and users previously aware of the warning. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: David Heidelberg <david@ixit.cz> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29691>	2024-07-19 00:24:29 +00:00
Lionel Landwerlin	67b778445a	brw: fix uniform rebuild of sources If you have something like this : con 32 %66 = @load_reg (%62) (base=0, legacy_fabs=0, legacy_fneg=0) con 32 %27 = @resource_intel (%22 (0xdeaddead), %66, %67, %17 (0x0)) (desc_set=2, binding=96, resource_intel=0, resource_block_intel=-1) Just copying the brw_reg in ssa_values[] is not enough for the load_reg intrinsic. We need to call get_nir_src() to force some logic to create the register correct. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `b8209d69ff` ("intel/fs: Add support for new-style registers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30050>	2024-07-18 19:58:46 +00:00
Kenneth Graunke	d630ff1f79	intel/brw: Disallow scalar byte to float conversions on DG2+ I haven't been able to find this restriction mentioned anywhere in the hardware documentation, but the simulator has code to reject this case as invalid, and it doesn't appear to work on hardware anymore. Having lower_regioning() handle this takes care of the issue so we don't have to worry about generating it in random places. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11489 Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30140>	2024-07-18 18:51:35 +00:00
Sushma Venkatesh Reddy	7ca77370d2	anv: Fix I915_PARAM_HAS_CONTEXT_FREQ_HINT check When I915_PARAM_HAS_CONTEXT_FREQ_HINT is not supported the intel_ioctl(fd, DRM_IOCTL_I915_GETPARAM, &gp) will return -1 and that will cause i915_gem_get_param() to return false. val will be different than 1 when not using GuC submission, so we are forcing val check to ensure this holds good in platforms that doesn't support GuC submission. Fixes: `d52dd5a9` ("anv/drirc: add option to provide low latency hint") Signed-off-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30234>	2024-07-18 18:26:38 +00:00
Kenneth Graunke	534f0019d7	intel/brw: Don't mix types for unary extended math instructions We were generating odd instructions like: math inv(8) g93<1>HF g85<8,8,1>HF null<8,8,1>F { align1 1Q @7 $4 }; It's unclear whether the type of the null operand matters, but sometimes these things don't get ignored properly. Out of caution, retype the null source to match the actual operand's type. It'll at least look less surprising in assembly dumps. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30193>	2024-07-18 03:25:06 +00:00
Iván Briano	c8d64860ec	anv: set MOCS for protected memory when needed We were missing setting the EncryptedData bit in the MOCS field when emitting the surface states for protected buffer/images. How this works on ADL remains a mystery to me. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11313 Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30097>	2024-07-17 22:56:51 +00:00
Iván Briano	ece7abb599	anv: get scratch surface from the correct pool Fixes: `3ccf80f9b1` ("anv: prepare 2 variants of all shader instructions") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30097>	2024-07-17 22:56:51 +00:00
José Roberto de Souza	0500e35165	intel/dev: Drop writeback_incoherent from Xe2 Xe2 platforms are only supported by Xe KMD that do not support CPU WB + 0 way coherent. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	6d77dfa75d	intel/dev: Use GPU WB PAT for Xe2 writecombining So for this entry we want the CPU mapping to be WC but GPU caches can be WB. This way GPU don't need to snoop to CPU caches and at the end of workloads L3 cache is flushed, so CPU access is coherent after get the signal that workload was finished. With this the transient(XD) L3 flushes will only affect displayable buffers. Ref: Bspec 71582 (r59285) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	48da8eab55	intel/dev: Add comment documenting the PAT entries Like said in the past patch, coherency is not needed and there was a miss understating about caching used by CPU and GPU. With this new comment it much better explained. Ref: Bspec 45101 (r51017) Ref: Bspec 71582 (r59285) Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	7295e09b53	intel/dev: Drop coherency from intel_device_info_pat_entry It is not used in run-time so we can drop from the struct. It might have value as PAT entries documentation but that will be done in the next patch. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	fa1129540a	intel/dev: Add documentation about intel_device_info_pat_entry::mmap My initial understating was that L3_CACHE_POLICY would be the CPU caching mode but that has nothing to do with CPU caching, it is the GPU caching mode. Due this miss understating we were using a not optimal PAT index that will be fixed in the next patches, so to avoid such issues in future adding comments to intel_device_info_pat_entry struct. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	4173e0f910	intel/dev: Drop DG1 PAT entries It inherents that table from TGL. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	178950bf9b	anv: Fix return of PAT index for compressed bos for discrete GPUs Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jianxun Zhang <jianxun.zhang@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29950>	2024-07-17 17:41:32 +00:00
José Roberto de Souza	4fd7cad05d	intel: Rename XE_PERF to XE_OBSERVATION Xe KMD renamed XE_PERF to XE_OBSERVATION to better match with Intel specification and avoid confusion. This uAPI rename will land in the same kernel version that added the uAPI being renamed. There is no uAPI change, just renames. Sync xe_drm.h with 63347fe031e3 ("drm/xe/uapi: Rename xe perf layer as xe observation layer"). Acked-by: Caio Oliveira <caio.oliveira@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30027>	2024-07-17 01:00:34 +00:00
Caio Oliveira	e3e712e74e	intel/elk: Convert missing uses of ralloc to linear in fs_live_variables And use the non-zeroing variant in cases we are filling the data immediately. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30201>	2024-07-16 23:53:45 +00:00
Caio Oliveira	3700e49fff	intel/brw: Convert missing uses of ralloc to linear in fs_live_variables And use the non-zeroing variant in cases we are filling the data immediately. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30201>	2024-07-16 23:53:45 +00:00
Paulo Zanoni	241585667f	anv: reimplement the anv_fake_nonlocal_memory workaround Commit `94989b45a5` ("anv,driconf: Add fake non device local memory WA for Total War: Warhammer 3") implemented a workaround to make Warhammer 3 work on ADL, but the game still doesn't work on LNL, which uses xe.ko, and MTL, which uses i915.ko: it still fails at launch claiming it couldn't allocate memory. So in this implementation, instead of clearing DEVICE_LOCAL_BIT we just duplicate our memory types, one having the bit and one not having. v2: - Check for VK_MAX_MEMORY_TYPES (José) - Invert the order of the memory types (José) - Fix white space issue (José) v3: - Comment our non-spec-compliance (José) - Remove useless lines (José) Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8721 Fixes: `94989b45a5` ("anv,driconf: Add fake non device local memory WA for Total War: Warhammer 3") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30162>	2024-07-16 20:43:02 +00:00
Dave Airlie	d94a40fe08	anv/video: use correct offset for MPR row store scratch buffer. While playing with zink video, I found this was using the wrong offset. Fixes: `98c58a16ef` ("anv: add initial video decode support for h264.") Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30143>	2024-07-15 01:05:18 +00:00
Caio Oliveira	f48b3bee31	intel/brw: Split off assembler logic into library Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30006>	2024-07-12 19:34:23 +00:00
Rohan Garg	5bb9c1cca9	anv: reuse existing macro to query for flushes Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30102>	2024-07-12 10:50:12 +00:00
Caio Oliveira	c2d1e10315	intel/brw: Don't print extra newlines in assembler Handle '\n' when inside the MSGDESC start condition, otherwise the lexer would apply its default rule (write to stdout). Without that, newlines were "leaking" to the output when parsing a multiple line "MsgDesc". E.g. given the file example.asm below ``` send(8) nullUD g126UD nullUD 0x02000000 0x00000000 thread_spawner MsgDesc: mlen 1 ex_mlen 0 rlen 0 { align1 WE_all 1Q @1 EOT }; ``` the assembler would produce one extra newline ``` $ brw_asm -t hex -g tgl example.asm 31 01 03 80 04 00 00 00 0c 7e 00 70 00 00 00 00 ``` Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30100>	2024-07-11 21:07:54 +00:00
Caio Oliveira	e63b0571bc	intel/brw: Account for reg_unit() in assembler Use reg_unit() to match the internal representation in brw_reg. Fixes the assembler tool when targetting Xe2. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30060>	2024-07-11 16:38:54 +00:00
Caio Oliveira	6cdd56e7ed	intel/brw: Use brw_inst_set_group() to set QtrCtrl and NibCtrl The function handles the Xe2 case where NibCtrl is gone. Also add error messages for invalid input when assembling for Xe2, e.g. "2N". Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30060>	2024-07-11 16:38:54 +00:00
Caio Oliveira	c3c65e8821	intel/brw: Don't set acc_wr_control for Xe2 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30060>	2024-07-11 16:38:54 +00:00
Kenneth Graunke	837c441acb	intel/nir: Don't needlessly split u2f16 for nir_type_uint32 Commit `f695a9fed2` moved the 64-bit float <-> 16-bit float conversion splitting into a core NIR pass, so the code remaining here is only needed for 64-bit integer types. Presumably in an attempt to remove the float handling, it replaced simple bit_size == 64 checks with this expression: (full_type & (nir_type_int64 \| nir_type_uint64)) I believe that the intended expression was: (full_type == nir_type_int64 \|\| full_type == nir_type_uint64) Unfortunately, the former is incorrect. Any integer or unsigned NIR type would trigger the former expression. For example: nir_type_uint32 & (nir_type_int64 \| nir_type_uint64) => nir_type_uint This meant that we were splitting e.g. u2f16 on 32-bit unsigned types into u2f32 and f2f16, when we can easily natively handle that case. To fix this, we go back to simple bit_size == 64 checks. This pass is already run after nir_lower_fp16_casts which will split the float case, so we will never see it here. fossil-db on Alchemist shows a -1.14% reduction in affected shaders for google-meet-clvk shaders. In another ChromeOS workload, it improves performance by around 8% on Meteorlake. Thanks to Sushma Venkatesh Reddy for finding this performance issue! Fixes: `f695a9fed2` ("intel/compiler: use nir_lower_fp16_casts") Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30091>	2024-07-11 02:37:05 -07:00

1 2 3 4 5 ...

12381 Commits