AlexIndustrial/mesa

Author	SHA1	Message	Date
Timothy Arceri	cb58d75224	nir/nir_opt_copy_prop_vars: don't call memset when cloning This makes the pass significantly faster cutting execution time by around 30% in the cts test dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20 This 30% improvement is in addition to all the improvements from the proceeding patches. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>	2023-02-16 23:31:59 +00:00
Timothy Arceri	d1a41d9c64	nir/nir_opt_copy_prop_vars: reorder clone calls This helps with the reuse of dynamic arrays. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>	2023-02-16 23:31:59 +00:00
Timothy Arceri	2a2d85e254	nir/nir_opt_copy_prop_vars: reuse dynamic arrays As per the previous commit if we don't reuse these dynamic arrays we end up needlessly thrashing the memory handling functions. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>	2023-02-16 23:31:59 +00:00
Timothy Arceri	ffe0f3fda1	nir/nir_opt_copy_prop_vars: reuse hash tables Due to how this pass works we can end up thrashing memory if we do not reuse these hash tables rather than reusing them. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>	2023-02-16 23:31:59 +00:00
Timothy Arceri	731e9fd535	nir/nir_opt_copy_prop_vars: avoid comparison explosion Previously the pass was comparing every deref to every load/store causing the pass to slow down more the larger the shader is. Here we use a hash table so we can simple store everything needed for comparision of a var separately. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>	2023-02-16 23:31:59 +00:00
Timothy Arceri	8f6f5730f6	nir/nir_opt_copy_prop_vars: remove extra loop The fix in `947f7b452a` introduced an extra loop over the copies array to find the correct entry in the case it had been moved. The problem is these loops can be iterated over millions of times so lets simply update the entry pointer in the case we change its location in the array. Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20381>	2023-02-16 23:31:59 +00:00
Faith Ekstrand	4e09d37f3b	nir/from_ssa: Move the loop bounds check in resolve_parallel_copy We loop, effectively, over two stacks: ready and to_do and finish only when both are empty. In the case where ready is empty, we pull one off of to_do, add a copy to a temporary, and push it onto the ready stack. Previously, we assumed that we would never get to the temporary copy case if to_do has exactly one entry because that would imply that there was only one copy left which means there can't possibly be a cycle to break. This was true until `c7fc44f9eb` ("nir/from_ssa: Respect and populate divergence information") which changed things such that temporary copies sometimes get added in the case where a convergent value is copied both to convergent and divergent destinations. This patch adjusts our loop iteration to always attempt to clear the ready stack before checking if there's anything left on the to_do stack. I also added an assert to make the exit condition more clear. Fixes: `c7fc44f9eb` ("nir/from_ssa: Respect and populate divergence information") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/8037 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>	2023-02-16 20:23:42 +00:00
Faith Ekstrand	5afba073c6	nir/from_ssa: Only re-locate values that are destinations There is an optimization in the parallel copy algorithm where, after a copy has been performed, we can treat the destination as the new source for future copies of the same source. In particular, consider the following parallel copy: A -> B, C -> A, A -> C. In this case, after we have done the A -> B copy, we can make note that the value in A is now in B and emit the sequence: A -> B, C -> A, B -> C. This allows us to resolve the swap cycle between A anc C without allocating a temporary register because we know B is also a copy of A. When one of the registers involved is convergent and the other is divergent, this optimization is problematic because, while convergent to divergent copies are fine, we can't re-use the divergent copy in later copies if any of those copies are to a convergent variable. We could, but it would require a read_first_invocation which would get messy. In In `c7fc44f9eb` ("nir/from_ssa: Respect and populate divergence information"), we attempted to deal with this by limiting the rename optimization to the case where the divergence matched. The problem is that we did the re-name part whenever the divergence matched but only marked it as ready if the thing being copied was a destination. (We actually left two instances of loc[a] = b, one which always happened and one which only happened if we also wanted to flag the source as being ready to use as a destination.) While this technically doesn't cause any problems, it may result in more inter-mov dependencies which hurts instruction scheduling. For example, if we had the parallel copy A -> B, A -> C, A -> D, we now end up emitting the sequence A -> B, B -> C, C -> D which has many more data hazards between instructions caused by the constant shuffling. This commit restores the original logic in which we only perform the rename optimization if the rename would free up a register we will later use as a destination. This isn't entirely optimal as it still doesn't prove that there is a cycle involved first, but it should lead to a reduction in unnecessary dependencies. No shader-db changes on SKL or DG2 Fixes: `c7fc44f9eb` ("nir/from_ssa: Respect and populate divergence information") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21315>	2023-02-16 20:23:42 +00:00
Rob Clark	9673502b3b	freedreno/drm: Optimize stateobj re-emit For long-lived stateobjs, it is common to re-emit to the same submit multiple times. By giving each submit a unique sequence # we can detect this case and skip the extra append_bo(). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	6747d30155	freedreno: Add seqno helper It is a pretty common pattern to allocate a non-zero sequence # for lightweight checking if an object is the same, changed, for use in cache keys, etc. (And also pretty common to forget to handle the rollover zero case.) Add a helper for this. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	8f2b22ba66	freedreno: Drop batch lock Now that we are not tracking cross-context batch dependencies, there is no scenario where one context could trigger flushing another context's batch. So we can drop the batch lock intended to protect against this. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	9a6de00e98	freedreno/batch: Stop tracking cross-context deps The app is expected to provide suitable cross-context synchronization (fences, etc), so don't try to do it's job for them. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	a4b949fe61	freedreno: Avoid taking screen lock Avoid taking screen unlock for batch unref. Instead just split the destroy fxn into locked and unlocked variants. That way we only end up taking the screen lock on final unref but avoid it in the common case. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	35fc1595b3	freedreno/a6xx: Pre-compute PROG related LRZ state PROG state mostly just disables various LRZ related flags, which can be handled as a simple mask. The exception is ztest mode, which is either overriden by PROG state, or we use the all 1's value (which isn't valid from hw standpoint) to signal that it needs to be computed at draw time, which fortunately fits in with the bitmask approach. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	c938101bb5	freedreno: Move FD_MESA_DEBUG cases out of draw_vbo If the debug options are enabled, just plug in a debug version of draw_vbo with the additional checks. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	8942f4b734	freedreno: Move blend out of dirty-rsc tracking This was not doing any actual resource tracking, just updating gmem_reason. And furthermore, a6xx+ doesn't care about the bits it was setting. So move this to per-gen backend for the gens that need it, and avoid setting FD_DIRTY_RESOURCE when FD_DIRTY_BLEND is set. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	67d4bc7be4	freedreno/a6xx: Remove tex-state refcnting Now that we use a flag to trigger the tex state invalidation coming from other contexts, we can drop the refcnt'ing. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	cfd4721ee0	freedreno/drm: Make rb refcnt non-atomic Now that the one special case where multiple threads could race to ref/unref, we can go back to using non-atomic refcnts. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	f91bcd2455	freedreno/a6xx: Do tex-state invalidates in same ctx If a resource invalidate is triggered by a different ctx (potentially on a different thread) simply flag that the tex state needs invalidation, but defer handling it to the ctx that owns the tex state. This will let us remove atomic refcnt'ing on the tex state, and more importantly atomic refcnt'ing on the fd_ringbuffer (as this was the one special case where rb's could be accessed from multiple threads). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	e7993d68e2	freedreno/a6xx: Multi-draw support Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	cc31997f1b	freedreno/a6xx: Split out flush_streamout() helper Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	911d67bdad	freedreno/a6xx: Drop unused return Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	c4e2e821a2	freedreno: Push num_draws down to backend Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Rob Clark	6bfee9e669	freedreno: Account for multi-draw in num_draws Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21274>	2023-02-16 19:57:13 +00:00
Daniel Schürmann	f6251b21f9	radv/rt: don't hash maxPipelineRayRecursionDepth The stack size has no effect on the generated shader anymore. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Daniel Schürmann	8e718c5b63	radv/rt: use dynamic_callable_stack_base also for static stack_sizes This patch also removes rt_pipeline->dynamic_stack_size and replaces it by checking for rt_pipeline->stack_size == -1u. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Daniel Schürmann	2649a1f272	radv/rt: introduce and set rt_pipeline->stack_size Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Daniel Schürmann	b338d59047	radv: unconditionally enable scratch for RT shaders Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Daniel Schürmann	aa362b4b6f	radv: rename shader_info->cs.uses_sbt -> shader_info->cs.is_rt_shader Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Konstantin Seurer	72d9604db0	radv: Clean up dynamic RT stack allocation Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21159>	2023-02-16 19:37:25 +00:00
Sidney Just	fc84c63e17	zink: Add missing features to the profile file Fixes: `2ea481b2f0` ("Zink: add Zink profiles file") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20920>	2023-02-16 19:11:57 +00:00
Sidney Just	60e0322092	zink: add check for samplerMirrorClampToEdge Vulkan 1.2 feature This adds a check to advertise PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE when either the extension is present or the Vulkan 1.2 feature is enabled. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20920>	2023-02-16 19:11:57 +00:00
Emma Anholt	ed62eec58b	hasvk: Fix SPIR-V warning about TF unsupported on gen7. It's supported now. Fixes: `d82826ad44` ("anv: Implement VK_EXT_transform_feedback on Gen7") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>	2023-02-16 18:11:44 +00:00
Emma Anholt	98455470ea	hasvk: Silence conformance warning in CI. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>	2023-02-16 18:11:44 +00:00
Emma Anholt	570acf5655	ci: Add a manual full and 1/10th hasvk CTS runs. These are manual since they're on a runner in my basement that sometimes can go down, but it'll be nice to have this for throwing the rare hasvk MR at. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21228>	2023-02-16 18:11:44 +00:00
Danylo Piliaiev	be976e0aa6	ci/tu: Add 1/200 pass to test for stale reg usage Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21226>	2023-02-16 17:43:10 +00:00
Danylo Piliaiev	86f82d4224	docs/freedreno: Add info about stale reg stomper dbg option Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21226>	2023-02-16 17:43:10 +00:00
Danylo Piliaiev	a66d9c815d	turnip: Add debug option to find usage of stale reg values MESA_VK_ABORT_ON_DEVICE_LOSS=1 \ TU_DEBUG_STALE_REGS_RANGE=0x00000c00,0x0000be01 \ TU_DEBUG_STALE_REGS_FLAGS=cmdbuf,renderpass \ ./app To pinpoint the reg causing a failure reducing regs range could be used for bisection. Some failures may be caused by multi-reg combination, in such case set 'inverse' flag which would change the meaning of reg range to "do not stomp these regs". Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21226>	2023-02-16 17:43:10 +00:00
Timur Kristóf	084d10a702	aco: Remove MTBUF zero operand. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363>	2023-02-16 17:16:34 +00:00
Timur Kristóf	afdacf4dcc	aco: Don't set scalar offset on buffer load instructions when it's zero. This helps generate slightly more optimal instructions. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21363>	2023-02-16 17:16:34 +00:00
José Roberto de Souza	e050a00b9f	intel/common: Move i915 files to i915 folder Following the organization done in intel/dev and intel/vulkan. Probably due to some rebase issue we had a duplicated copyright header in intel_gem_i915.h that is being removed in here too. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21256>	2023-02-16 16:24:36 +00:00
Mike Blumenkrantz	41286f100e	vl/dri3: avoid deadlocking when polling deleted windows for events upcoming xserver releases will emit PresentConfigureNotify with this flag set when a window is destroyed, ensuring drivers don't poll infinitely and deadlock Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>	2023-02-16 15:55:47 +00:00
Mike Blumenkrantz	819cbf329a	vulkan/wsi: avoid deadlocking dri3 when polling deleted windows for events upcoming xserver releases will emit PresentConfigureNotify with this flag set when a window is destroyed, ensuring drivers don't poll infinitely and deadlock fixes #6685 cc: mesa-stable Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>	2023-02-16 15:55:47 +00:00
Mike Blumenkrantz	91de576a7f	dri3: avoid deadlocking when polling deleted windows for events upcoming xserver releases will emit PresentConfigureNotify with this flag set when a window is destroyed, ensuring drivers don't poll infinitely and deadlock Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/116 cc: mesa-stable Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21339>	2023-02-16 15:55:47 +00:00
Timur Kristóf	4621ffdec1	aco: Get rid of redundant load_vmem_mubuf function. Call emit_load directly from visit_load_buffer instead. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>	2023-02-16 15:29:37 +00:00
Timur Kristóf	74f1b77046	radv: Move VS input lowering to new file: radv_nir_lower_vs_inputs. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>	2023-02-16 15:29:37 +00:00
Timur Kristóf	450e173de0	ac/llvm: Change ac_build_tbuffer_load to take format and channel type. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>	2023-02-16 15:29:37 +00:00
Timur Kristóf	0ae778ca59	ac/llvm: Fix ac_build_buffer_load to work with more than 4 channels. LLVM is unable to select instructions for num_channels > 4, so we workaround that by manually splitting larger buffer loads. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>	2023-02-16 15:29:37 +00:00
Timur Kristóf	a2755fc203	ac/llvm: Fix buffer_load_amd with larger than 32-bit channel sizes. LLVM is unable to select instructions for larger than 32-bit channel types. Workaround by using i32 and casting to the correct type later. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>	2023-02-16 15:29:37 +00:00
Timur Kristóf	b5b0ded4c1	ac/llvm: Remove "structurized" argument and instead check vindex. Change ac_build_buffer_load_common and ac_build_tbuffer_load so the use structurized load when the vindex argument is not NULL. Adjust callers to match the new behaviour. This fixes the load_buffer_amd intrinsic with index source. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Qiang Yu <yuq825@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Konstantin Seurer <konstantin.seurer@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21358>	2023-02-16 15:29:37 +00:00

1 2 3 4 5 ...

154293 Commits