In turnip we were using this a lot with the dynamic state enum, and
we're running out of space there because we're needing to add more and
more dynamic states that don't correspond to draw states. Make it
64-bit-safe so we don't need to rewrite everything in turnip. In the
case where the thing being operated on is 32-bit the compiler can
usually optimize it away, as can be seen with the release build size
before and after:
before:
text data bss dec hex filename
5404913 293592 22744 5721249 574ca1 /home/cwabbott/build/mesa-release/lib64/libvulkan_freedreno.so
text data bss dec hex filename
13981320 498550 205000 14684870 e012c6 /home/cwabbott/build/mesa-release/lib64/dri/msm_dri.so
after:
text data bss dec hex filename
5404969 293592 22744 5721305 574cd9 /home/cwabbott/build/mesa-release/lib64/libvulkan_freedreno.so
text data bss dec hex filename
13981320 498550 205000 14684870 e012c6 /home/cwabbott/build/mesa-release/lib64/dri/msm_dri.so
In the end the only changes is an additional ~50 bytes of text in
turnip.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
This avoids a dependency on the sample count in the blend state, and
seems to work. Otherwise, we'd need to make blend dynamic if samples is
dynamic and record whether the sample mask was NULL, which is a lot more
complicated.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
Most of the time we will only be updating either the number of samples
or whether it should be disabled, not both, and we don't need to compare
both. With pipelines we were comparing both, but with dynamic
rasterization samples we want to only update disable when binding the
pipeline and only update samples when calling
vkCmdSetRasterizationSamplesEXT(). Stop optimizing the uncommon case
where both are changed when binding a pipeline, and split it into 2
parts while sharing the common part that records and emits the state
packet.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
From the spec language, it seems like this change wasn't strictly
required and is just an optimization for when minSampleShading would
be small enough to allow one sample per pixel. However
rasterizationSamples will soon possibly be dynamic, and I don't think we
should keep this around.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
This is a little tricky because now we always have to store the
translated logicOp in the pipeline, regardless of whether it's enabled
or not, because the enable/disable may now be dynamic even if the
logicOp is not.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
Emit GRAS_SU_CNTL, GRAS_CL_CNTL, the polygon mode, and the VRS registers
in one draw state. We're running out of draw states, and this saves a
draw state while preparing us for the rest of the rasterization state to
be dynamic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
At least in the current Vulkan spec there is no validation language
saying that it isn't valid to set this state if stippled lines aren't
supported, so it seems we have to just ignore it. Ignore it if the user
specifies a dynamic line stipple state and don't emit warnings if they
call CmdSetLineStippleEXT because zink will do this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
Vulkan allows the user to set extraneous dynamic state which then gets
ignored if a pipeline with static state is bound. We weren't
implementing this correctly for viewports because we weren't clearing
the dirty bit, but it was happening to work until changes for dynamic
depth negative-one-to-one broke
dEQP-VK.pipeline.*.depth.depth_clip_control.d32_sfloat_less_viewport_before_static.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18912>
Again sharing the same function across all Intel drivers.
There is still two additional DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM
calls, one in intel/dev and other in perf.
The first one can't call intel_gem_get_context_param() because of the
build order of libs and the second one because it sets the size
parameter.
Will revisit those calls in future but this is already an improvement.
v2:
- using intel_gem_get_context_param() for the recently added query for
I915_CONTEXT_PARAM_PROTECTED_CONTENT
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>
This function was returning a int but there was no meaninfull errno
code being returned, also context_id is a uint32_t what would be
problematic if i915 even returned 2147483648(-1).
So here changing the return type and add context_id pointer parameter.
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18974>
Color and TS buffers are allocated separately for each etnaviv resource, so
getting the same base and TS buffer at import time is unexpected and a strong
hint at the application doing something wrong, like passing in the same GEM
handle for all planes on a GBM import. Print a warning to give the user some
feedback.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9780>
This adds support for sharing the TS buffer, which up until now has been
an internal implementation detail, with the outside world. This mainly
improves performance with a GPU compositor present, but on i.MX8M also
direct to display use-cases benefit.
The impact of this change depends on the GPU generation:
- old GPUs with a single pipe won't see any difference
- GC2000 can skip the TS resolve in the client and will benefit from a
more efficient blit into the sampler compatible format when the client
buffer contains cleared tiles
- GC3000 can directly sample with TS support, so saves both write and read
memory bandwidth when the client buffer contains cleared tiles
- GC7000 with compression support can keep the client buffer in compressed
format, thus saving both read and write bandwidth even for fully filled
client buffers
- GC7000 coupled to a display unit supporting the compression format (DCSS
on i.MX8M) does not even need to uncompress the render buffer for display
so will see significant bandwidth saving even when GPU compositing is
bypassed
There is a slight complication in that the tile clear color isn't part of
the TS buffer, but is programmed into state registers in the GPU. To handle
this externally shared TS buffers now contain a software metadata area,
where the clear color is stored by the driver, so the receiving end of the
TS buffer can retrieve the clear color from this area.
The compression format is handled in the same way by storing it in the SW
meta area. While we can derive the compression format from the color buffer
format in most cases, some users, like weston, expect that they can "upgrade"
ARGB to XRGB color formats. While this works with plain color formats, as
it's just masking a channel, the compression format differs when alpha is in
use. Receivers of the TS buffer should thus not try to infer the compression
format from the color buffer format, but instead fetch it from the SW meta.
The import/export handling of the TS buffer is modelled after the Intel iris
driver: we add a separate plane for the TS buffer and fold it into the base
resource after the import.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9780>
Unknown modifiers are currently squashed down to linear when transforming
the modifier into our interal layout representation. However, the only real
modifier that we expect to see, which isn't Vivante specific or LINEAR, is
the INVALID modifier. Treat this modifier as linear and reject any other
unexpected modifiers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9780>
Some display engines are able to resolve fast clear and/or compression
on the fly and need access to the TS buffer to do so. As they might
have restrictions on which memory they can access, allocate the TS
buffer memory from the KMS side when the resource should be SCANOUT
capable.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9780>