st/mesa now exposes KHR_blend_equation_advanced_coherent and
EXT_shader_framebuffer_fetch if the new capability is supported.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
TGSI's FBFETCH instruction currently only supports reading from a single
render target, but NIR intrinsics can support multiple render targets.
radeonsi can only support fetching from RT 0, but other drivers may be
able to support fetching from any render target.
To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply
PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?"
to an integer number of render targets which can be fetched.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Corresponding to GL_ARB_fragment_shader_interlock and
GL_NV_fragment_shader_interlock. Currently, only the NIR paths
support this functionality, but someone could conceivably add it
to TGSI too.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The _LEVELS assumes that the max is always power of two. For V3D 4.2, we
can support up to 7680 non-power-of-two MSAA textures, which will let X11
support dual 4k displays on newer hardware.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
this can be used by drivers which support the extension to indicate support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This patch adds cropping flags for H264 in pipe_h264_enc_pic_control.
Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
PIPE_CAP_TGSI_SKIP_SHRINK_IO_ARRAYS is added to indicate whether the TGSI
pass to shrink IO arrays should be skipped to enforce the originally declared array
sizes and locations instead.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
To enable NV_compute_shader_derivatives, which allows derivatives (and
texture lookups with implicit derivatives) in compute shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We need indexed queries to retrieve the geom shader info.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Adds a new cap to allow drivers to expose higher shading language
versions in GLES contexts, to avoid having to report an artificially
low version for the benefit of GL contexts.
The motivation is to expose EXT_gpu_shader5 even though a driver may
not support all the features needed for the corresponding GL extension
(ARB_gpu_shader5).
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The glMemoryBarrier() function makes shader memory stores ordered with
respect to things specified by the given bits. Until now, st/mesa has
ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT,
saying that drivers should implicitly perform the needed flushing.
This seems like a pretty big assumption to make. Instead, this commit
opts to translate them to new PIPE_BARRIER bits, and adjusts existing
drivers to continue ignoring them (preserving the current behavior).
The i965 driver performs actions on these memory barriers. Shader
memory stores go through a "data cache" which is separate from the
render cache and other read caches (like the texture cache). All
memory barriers need to flush the data cache (to ensure shader memory
stores are visible), and possibly invalidate read caches (to ensure
stale data is no longer visible). The driver implicitly flushes for
most caches, but not for data cache, since ARB_shader_image_load_store
introduced MemoryBarrier() precisely to order these explicitly.
I would like to follow i965's approach in iris, flushing the data cache
on any MemoryBarrier() call, so I need st/mesa to actually call the
pipe->memory_barrier() callback.
Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate
and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on
the iris driver.
Roland said this looks reasonable to him.
Reviewed-by: Eric Anholt <eric@anholt.net>
The OpenMAX state tracker will use this.
RadeonSI is adapted to use pipe_grid_info::last_block instead of its
internal state.
Acked-by: Leo Liu <leo.liu@amd.com>
Some NVIDIA hardware can accept 128 fragment shader input components,
but only have up to 124 varying-interpolated input components. We add a
new cap to express this cleanly. For most drivers, this will have the
same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader.
Fixes KHR-GL45.limits.max_fragment_input_components
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: rebased, improved docs/commit message]
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Iris would like to use compact arrays for tesslevels and clip/cull
distances. radeonsi will likely want to switch to these at some point,
since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready
for them just yet.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Add a new cap that indicates whether the drivers supports
enabling/disabling the conversion from linear space to sRGB
for a framebuffer attachment. In Driver terms that this CAP indicates
whether the driver can switcht between a linear and and a sRGB surface
format for draw destinations witout changing the sourface itself.
v2: rename CAP to DEST_SURFACE_SRGB_CONTROL to reflect its
purpouse better (pointed out by Ilia Mirkin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This fixes the drisw paths to use the new shm2 interface, so that
we don't trigger the X server overflow checks when the x offset is non-zero.
This just hides the versioning in drisw, and either passes the src_x
or adds the offset fixup for the fallback path.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
In the Intel backend, it makes the most sense to treat gl_TessLevelInner
and gl_TessLevelOuter as ordinary shader inputs. For Radeon, it makes
more sense to treat them as system values which get special handling.
We already have a compiler option for this, but the Iris driver will
need a capability bit so we can set it appropriately.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Intel's blending hardware does not properly return 1.0 for destination
alpha for RGBX formats; it requires the factors to be overridden to
either zero or one. Broadcom vc4 and v3d also could use this override.
While overriding these factors is safe in general, Nouveau and Radeon
would prefer not to. Their blending hardware already returns correct
values for RGB/RGBX formats, and would like to avoid the resulting
per-buffer blending and independent blend factors (rgb != a) since it
can cause additional overhead.
I considered simply handling this in the driver, but it's not as nice.
pipe_blend_state doesn't have any format information, so we'd need the
hardware blend state to depend on both pipe_blend_state and
pipe_framebuffer_state. Furthermore, Intel GPUs don't have a native
RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall
back to RGBA_SNORM. The pipe_surfaces we get in the driver have an RGBA
format, making it impossible to tell that there shouldn't be an alpha
channel. One could argue that st not handling it in that case is a bug.
To work around this, we'd have to expose RGBX pipe formats, mapped to
RGBA hardware formats, and add format swizzling special cases. All
doable, but it ends up being more code than I'd like.
st_atom_blend already has access to the right information and it's
trivial to accomplish there, so we just add a cap bit and do that.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Gallium historically has treated pipeline statistics queries as a single
query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11
values. This was originally patterned after the D3D1x API. Much later,
Brian introduced an OpenGL extension that exposed these counters - but
it exposes 11 separate queries, each of which returns a single value.
Today, st/mesa simply queries all 11 values, and returns a single value.
While pipeline statistics counters aren't typically performance
critical, this is still not a great fit. A D3D1x->GL translator might
request all 11 counters by creating 11 separate GL queries...which
Gallium would map to reads of all 11 values each time, resulting in a
total 121 counter reads. That's not ideal.
This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE,
and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE.
When calling create_query(), q->index should be set to one of the
PIPE_STAT_QUERY_* enums to select a counter. Unlike the block query,
this returns the value in pipe_query_result::u64 (as it's a single
value) instead of the pipe_query_data_pipeline_statistics group.
We update st/mesa to expose ARB_pipeline_statistics_query if either
capability is set, preferring the new SINGLE variant when available.
Thanks to Roland, Ilia, and Marek for helping me sort this out.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gallium handles pipeline statistics queries as a single query
(PIPE_QUERY_PIPELINE_STATISTICS) which returns a struct with 11 values.
Sometimes it's useful to refer to each of those values individually,
rather than as a group. To avoid hardcoding numbers, we define a new
enum for each value. Here, the name and enum value correspond to the
index in the struct pipe_query_data_pipeline_statistics result.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
ATOMFADD is a little special -- make drivers have to specify it
explicitly.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is supported by at least NVIDIA hardware, and exposeable via GL
extensions.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This new pipe cap and the new nr_samples field in pipe_surface lets a
state tracker bind a render target with a different sample count than
the resource. This allows for implementing
EXT_multisampled_render_to_texture and
EXT_multisampled_render_to_texture2.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY
that specifies whether the caller will use buffer_unmap or not. The
default behavior is set to permanent maps, because that's what drivers
do for Gallium buffer maps.
This should eliminate the need for hacks in libdrm. Assertions are added
to catch when the buffer_unmap calls don't match the (temporary)
buffer_map calls.
I did my best to update r600 for consistency (r300 needs no changes
because it never calls buffer_unmap), even though the radeon winsys
ignores the new flag.
As an added bonus, this should actually improve the performance of
the normal fast path, because we no longer call into libdrm at all
after the first map, and there's one less atomic in the winsys itself
(there are now no atomics left in the UNSYNCHRONIZED fast path).
Cc: Leo Liu <leo.liu@amd.com>
v2:
- remove comment about visible VRAM (Marek)
- don't rely on amdgpu_bo_cpu_map doing an atomic write
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This format is needed to support EXT_texture_sRGB_R8. THe patch adds a new
format enum, the format entries in Gallium and and svga, the mapping between
sRGB and linear formats, and tests.
v2: - add mapping to linear format for PIPE_FORMATR_R8_SRGB
v3: - Add texture format to svga format table since otherwise building
mesa will fail when this driver is enabled. It was not tested
whether the extension actually works.
v4: - svga: remove the SVGA specific format definitions and table entries
and only add correct the location of PIPE_FORMAT_R8_SRGB in the
format_conversion_table (Ilia Mirkin)
- Split patch (1/2) to separate Gallium part and mesa/st part.
(Roland Scheidegger)
- Trim the commit message to only contain the relevant parts from the
split.
v5: - svga: correct location of PIPE_FORMAT_SRGB_R8 (Ilia Mirkin)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
They're not required to be the same as the access flag on the image
unit. For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.
v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
- Reduce both access and shader_access to uint16_t to avoid making
the pipe_image_view structure larger.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This patch fixes the following Piglit test:
spec@egl_mesa_configless_context@basic
It also fixes few test in a virgl guest.
v2: Evaluate the value of no_config (Ilia)
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Introduce a new capability for the maximum value of
pipe_vertex_element::src_offset. Initially just every driver
backend returns the value previously set from _mesa_init_constants.
So this shall end up in no functional change.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
State trackers will not use the new param directly, but will instead use
a helper in MakeCurrent that does the right thing.
v2: rework the interface
Reviewed-by: Brian Paul <brianp@vmware.com>