The entire VUE map is computed based on the slots_valid bitfield;
calling brw_compute_vue_map on the same bitfield will return the
same result. So we can simply compare those.
struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is
much cheaper and should work just as well.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
These were only for legacy userclipping, which we no longer support
in geometry shaders.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
According to the GLSL 1.50 specification, page 76:
"The shader must also set all values in gl_ClipDistance that have been
enabled via the OpenGL API, or results are undefined."
With this patch, we only enable clip distance writes when the shader
actually writes them. We no longer force a value to be written when
clip planes are enabled in the API. This could mean the first varying
slot would be used as clip distances - I believe it should be the safe
kind of undefined behavior.
Empirically, it doesn't seem to cause a problem.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
The legacy userclip fields are only used for the vertex shader, and at
that point there's only program_string_id and the tex struct, which are
common to all keys. So there's no need for a "VUE" key base class.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This avoids a downcast of key, which won't exist in the base class soon.
I'm not a huge fan of this patch, but given that we're currently using
inheritance, this seems like the "right" way to do it. The alternative
is to make key a void pointer in the parent class and continue
downcasting.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
I'm about to remove the base class for VS/GS/HS/DS program keys, at
which point we won't be able to use key->tex anymore. Instead, we'll
need to store a direct pointer (like we do in the FS backend).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This is now only used for the vertex shader, so it makes sense to get it
out of any paths run by the geometry shader.
Instead of passing the gl_clip_plane array into the run() method (which
is shared among all subclasses), we add it as a vec4_vs_visitor
constructor parameter. This eliminates the bogus NULL parameter in the
GS case.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
There are two uses of this flag.
The primary use is checking whether we need to emit code to convert
legacy gl_ClipVertex/gl_Position clipping to clip distances. In this
case, we also have to upload the clip planes as uniforms, which means
setting nr_userclip_plane_consts to a positive value. Checking if it's
> 0 works for detecting this case.
Gen4-5 also wants to know whether we're doing clipping at all, so it can
emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0]
is set to a real register suffices for this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
We only support geometry shaders in core profiles, where gl_ClipVertex
doesn't exist. Presumably the even older behavior of clipping to
gl_Position isn't supported either. In fact, GLSL 1.50 page 76 claims:
"The shader must also set all values in gl_ClipDistance that have been
enabled via the OpenGL API, or results are undefined."
So we don't need to handle legacy clipping in geometry shaders. I think
Paul added this back when we were considering supporting the old
GL_ARB_geometry_shader4 extension.
This removes a non-orthagonal state dependency on GS compilation.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This living in brw_fs.{h,cpp} is a historical artifact of us supporting
texturing for fragment shaders before any other stages. It's kind of
awkward given that we use it for all stages.
This avoids having to include brw_fs.h in geometry shader code in order
to access this function.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Patch modifies existing shader source and replace functionality to work
with environment variables rather than enable dumping on compile time.
Also instead of _mesa_str_checksum, _mesa_sha1_compute is used to avoid
collisions.
Functionality is controlled via two environment variables:
MESA_SHADER_DUMP_PATH - path where shader sources are dumped
MESA_SHADER_READ_PATH - path where replacement shaders are read
v2: cleanups, add strerror if fopen fails, put all functionality
inside HAVE_SHA1 since sha1 is required
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
commit 472ef9a02f introduced code to
change the types of SEL and MOV instructions for moves that simply
"copy bits around". It didn't account for type conversion moves,
however. So it would happily turn this:
mov(8) vgrf6:D, -vgrf5:D
mov(8) vgrf7:F, vgrf6:UD
into this:
mov(8) vgrf6:D, -vgrf5:D
mov(8) vgrf7:D, -vgrf5:D
which erroneously drops the conversion to float.
Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
As far as I can tell, the behavior is preserved from the previous generations.
Before we set a single bit to tell the FS whether or not we'll be using an input
coverage mask. Now we have some options which are implementing various
extensions. These bits are used for the various conservative rasterization
mechanisms (for collision detection, binning, and whatever else).
I believe that the behavior is preserved because the problem which conservative
rasterization is attempting to fix would go away with the "NORMAL" mode (at the
cost of performance, I believe).
This patch serves as documentation of the change by creating the enums, as well
as giving some of the history with the links here so that the next person who
comes along and looks at it doesn't spend as long as I had to in order to
determine if there is an issue or not.
Previously, this algorithm had been done in software, and this can still be used
as long as we don't export an extension stating otherwise.
References: https://www.opengl.org/registry/specs/NV/conservative_raster.txt
References: https://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This must be done before exporting a buffer as dmabuf fds, because
we lose track of who is using it and can't trust the reference counter.
Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This is probably the most called util function. It does almost nothing,
yet it can consume 10% of the CPU on the profile. This drops it down to 5%.
Reviewed-by: Brian Paul <brianp@vmware.com>
There doesn't seem any reason to start from 4.
Start from 1 instead (0 is left reserved to catch uninitialized atoms).
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Similarly to scissor states, we can use single atom to track all viewport
states. This will allow to simplify dirty atom handling later.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
During review of the "r600g: make all scissor states use single atom" patch
Marek Olšák noticed that scissor disable workaround should be applied on
all scissor states and not just first one, so let's do so.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
As suggested by Marek Olšák, we can use single atom to track all scissor
states. This will allow to simplify dirty atom handling later.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
It's legal to call glTexSubImage with zero values for the width,
height or depth. Previously this was breaking the PBO access
validation because it tries to work out the last pixel accessed by
getting the pixel at height-1 and depth-1 which would end up with
bogus values.
This was causing GL errors to be generated during the Piglit
texsubimage test, although the test was passing anyway.
v2: Also check for width == 0. Don't validate the start pointer if any
of the dimensions are zero.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
In various versions of OpenGL and GLSL, it's possible to declare
multiple VS input variables with aliasing attribute locations.
So, when computing the storage requirements for vertex attributes,
we can't simply add up the sizes. Instead, we need to look at the
enabled slots.
This patch begins tracking which attributes are double types that
are larger than 128-bits (i.e. take up two vec4 slots). We then
count normal attributes once, and count the double-size attributes
a second time.
Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests
on i965, which regressed with commit ad208d975a.
No Piglit changes on llvmpipe (which actually supports dvecs).
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Previously we would allow glUniformMatrix4fv on a dmat4 and
glUniformMatrix4dv on a mat4. Both are illegal. That later also
overwrites the storage for the mat4 and causes bad things to happen.
Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
All of the other state upload functions are static because the only use
is in the brw_tracked_state structure.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
All of the other state upload functions are static because the only use
is in the brw_tracked_state structure.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Because the compiler already has enough things to complain about.
grep -rl 'const static' src/ | while read f
do
sed --in-place -e 's/const static/static const/g' $f
done
brw_eu_emit.c: In function 'brw_reg_type_to_hw_type':
brw_eu_emit.c:98:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
const static int imm_hw_types[] = {
^
brw_eu_emit.c:120:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
const static int hw_types[] = {
^
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
brw_upload_cs_push_constants was based on gen6_upload_push_constants.
v2:
* Add FINISHME comments about more efficient ways to push uniforms
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Check for a valid framebuffer cbuf pointer before accessing its
associated surface.
Fix piglit test fbo-drawbuffers-none.
Reviewed-by: Brian Paul <brianp@vmware.com>
Commit b9ba8492 removes an unneeded pipe_surface_release() from
st_render_texture(). This implies a surface can now be reused for a
render buffer. Currently, when we render to a texture, we mark the
surface as dirty. But in svga_mark_surface_dirty(), if the surface
is already marked as dirty, it does not increment the texture age.
Any view to this texture might not be updated properly then.
With this patch, the texture age is incremented regardless of whether
the surface is already marked as dirty or not.
Fix bug 1499181.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Commit b9ba8492 removes an unneeded pipe_surface_release() from
st_render_texture() and exposes a bug in the backed surface view
creation. Currently a backed surface view for a conflicted surface view
is created at framebuffer emit time. But if shader sampler views are changed
but framebuffer surface views remain unchanged, emit_framebuffer() will not
be called and conflicted surface views will not be detected.
To fix this, also check for conflicted surface views when setting sampler
views. If there is any conflicted surface views, enable the
framebuffer dirty bit so that the framebuffer emit code has a chance to
create a backed surface view for the conflicted surface view.
Fix cinebench-r11-test regression.
Reviewed-by: Brian Paul <brianp@vmware.com>
The lowered code reads from the destination, which isn't possible from
message registers.
Fixes the following dEQP tests on SNB:
dEQP-GLES3.functional.shaders.precision.int.highp_mul_fragment
dEQP-GLES3.functional.shaders.precision.int.mediump_mul_fragment
dEQP-GLES3.functional.shaders.precision.int.lowp_mul_fragment
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
This is a squash commit of roughly two years of development work.
Authors include:
Brian Paul
Charmaine Lee
Thomas Hellstrom
Jakob Bornecrantz
Sinclair Yeh
Mingcheng Chen
Kai Ninomiya
MengLin Wu
The driver supports OpenGL 3.3.
Signed-off-by: Brian Paul <brianp@vmware.com>