Queries which use more than one MP counters was misconfigured and
computing the final result was also wrong because sources need to
be configured on different hardware counters instead.
According to the blob, computing the result is now as follows:
FOR i..n
val += ctr[i] * pow(2, i)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
On Fermi, we have one domain of 8 MP counters while we have
two domains of 4 MP counters on Kepler.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Sequence fields are located at MP[i] + 0x20 in the buffer object.
This is used to check if result is available for MP[i].
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Writing 0x408000 to 0x419e00 (like on Kepler) has no effect on Fermi
because we only have one domain of 8 counters. Instead, we have to
write 0x80000000.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Writing 0x1fcb to 0x419eac is definitely not related to MP counters and
has no effect on Fermi (although this enables MP counters on Kepler).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
The way we configure MP performance counters is going to pretty
different between Fermi and Kepler. Having two separate functions
is much better.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Add new GALLIUM_HUD queries for:
num-shaders
num-resources
num-state-objects
num-validations
map-buffer-time
num-surface-views
num-resources-mapped
num-flushes
Most of this patch was originally written by Neha. Additional clean-ups
and num-flushes counter added by Brian Paul.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Silences 5 warnings of the type:
state_tracker/st_cb_program.c: In function 'st_new_program':
state_tracker/st_cb_program.c:108:7: warning: passing argument 1 of
'_mesa_init_gl_program' from incompatible pointer type [enabled by default]
return _mesa_init_gl_program(&prog->Base, target, id);
^
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
This reverts commit 0de5e0f3fb.
Michel Dänzer spotted two piglit regressions from the change. I suspect
that removing the FLUSH_VERTICES() actually exposed a bug elsewhere but
I don't have time to hunt down the root issue at this time.
has_shader_storage_buffer_objects() returns true also if the OpenGL
context is 4.30 or ES 3.1.
Previously, we were saying that all atomic*() GLSL builtin functions
for SSBOs were not available when OpenGL ES 3.1 context was in use.
Fixes 48 dEQP-GLES31 tests:
dEQP-GLES31.functional.ssbo.atomic.*
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Otherwise there are problems when user overrides version and application
such as Piglit wants to detect used api with glGetString(GL_VERSION).
This makes it currently impossible to run glslparsertest tests for
OpenGL ES when using version override.
Below is example when using MESA_GLES_VERSION_OVERRIDE=3.1.
Before:
"3.1 Mesa 11.1.0-devel (git-24a1a15)"
After:
"OpenGL ES 3.1 Mesa 11.1.0-devel (git-78042ff)"
v2: only include api prefix for OpenGL ES (Boyan Ding)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Before d31f98a272 and 56e2bdbca3 we had a sigle index space for UBOs
and SSBOs, so NumBufferInterfaceBlocks would contain the combined number of
blocks, not just one kind. This means that for shader programs using both
UBOs and SSBOs, we were setting num_ssbos and num_ubos to a larger number than
we should. Since the above commits we have separate index spaces for each
so we can just get the right numbers.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
(yes, we want PRI?64, but we want the x version rather than the u
version)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
nir_variable_create already inserts it in the right list for us so
inserting it again causes a linked list corruption.
Reviewed-by: Matt Turner <mattst88@gmail.com>
This has the better name to use. Aparently, sh->Name is usually 0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
For glBlendFunc and glBlendFuncSeparate(), the _UsesDualSrc flag
will be the same for all buffers, so no need to compute it N times.
Reviewed-by: Eric Anholt <eric@anholt.net>
A redundant call to glBlendFuncSeparateiARB() is more likely than getting
invalid values, so do the no-op check first.
Reviewed-by: Eric Anholt <eric@anholt.net>
Streamline the checking for no state change in _mesa_BlendFuncSeparate()
(and _mesa_BlendFunc()). If _BlendFuncPerBuffer is false, we only need
to check the 0th buffer state. Move argument validation after the no-op
check.
I'm looking at an app that issues about 1000 redundant glBlendFunc()
calls per frame!
Reviewed-by: Eric Anholt <eric@anholt.net>
We can skip to the end of _mesa_update_state_locked() if only the
_NEW_LINE flag is set since none of the derived state depends on it
(just like _NEW_CURRENT_ATTRIB). Note that we still call the
ctx->Driver.UpdateState() function, of course.
v2: use bitmask-based test, per Eric.
Reviewed-by: Eric Anholt <eric@anholt.net>
Changing the matrix mode alone has no effect on rendering and does
not need to trigger a flush or state validation.
Reviewed-by: Eric Anholt <eric@anholt.net>
Rather than accepting a void pointer, only to down and up cast around
it, convert the function to take the base (struct gl_program) pointer.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
V3: use a check_*_allowed style function for requirements checking
rather than has_* which doesn't encapsulate the error message
V2: add missing 's' to the extension name in error messages
and add decimal place in version string
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
This adds support for setting up the UniformBlock structures for AoA
and also adds support for resizing AoA blocks with a packed layout.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Add support for setting the max access of an unsized member
of an interface array of arrays.
For example ifc[j][k].foo[i] where foo is unsized.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>