The variable 'i' is a value in [0, MAT_ATTRIB_MAX-1] so subtracting
VERT_ATTRIB_GENERIC0 gave a bogus value and we executed the default
switch clause for all loop iterations.
This doesn't fix any known issues but was clearly incorrect.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This is the result of applying several rules:
From OpenGL 4.3 spec, section 7.6.2.2 "Standard Uniform Block Layout":
"2. If the member is a two- or four-component vector with components
consuming N basic machine units, the base alignment is 2N or 4N,
respectively."
[...]
"4. If the member is an array of scalars or vectors, the base alignment
and array stride are set to match the base alignment of a single array
element, according to rules (1), (2), and (3), and rounded up to the
base alignment of a vec4."
[...]
"7. If the member is a row-major matrix with C columns and R rows, the
matrix is stored identically to an array of R row vectors with C
components each, according to rule (4)."
[...]
"When using the std430 storage layout, shader storage blocks will be
laid out in buffer storage identically to uniform and shader storage
blocks using the std140 layout, except that the base alignment and
stride of arrays of scalars and vectors in rule 4 and of structures in
rule 9 are not rounded up a multiple of the base alignment of a vec4."
In summary: vec2 has a base alignment of 2*N, a row-major mat2xY is
stored like an array of Y row vectors with 2 components each. Because
of std430 storage layout, the base alignment of the array of vectors
is not rounded up to vec4, so it is still 2*N.
Fixes 15 dEQP tests:
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_lowp_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_mediump_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_type.std430.row_major_highp_mat2x4
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x3
dEQP-GLES31.functional.ssbo.layout.single_basic_array.std430.row_major_mat2x4
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x3
dEQP-GLES31.functional.ssbo.layout.instance_array_basic_type.std430.row_major_mat2x4
v2:
- Add spec quote in both commit log and code (Timothy)
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Previously, ATTR was indexed by VERT_ATTRIB_* slots; at the end of
compilation, assign_vs_urb_setup() translated those into GRF units,
and converted ATTR to HW_REGs.
This patch moves the transslation earlier, making ATTR work in terms of
GRF units from the beginning. assign_vs_urb_setup() simply has to add
the number of payload registers and push constants to obtain the final
hardware GRF number. (We can't do this earlier as those values aren't
known.)
ATTR still supports reg_offset; however, it's simply added to reg.
It's not clear whether this is valuable or not.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The act of ensuring that there is space can cause a flush to happen,
which will emit the current screen fence. If that is the fence we're
trying to wait on, then it will have been emitted as a result of doing
the PUSH_SPACE. Don't attempt to emit it a second time.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Fixes: 8053c9208f (nouveau: avoid emitting new fences unnecessarily)
Cc: mesa-stable@lists.freedesktop.org
Fixes:
ES3-CTS.shaders.negative.constant_sequence
spec/glsl-es-3.00/compiler/global-initializer/from-sequence.vert
spec/glsl-es-3.00/compiler/global-initializer/from-sequence.frag
v2: Fix a couple copy-and-paste mistake in the spec quotations.
Suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
This will be used in the next patch to enforce some language sematics.
v2: Fix inverted logic in
ast_function_expression::has_sequence_subexpression. The method
originally had a different name and a different meaning. I fixed the
logic in ast_to_hir.cpp, but I only changed the names in
ast_function.cpp.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
v2: Combine this check with the existing const and uniform checks. This
change depends on the previous patch (glsl: Only set
ir_variable::constant_value for const-decorated variables).
Fixes:
ES2-CTS.shaders.negative.initialize
ES3-CTS.shaders.negative.initialize
spec/glsl-es-1.00/compiler/global-initializer/from-attribute.vert
spec/glsl-es-1.00/compiler/global-initializer/from-uniform.vert
spec/glsl-es-1.00/compiler/global-initializer/from-uniform.frag
spec/glsl-es-1.00/compiler/global-initializer/from-global.vert
spec/glsl-es-1.00/compiler/global-initializer/from-global.frag
spec/glsl-es-1.00/compiler/global-initializer/from-varying.frag
spec/glsl-es-3.00/compiler/global-initializer/from-uniform.vert
spec/glsl-es-3.00/compiler/global-initializer/from-uniform.frag
spec/glsl-es-3.00/compiler/global-initializer/from-in.vert
spec/glsl-es-3.00/compiler/global-initializer/from-in.frag
spec/glsl-es-3.00/compiler/global-initializer/from-global.vert
spec/glsl-es-3.00/compiler/global-initializer/from-global.frag
Note: spec/glsl-es-3.00/compiler/global-initializer/from-sequence.*
still fail because the result of a sequence operator is still considered
to be a constant expression.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92304
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> [v1]
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Right now we're also setting for uniforms, and that doesn't seem to hurt
things. The next patch will make general global variables in GLSL ES,
and those definitely should not have constant_value set!
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
This is the way layout(binding=xxx) works from GLSL. The old method
just happened to work (and significantly predated support for
layout(binding=xxx)), but future changes will break this.
v2: Remove some stale comments. Suggested by Matt and Chris Forbes.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
In d4a24745 (August 2012), Paul made functions calls not be constant
expressions in GLSL ES 1.00. Since this feature was added in desktop
GLSL 1.20, we believed that it was added in GLSL ES 3.00. That turns
out to be completely wrong. Built-in functions have always been allowed
as constant expressions in GLSL ES, and the patch adds the (many) spec
quotations to prove it.
While we never previously encountered this, a later patch enforces a GLSL
ES 1.00 rule that global variable initializers must be constant
expressions. Without this fix, several dEQP tests fail.
Fixes:
tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.frag
tests/spec/glsl-es-1.00/compiler/const-initializer/from-function.vert
tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.frag
tests/spec/glsl-es-1.00/compiler/const-initializer/from-sequence-in-function.vert
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.0 10.1 10.2 10.3 10.4 10.5 10.6 11.0" <mesa-stable@lists.freedesktop.org>
Yes, I know we don't maintain stable branches that far back, but that
*is* how far back this bug goes!
Vertex attributes of different categories (constant/per-instance/
per-vertex) go into different buffers for translation, and this is now
properly reflected in the vertex buffers passed to the driver.
Fixes e.g. piglit's point-vertex-id divisor test.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Also fix style / wrong indentation along the way and make the messages
more uniform.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
GLSL Spec 4.20.8, 4.3 Storage Qualifiers:
"Initializers in global declarations may only be used in declarations of
global variables with no storage qualifier, with a const qualifier or
with a uniform qualifier."
We do this for input variables, but not for output variables. AMD and NVIDIA
proprietary drivers don't allow this either.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
For the VS and FS stages that use ARB_vertex_program or
ARB_fragment_program we don't have a shader program, however,
when debuging is enabled, we call brw_dump_ir like this:
brw_dump_ir("vertex", prog, &vs->base, &vp->program.Base);
where vs will be NULL (since prog is NULL).
As pointed out by Chris, this &vs->base is not really a dereference,
it simply computes a new address that just happens to be 0x0 because
the offset of base in brw_shader is 0. Then brw_dump_ir will see a
NULL pointer and not do anything. This is why this does not crash at
the moment. However, this does not look very safe (it would crash
for any location of base that is not the first in brw_shader), so
patch it to prevent a potential (even if unlikely) problem in the
future.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
The initial glGetUniformdv support didn't cover all the
casting cases that are apparantly legal, and cts seems to
test for them.
I've updated the piglit test to cover these cases now.
v2: fix indentation - it's all broken in this file (Ilia)
fix src/dst index tracking in light of fp64 support (Ilia)
cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Curro added this in commit 3ee2daf23d (before the vec4/NIR backend was
added) but it was missed in the new NIR backend. Add it there as well.
instructions in affected programs: 1857 -> 1810 (-2.53%)
helped: 15
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
We still have to push everything out, might as well kick earlier and
flip pushbufs when we know we'll need it. This resolves some issues with
the new policy of making sure that we always leave a bit of room at the
end for fences.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
Right now we emit on every kick, but this is only necessary if something
will ever be able to observe that the fence completed. If there are no
refs, leave the fence alone and emit it another day.
This also happens to work around an issue for the kick handler -- a kick
can be a result of e.g. nouveau_bo_wait or explicit kick, or it can be
due to lack of space in the pushbuf. We want the emit to happen in the
current batch, so we want there to always be enough space. However an
explicit kick could take the reserved space for the implicitly-triggered
kick's fence emission if it happened right after. With the new mechanism,
hopefully there's no way to cause two fences to be emitted into the same
reserved space.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: 47d11990b (nouveau: make sure there's always room to emit a fence)
Cc: mesa-stable@lists.freedesktop.org
In theory, GF110+ should also support NVC8_COMPUTE_CLASS but, in practice,
a ILLEGAL_CLASS dmesg fail appears when using it.
This fixes compute support and MP performance counters on GF110.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
For scalar VS, I'll need this in brw_fs.cpp as well. It seems silly to
redeclare it in three places.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Previously, we used nir_lower_io with the scalar type_size function,
which mapped VERT_ATTRIB_* locations to...some numbers. Then, in
fs_visitor::nir_setup_inputs(), we created temporaries indexed by
those numbers, and emitted MOVs from the actual ATTR registers to
those temporaries. Virtually all of these were copy propagated away,
but it's still ugly.
This patch reworks our input lowering to produce NIR lower_input
intrinsics that properly index into the ATTR file, so we can access
it directly.
No changes in shader-db.
v2: Fix unreachable() message (Ken), update commit message (Matt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
nr_attributes is used to compute first_non_payload_grf, which is the
first register we're allowed to use for ordinary register allocation.
The hardware requires us to read at least one pair of values, but we're
completely free to overwrite that garbage register with whatever we like.
Instead of altering nr_attributes, we should alter urb_read_length, which
only affects the amount we ask the VF to read. This should save us a
register in trivial cases (which admittedly isn't very useful).
While we're at it, improve the explanation in the comments.
v2: Actually do what I said (caught by Ilia).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Both the vec4 and scalar VS backends had virtually identical URB entry
size and read length calculations. We can move those up a level to
backend-agnostic code and reuse it for both.
Unfortunately, the backends need to know nr_attributes to compute
first_non_payload_grf, so I had to store that in prog_data. We could
use urb_read_length, but that's nr_attributes rounded up to a multiple
of two, so doing so would waste a register in some cases.
There's more code to be removed in the vec4 backend, but that will
come in a follow-on patch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Switch statements introduce a bogus loop with an unconditional break at
the end of the loop, just before the while...so the while is unreachable
and has no immediate dominator.
v2: With less exuberance
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The (gen < 9) check in brw_clear() was too broad. It disabled all types
of fast color clears:
a. singlesample rep clears
b. singlesample MCS fast clears
c. multisample MCS fast clears
The MCS clears are still buggy, but the rep clear works well. So let's
enable it.
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Fast color clears are disabled for gen9 (see the checks in
brw_meta_fast_clear), so there is no reason to allocate the MCS and
track its clear/resolve state.
Reviewed-by: Neil Roberts <neil@linux.intel.com>