Apparently I inverted the sense of this flag back when we didn't have
piglit testing. Fixes terrible rendering in minetest, HL2, CS:Source, and
CS.
Fixes: 0369dd9077 ("freedreno/a6xx: Add ARB_depth_clamp and separate clamp support.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
After the previous change, PASS 1 can be trivially pulled out of the
loop.
With PASS 1 removed, the loop can be unrolled, and a lot of code can be
deleted (from the unrolls). This saves a couple lines of code, and it
makes the function a little easier to follow.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9867>
Things that are not dynamically indexed must be added last. This is
necessary so that values that are both statically indexed (or used
directly) and dynamically indexed will only be added once. With the
above change, if the constant 47 is used as a literal in an instruction
and in an array that is dynamically indexed, it will be added to
`Parameters` twice. On (really old) GPUs that store constants and other
parameters in the same storage, this can cause some valid programs to
exceed the storage limits. I don't know about R300 or NV30, but R200
was limited to something like 256 vec4s. This applies to constants,
state parameters, and local parameters (the assembly shader version of
uniforms).
The problem this causes here is that the final parameter layout created
in `_mesa_layout_parameters` may have more parameters than the input
layout. The fundamental assumption of that routine (and documented as
an assumption of `copy_indirect_accessed_array`) is that the input size
and the output size will be the same.
The affected shader had something like below. This is a common pattern
for ARB assembly shaders generated by NVIDIA's cgc compiler. As far as
I can tell, the majory of applications that use ARB assembly shaders
either use cgc or use some sort of DX9 crosscompiler... that generates
similar patterns.
PARAM c[141] = { program.local[0..133],
{ 255, 0.1, 3, 1 },
{ 0.5, 2, 0.15915491, 0.25 },
{ 0, 0.5, 1, -1 },
{ 24.980801, -24.980801, -60.145809, 60.145809 },
{ 85.453789, -85.453789, -64.939346, 64.939346 },
{ 19.73921, -19.73921, -9, 0.75 },
{ -999999 } };
The shader contains instructions like
MUL R0.x, R0, c[135].y;
and
DP4 R2.z, c[A0.x + 6], R1;
Starting with b9bff76b63, the constants at the end of `c` would get
added to `Parameters` twice. The first time they are added due to
instructions that directly access the array (e.g., the `c[135].y`
above). The second time is because they are part of an array that is
dynamically indexed. As a result, the final layout of Parameters
(calculated by `_mesa_layout_parameters`) is 7 elements larger than the
input layout.
Since bcc61a01d4 fixed the allocation size of `ParameterValues`,
`copy_indirect_accessed_array` will now write past the end of the array.
The eventually results in a crash in `free`. Thankfully Valgrind was
able to help find the real source of the problem.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: b9bff76b63 ("mesa: put constants before state vars for ARB programs")
Closes: #4505
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9867>
Ran into this while trying to rework fbconfig setup, due to a bug I
ended up trying to allocate a PIPE_FORMAT_NONE framebuffer, which failed
like you'd hope, but which we weren't converting into an error in
st_api_make_current. Instead we'd treat it like binding no drawable to
the context, which is really not what was asked for, so let's go ahead
and make this an error.
Reviewed-by: Eric Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9956>
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.
Passes tests that run under:
dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*
Rebased and modified commit from Jonathan Marek.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
When float16 is enabled this will allow to pass a number of
float16 tests.
When A6XX_SP_FLOAT_CNTL_F16_NO_INF is set - all operations which
generate +-infinity generate +-MAX_HALF_FLOAT.
Fixes some tests from:
dEQP-VK.spirv_assembly.instruction.*.float16.*
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp16.*
E.g.:
dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.sinh_vert
dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_4.length
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log_denorm_flush_to_zero_nostorage
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log2_denorm_flush_to_zero_nostorage
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.inv_sqrt_denorm_flush_to_zero_nostorage
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
NIR has shifts defined as:
opcode("*shr", 0, tuint, [0, 0], [tuint, tuint32], False, ...
However, in ir3 we have to ensure that both operators of shift
instruction have the same bitness.
Let's hope that in future the additional COV for constants would
be optimized away.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
cat1 instructions round to zero by default.
When fp16 is enabled this will fix:
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_frag
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_vert
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>