In the old hand-writen implementation of atan2, the calculation of
atan(y/x) was performed conditionally in the "then" block of the
outermost if statement. I believe I accidentally lifted this out
into unconditional code when converting to IR builder.
For reference, the original hand-written IR is visible in commit
722eff674b.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
MinimumArrayElement carries the same meaning for BDW and SKL.
Suggested by Jason.
No regressions in dEQP-VK.pipeline.image.view_type.cube_array.*
Fixes a number of cube tests, including cube_array_base_slice
and cube_base_slice tests.
This makes it act like the address mode is set to TEXCOORDMODE_CUBE
whenever this sampler is combined with a cube surface. This *should* be
what we need for Vulkan. Interestingly, the PRM contains a programming
note for this field that says simply, "This field must be set to
CUBECTRLMODE_PROGRAMMED". However, emprical evidence suggests that it does
what the PRM says it does and OVERRIDE is just fine.
These fields are ignored for non-cube surfaces. For cube surfaces
these fields should be enabled when using TEXCOORDMODE_CLAMP and
TEXCOORDMODE_CUBE.
TODO: Determine if these are the only two modes used in Vulkan.
OpenGL's dual color blending feature was specified so that an
implementation could support both multiple render targets (MRT) and
dual source blending. Fragment shader outputs specify both "location"
(the render target number) and "index" (either color 0 or 1).
I believe DirectX only has the notion of "location" - if using dual
color blending, location 0 or 1 will specify the operands. If not,
then location means the render target index. The two features can't
be used together.
As such, some applications mistakenly try to use <loc = 0, index = 0>
and <loc = 1, index = 0> in a shader used for dual color blending with
a single render target, rather than the correct <loc = 0, index = 0>
and <loc = 0, index = 1>.
In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug.
Unigine is aware of the problem, and quickly developed a fix, but has
not bothered to change the download link on their website to a working
copy in over a year. People were still using the broken version and
complaining. We tried working around this by disabling dual color
blending, but that apparently hurts performance, and people were once
again unhappy.
On i965, dual source blending is achieved by using different framebuffer
write messages than normal rendering. So, we have to compile different
code for the two cases. We're not being pedantic: we actually have to
know in order to function.
Normally, dual source blending is detectable in the shader: if a shader
has an output with index = 1, then it's meant for blending, not MRT.
With the broken inputs, they're indistinguishable, so we can only tell
by looking at the current GL state.
This patch implements a new drirc workaround:
export dual_color_blend_by_location=true
which makes the i965 driver detect when OpenGL state is configured for
dual source blending, and recompile the fragment shader to use the right
messages. In that case, we allow either location = 1 or index = 1 to
specify the second source for the blending equations.
It also re-enables GL_ARB_blend_func_extended for Unigine.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
v2: After more discussion with hw teams, the kernel already contains the
optimal settings allowing us to use all CUs.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Some fields of the surface state template were dependent on the
surface type, which is dependent on the usage of the image view, which
wasn't known until the bottom of the function after the template had
been constructed. This caused failures in all image load/store CTS
tests using cubemaps. Refactor the surface state calculation into a
function that is called once for each required usage.
We don't want to do this in the long-run but it's needed for passing the
NoContraction tests at the moment. Eventually, we want to plumb this
through NIR properly.
This was originally removed here:
commit 031d350132
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Tue Aug 25 16:59:12 2015 -0700
i965/vs: Unify URB entry size/read length calculations between backends.
Then added back:
commit bd198b9f0a
Author: Kenneth Graunke <kenneth@whitecape.org>
Date: Fri Aug 14 16:01:33 2015 -0700
i965/vs: Simplify fs_visitor's ATTR file.
Note that the authorship dates are out of order, but the above reflects the
order of the commit dates.
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When a fragment shader is used that has no outputs but does conditional
discard (KILL_IF), all fragments are killed without this patch.
By comparing various register settings, my conclusion is that the exec mask
is either not properly forwarded to the DB by NULL exports or ends up being
unused, at least when there is _only_ a NULL export (the ISA documentation
claims that NULL exports can be used to override a previously exported exec
mask).
Of the various approaches I have tried to work around the problem, this one
seems to be the least invasive one.
v2: take discard by alpha test into account as well
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Updates the _mesa_has_geometry_shaders function to also look
for OpenGL ES 3.1 contexts that has OES_geometry_shader enabled.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>