Previously, the command buffer implementation was split between
anv_cmd_buffer.c and anv_cmd_emit.c. However, this naming convention was
confusing because none of the Vulkan entrypoints for anv_cmd_buffer were
actually in anv_cmd_buffer.c. This changes it so that anv_cmd_buffer.c is
what you think it is and the internals are in anv_batch_chain.c.
if we get a request to take the count from feedback, but there
is no buffer to take it from, just draw as if we got 0 vertices
so nothing.
This fixes this assert killing the ogl conform, and a piglit
test I've sent.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Previously, the caller of emit_state_base_address was doing this. However,
putting it directly in emit_state_base_address means that we'll never
forget the flush at the cost of one PIPE_CONTROL at the top every batch
(that should do nothing since the kernel just flushed for us).
This is more generic and doesn't imply that it emits MI_BATCH_BUFFER_END.
While we're at it, we'll move NOOP adding from bo_finish to
end_batch_buffer.
This removes the need for multiple functions designed to validate an array
subscript and replaces them with a call to a single function.
The change also means that validation is now only done once and the index
is retrived at the same time, as a result the getUniformLocation code can
be simplified saving an extra hash table lookup (and yet another
validation call).
This chage also fixes some tests in:
ES31-CTS.program_interface_query.uniform
V3: rebase on subroutines, and move the resource index array == 0
check into _mesa_GetProgramResourceIndex() to simplify things further
V2: Fix bounds checks for program input/output, split unrelated comment fix
and _mesa_get_uniform_location() removal into their own patch.
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Previously it could end up using the “SKL early” device on BXT
depending on the revision number. This would probably break things
because for example has_llc would be wrong.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This fixes the remaining failing tests in:
ES31-CTS.program_interface_query.uniform-types
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This enables GL4.1 for radeonsi, and updates the
docs in the correct places.
v2: enable only for llvm 3.7 which has fixes in place.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is the final piece for ARB_gpu_shader5,
The code is based on the r600 code from Glenn Kennard,
and myself.
While developing this, I'm not 100% sure of all the calculations
made in the GS registers, this is why the max_stream is worked
out there and used to limit the changes in registers. Otherwise
my initial attempts either regressed GS texelFetch tests
or primitive-id-restart. The current code has no regressions
in piglit.
This commit doesn't enable ARB_gpu_shader5, since that just
bumps the glsl level to 4.00, so I'll just do a separate patch
for 4.10.
v1.1: fix bug introduced in rebase.
v2: Address Marek's review comments,
remove my llvm stream code for simpler C,
move gsvs_ring and gs_next_vertex to arrays.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of walking the list of batch and surface buffers, we simply keep
track of all known batch and surface buffers as we build the command
buffer. Then we use this new list to construct the validate list.
The algorighm we used previously required us to call add_bo in a particular
order in order to guarantee that we get the initial batch buffer as the
last element in the validate list. The new algorighm does a recursive walk
over the buffers and then re-orders the list. This should be much more
robust as we start to add circular dependancies in the relocations.
Fixes following compiler warning:
brw_cs.cpp:386:27: warning: comparison between signed and unsigned
integer expressions [-Wsign-compare]
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Before, we were doing this thing where we had one big relocation list for
the whole command buffer and each subbuffer took a chunk out of it. Now,
we store the actual relocation list in the anv_batch_bo. This comes at the
cost of more small allocations but makes a lot of things simpler.
Previously anv_batch.relocs was an actual relocation list. However, this
is limiting if the implementation of the batch wants to change the
relocation list as the batch progresses.
This is easily accomplished by moving simd16 3src to GEN9_FEATURES.
v2: small cleanup to make it more similar to GEN8_FEATURES
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This update fixes cases where a 48-bit address field was split into
two parts:
__gen_address_type MemoryAddress;
uint32_t MemoryAddressHigh;
which cases this pack code to be generated:
dw[1] =
__gen_combine_address(data, &dw[1], values->MemoryAddress, dw1);
dw[2] =
__gen_field(values->MemoryAddressHigh, 0, 15) |
0;
which breaks for addresses above 4G.
This update also fixes arrays of structs in commands and structs, for
example, we now have:
struct GEN8_BLEND_STATE_ENTRY Entry[8];
and the pack functions now write all dwords in the packet, making
valgrind happy.
Finally, we would try to pack 64 bits of blend state into a uint32_t -
that's also fixed now.
There are a couple of unrelated changes in t_vb_lighttmp.h that I hope
you'll excuse -- there's a block of code that's duplicated modulo a few
trivial differences that I took the liberty of fixing.