Unfortunately, this requires some non-trivial changes to the driver. Now
that the primitive restart index isn't given explicitly by the client, we
always use ~0 for everything like D3D does. Unfortunately, our hardware is
awesome and a 32-bit version of ~0 doesn't match any 16-bit values. This
means, we have to set it to either UINT16_MAX or UINT32_MAX depending on
the size of the index type. Since we get the index type from
CmdBindIndexBuffer and the rest of the VF packet from the pipeline, we need
to lazy-emit the VF packet.
This allows drivers to report queries in units of microseconds and
have the HUD display "us" (microseconds), "ms" (milliseconds) or "s"
(seconds) on the graph.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Instead of using a boolean 'is bytes' value, use the pipe_driver_query_type
enum type. This will let is add support for time values in the next patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This was probably disabled due to a combination of several bugs in the
generator code (fixed earlier in this series) and a misunderstanding
of the hardware spec. The documentation for most control flow
instructions mentions among other restrictions:
"Instruction compression is not allowed."
This however doesn't have any implications on 16 wide not being
supported, because none of the control flow instructions have
multi-register operands (control flow instructions are not compressed
on more recent hardware either, except maybe SNB's IF with inline
compare). In fact Gen4-5 had 16-wide control flow masks and stacks,
and the spec mentions in several places that control flow instructions
push and pop 16 channels worth of data -- Otherwise there doesn't seem
to be any indication that it shouldn't work.
Causes no piglit regressions, and gives the following shader-db
results on ILK:
total instructions in shared programs: 4711384 -> 4711384 (0.00%)
instructions in affected programs: 0 -> 0
helped: 0
HURT: 0
GAINED: 1215
LOST: 0
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
From the hardware docs for the DO instruction:
"Execution size is ignored for this instruction."
My observation on ILK hardware contradicts the spec though, channels
over the execution size of a DO instruction won't enter the loop, and
channels over the execution size of a WHILE instruction will exit the
loop after the first iteration -- The latter is consistent with the
spec though, there's no claim about the execution size being ignored
for the WHILE instruction so it's not completely unexpected that it
has an influence on the evaluation of EMask.
The execute_size argument of brw_DO() shouldn't have any effect on
Gen6 and newer hardware. On Gen4-5 WHILE instructions inherit the
execution size from the matching DO, so this patch should fix them
too. The execution size of BREAK and CONT instructions was already
being set correctly.
Fixes some 50 piglit tests on Gen4-5 when forced to run shaders with
conditional and loop instructions 16-wide,
e.g. shaders/glsl-fs-continue-inside-do-while.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The hardware docs don't mention explicitly what these fields should
be, but I've verified experimentally on ILK that using a GRF as
destination causes the register to be corrupted when the execution
size of an ENDIF instruction is higher than 8 -- and because the
destination we were using was g0, eventually a hang.
Fixes some 150 piglit tests on Gen4-5 when forced to run shaders with
if conditionals 16-wide, e.g. shaders/glsl-fs-sampler-numbering-3.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This was missing and was causing the driver to not work with
execlists. Presumably we get a different initial hw context with
execlists enabled, that has sample mask 0 initially.
Set this to 0xffff for now. When we add MS support, we need to take the
value from VkPipelineMsStateCreateInfo::sampleMask.
The new headers use stdbool for enable/disable fields which
implicitly converts expressions like (flags & 8) to 0 or 1.
Also handles MBO (must-be-one) fields by setting them to one,
corrects a bspec typo (_3DPRIM_LISTSTRIP_ADJ -> LINESTRIP) and
makes a few enum values less clashy.
This eliminates the error prone logic in si_shader_vs recalculating this
value.
It also fixes TGSI_SEMANTIC_CLIPDIST outputs incorrectly not being
counted for VS exports. They need to be counted because they are passed
to the pixel shader as parameters as well.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91193
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Convert the table from the direct mapping
VkImageViewType -> SurfaceType
into a mapping to an info struct
VkImageViewType -> struct anv_image_view_info