This moves a bunch of non-draw dependent calcs into the pipeline code,
to reduce CPU overheads in the draw path.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This realigns this code with the radeonsi version and fixes
the indirect case to work properly.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reduce size of radv_pipeline.c and improve code isolation. More
code can probably moved but it's a start.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
When I added gfx9 I did it wrong, this fixes it.
Fixes: 5247b311e9 "radv/gfx9: fix set predication packet."
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When using DCC some clear values don't require a cmask eliminate
step. This patch adds support for black and black with alpha 1,
there are other values, but I don't have access to a comprehensive list.
This works by setting the cmask eliminate predicate when doing the
fast clear, and later when doing the cmask elimination making sure
the draws are predicated.
This increases the fps on Sascha Willems deferred.
Tonga: 580fps->670fps on a Tonga PRO card.
Polaris 730->850fps
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This doesn't get used yet, it just adds support to various PKT3
emissions to enable it later.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The register header (and radeonsi comment) states V_411_SRC_ADDR_TC_L2
is for CIK+ only, so let's assert on earlier ASICs.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This seems to matter here in a profile, without this we spend a lot
more time exiting this function with no flush bits.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
No sense checking each bit separately in the common case of none
being set.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
This adds some rb+ support, as on GFX9 we have to disable
it as per radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
GFX9 needs to write event EOP to a fence buffer, allocate some
space for this, and just write an ever increasing number to it,
this isn't exactly what radeonsi does, but it seems to work.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just adds support for initialising some GFX9 registers,
and handles the different init for the VGT reuse reg.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This adds support to the CP dma code for GFX9, ported from
radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just adds the strings and includes the gfx9 register defs
in some files that we need them in.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This reworks this code to be like radeonsi, which will make it
easier to add GFX9 support to it in the future.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In prep for GFX9 refactor some of the eop event writing code
out.
This changes behaviour, but aligns with what radeonsi does,
it does double emits on CIK/VI, whereas previously it only
did this on CIK.
v2: bump the size checks.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just adds the chip in the right places.
We don't set the partial_vs_wave workaround, as radeonsi
doesn't, but have to confirm it's not required.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Gives me approximately a 2% perf increase in bot dota2 & talos.
Having descriptors (both sets and vertex buffers) prefetched
didn't help so I didn't include that.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
We want the guardband_x/y to be the largerst scalars such that each
viewport scaled by that amount is still a subrange of [-32767, 32767].
The old code has a couple of issues:
1) It used scissor instead of viewport_scissor, potentially taking into
account a viewport that is too small and therefore selecting a scale
that is too large.
2) Merging the viewports isn't ideal, as for example viewports with
boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k,
while their union [0, 1001] only allows a scale of ~32.
The new code just determines the guardband per viewport and takes the minimum.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
y is vert, x is horiz.
Noticed in visual inspection compared to radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This ports the VGT_VERTEX_REUSE register settings
for Polaris GPUs from radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The logic was different than radeonsi, fix it up before adding
tess support.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This just keeps popping up minor problems and regressions we should
revisit in a more sustainable manner later.
This also reverts:
Revert "radv: query cmds should mark a cmd buffer as having draws."
Revert "radv: also fixup event emission to not get culled."
This reverts commit d1640e7932.
This reverts commit 8b47b97215.
This reverts commit b4b19afebe.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This ports the remains of the workarounds from radeonsi for
the non-TESS cases. It should provide equivalent workarounds
for hawaii and bonarie.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This puts the common gfx state for the device into an
indirect buffer, and just calls out to it, on CIK and above.
This is taken from what radeonsi does.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is just prep work for the following patch to use
a common gfx init indirect buffer.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>