Rhys Perry
ded9cb904f
anv: Enable nir_opt_access
...
This commit will enable pass for searching readonly / writeonly
access when it's missing.
We don't support shaderStorageImageReadWithoutFormat
and the optimization pass causes those shaders to
take the write-only path which does support formatless.
Following games are affected with positive result:
- Wolfenstein: Youngblood
- Wolfenstein II: The New Colossus https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138
- Rage 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791
- The Surge 2 https://gitlab.freedesktop.org/mesa/mesa/-/issues/5805
- Metro Exodus https://gitlab.freedesktop.org/mesa/mesa/-/issues/4703
- DOOM Eternal https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3138,https://gitlab.freedesktop.org/mesa/mesa/-/issues/5791,https://gitlab.freedesktop.org/mesa/mesa/-/issues/4273
Signed-off-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15082 >
2022-02-23 13:11:12 +00:00
Alyssa Rosenzweig
abb7f04674
panfrost: Inline pan_emit_sfbd_tiler
...
Easier to read, the common code was already common.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
910d4f8245
panfrost: Remove pan_emit_fbd thunking
...
Use a common interface.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
8dc7757754
panfrost: Remove unrelated comment
...
Not sure what this was supposed to describe, but it's not the code here.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
099d61c95d
panfrost: Use txl instead of tex in the blitter
...
We always blit from a particular level, so it's a waste to compute the LOD.
This corresponds to a simple texture instruction with implement 0 LOD, which is
the optimal texturing path on Bifrost -- it maps to TEXS_2D but does not require
helper invocations.
Functional change on Bifrost: Blit shaders no longer set .computed_lod or
shader_contains_barrier.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
5b1a00c565
panfrost: Inline pan_blit_emit_dcd
...
Easier to follow the logic without having a million arguments passed around.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
c9784c9512
panfrost: Decouple tiler job and DCD emit
...
We can share the "emit quad" logic, even though the DCDs differ.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
a13d87c484
panfrost: Annotate slow clears as such
...
We should realistically be using the clear shaders from PanVK once they're moved
to common.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
1eb3dbafdb
panfrost: Set defaults for deprecated DCD fields
...
There are always set to true. Don't pollute the driver code with them, make
their existence a local detail to pre-Valhall XML and that's it.
Functional change: "four components per vertex" is now set on vertex job DCDs.
This should be a no-op.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
bd3d7e33b6
panfrost: Use pan_shader_prepare_rsd in blitter
...
This reduces code duplication and will ease Valhall porting. Functional changes
on v7:
* Shader contains barrier is now set (perf loss, fixed later in series)
* Shader register allocation is now set (perf win)
* Point sprite inverted, no-op for blit shaders
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Alyssa Rosenzweig
6fc81f163e
pan/mdg: Fix partial execution mode names
...
cont -> skip, last -> kill, and fix the special case handling. It's just an
enum. Makes the disassembly easier to read and closer to Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15123 >
2022-02-23 12:56:30 +00:00
Danylo Piliaiev
7e703e4428
turnip: Always use GMEM for feedback loops in autotuner
...
For ordinary feedback loops GMEM is a lot faster than sysmem since
we don't set SINGLE_PRIM mode.
For feedback loops with ordered rasterization GMEM should also be
faster.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106 >
2022-02-23 11:31:59 +00:00
Danylo Piliaiev
ebc23ac963
turnip: Implement VK_ARM_rasterization_order_attachment_access
...
Trivially implemented by using A6XX_GRAS_SC_CNTL_SINGLE_PRIM_MODE.
This extension is useful for emulators e.g. AetherSX2 PS2 emulator and
could drastically improve performance when blending is emulated.
Relevant tests:
dEQP-VK.rasterization.rasterization_order_attachment_access.*
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106 >
2022-02-23 11:31:59 +00:00
Danylo Piliaiev
d6c89e1e4a
turnip: Merge LRZ and DEPTH_PLANE draw states
...
They were emitted at the same time. Frees 1 draw state for us to use.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106 >
2022-02-23 11:31:59 +00:00
Danylo Piliaiev
dab34bd5c8
turnip: Use LATE_Z when there might be depth/stencil feedback loop
...
Otherwise a shader invocation would read the value which should have
been set AFTER this shader invocation.
Fixes tests:
dEQP-VK.rasterization.rasterization_order_attachment_access.depth.samples_1.multi_draw_barriers
dEQP-VK.rasterization.rasterization_order_attachment_access.stencil.samples_1.multi_draw_barriers
Fixes: 71595a189a
("tu: Fix feedback loops in sysmem mode")
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15106 >
2022-02-23 11:31:59 +00:00
Paulo Zanoni
d10fd5b7c9
iris: fix register spilling on compute shaders on XeHP
...
XeHP scratch space is handled differently. Commit ae18e1e707
implemented support for it, but handled it differently between render
and compute shaders: it calculates scratch_addr differently and
doesn't pin the buffer on compute. Make it work on compute shaders by
calling pin_scratch_space() from iris_compute_walker(), which fixes
both the address and the pinning.
This commit can be verified by the two-year-old-but-still-unreviewed
Piglit MR 234. You can also verify this by running a very simple
compute shader with INTEL_DEBUG=spill_fs.
References: https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/234
Fixes: ae18e1e707 ("iris: Add support for scratch on XeHP")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15070 >
2022-02-22 22:16:57 +00:00
Kenneth Graunke
c46d3acf0e
anv: Raise vertex input bindings and attributes limits slightly
...
This raises our vertex input bindings limit from 28 to 31, and our
vertex input attribute limit from 28 to 29. We could theoretically
go higher, but it will take additional work.
The 3DSTATE_VERTEX_BUFFERS and 3DSTATE_VERTEX_ELEMENTS limits are 33
vertex buffers, and 34 vertex elements. But we need up to two vertex
elements for system values (FirstVertex, BaseVertex, BaseInstance,
DrawID), and we currently use two vertex bindings for those.
There is another hidden limit: our compiler backend only supports the
push model for VS inputs currently. 3DSTATE_VS only allows URB Read
Lengths between [0, 15], which is measured in pairs of inputs, which
means we can theoretically push no more than 32 vertex elements. This
is no artifical limit either, as a vec4 element takes up 4 registers
in the payload, and 32 * 4 = 128, the entire size of our register file.
Plus, the VS Thread payload needs at least g0 and g1 for other things,
so we can really only push 31.
We can theoretically support one additional binding, by combining our
two SGV bindings into a single upload. In order to support additional
vertex elements, we would need to add support to the backend compiler
for the pull model for VS inputs.
References: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5917
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14991 >
2022-02-22 21:31:06 +00:00
Mike Blumenkrantz
dabba7d726
zink: ci updates
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067 >
2022-02-22 21:16:55 +00:00
Mike Blumenkrantz
3029000389
zink: remove zink_descriptor_util_init_null_set()
...
no longer used
Reviewed-by: Dave Airlie <airlied@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067 >
2022-02-22 21:16:55 +00:00
Mike Blumenkrantz
7266182be0
zink: allow null descriptor set layouts
...
I got confused while writing this somehow because of the null descriptor
feature, which enables drivers to consume a null descriptor, which has no
relation to a descriptor layout containing no descriptors
failing to accurately use zero descriptors can put layouts over the maximum
per-stage limits, which causes tests to crash
fixes (lavapipe):
KHR-GL46.shading_language_420pack.binding_uniform_block_array
KHR-GL46.multi_bind.dispatch_bind_buffers_base
Reviewed-by: Dave Airlie <airlied@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15067 >
2022-02-22 21:16:55 +00:00
Timur Kristóf
3759a16d8a
ac/nir/ngg: Fix mixed up primitive ID after culling.
...
When NGG culling is enabled, make sure that the correct
primitive ID is exported by each lane.
Fixes: e97f0463a8 "ac/nir: Implement NGG deferred attribute culling in NIR."
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6050
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15055 >
2022-02-22 18:15:24 +00:00
Mike Blumenkrantz
c063d8ff64
zink: prune ci lists
...
I don't know why I thought running GL3.2 and GL4.6 was a good idea,
but it wasn't
Acked-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15065 >
2022-02-22 18:02:00 +00:00
Emma Anholt
59bc17d57a
turnip: Request no implicit sync when we have no implicit-sync WSI BOs.
...
I chose to implement this as a global flag in the device, because
otherwise we would end up with extra draw overhead trying to avoid it in
the implicit-sync WSI case, and you're probably going to end up needing
implicit sync anyway because you used one of the BOs in any of the
submitted cmdbufs. To do better than this, we would probably want a
skip-implicit-sync flag on the BOs in the BO list, rather than global on
the submit.
Reports about venus on turnip say that this flag reduces worst-case
QueueSubmit time in a game workload from ~10ms to ~4ms.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14838 >
2022-02-22 17:36:05 +00:00
Samuel Pitoiset
83ee08f6d1
radv: fix build on BSD
...
Just disable inotify for BDS systems.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6060
Fixes: c50557d961 ("radv: allow applications to dynamically change RADV_FORCE_VRS")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15105 >
2022-02-22 17:16:21 +00:00
Alyssa Rosenzweig
2e86767370
pan/bi: Add BIFROST_MESA_DEBUG=nosb option
...
To disable the new scoreboarding optimizations when debugging.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
c81c022e66
pan/bi: Implement basic scoreboarding pass
...
Extend our existing bi_scoreboard infrastructure with a simple data flow
analysis pass that calculates which dependency slots need waiting. We
still lack a heuristic for selecting dependency slots.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
8f25d88d90
pan/bi: Print scoreboarding state
...
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
6ad9a7f650
pan/bi: Add scoreboard state to IR
...
To a limited degree, scoreboarding must be global, so add the data
structures for tracking this to the IR.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
91c02893d8
pan/bi: Clean up nits in liveness analysis
...
Fix minor silly things.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
734a8bdc5d
pan/bi: Use bi_exit_block
...
The "generic" one is a vestige of Midgard.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
75406a561f
pan/bi: Add bi_{start, exit}_block helpers
...
Useful for data flow analysis.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
e5423bb129
pan/bi: Do not cull post-RA staging writes
...
Bifrost post-RA dead code elimination can cull the destinations of
regular ALU instructions, by weakening from a register write to a
temporary write. However, there is no way to suppress staging writes, so
culling the destinations will result in invalid code generation.
Fixes a regression in
dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_static_vertex
with scoreboarding. The root cause there is the backend dead code
elimination not being sufficiently aggressive in the presence of control
flow. Usually this does not matter, since the backend optimizations are
intended to be local with global optimizations happening in NIR.
Unfortunately, our implementation of IDVS hits this hard. That will need
to be optimized (probably by specializing IDVS shaders in NIR instead of
the backend). In the mean time, let's fix the actual bug affecting
scoreboarding.
No shader-db changes.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
87d46f40c8
pan/bi: Cull DTSEL_IMM dests in post-RA DCE
...
They are useless (given the semantics of DTSEL_IMM) and complicate
scoreboarding. Just remove them in the pass that removes all the other silly
register destinations.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Alyssa Rosenzweig
956b969616
pan/bi: Clarify requirement for barriers
...
Barriers need to wait on all outstanding messages. This is more of an API
requirement than a hardware requirement, but it's still an invariant the
scoreboarding pass must respect.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14298 >
2022-02-22 16:57:30 +00:00
Adam Jackson
2eb644e470
mesa: Enable GL_NV_pack_subimage
...
This just legalizes a few of the pixelstore pack parameters in GLES2
that are already legal in desktop and GLES3. glamor takes advantage of
this in the GetImage and software-fallback paths.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14977 >
2022-02-22 10:45:28 -05:00
Alyssa Rosenzweig
606ac8d61e
pan/bi: Enable nir_opt_shrink_vectors
...
total instructions in shared programs: 1939513 -> 1935815 (-0.19%)
instructions in affected programs: 809066 -> 805368 (-0.46%)
helped: 3195
HURT: 865
helped stats (abs) min: 1.0 max: 15.0 x̄: 1.99 x̃: 1
helped stats (rel) min: 0.10% max: 25.00% x̄: 2.26% x̃: 1.28%
HURT stats (abs) min: 1.0 max: 22.0 x̄: 3.09 x̃: 2
HURT stats (rel) min: 0.10% max: 83.33% x̄: 2.67% x̃: 1.39%
95% mean confidence interval for instructions value: -1.00 -0.82
95% mean confidence interval for instructions %-change: -1.34% -1.08%
Instructions are helped.
total tuples in shared programs: 1523194 -> 1521789 (-0.09%)
tuples in affected programs: 745526 -> 744121 (-0.19%)
helped: 2947
HURT: 1844
helped stats (abs) min: 1.0 max: 18.0 x̄: 2.06 x̃: 1
helped stats (rel) min: 0.15% max: 25.00% x̄: 2.65% x̃: 1.59%
HURT stats (abs) min: 1.0 max: 29.0 x̄: 2.54 x̃: 1
HURT stats (rel) min: 0.09% max: 40.00% x̄: 2.32% x̃: 1.52%
95% mean confidence interval for tuples value: -0.39 -0.20
95% mean confidence interval for tuples %-change: -0.85% -0.62%
Tuples are helped.
total clauses in shared programs: 329158 -> 325350 (-1.16%)
clauses in affected programs: 111654 -> 107846 (-3.41%)
helped: 2787
HURT: 498
helped stats (abs) min: 1.0 max: 17.0 x̄: 1.57 x̃: 1
helped stats (rel) min: 0.76% max: 40.00% x̄: 6.92% x̃: 5.26%
HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.14 x̃: 1
HURT stats (rel) min: 0.87% max: 50.00% x̄: 4.73% x̃: 3.77%
95% mean confidence interval for clauses value: -1.21 -1.10
95% mean confidence interval for clauses %-change: -5.39% -4.93%
Clauses are helped.
total cycles in shared programs: 172084.50 -> 166827.62 (-3.05%)
cycles in affected programs: 74698.83 -> 69441.96 (-7.04%)
helped: 3706
HURT: 568
helped stats (abs) min: 0.041665999999999315 max: 19.0 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.24% max: 75.00% x̄: 9.48% x̃: 6.90%
HURT stats (abs) min: 0.041665999999999315 max: 1.0 x̄: 0.15 x̃: 0
HURT stats (rel) min: 0.25% max: 50.00% x̄: 2.21% x̃: 1.42%
95% mean confidence interval for cycles value: -1.28 -1.18
95% mean confidence interval for cycles %-change: -8.18% -7.67%
Cycles are helped.
total arith in shared programs: 57145.04 -> 57211.37 (0.12%)
arith in affected programs: 27595.12 -> 27661.46 (0.24%)
helped: 1933
HURT: 2259
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.09 x̃: 0
helped stats (rel) min: 0.16% max: 33.33% x̄: 2.74% x̃: 1.52%
HURT stats (abs) min: 0.04166399999999726 max: 1.3333329999999997 x̄: 0.11 x̃: 0
HURT stats (rel) min: 0.10% max: 100.00% x̄: 2.79% x̃: 1.62%
95% mean confidence interval for arith value: 0.01 0.02
95% mean confidence interval for arith %-change: 0.07% 0.40%
Arith are HURT.
total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0
total vary in shared programs: 11157.75 -> 10222 (-8.39%)
vary in affected programs: 5643 -> 4707.25 (-16.58%)
helped: 3196
HURT: 0
helped stats (abs) min: 0.125 max: 1.875 x̄: 0.29 x̃: 0
helped stats (rel) min: 2.78% max: 75.00% x̄: 18.49% x̃: 15.00%
95% mean confidence interval for vary value: -0.30 -0.29
95% mean confidence interval for vary %-change: -18.88% -18.11%
Vary are helped.
total ldst in shared programs: 146420 -> 140270 (-4.20%)
ldst in affected programs: 66027 -> 59877 (-9.31%)
helped: 2942
HURT: 10
helped stats (abs) min: 1.0 max: 19.0 x̄: 2.09 x̃: 2
helped stats (rel) min: 0.90% max: 100.00% x̄: 16.81% x̃: 8.33%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 2.22% max: 50.00% x̄: 13.03% x̃: 3.33%
95% mean confidence interval for ldst value: -2.15 -2.02
95% mean confidence interval for ldst %-change: -17.53% -15.89%
Ldst are helped.
total quadwords in shared programs: 1398329 -> 1392117 (-0.44%)
quadwords in affected programs: 704641 -> 698429 (-0.88%)
helped: 3677
HURT: 1299
helped stats (abs) min: 1.0 max: 26.0 x̄: 2.51 x̃: 1
helped stats (rel) min: 0.10% max: 26.92% x̄: 2.64% x̃: 1.89%
HURT stats (abs) min: 1.0 max: 20.0 x̄: 2.31 x̃: 1
HURT stats (rel) min: 0.11% max: 44.44% x̄: 2.34% x̃: 1.55%
95% mean confidence interval for quadwords value: -1.34 -1.16
95% mean confidence interval for quadwords %-change: -1.44% -1.25%
Quadwords are helped.
total threads in shared programs: 35234 -> 35311 (0.22%)
threads in affected programs: 119 -> 196 (64.71%)
helped: 91
HURT: 14
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.60 0.87
95% mean confidence interval for threads %-change: 70.08% 89.92%
Threads are helped.
total loops in shared programs: 125 -> 125 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 149 -> 144 (-3.36%)
spills in affected programs: 22 -> 17 (-22.73%)
helped: 1
HURT: 0
total fills in shared programs: 966 -> 956 (-1.04%)
fills in affected programs: 44 -> 34 (-22.73%)
helped: 1
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090 >
2022-02-22 15:21:09 +00:00
Alyssa Rosenzweig
e0e63c2a8e
pan/bi: Specialize IDVS in NIR
...
It's a bit more code, but it's needed to chew through control flow since we
don't have a backend version of dead_cf. Results are really good, meaning I
really screwed this up the first time around (hence the cc mesa-stable).
total instructions in shared programs: 1963576 -> 1939513 (-1.23%)
instructions in affected programs: 671053 -> 646990 (-3.59%)
helped: 4436
HURT: 729
helped stats (abs) min: 1.0 max: 43.0 x̄: 5.75 x̃: 6
helped stats (rel) min: 0.21% max: 100.00% x̄: 6.47% x̃: 5.17%
HURT stats (abs) min: 1.0 max: 22.0 x̄: 2.01 x̃: 1
HURT stats (rel) min: 0.50% max: 50.00% x̄: 10.45% x̃: 9.09%
95% mean confidence interval for instructions value: -4.77 -4.55
95% mean confidence interval for instructions %-change: -4.36% -3.80%
Instructions are helped.
total tuples in shared programs: 1533335 -> 1523194 (-0.66%)
tuples in affected programs: 483167 -> 473026 (-2.10%)
helped: 3414
HURT: 1288
helped stats (abs) min: 1.0 max: 20.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.27% max: 100.00% x̄: 4.87% x̃: 3.03%
HURT stats (abs) min: 1.0 max: 19.0 x̄: 2.02 x̃: 1
HURT stats (rel) min: 0.24% max: 38.10% x̄: 8.10% x̃: 5.88%
95% mean confidence interval for tuples value: -2.28 -2.03
95% mean confidence interval for tuples %-change: -1.62% -1.02%
Tuples are helped.
total clauses in shared programs: 351432 -> 329158 (-6.34%)
clauses in affected programs: 142237 -> 119963 (-15.66%)
helped: 5328
HURT: 3
helped stats (abs) min: 1.0 max: 43.0 x̄: 4.18 x̃: 4
helped stats (rel) min: 0.74% max: 100.00% x̄: 19.44% x̃: 17.24%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 9.09% max: 12.50% x̄: 10.90% x̃: 11.11%
95% mean confidence interval for clauses value: -4.25 -4.11
95% mean confidence interval for clauses %-change: -19.72% -19.12%
Clauses are helped.
total cycles in shared programs: 202830.92 -> 172084.50 (-15.16%)
cycles in affected programs: 117078.42 -> 86332 (-26.26%)
helped: 5450
HURT: 1
helped stats (abs) min: 0.083333 max: 49.0 x̄: 5.64 x̃: 5
helped stats (rel) min: 1.42% max: 100.00% x̄: 27.94% x̃: 25.64%
HURT stats (abs) min: 0.25 max: 0.25 x̄: 0.25 x̃: 0
HURT stats (rel) min: 2.46% max: 2.46% x̄: 2.46% x̃: 2.46%
95% mean confidence interval for cycles value: -5.74 -5.54
95% mean confidence interval for cycles %-change: -28.30% -27.58%
Cycles are helped.
total arith in shared programs: 57274.29 -> 57145.04 (-0.23%)
arith in affected programs: 16418.33 -> 16289.08 (-0.79%)
helped: 2442
HURT: 1784
helped stats (abs) min: 0.041665999999999315 max: 0.75 x̄: 0.14 x̃: 0
helped stats (rel) min: 0.23% max: 100.00% x̄: 5.51% x̃: 2.87%
HURT stats (abs) min: 0.041665999999999315 max: 0.9166670000000003 x̄: 0.12 x̃: 0
HURT stats (rel) min: 0.00% max: 100.00% x̄: 25.13% x̃: 9.09%
95% mean confidence interval for arith value: -0.04 -0.03
95% mean confidence interval for arith %-change: 6.61% 8.24%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).
total texture in shared programs: 12857 -> 12857 (0.00%)
texture in affected programs: 0 -> 0
helped: 0
HURT: 0
total vary in shared programs: 11157.75 -> 11157.75 (0.00%)
vary in affected programs: 0 -> 0
helped: 0
HURT: 0
total ldst in shared programs: 177208 -> 146420 (-17.37%)
ldst in affected programs: 117098 -> 86310 (-26.29%)
helped: 5447
HURT: 0
helped stats (abs) min: 1.0 max: 49.0 x̄: 5.65 x̃: 5
helped stats (rel) min: 1.92% max: 100.00% x̄: 27.91% x̃: 25.64%
95% mean confidence interval for ldst value: -5.75 -5.55
95% mean confidence interval for ldst %-change: -28.27% -27.56%
Ldst are helped.
total quadwords in shared programs: 1436507 -> 1398329 (-2.66%)
quadwords in affected programs: 515101 -> 476923 (-7.41%)
helped: 5150
HURT: 111
helped stats (abs) min: 1.0 max: 39.0 x̄: 7.46 x̃: 6
helped stats (rel) min: 0.17% max: 100.00% x̄: 10.02% x̃: 8.24%
HURT stats (abs) min: 1.0 max: 9.0 x̄: 2.01 x̃: 1
HURT stats (rel) min: 0.43% max: 21.62% x̄: 3.57% x̃: 1.94%
95% mean confidence interval for quadwords value: -7.41 -7.11
95% mean confidence interval for quadwords %-change: -9.98% -9.49%
Quadwords are helped.
total threads in shared programs: 35025 -> 35228 (0.58%)
threads in affected programs: 218 -> 421 (93.12%)
helped: 208
HURT: 5
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.91 0.99
95% mean confidence interval for threads %-change: 93.40% 99.55%
Threads are helped.
total loops in shared programs: 128 -> 125 (-2.34%)
loops in affected programs: 3 -> 0
helped: 3
HURT: 0
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
total spills in shared programs: 158 -> 149 (-5.70%)
spills in affected programs: 15 -> 6 (-60.00%)
helped: 9
HURT: 0
total fills in shared programs: 1133 -> 966 (-14.74%)
fills in affected programs: 197 -> 30 (-84.77%)
helped: 9
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090 >
2022-02-22 15:21:09 +00:00
Alyssa Rosenzweig
3c1021cd1e
panvk: Use more reliable assert for UBO pushing
...
The important thing isn't the number of words pushed, it's that there are no
UBOs required for us to upload. Check that instead.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15090 >
2022-02-22 15:21:09 +00:00
Georg Lehmann
d223b7f096
radv, aco: Add u_foreach_bit to .clang-format.
...
Signed-off-by: Georg Lehmann <dadschoorse@gmail.com >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15083 >
2022-02-22 14:57:29 +00:00
Xaver Hugl
a6be12fdad
gbm: improve documentation about the lifetime of resources
...
Signed-off-by: Xaver Hugl <xaver.hugl@gmail.com >
Reviewed-by: Simon Ser <contact@emersion.fr >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10906 >
2022-02-22 14:42:52 +01:00
Marek Olšák
62074cb4ac
ac: update shadowed registers
...
based on PAL
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
e74929bfef
radeonsi: move Arcturus code outside the gfx9 branch
...
preparation for a future commit
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
c740fd18ba
ac/llvm: replace structured by vindex != NULL in ac_build_buffer_store_common
...
"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently.
From `si_make_buffer_descriptor`:
* - For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units.
* - For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE.
so there is a difference between setting vindex = i32_0 and vindex = NULL.
Instead of having the `structured` flag, we can just check if vindex is NULL.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
1038382baf
ac/llvm: replace structured by vindex != NULL in ac_build_tbuffer_store
...
"raw" (IDXEN=0) and "structured" (IDXEN=1) do bounds checking differently.
From `si_make_buffer_descriptor`:
* - For VMEM and inst.IDXEN == 0 or STRIDE == 0, it's in byte units.
* - For VMEM and inst.IDXEN == 1 and STRIDE != 0, it's in units of STRIDE.
so there is a difference between setting vindex = i32_0 and vindex = NULL.
Instead of having the `structured` flag, we can just check if vindex is NULL.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
c8e2c6faf6
radeonsi: use SET_SH_REG_INDEX with index=3 for registers containing CU_EN
...
This matches PAL and RADV behavior. It's for preemption.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
79a7ab642a
ac/surface: add more elements to meta equations because HTILE can use them
...
according to gfx10SwizzlePattern.h
Fixes: 9fabbf2150 - ac/surface: copy the HTILE equations to the surface
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
9a28f79f7b
ac/surface/tests: fix missing NUM_PKRS extraction in test_modifier
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
12e00be09b
radeonsi: apply the LLVM discard bug workaround to LLVM 13 only
...
It was fixed in LLVM 14.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Marek Olšák
21f169b2fb
ac,radeonsi: rework and optimize how TMPRING_SIZE is set
...
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00
Yogesh Mohan Marimuthu
3c7df183a3
radeonsi: prepare clamp, alpha test before mrtz prepare
...
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15098 >
2022-02-22 11:41:04 +00:00