Emma Anholt
d024eb6fab
glsl: Remove stale lower_instructions comments.
...
Should have been in 3a42e92a4f ("glsl: Drop the dead MOD_TO_FLOOR path.")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16823 >
2022-06-07 02:38:42 +00:00
Emma Anholt
8c4b88ee48
gallium+glsl: Remove EmitNoSat/PIPE_CAP_VERTEX_SHADER_SATURATE
...
The drivers not setting it were:
- nv30, which gets lowering using NIR's lower_fsat flag.
- r300, which gets lowering using NIR's lower_fsat flag.
- a2xx, which has was getting it optimized back to fsat anyway.
This drops the check for the cap from gallium nine. While nine does have
a non-nir path, I think it's safe to assume that if you have SM3
texturing, you can do fsat.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16823 >
2022-06-07 02:38:42 +00:00
Emma Anholt
7a8e3c80fd
nouveau/nv30: Make sure fsat is lowered in the VS.
...
GLSL lowers fsat to clamps based on PIPE_CAP_VERTEX_SHADER_SATURATE
(EmitNoSat), but nir is happy to optimize that back to fsat unless you
tell it not to.
Noticed by inspection while looking at deleting EmitNoSat.
Fixes: ca1ec72726 ("nv30/40: Switch to using NIR-to-TGSI by default.")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16823 >
2022-06-07 02:38:42 +00:00
Qiang Yu
61c500ee9b
radeonsi: replace llvm ls/hs interface lds ops with nir lowered ones
...
Use ac nir lower pass to generate these lds load/store ops explicitly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Timur Kristóf
f7f2770e72
ac/nir: Add remappability to tess and ESGS I/O lowering passes.
...
This will be used for radeonsi to map common I/O location to fixed
slots agreed by different shader stages.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Qiang Yu
666dbbf1a3
ac/nir: skip gl_Layer/gl_ViewportIndex write for LS
...
This is from radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Qiang Yu
87dfff3e6b
radeonsi: add tcs_vgpr_only_inputs parameter to si_get_nir_shader
...
Will be used later.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Qiang Yu
47dd3525fb
radeonsi: implement load_lshs_vertex_stride abi
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Qiang Yu
6a95452ddf
ac/nir: use nir_intrinsic_load_lshs_vertex_stride_amd
...
For radeonsi which pass this value by argument.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Qiang Yu
33b4b923ee
nir: add nir_intrinsic_load_lshs_vertex_stride_amd
...
For loading LS-HS vertex stride by shader argument in radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:14 +00:00
Qiang Yu
e35ff669b5
ac/llvm: get back nir_intrinsic_load_tess_rel_patch_id_amd
...
radeonsi will use it. This can be removed again after radeonsi support
radv_nir_lower_abi like lower pass.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16418 >
2022-06-07 01:40:13 +00:00
Timothy Arceri
4237932685
glsl: tidy up link_varyings_and_uniforms()
...
All uniform linking is now done via nir based linker not via this
code so we drop that from its name. We also drop a bunch of unused
parameters.
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16880 >
2022-06-07 01:11:19 +00:00
Timothy Arceri
f00be793e4
glsl: drop extra optimise swizzles call
...
As per the comment this was meant to tidy things up after varying
linking but varying linking has been moved into a nir based linker
so this extra call is no longer needed.
This optimisation pass is still called in the regular glsl ir
optimisation loop.
No shader-db change on Iris (BDW).
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16880 >
2022-06-07 01:11:19 +00:00
Emma Anholt
7af5929b54
turnip: Move tile loads back into the draw CS.
...
Now that we don't need to know if HW binning actually will get used or
not, we can just emit the tile loads into the start of the draw CS.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Danylo Piliaiev
ecabd3b5a9
turnip: Allow nested CP_COND_REG_EXEC
...
This ends up being needed for moving tile loads into the draw cs.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
a92fad45e9
turnip: Allow load/store skipping in vkCmdClearAttachments().
...
We have to use a 3D draw to make it possible (so it goes through the
binner's visibility calcs), but hopefully the increased overhead for apps
with non-skippable rendering balances against skipping in others.
The real motivation is to get draw-time state out of tile load setup.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
b8619ef343
turnip: Refactor a bit of subpass attachment processing.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
83ae4a5ed4
turnip: Include 3d-based CmdClearAttachments() in binning visibility.
...
It means the clear's draw can get skipped when it doesn't intersect with
the tile.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
48403628a2
turnip: Refactor a bit of repeated code for subpass setup.
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
5b119c0148
ci/turnip: Add a little forced touch-testing of XFB with no binning requested.
...
This is just a couple of seconds of runtime.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
046438b7a4
turnip: Use fb->binning_possible to decide on conditional tile load/stores.
...
When !fb->binning but fb->binning_possible, we can just set the VSC
per-tile visibility reg to all visible in the "whoops, we'd rather not bin
but we had to anyway for XFB" case. This gets that EndRenderPass state out
of tile_load_cs/store_cs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
6c37b4ded1
turnip: Move binning decisions from FB usage time to FB creation time.
...
This is mostly about helping me understand which choices are constant for the object as opposed to runtime decisions.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
ceeaac340a
turnip: Refactor a bit of tu6_emit_tile_select().
...
Reduce redundant code, make the used SET_VISIBILITY_OVERRIDE value clearer.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
2cad0dd03b
turnip: Don't bother creating tile_load/store_cs for sysmem rendering.
...
They won't get called, so don't bother.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16826 >
2022-06-07 00:00:28 +00:00
Emma Anholt
f69aa01c4e
ci/i915: Update manual piglit job expectations.
...
These shaders are near the instruction count limit, and something changed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16896 >
2022-06-06 21:48:11 +00:00
Emma Anholt
5d0f36d826
ci/i915: Merge the piglit and deqp runs.
...
One less button to click.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16896 >
2022-06-06 21:48:11 +00:00
Nagappa Koppad, Basanagouda
a99e85db9e
iris:Duplicate DRM fd internally instead of reuse.
...
Scenario we want to avoid is double close of DRM fd in iris driver.
Signed-off-by: Nagappa Koppad, Basanagouda <basanagouda.nagappa.koppad@intel.com >
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6620
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16886 >
2022-06-06 20:04:28 +00:00
Alyssa Rosenzweig
feb9020039
panfrost: Enable Mali-G57
...
Everything required for conformant OpenGL ES 3.1 support on Valhall (v9) is now
upstream -- all that's left is to enable implementations! Add the GPU ID for the
Mali-G57 implemented in the MediaTek MT8192 system-on-chip.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16890 >
2022-06-06 19:30:15 +00:00
Rhys Perry
40db52488b
aco: consider fma with multiplication by power-of-two unfused
...
fossil-db (Sienna Cichlid):
Totals from 700 (0.43% of 162353) affected shaders:
MaxWaves: 18986 -> 18990 (+0.02%)
Instrs: 546475 -> 539729 (-1.23%); split: -1.24%, +0.00%
CodeSize: 2823716 -> 2808504 (-0.54%); split: -0.55%, +0.01%
VGPRs: 25304 -> 25288 (-0.06%)
Latency: 2180102 -> 2168187 (-0.55%); split: -0.55%, +0.01%
InvThroughput: 466223 -> 457326 (-1.91%)
VClause: 6768 -> 6797 (+0.43%); split: -0.01%, +0.44%
SClause: 12235 -> 12237 (+0.02%); split: -0.22%, +0.24%
Copies: 34498 -> 34479 (-0.06%); split: -0.21%, +0.15%
PreVGPRs: 20968 -> 20958 (-0.05%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15862 >
2022-06-06 19:06:01 +00:00
Qiang Yu
6489af145c
mesa: enable HardwareAcceleratedSelect
...
Could be enabled/disabled by MESA_HW_ACCEL_SELECT.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
e8658adaa8
virgl: return -1 for PIPE_CAP_ACCELERATED
...
There's no way currently in virgl to determine whether it's running
above CPU or GPU. This info will be used to disable HW SELECT.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
1b3fd8b3d2
zink: reset PIPE_CAP_ACCELERATED when cpu soft rendering
...
This field can be used to disable some unsupport/unproper hardware
acceleration. Reset it when zink is runing on cpu rendering.
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
9b22ab4167
mesa/st: implement hardware accelerated GL_SELECT
...
Use an internal geometry shader to handle input primitives. Do full
accurate culling and clipping in the shader and output hit result and
min/max depth to a SSBO for final being written to select buffer.
With multiple result slots in SSBO we can left multiple draws on the
fly and wait them done when buffer is full or exit GL_SELECT mode.
This provides quicker selection response compared to software based
solution. Tested on Discovery Studio 2020: some complex model needs
1~2s selection response time originally, now it's almost selected
immidiately.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
19f3737262
mesa: pass select result buffer offset as attribute/varying
...
Will be used by geometry shader to store hit result.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
c41ac0682e
mesa: add HWSelectModeBeginEnd dispatch table
...
Used when in glBegin/End section and HW GL_RENDER mode.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
8373248cf0
mesa: set CurrentServerDispatch too when glBegin/End
...
When glthread not enabled, CurrentClientDispatch and CurrentServerDispatch
should be same. This does not cause problems before because OutsideBeginEnd
and BeginEnd have same BeginEnd entries, so when
CurrentServerDispatch==OutsideBeginEnd
CurrentClientDispatch==BeginEnd
will call into same BeginEnd _mesa_* functions.
But we'll add another dispatch table to replace BeginEnd when HW GL_SELECT
mode, so this needs to be fixed. Otherwise some function like _mesa_Rectf
which always call with CurrentServerDispatch will go into wrong entries.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
90b34c9184
mapi: add api setup header for hw select mode
...
Used by GL_SELECT mode dispatch table setup.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
f890b49c29
mesa/vbo: enclose none-vertex functions with HW_SELECT_MODE
...
For constructing dispatch table used in GL_SELECT mode. Every vertex
inserted need to also insert a name stack offset attribute.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
d231f95591
mesa: add hw select name stack code path
...
HW code path will not flush vertex whenever name stack change.
It will save the current name stack and write to select buffer
only when no space left or exit select mode.
This let us submit multi draws from different name stack at
once instead of submit draws for a single name stack then
wait it finish before submit next one.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
429c7fbaa1
mesa: refine name stack code to prepare for hw select
...
No functional change, just pack existing software based implementation into
the HardwareAcceleratedSelect switch, will add hardware implementation in
next commit.
ctx->Select.NameStackDepth is sure to be <=MAX_NAME_STACK_DEPTH, so removed
the overflow check in _mesa_LoadName.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Sgined-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
b3ba33b6f1
mesa: add _mesa_bufferobj_get_subdata
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
2224d6c35d
mesa: add hardware accelerated select constant
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
ff8ae4e589
nir/builder: add load/store array variable helper functions
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
1ef734cde6
mesa/vbo: remove unused vbo_context->binding
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Qiang Yu
feea8fed44
mesa/program: fix nir output reg overflow
...
outputs_written is uint64_t, should count max reg number
by util_last_bit64(). Otherwise the following access will
overflow the allocated array with a smaller size.
cc: mesa-stable
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com >
Signed-off-by: Qiang Yu <yuq825@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15765 >
2022-06-06 18:23:49 +00:00
Alyssa Rosenzweig
28801cfba0
pan/va: Unit test constant lowering pass
...
Like other optimizations, breaking this pass may not affect functional
correctness. It's also dead simple to unit test the pass, so we have no excuse
not to. Add unit tests for the functionality we currently support, since we just
extended it and want to make sure everything still works.
This includes tests for use of modifiers to get more small constants. There are
lots of subtle gotchas there, so let's add lots of unit tests to make sure we
got it right.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862 >
2022-06-06 18:10:24 +00:00
Alyssa Rosenzweig
9cfafbb09b
pan/va: Try widening small constants
...
Many small integers are availabled as small constants, but the table of small
constants is tightly packed. Zero and sign extensions are usually required to
access small integers. When packing constants, try zero/sign extension for
unsigned/signed integer instructions respectively.
total instructions in shared programs: 2716912 -> 2707795 (-0.34%)
instructions in affected programs: 1045609 -> 1036492 (-0.87%)
helped: 4460
HURT: 125
helped stats (abs) min: 1.0 max: 58.0 x̄: 2.14 x̃: 1
helped stats (rel) min: 0.14% max: 23.85% x̄: 1.35% x̃: 0.88%
HURT stats (abs) min: 1.0 max: 68.0 x̄: 3.41 x̃: 1
HURT stats (rel) min: 0.34% max: 3.88% x̄: 0.93% x̃: 0.70%
95% mean confidence interval for instructions value: -2.09 -1.89
95% mean confidence interval for instructions %-change: -1.33% -1.25%
Instructions are helped.
total cycles in shared programs: 141984.06 -> 141932.42 (-0.04%)
cycles in affected programs: 552.08 -> 500.44 (-9.35%)
helped: 18
HURT: 0
helped stats (abs) min: 0.015625 max: 11.0 x̄: 2.87 x̃: 0
helped stats (rel) min: 0.50% max: 19.64% x̄: 5.36% x̃: 1.53%
95% mean confidence interval for cycles value: -5.17 -0.56
95% mean confidence interval for cycles %-change: -9.28% -1.44%
Cycles are helped.
total cvt in shared programs: 13805.05 -> 13663.39 (-1.03%)
cvt in affected programs: 6127.45 -> 5985.80 (-2.31%)
helped: 4460
HURT: 125
helped stats (abs) min: 0.015625 max: 0.90625 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.35% max: 50.00% x̄: 5.19% x̃: 4.00%
HURT stats (abs) min: 0.015625 max: 1.0625 x̄: 0.05 x̃: 0
HURT stats (rel) min: 0.77% max: 9.30% x̄: 3.40% x̃: 2.78%
95% mean confidence interval for cvt value: -0.03 -0.03
95% mean confidence interval for cvt %-change: -5.10% -4.81%
Cvt are helped.
total ls in shared programs: 129545 -> 129494 (-0.04%)
ls in affected programs: 495 -> 444 (-10.30%)
helped: 6
HURT: 0
helped stats (abs) min: 2.0 max: 11.0 x̄: 8.50 x̃: 11
helped stats (rel) min: 1.49% max: 19.64% x̄: 13.95% x̃: 19.64%
95% mean confidence interval for ls value: -12.68 -4.32
95% mean confidence interval for ls %-change: -23.23% -4.67%
Ls are helped.
total quadwords in shared programs: 1476416 -> 1469824 (-0.45%)
quadwords in affected programs: 121208 -> 114616 (-5.44%)
helped: 820
HURT: 16
helped stats (abs) min: 8.0 max: 32.0 x̄: 8.28 x̃: 8
helped stats (rel) min: 1.39% max: 50.00% x̄: 11.00% x̃: 10.00%
HURT stats (abs) min: 8.0 max: 32.0 x̄: 12.50 x̃: 8
HURT stats (rel) min: 1.38% max: 10.00% x̄: 6.19% x̃: 7.14%
95% mean confidence interval for quadwords value: -8.14 -7.63
95% mean confidence interval for quadwords %-change: -11.20% -10.15%
Quadwords are helped.
total threads in shared programs: 53633 -> 53663 (0.06%)
threads in affected programs: 39 -> 69 (76.92%)
helped: 33
HURT: 3
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.64 1.02
95% mean confidence interval for threads %-change: 73.27% 101.73%
Threads are helped.
total spills in shared programs: 154 -> 103 (-33.12%)
spills in affected programs: 75 -> 24 (-68.00%)
helped: 6
HURT: 0
total fills in shared programs: 656 -> 656 (0.00%)
fills in affected programs: 148 -> 148 (0.00%)
helped: 2
HURT: 4
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862 >
2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig
72146051d5
pan/va: Try negating small constants when lowering
...
If a constant is used with a floating point instruction with a floating-point
negate modifier, we can use the modifier to negate constants in the table for
free. Each floating point in the table is positive, so this is required for
negative small constants.
total instructions in shared programs: 2728438 -> 2716912 (-0.42%)
instructions in affected programs: 1418220 -> 1406694 (-0.81%)
helped: 6053
HURT: 94
helped stats (abs) min: 1.0 max: 43.0 x̄: 1.94 x̃: 1
helped stats (rel) min: 0.06% max: 18.18% x̄: 1.34% x̃: 0.84%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 2.34 x̃: 2
HURT stats (rel) min: 0.09% max: 21.43% x̄: 1.87% x̃: 0.91%
95% mean confidence interval for instructions value: -1.93 -1.82
95% mean confidence interval for instructions %-change: -1.34% -1.25%
Instructions are helped.
total cycles in shared programs: 142103 -> 141984.06 (-0.08%)
cycles in affected programs: 766.70 -> 647.77 (-15.51%)
helped: 97
HURT: 0
helped stats (abs) min: 0.015625 max: 40.0 x̄: 1.23 x̃: 0
helped stats (rel) min: 0.27% max: 41.24% x̄: 3.63% x̃: 2.08%
95% mean confidence interval for cycles value: -2.41 -0.04
95% mean confidence interval for cycles %-change: -4.68% -2.57%
Cycles are helped.
total cvt in shared programs: 13983.34 -> 13805.05 (-1.28%)
cvt in affected programs: 7952.45 -> 7774.16 (-2.24%)
helped: 6049
HURT: 98
helped stats (abs) min: 0.015625 max: 0.359375 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.25% max: 100.00% x̄: 4.74% x̃: 2.52%
HURT stats (abs) min: 0.015625 max: 0.078125 x̄: 0.04 x̃: 0
HURT stats (rel) min: 0.17% max: 100.00% x̄: 5.48% x̃: 2.54%
95% mean confidence interval for cvt value: -0.03 -0.03
95% mean confidence interval for cvt %-change: -4.83% -4.32%
Cvt are helped.
total ls in shared programs: 129660 -> 129545 (-0.09%)
ls in affected programs: 601 -> 486 (-19.13%)
helped: 7
HURT: 0
helped stats (abs) min: 3.0 max: 40.0 x̄: 16.43 x̃: 8
helped stats (rel) min: 2.88% max: 41.24% x̄: 17.41% x̃: 12.50%
95% mean confidence interval for ls value: -31.42 -1.44
95% mean confidence interval for ls %-change: -29.25% -5.58%
Ls are helped.
total quadwords in shared programs: 1482728 -> 1476416 (-0.43%)
quadwords in affected programs: 131200 -> 124888 (-4.81%)
helped: 798
HURT: 15
helped stats (abs) min: 8.0 max: 24.0 x̄: 8.06 x̃: 8
helped stats (rel) min: 0.34% max: 50.00% x̄: 10.15% x̃: 6.67%
HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel) min: 1.49% max: 100.00% x̄: 11.25% x̃: 2.78%
95% mean confidence interval for quadwords value: -7.92 -7.60
95% mean confidence interval for quadwords %-change: -10.52% -8.99%
Quadwords are helped.
total threads in shared programs: 53585 -> 53633 (0.09%)
threads in affected programs: 51 -> 99 (94.12%)
helped: 49
HURT: 1
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.88 1.04
95% mean confidence interval for threads %-change: 90.97% 103.03%
Threads are helped.
total spills in shared programs: 125 -> 154 (23.20%)
spills in affected programs: 75 -> 104 (38.67%)
helped: 3
HURT: 4
total fills in shared programs: 800 -> 656 (-18.00%)
fills in affected programs: 476 -> 332 (-30.25%)
helped: 7
HURT: 0
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862 >
2022-06-06 18:10:23 +00:00
Alyssa Rosenzweig
cecfa0c44a
pan/va: Record which instructions are signed
...
We need to distinguish signed integer instructions from unsigned integer
instructions, to distinguish sign-extension and zero-extension of sources.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16862 >
2022-06-06 18:10:23 +00:00
Rhys Perry
f4c02d9116
aco: fix SMEM load_global with VGPR address and non-zero offset
...
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com >
Fixes: 3e9517c757 ("aco: implement _amd global access intrinsics")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16775 >
2022-06-06 17:47:59 +00:00