Rhys Perry
|
de7c6950b3
|
aco: fix sub-dword opsel/sdwa checks
These should all check if the operand has a regclass. The opsel check
should also be skipped post-RA.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5504>
|
2020-06-17 10:57:17 +00:00 |
|
Rhys Perry
|
1e791e51a6
|
aco: fix validation error from vgpr spill/restore code
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5504>
|
2020-06-17 10:57:17 +00:00 |
|
Daniel Schürmann
|
8006feda09
|
aco: don't allow SGPRs on logical phis
aco_validate() is called after phi lowering, now.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5496>
|
2020-06-16 14:46:19 +01:00 |
|
Daniel Schürmann
|
0e47fe3fa2
|
aco: reorder calls to aco_validate() and cleanup aco_compile_shader()
The first call of aco_validate should happen after phi lowering.
Otherwise, subdword restrictions might be violated
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5496>
|
2020-06-16 14:46:19 +01:00 |
|
Rob Clark
|
603d3c2991
|
radv: don't set num_components for non-vectorized intrinsics
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
|
2020-06-16 02:48:18 +00:00 |
|
Rhys Perry
|
a02e7f6799
|
aco: fix encoding of certain s_setreg_imm32_b32 instructions
If the mode is too small, the operand will be an inline constant and the
literal dword won't be written.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
82c265a514
|
aco: improve check for moving temporaries out of fixed definitions
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
e9578e3033
|
aco: allow GFX9 partial writes with instructions which use opsel
Some instructions such as v_mad_f16 can do partial writes on GFX9.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
82de70d06e
|
aco: add more opcodes to can_swap_operands
fossil-db (Navi, fp16 enabled):
Totals from 310 (0.24% of 127638) affected shaders:
CodeSize: 1290508 -> 1289716 (-0.06%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Samuel Pitoiset
|
3c1b55962e
|
aco: allow to swap operands for some 16-bit float instructions
No fossil-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
575b431c80
|
aco: validate sub-dword pseudo instructions
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
d16a7190a3
|
aco: optimize 16-bit and 64-bit float comparisons
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
22d7122739
|
aco: copy-propagate constants through p_extract_vector/p_split_vector
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4388 -> 4392 (+0.09%)
VMEM: 465 -> 458 (-1.51%)
Copies: 54 -> 55 (+1.85%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
3d6f67950d
|
aco: improve 8/16-bit constants
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4540 -> 4388 (-3.35%)
Instrs: 861 -> 830 (-3.60%)
Cycles: 3444 -> 3320 (-3.60%)
VMEM: 489 -> 465 (-4.91%)
SMEM: 107 -> 110 (+2.80%)
SClause: 31 -> 30 (-3.23%)
Copies: 58 -> 54 (-6.90%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
4784111abc
|
aco: use 32-bit inline constants for 16-bit integer instructions
See https://reviews.llvm.org/D81841
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
dd23345567
|
aco: fix half_pi constant for 16-bit fsin/fcos
This worked because the optimizer didn't consider that the 16-bit
instruction would interpret the inline constant differently. This will
change in the next commit.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
9b69ed0bb9
|
aco: improve sub-dword check for sgpr/constant propagation
p_create_vector can have sub-dword operands with a v1 definition.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
1210e0bd62
|
aco: create 16-bit input and output modifiers
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4552 -> 4540 (-0.26%)
Instrs: 863 -> 861 (-0.23%)
Cycles: 3452 -> 3444 (-0.23%)
VMEM: 490 -> 489 (-0.20%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
f5a5674178
|
aco: update comment about preserving fp16/fp64 denormals
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
7f511efa16
|
aco: create 16-bit mad/fma
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4868 -> 4552 (-6.49%)
Instrs: 956 -> 863 (-9.73%)
Cycles: 3824 -> 3452 (-9.73%)
VMEM: 504 -> 490 (-2.78%)
SMEM: 109 -> 107 (-1.83%)
VClause: 19 -> 20 (+5.26%)
Copies: 54 -> 58 (+7.41%)
PreVGPRs: 43 -> 41 (-4.65%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
1b10764e50
|
aco: try to use fma instead of mad when denormals are enabled
v_mad_f32 doesn't support denormals but v_fma_f32 does.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
6cb42cdd8f
|
aco: create mads when signed zeros should be preserved
This check was added because I thought v_mad_f32 didn't preserve the
signess of zero, but I can't reproduce that and this isn't mentioned
anywhere in LLVM.
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
1b6a319c15
|
aco: add and set precise flag
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
a8f800a836
|
aco: use p_as_uniform in emit_vop1_instruction
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
b6d9e45f47
|
aco: improve code for f2{i,u}{8,16}
Use sub-dword definitions so that the RA can use SDWA
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Rhys Perry
|
34d481fd1f
|
aco: use num_opcodes instead of last_opcode
No fossil-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
|
2020-06-15 18:24:22 +00:00 |
|
Samuel Pitoiset
|
013d096d15
|
ac: add ac_choose_spi_color_formats() to common code
It's similar between RADV and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5436>
|
2020-06-15 08:16:07 +02:00 |
|
Erik Faye-Lund
|
cd01d0dee1
|
radv: update internal reference
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4630>
|
2020-06-13 10:42:01 +00:00 |
|
Daniel Schürmann
|
1f98d8c804
|
aco: fix shared subdword loads
Shared subdword loads don't need byte alignment as they are split
into multiple loads if necessary.
Fixes: 5cde4989d3 ('aco: remove unnecessary split- and create_vector instructions for subdword loads')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5441>
|
2020-06-12 13:56:12 +00:00 |
|
Samuel Pitoiset
|
54e7ae569b
|
radv/llvm: implement radv_enable_mrt_output_nan_fixup workaround
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>
|
2020-06-12 14:43:58 +02:00 |
|
Samuel Pitoiset
|
7b44f549b3
|
aco: implement radv_enable_mrt_output_nan_fixup workaround
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>
|
2020-06-12 14:43:57 +02:00 |
|
Samuel Pitoiset
|
6f21995f98
|
radv: add new drirc option radv_enable_mrt_output_nan_fixup
To replace NaN from FS with zeros to fix game bugs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>
|
2020-06-12 14:43:31 +02:00 |
|
Samuel Pitoiset
|
07aefe8065
|
radv: set DB_SHADER_CONTROL.CONSERVATIVE_Z_EXPORT correctly
Use the SPIR-V execution modes if set.
Cc: 20.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5404>
|
2020-06-12 08:22:47 +00:00 |
|
Mauro Rossi
|
7afdc549f4
|
android: aco: add aco_ir.cpp to Makefile.sources
Fixes the following building errors:
FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
ld.lld: error: undefined symbol: aco::can_use_SDWA(chip_class, std::__1::unique_ptr<aco::Instruction, aco::instr_deleter_functor> const&)
...
ld.lld: error: undefined symbol: aco::can_use_opsel(chip_class, aco_opcode, int, bool)
...
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes: d9cfb8ad ("aco: validate instructions reading/writing upper halves/bytes")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5425>
|
2020-06-11 20:00:16 +02:00 |
|
Marek Olšák
|
0b3e344212
|
ac/surface: don't free dcc_retile_map on failure
because the hash table now owns it.
Fixes: bd553f0546 - ac/surface: cache DCC retile maps (v2)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
|
2020-06-11 10:01:57 +00:00 |
|
Marek Olšák
|
56f2a77a41
|
ac/surface: enable DCC for the first level in the mip tail on gfx10
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
|
2020-06-11 10:01:57 +00:00 |
|
Marek Olšák
|
7406ea37e6
|
ac/surface: require that gfx8 doesn't have DCC in order to be displayable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
|
2020-06-11 10:01:57 +00:00 |
|
Marek Olšák
|
374f6d568f
|
ac/surface: don't set is_displayable if displayable DCC is missing
If flags.display isn't set, then displayable DCC will not be computed, so
is_displayable will always be false.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
|
2020-06-11 10:01:57 +00:00 |
|
Marek Olšák
|
0fcf55329b
|
amd/addrlib: fix the C++ one definition rule violation
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1854
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5414>
|
2020-06-11 05:11:50 -04:00 |
|
Marek Olšák
|
bd553f0546
|
ac/surface: cache DCC retile maps (v2)
This reduces overhead when resizing windows or when allocating
similar image sizes over and over again.
v2: optimize the memory footprint of the cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
|
2020-06-10 15:35:46 +00:00 |
|
Marek Olšák
|
4cf674c8f7
|
ac/surface: add a wrapper structure to hold ADDR_HANDLE
and more things in the future.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
|
2020-06-10 15:35:46 +00:00 |
|
Marek Olšák
|
e6996d6fbd
|
amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
|
2020-06-10 15:35:46 +00:00 |
|
Marek Olšák
|
a99f4d5382
|
amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call
This decreases the DCC retile map overhead from 23% to 18%.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
|
2020-06-10 15:35:46 +00:00 |
|
Marek Olšák
|
a1b9eb62f6
|
ac/surface: don't recompute the DCC retile map for imported textures
The retile map is not used in this case, and the retile map computation
takes 39% of CPU time when resizing a window.
This brings it down to 23%.
The dcc_retile_use_uint16 setting has to be derived from DCC sizes.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
|
2020-06-10 15:35:46 +00:00 |
|
Rhys Perry
|
1b2e1163b2
|
aco: fix moving sub-dword values out of a register for a fixed definition
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
|
2020-06-10 15:05:11 +00:00 |
|
Rhys Perry
|
edf863d1d2
|
aco: use Info::definition_size instead of definition's regclass
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
|
2020-06-10 15:05:11 +00:00 |
|
Rhys Perry
|
207c35cbe8
|
aco: add Info::{operand_size,definition_size}
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
|
2020-06-10 15:05:11 +00:00 |
|
Rhys Perry
|
62ea429a99
|
aco: prefer 4-byte aligned definitions
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
CodeSize: 811984 -> 806224 (-0.71%)
Instrs: 155733 -> 155939 (+0.13%); split: -0.04%, +0.18%
Cycles: 1982568 -> 1984400 (+0.09%); split: -0.06%, +0.15%
VMEM: 7187 -> 7121 (-0.92%); split: +0.86%, -1.78%
SMEM: 1770 -> 1769 (-0.06%)
VClause: 1475 -> 1476 (+0.07%)
Copies: 12406 -> 12606 (+1.61%); split: -0.46%, +2.07%
Branches: 5901 -> 5900 (-0.02%); split: -0.25%, +0.24%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
|
2020-06-10 15:05:11 +00:00 |
|
Rhys Perry
|
56345b8c61
|
aco: allow reading/writing upper halves/bytes when possible
Use SDWA, opsel or a different opcode to achieve this.
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles: 1994180 -> 1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
|
2020-06-10 15:05:11 +00:00 |
|
Rhys Perry
|
98060ba0f0
|
aco: p_extract_vector in 64-bit u2f16/i2f16
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
|
2020-06-10 15:05:11 +00:00 |
|