AlexIndustrial/mesa

Author	SHA1	Message	Date
Rhys Perry	bf0af80045	aco: improve VcmpxPermlaneHazard workaround According to LLVM, we only need to care about VOPC which writes exec. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17697>	2022-08-08 13:59:17 +00:00
Rhys Perry	5912c7d3fa	aco: only add vscnt wait when visiting VMEM/DS This prevents issues where we insert a s_waitcnt_vscnt(0) at the start of a block or very end of the shader because we're joining two blocks (for example, one with has_VMEM=true and the other with has_branch_after_DS=true). fossil-db (navi10): Totals from 2441 (1.51% of 161220) affected shaders: Instrs: 1383964 -> 1384094 (+0.01%); split: -0.07%, +0.08% CodeSize: 7438212 -> 7438760 (+0.01%); split: -0.05%, +0.06% Latency: 13780665 -> 13679664 (-0.73%); split: -1.53%, +0.80% InvThroughput: 2950835 -> 2921511 (-0.99%); split: -1.06%, +0.07% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17697>	2022-08-08 13:59:17 +00:00
Rhys Perry	52156d6b26	aco: set has_VMEM,has_DS=false after a branch fossil-db (navi10): Totals from 161 (0.10% of 161220) affected shaders: Instrs: 206726 -> 207179 (+0.22%); split: -0.02%, +0.24% CodeSize: 1114152 -> 1116032 (+0.17%); split: -0.01%, +0.18% Latency: 2119380 -> 2147403 (+1.32%); split: -0.16%, +1.48% InvThroughput: 462960 -> 461922 (-0.22%); split: -0.42%, +0.19% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17697>	2022-08-08 13:59:17 +00:00
Rhys Perry	b17e59a03b	aco: fix LdsBranchVmemWARHazard with 2+ branch chains For example, "DS -> branch -> VMEM -> branch -> DS". fossil-db (navi10): Totals from 639 (0.40% of 161220) affected shaders: Instrs: 629090 -> 628254 (-0.13%); split: -0.19%, +0.06% CodeSize: 3410164 -> 3406748 (-0.10%); split: -0.14%, +0.04% Latency: 7834755 -> 7821011 (-0.18%); split: -0.70%, +0.52% InvThroughput: 1369698 -> 1374495 (+0.35%); split: -0.12%, +0.47% A lot of the fossil-db changes are noise. threekingdoms.8db138826c386a62.1.foz/0b222ed175eebad0 is an example of a shader that actually has this issue. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `c037ba1bb7` ("aco/gfx10: Mitigate LdsBranchVmemWARHazard.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17697>	2022-08-08 13:59:17 +00:00
Marek Olšák	39800f0fa3	amd: change chip_class naming to "enum amd_gfx_level gfx_level" This aligns the naming with PAL. Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Pierre-Eric Pellou-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16469>	2022-05-13 14:56:22 -04:00
Tony Wasserka	0812d440c7	aco: Use std::vector for the underlying container of std::stack By default, std::stack uses std::deque to allocate its elements, which has poor cache efficiency. std::vector makes appending elements more expensive (due to potential reallocations), but in the changed contexts the element count should always be low anyway. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11925>	2021-10-01 09:39:13 +00:00
Christian Gmeiner	3d65cea6ee	util/bitset: s/BITSET_SET_RANGE/BITSET_SET_RANGE_INSIDE_WORD Prep work for the next commit. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11321>	2021-09-21 20:25:31 +00:00
Rhys Perry	f24f62f4ad	aco/nops: fix handle_raw_hazard_internal when visiting the current block Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12720>	2021-09-17 16:43:00 +00:00
Rhys Perry	a8cc911aaf	aco/nops: add State Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12720>	2021-09-17 16:43:00 +00:00
Rhys Perry	bdf7eed045	aco/nops: create handle_raw_hazard_instr helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12720>	2021-09-17 16:43:00 +00:00
Rhys Perry	bd07118b56	aco/nops: use up-to-date mask_size fossil-db (Pitcairn): Totals from 6 (0.00% of 129702) affected shaders: CodeSize: 8760 -> 8736 (-0.27%) Instrs: 1714 -> 1708 (-0.35%) Latency: 12325 -> 12302 (-0.19%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12720>	2021-09-17 16:43:00 +00:00
Tony Wasserka	66e51dc474	aco: Remove use of deprecated Operand constructors This migration was done with libclang-based automatic tooling, which performed these replacements: * Operand(uint8_t) -> Operand::c8 * Operand(uint16_t) -> Operand::c16 * Operand(uint32_t, false) -> Operand::c32 * Operand(uint32_t, bool) -> Operand::c32_or_c64 * Operand(uint64_t) -> Operand::c64 * Operand(0) -> Operand::zero(num_bytes) Casts that were previously used for constructor selection have automatically been removed (e.g. Operand((uint16_t)1) -> Operand::c16(1)). Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11653>	2021-07-13 17:43:26 +00:00
Daniel Schürmann	1e2639026f	aco: Format. Manually adjusted some comments for more intuitive line breaks. Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11258>	2021-07-12 21:27:31 +00:00
Daniel Schürmann	59fdaa1985	aco: reorder and cleanup #includes Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>	2021-07-12 12:09:31 +00:00
Rhys Perry	502a073552	aco: fix NSA following writelane No fossil-db changes on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `c353895c92` ("aco: use non-sequential addressing") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9187>	2021-03-17 12:31:05 +00:00
Rhys Perry	194f3e4c69	aco: fix NSA MIMG followed by MUBUF/MTBUF No fossil-db changes on GFX10. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Fixes: `c353895c92` ("aco: use non-sequential addressing") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9187>	2021-03-17 12:31:05 +00:00
Tony Wasserka	97c97781f6	aco: Fix vector::reserve() being called with the wrong size The container is moved from before and hence returns size 0. To get the correct value, the new instruction container must be used instead. This was flagged by clang-tidy. The fixed call still triggers the corresponding diagnostic, hence this change silences it by adding a redundant clear() after move. Fixes: `7f1b537304` ("aco: add new NOP insertion pass for GFX6-9") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9432>	2021-03-08 10:44:20 +01:00
Rhys Perry	3d4c13f3b8	aco: add DeviceInfo Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8761>	2021-02-15 13:44:22 +00:00
Rhys Perry	e115b01948	aco: return references in instruction cast methods Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>	2021-01-22 14:12:33 +00:00
Rhys Perry	1d245cd18b	aco: use format-check methods Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>	2021-01-22 14:12:32 +00:00
Rhys Perry	70dbcfa1c9	aco: use instruction cast methods Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>	2021-01-22 14:12:32 +00:00
Rhys Perry	c68fba9bba	aco: update bug workarounds for GFX10_3 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5546>	2020-08-04 20:39:33 +01:00
Rhys Perry	f302ef3853	aco: use s_waitcnt_depctr to mitigate VMEMtoScalarWriteHazard Apparently this is potentially faster than v_nop: https://reviews.llvm.org/D83872 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5923>	2020-07-18 00:14:12 +00:00
Rhys Perry	bcf94bb933	aco: properly recognize that s_waitcnt mitigates VMEMtoScalarWriteHazard fossil-db (Navi): Totals from 555 (0.41% of 135946) affected shaders: CodeSize: 1005716 -> 1003400 (-0.23%) Instrs: 195326 -> 194744 (-0.30%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5923>	2020-07-18 00:14:12 +00:00
Samuel Pitoiset	2f424c83e0	aco: only break SMEM clauses if XNACK is enabled (mostly APUs) According to LLVM, it seems only required for APUs like RAVEN, but we still ensure that SMEM stores are in their own clause. pipeline-db (VEGA10): Totals from affected shaders: SGPRS: 1775364 -> 1775364 (0.00 %) VGPRS: 1287176 -> 1287176 (0.00 %) Spilled SGPRs: 725 -> 725 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 65386620 -> 65107460 (-0.43 %) bytes Max Waves: 287099 -> 287099 (0.00 %) pipeline-db (POLARIS10): Totals from affected shaders: SGPRS: 1797743 -> 1797743 (0.00 %) VGPRS: 1271108 -> 1271108 (0.00 %) Spilled SGPRs: 730 -> 730 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 64046244 -> 63782324 (-0.41 %) bytes Max Waves: 254875 -> 254875 (0.00 %) This only affects GFX6-GFX9 chips because the compiler uses a different pass for GFX10. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4349> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4349>	2020-04-01 17:50:31 +00:00
Rhys Perry	c6e0c062da	aco: improve control flow handling in GFX6-9 NOP pass Fixes Detroit: Become Human hang. Also affects World of Warships. pipeline-db (Tahiti): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) pipeline-db (Polaris): Totals from affected shaders: SGPRS: 17168 -> 17168 (0.00 %) VGPRS: 11296 -> 11296 (0.00 %) Spilled SGPRs: 1870 -> 1870 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1472628 -> 1473292 (0.05 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 628 -> 628 (0.00 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 17168 -> 17168 (0.00 %) VGPRS: 11296 -> 11296 (0.00 %) Spilled SGPRs: 1870 -> 1870 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1409716 -> 1410380 (0.05 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) Max Waves is lower than it should be because of a null winsys bug. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>	2020-03-05 19:37:24 +00:00
Rhys Perry	47b7f104a0	aco: consider non-hazard writes in handle_raw_hazard_internal I think this helps GFX6 in particular because code like this is common: s_add_i32 s4, 0x60, s3 s_mov_b32 s5, 0 s_load_dwordx4 s[4:7], s[4:5], 0x0 s_buffer_load_dword s4, s[4:7], 0xcc pipeline-db (Tahiti): Totals from affected shaders: SGPRS: 1923878 -> 1923878 (0.00 %) VGPRS: 1528964 -> 1528964 (0.00 %) Spilled SGPRs: 476 -> 476 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 88723604 -> 88528880 (-0.22 %) bytes LDS: 241 -> 241 (0.00 %) blocks Max Waves: 145402 -> 145402 (0.00 %) pipeline-db (Polaris): Totals from affected shaders: SGPRS: 428128 -> 428128 (0.00 %) VGPRS: 353092 -> 353092 (0.00 %) Spilled SGPRs: 119251 -> 119251 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 57580468 -> 57563964 (-0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 11631 -> 11631 (0.00 %) piepline-db (Vega): Totals from affected shaders: SGPRS: 425016 -> 425016 (0.00 %) VGPRS: 349588 -> 349588 (0.00 %) Spilled SGPRs: 117835 -> 117835 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 54890792 -> 54874432 (-0.03 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 54 -> 54 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>	2020-03-05 19:37:24 +00:00
Rhys Perry	38743577f8	aco: improve get_wait_states() pipeline-db (Tahiti): Totals from affected shaders: SGPRS: 21208 -> 21208 (0.00 %) VGPRS: 22388 -> 22388 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3278596 -> 3277004 (-0.05 %) bytes LDS: 19 -> 19 (0.00 %) blocks Max Waves: 238 -> 238 (0.00 %) pipeline-db (Polaris): Totals from affected shaders: SGPRS: 64 -> 64 (0.00 %) VGPRS: 96 -> 96 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5200 -> 5192 (-0.15 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 10 -> 10 (0.00 %) pipeline-db (Vega): Totals from affected shaders: SGPRS: 0 -> 0 (0.00 %) VGPRS: 0 -> 0 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 0 -> 0 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 0 -> 0 (0.00 %) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>	2020-03-05 19:37:24 +00:00
Rhys Perry	7f1b537304	aco: add new NOP insertion pass for GFX6-9 This new pass is more similar to the GFX10 pass and should be able to handle control flow better. No pipeline-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4004>	2020-03-05 19:37:24 +00:00
Daniel Schürmann	71440ba0f5	aco: reorder VMEM operands in ACO IR For all VMEM instructions, the resource constant is now in operands[0]. For MIMG instructions, the sampler shares operands[1] with write data in case this instruction writes memory. Moving the VADDR to be the last operand for MIMG is the first step to support Navi NSA encoding. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>	2020-01-29 18:45:23 +00:00
Rhys Perry	e6c90e4af9	aco: fix WaR check for >64-bit FLAT/GLOBAL instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `5986e0019` ('aco: improve WAR hazard workaround with >64bit stores') Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3541>	2020-01-27 14:50:37 +00:00
Samuel Pitoiset	1ac49ba908	aco: fix a hazard with v_interp_* and v_{read,readfirst}lane_* on GFX6 It's required to insert 1 wait state if the dst VGPR of any v_interp_* is followed by a read with v_readfirstlane or v_readlane to fix GPU hangs on GFX6. Note that v_writelane_* is apparently not affected. This hazard isn't documented anywhere but AMD confirmed it. This fixes a GPU hang with the texturemipmapgen Sascha demo on GFX6. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3533>	2020-01-24 18:34:27 +00:00
Timur Kristóf	c787b8d2a1	aco/gfx10: Fix VcmpxExecWARHazard mitigation. The SOPP instruction shouldn't have a definition, and its block should be set to -1 in order to prevent it from being recognized as a branch. Also fix a typo in the readme. Fixes: `d6dfce02d0` Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3552>	2020-01-24 16:21:08 +00:00
Rhys Perry	70f63c1988	aco: improve support for s_sendmsg In particular, the messages needed for GS. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2421>	2020-01-24 13:35:07 +00:00
Daniel Schürmann	6a586a6006	aco: split read/writelane opcode into VOP2/VOP3 version for SI/CI Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Daniel Schürmann	6fc9ddfef8	aco: recognize SI/CI SMRD hazards Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-12-07 11:23:11 +01:00
Rhys Perry	5986e00194	aco: improve WAR hazard workaround with >64bit stores Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Rhys Perry	a9fc81b098	aco: add v_nop inbetween exec write and VMEM/DS/FLAT LLVM and the proprietary compiler seem to do this Fixes: `b01847bd9` ("aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.") Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-11-29 17:46:01 +00:00
Daniel Schürmann	8657eede8a	aco: check if SALU instructions are predeceeded by exec when calculating WQM needs Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>	2019-11-14 17:27:10 +01:00
Rhys Perry	d4684a294b	aco: a couple loop handling fixes for GFX10 hazard pass It was joining from the wrong blocks and block.kind is a bitmask instead of an enum. Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>	2019-10-30 18:13:53 +00:00
Timur Kristóf	c580f134ae	aco: Refactor hazard mitigations, separate pass for GFX10. GFX10 hazards require a different approach compared to previous generations, for example it doesn't need s_nop, and most hazards can't be solved by adding NOPs at all. Also, they are not resolved by branch instructions. This commit reorganizes aco_insert_NOPs so that there is now a separate pass for GFX10. The new GFX10 pass also respects the control flow of the shader. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	b01847bd94	aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard. This commit refines the VMEMtoScalarWriteHazard mitigation, based upon a closer look at what LLVM does. Also changes the code to match the structure of the other hazard mitigations. * The hazard is not only triggered by VMEM, FLAT and GLOBAL but also SCRATCH and DS instructions. * The SMEM/SALU instructions only cause a hazard when they write a register that the VMEM/etc. are reading. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	c037ba1bb7	aco/gfx10: Mitigate LdsBranchVmemWARHazard. There is a hazard caused by there is a branch between a VMEM/GLOBAL/SCRATCH instruction and a DS instruction. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	09d676d81a	aco/gfx10: Mitigate SMEMtoVectorWriteHazard. There is a hazard that happens when an SMEM instruction reads an SGPR and then a VALU instruction writes that same SGPR. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	d6dfce02d0	aco/gfx10: Mitigate VcmpxExecWARHazard. There is a hazard when a non-VALU instruction reads the EXEC mask and then a VALU instruction writes the EXEC mask. This commit adds a workaround that avoids the problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Timur Kristóf	e5a8616973	aco/gfx10: Mitigate VcmpxPermlaneHazard. Any permlane instruction that follows any VOPC instruction can cause a hazard, this commit implements a workaround that avoids this causing a problem. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-25 10:10:42 +02:00
Rhys Perry	6a6bef59b0	aco: Initial work to avoid GFX10 hazards. Currently just breaks up SMEM groups and fixes FeatureVMEMtoScalarWriteHazard (name from LLVM). Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-By: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>	2019-10-10 09:57:53 +02:00
Daniel Schürmann	93c8ebfa78	aco: Initial commit of independent AMD compiler ACO (short for AMD Compiler) is a new compiler backend with the goal to replace LLVM for Radeon hardware for the RADV driver. ACO currently supports only VS, PS and CS on VI and Vega. There are some optimizations missing because of unmerged NIR changes which may decrease performance. Full commit history can be found at https://github.com/daniel-schuermann/mesa/commits/backend Co-authored-by: Daniel Schürmann <daniel@schuermann.dev> Co-authored-by: Rhys Perry <pendingchaos02@gmail.com> Co-authored-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Co-authored-by: Connor Abbott <cwabbott0@gmail.com> Co-authored-by: Michael Schellenberger Costa <mschellenbergercosta@googlemail.com> Co-authored-by: Timur Kristóf <timur.kristof@gmail.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-09-19 12:10:00 +02:00

48 Commits