AlexIndustrial/mesa

Author	SHA1	Message	Date
Alyssa Rosenzweig	f8b69ebdc2	hk: drop assert works fine without. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	ece3bd74db	agx: make imad+ishl rules actually work total instructions in shared programs: 2750211 -> 2750184 (<.01%) instructions in affected programs: 50499 -> 50472 (-0.05%) helped: 27 HURT: 0 Instructions are helped. total alu in shared programs: 2273669 -> 2273642 (<.01%) alu in affected programs: 29874 -> 29847 (-0.09%) helped: 27 HURT: 0 Alu are helped. total fscib in shared programs: 2271986 -> 2271959 (<.01%) fscib in affected programs: 29874 -> 29847 (-0.09%) helped: 27 HURT: 0 Fscib are helped. total bytes in shared programs: 21475184 -> 21474968 (<.01%) bytes in affected programs: 371574 -> 371358 (-0.06%) helped: 27 HURT: 0 Bytes are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	f737470736	agx: fuse iadd+large shift into imad total instructions in shared programs: 2750352 -> 2750211 (<.01%) instructions in affected programs: 86944 -> 86803 (-0.16%) helped: 32 HURT: 18 Instructions are helped. total alu in shared programs: 2273810 -> 2273669 (<.01%) alu in affected programs: 76720 -> 76579 (-0.18%) helped: 32 HURT: 18 Alu are helped. total fscib in shared programs: 2272127 -> 2271986 (<.01%) fscib in affected programs: 76720 -> 76579 (-0.18%) helped: 32 HURT: 18 Fscib are helped. total bytes in shared programs: 21476424 -> 21475184 (<.01%) bytes in affected programs: 649884 -> 648644 (-0.19%) helped: 33 HURT: 18 Bytes are helped. total regs in shared programs: 865114 -> 865090 (<.01%) regs in affected programs: 525 -> 501 (-4.57%) helped: 3 HURT: 0 total uniforms in shared programs: 2120792 -> 2120848 (<.01%) uniforms in affected programs: 414 -> 470 (13.53%) helped: 0 HURT: 8 Uniforms are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	c9e42073a1	agx: optimize signext imad improves clpeak short. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Asahi Lina	cf0261980a	hk: Enable missing swapchainMaintenance1 support This was inconsistent with claiming the extension is supported, and that trips up GTK4. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	d449800e46	hk: don't advertise impossible modifiers fixes dEQP-VK.drm_format_modifiers.bound_to_dma_buf.a2b10g10r10_sint_pack32,Crash Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Asahi Lina	e5d61631fe	hk: Fix DRM modifier selection for compressed surfaces We have to reject DRM_FORMAT_MOD_APPLE_TWIDDLED_COMPRESSED for surfaces which are too small. Since the modifier is for all planes, that means that for multiplane images we need to test all planes for compression support. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Asahi Lina	da1601a4ec	hk: Add virtio implicit sync support Since we can't know what BOs are written easily, just sync against all external BOs. This should go away once we have proper fence passing support so we can do implicit sync passing in muvm-x11bridge. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Mary Guillemard	1a621a6967	agx: Add support for EGL_NV_context_priority_realtime Signed-off-by: Mary Guillemard <mary@mary.zone> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	ddc6d9e984	agx: fix atomics in tess count shaders Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	2c7635ab63	agx: add tests for sign/zero-extend propagate Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:02 +00:00
Alyssa Rosenzweig	6d56c8bc02	agx: fold zext into int sources Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Alyssa Rosenzweig	200d0794e2	agx: optimize signext+iadd Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Alyssa Rosenzweig	cfe0a9acec	agx: add pseudo for signext easier to optimize Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Alyssa Rosenzweig	8de339c0d8	agx: change int conversion test it's not useful as is but we can salvage Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Asahi Lina	85c5a25ec3	asahi: In-place decompress shared resources for feedback loops Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Asahi Lina	f04387a415	asahi: Introduce batch->feedback to disable compression in PBE Used for RTs that have feedback with in-place decompression. Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Asahi Lina	9288a3a583	asahi: Extract agx_decompress_inplace() Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Asahi Lina	f28a1b3fcf	asahi: Add PIPE_BIND_SHARED to imported resources Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Asahi Lina	59501af723	asahi: Add pipe bind flags to resource debug Signed-off-by: Asahi Lina <lina@asahilina.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32081>	2024-11-11 14:33:01 +00:00
Zan Dobersek	e17038cc88	fd/pps: provide derived counters on a7xx Provide various derived counters that can be reported by the freedreno perfetto producer on a7xx devices. Specific to a7xx is the split of counters for some countables between the rendering and visibility bins. Such counters have to be configured separately inside the appropriate perfcounter group, which then enables the derived counter to use the separate counter values in its measured metrics. Not all possible derived counters are enabled because the perfcounter groups cannot handle as many counters as would be necessary. There's also disabled derived counters that would require counters from the VBIF group which isn't exposed for now due to its more complex way of enabling the relevant counters. Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Acked-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29677>	2024-11-11 13:39:40 +00:00
Zan Dobersek	fae4a23ab1	fd/pps: specify counter group for each countable For each countable that's being set up, the specific counter group is now also required. This way on a7xx it will be possible to differentiate between countables that have the same name but can be used through counter groups for rendering bin or for visibility bin (e.g. CP and BV_CP). Signed-off-by: Zan Dobersek <zdobersek@igalia.com> Acked-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29677>	2024-11-11 13:39:39 +00:00
Danylo Piliaiev	21359417ba	ir3/parser: Print the line where parsing error occurred Super useful with rddecompiler, otherwise it's impossible to determine the instruction which is failed to be parsed. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31954>	2024-11-11 11:38:17 +00:00
Samuel Pitoiset	30d9166d80	radv: dump the trap handler shader with RADV_DEBUG=dump_trap_handler Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>	2024-11-11 09:34:05 +00:00
Samuel Pitoiset	4d50691ae9	radv: remove unused parameter to radv_fill_nir_compiler_options() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32031>	2024-11-11 09:34:05 +00:00
Konstantin Seurer	e3cf6290e0	radv: Add RADV_DEBUG=nirdebuginfo Annotates the shader with source locations into the nir shader. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin Seurer	cf447c5da1	nir: Do not gather source locations for phis Phi instructions are expected to be the first instructions in a block. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin Seurer	f2c204daf0	nir: Add a first_line parameter to gather_debug_info Useful when the file contains multiple shaders. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin Seurer	736c8c6f23	radv: Dump nir shaders before compiling It will allow adding source locations that point to the nir_string to the shader. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin Seurer	aaf65d6219	radv: Store debug info inside radv_shader Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:14 +00:00
Konstantin Seurer	54c22656b8	radv: Add a helper for accessing the shader binary Use pointers into the blob instead of hardcoding the layout everywhere. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:13 +00:00
Konstantin Seurer	69ebba82d4	aco: Pass debug information to the driver Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:13 +00:00
Konstantin Seurer	f8ef1afec8	aco: Handle nir_debug_info_instr Propagated debug info using p_debug_info and Program::debug_info. Offsets into the shader binary are gathered during assembly. This will be usefull for mapping back the disassembled shader to nir, glsl or spirv. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:13 +00:00
Konstantin Seurer	7dd9840128	amd: Add ac_shader_debug_info This is very similar to nir_debug_info_instr but it can exist outside of a nir shader. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29298>	2024-11-11 08:39:13 +00:00
Konstantin	4d09cd7fa5	nir/lower_non_uniform_access: Group accesses using the same resource Avoids emitting the waterfall loop for every access if they use the same resource: waterfall_loop { access } waterfall_loop { access } -> waterfall_loop { access access } Totals from 276 (0.33% of 84770) affected shaders: MaxWaves: 3360 -> 3356 (-0.12%) Instrs: 3759927 -> 3730650 (-0.78%) CodeSize: 21125784 -> 20899580 (-1.07%) VGPRs: 23096 -> 23104 (+0.03%) Latency: 35593716 -> 35315455 (-0.78%); split: -0.78%, +0.00% InvThroughput: 7353071 -> 7297309 (-0.76%); split: -0.76%, +0.00% VClause: 120983 -> 118579 (-1.99%) SClause: 113073 -> 110671 (-2.12%) Copies: 358272 -> 348686 (-2.68%) Branches: 166706 -> 159500 (-4.32%) PreSGPRs: 18598 -> 18596 (-0.01%) PreVGPRs: 21417 -> 21424 (+0.03%); split: -0.01%, +0.04% VALU: 2354862 -> 2350053 (-0.20%) SALU: 582291 -> 567638 (-2.52%) SMEM: 139875 -> 137473 (-1.72%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>	2024-11-11 07:53:13 +00:00
Konstantin Seurer	c5e40a60f8	radv: Lower non-uniform access after vectorization Scalar access can make nir_lower_non_uniform_access emit a lot of waterfall loops. Totals from 83 (0.10% of 84770) affected shaders: Instrs: 2747926 -> 2745959 (-0.07%); split: -0.07%, +0.00% CodeSize: 15022460 -> 14998240 (-0.16%); split: -0.16%, +0.00% Latency: 18602932 -> 18404976 (-1.06%); split: -1.18%, +0.12% InvThroughput: 4500730 -> 4450364 (-1.12%); split: -1.18%, +0.06% VClause: 93651 -> 91848 (-1.93%); split: -1.93%, +0.00% SClause: 63672 -> 63595 (-0.12%); split: -0.13%, +0.00% Copies: 229377 -> 229896 (+0.23%); split: -0.04%, +0.27% Branches: 107630 -> 107627 (-0.00%); split: -0.01%, +0.00% PreSGPRs: 5247 -> 5253 (+0.11%) PreVGPRs: 5911 -> 5903 (-0.14%); split: -0.29%, +0.15% VALU: 1761158 -> 1761540 (+0.02%); split: -0.01%, +0.03% SALU: 419743 -> 419783 (+0.01%); split: -0.01%, +0.02% VMEM: 152142 -> 150208 (-1.27%) SMEM: 80251 -> 80244 (-0.01%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>	2024-11-11 07:53:13 +00:00
Konstantin Seurer	d44f74896e	nir: Add missing access flags to print_access Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>	2024-11-11 07:53:13 +00:00
Konstantin Seurer	01ca436263	util: Fix some brackets in util_dynarray_.*_ptr Fixes a compiler error when directly accessing members of the returned pointer. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/30509>	2024-11-11 07:53:13 +00:00
Visan, Tiberiu	d379a3a428	amd/vpelib: remove luma offset (#459 ) \[WHY\] Shader and VPE does not apply brightness adjs in the same manner \[HOW\] Removed luma offset added in VPE \[TESTING\] Tested on real time video rendering Co-authored-by: Tiberiu Visan <tvisan@amd.com> Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Reviewed-by: Navid Assadian <Navid.Assadian@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>	2024-11-11 13:00:54 +08:00
Visan, Tiberiu	2172ab2c2a	amd/vpelib: patch to match shader (#456 ) \[WHY\] Shader and VPE had different behavior while adjusting the brightness \[HOW\] Apply the same normalization factor \[TESTING\] Tested on real video outputs Co-authored-by: Tiberiu Visan <tvisan@amd.com> Reviewed-by: Jesse Agate <Jesse.Agate@amd.com> Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>	2024-11-11 13:00:44 +08:00
Leder, Brendan Steve	891c4694ba	amd/vpelib: Refactor OCSC and update missing check Missing check for 601 in limited format check, updated that. Refactored OCSC to use specific limited depths. Cleaned up general color processing. Co-authored-by: Brendan <breleder@amd.com> Reviewed-by: Jesse Agate <Jesse.Agate@amd.com> Reviewed-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Acked-by: Chenyu Chen <Chen-Yu.Chen@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32075>	2024-11-11 13:00:29 +08:00
Martin Roukala (né Peres)	dc1fe83aa5	zink/ci: document new-ish vangogh flakes Signed-off-by: Martin Roukala (né Peres) <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32071>	2024-11-10 07:21:41 +02:00
Marek Olšák	1299f5c50a	gallium/radeon: import libdrm_radeon source code, drop the dependency Only radeon_surface.h/c is used from libdrm and radeon_drm.h is imported too. This code doesn't change anymore. We don't need the dependency. Acked-by: Pavel Ondračka <pavel.ondracka@gmail.com> Acked-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31827>	2024-11-10 00:52:18 +00:00
Russell Greene	ae9d365686	perfetto: fix macos compile On macos, <sys/types.h> does not declare clockid_t, but it's instead in <time.h>, which also includes <sys/types.h> on Linux, so just include <time.h> on all UNIX platforms. Fixes: `a871eabc` Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12064 Tested-by: Vinson Lee <vlee@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31881>	2024-11-09 09:23:22 +00:00
Deborah Brouwer	e368623fff	freedreno/ci: add prefix for a630-vk-asan tests Currently a630-vk-asan has separate files for its expected failures and skips, but by using the deqp-runner prefix option, the job can use the common a630 expectation files. This simplifies `a630-vk-asan` without any substantive changes to the ci job. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31970>	2024-11-09 08:15:36 +00:00
Alyssa Rosenzweig	0a81434adf	agx: rewrite address mode lowering AGX load/stores supports a single family of addressing modes: 64-bit base + sign/zero-extend(32-bit) << (format shift + optional shift) This is a base-index addressing mode, where the index is minimally in elements (never bytes, unless we're doing 8-bit load/stores). Both base and the resulting address must be aligned to the format size; the mandatory shift means that alignment of base is equivalent to alignment of the final address, which is taken care of by lower_mem_access_bit_size anyhow. The other key thing to note is that this is a 64-bit shift, after the sign- or zero-extension of the 32-bit index. That means that AGX does NOT implement 64-bit base + sign/zero-extend(32-bit << shift) This has sweeping implications. For addressing math from C-based languages (including OpenCL C), the AGX mode is more helpful, since we tend to get 64-bit shifts instead of 32-bit shifts. However, for addressing math coming from GLSL, the AGX mode is rather annoying since we know UBOs/SSBOs are at most 4GB so nir_lower_io & friends are all 32-bit byte indexing. It's tricky to teach them to do otherwise, and would not be optimal either since 64-bit adds&shifts are usually much more expensive than 32-bit on AGX except for when fused into the load/store. So we don't want 32-bit NIR, since then we can't use the hardware addressing mode at all. We also don't want 64-bit NIR, since then we have excessive 64-bit math resulting from deep deref chains from complex struct/array cases. Instead, we want a middle ground: 32-bit operations that are guaranteed not to overflow 32-bit and can therefore be losslessly promoted to 64-bit. We can make that no-overflow guarantee as a consequence of the maximum UBO/SSBO size, and indeed Mesa relies on this already all over the place. So, in this series, we use relaxed amul opcodes for addressing math. Then, we rewrite our address mode pattern matching to fuse AGX address modes. The actual pattern matching is rewritten. The old code was brittle handwritten nir_scalar chasing, based on a faulty model of the hardware (with the 32-bit shift). We delete it all, it's broken. In the new approach, we add some NIR pseudo-opcodes for address math (ulea_agx/ilea_agx) which we pattern match with NIR algebraic rules. Then the chasing required to fuse LEA's into load/stores is trivial because we never go deeper than 1 level. After fusing, we then lower the leftover lea/amul opcodes and let regular nir_opt_algebraic take it from here. We do need to be very careful around pass order to make sure things like load/store vectorization still happen. Some passes are shuffled in this commit to make this work. We also need to cleanup amul before fusing since we specifically do not have nir_opt_algebraic do so - the entire point of the pseudo-opcodes is to make nir_opt_algebraic ignore the opcodes until we've had a chance to fuse. If we simply used the .nuw bit on iadd/imul, nir_opt_algebraic would "optimize" things and lose the bit and then we would fail to fuse addressing modes, which is a much more expensive failure case than anything nir_opt_algebraic can do for us. I don't know what the "optimal" pass order for AGX would look like at this point, but what we have here is good enough for now and is a net positive for shader-db. That all ends up being much less code and much simpler code, while fixing the soundness holes in the old code, and also optimizing a significantly richer set of addressing calculations. Now we don't juts optimize GL/VK modes, but also CL. This is crucial even for GL/VK performance, since we rely on CL via libagx even in graphics shaders. Terraintessellation is up 10% to ~310fps, which is quite nice. The following stats are for the end of the series together, including this change + libagx change + the NIR changes building up to this... but not including the SSBO vectorizer stats or the IC modelling fix. In other words, these are the stats for "rewriting address mode handling". This is on OpenGL, and since the old code was targeted at GL, anything that's not a loss is good enough - we need this for the soundness fix regardless. total instructions in shared programs: 2751356 -> 2750518 (-0.03%) instructions in affected programs: 372143 -> 371305 (-0.23%) helped: 715 HURT: 75 Instructions are helped. total alu in shared programs: 2279559 -> 2278721 (-0.04%) alu in affected programs: 304170 -> 303332 (-0.28%) helped: 715 HURT: 75 Alu are helped. total fscib in shared programs: 2277843 -> 2277008 (-0.04%) fscib in affected programs: 304167 -> 303332 (-0.27%) helped: 715 HURT: 75 Fscib are helped. total ic in shared programs: 632686 -> 621886 (-1.71%) ic in affected programs: 113078 -> 102278 (-9.55%) helped: 1159 HURT: 82 Ic are helped. total bytes in shared programs: 21489034 -> 21477530 (-0.05%) bytes in affected programs: 3018456 -> 3006952 (-0.38%) helped: 751 HURT: 107 Bytes are helped. total regs in shared programs: 865148 -> 865114 (<.01%) regs in affected programs: 1603 -> 1569 (-2.12%) helped: 10 HURT: 9 Inconclusive result (value mean confidence interval includes 0). total uniforms in shared programs: 2120735 -> 2120792 (<.01%) uniforms in affected programs: 22752 -> 22809 (0.25%) helped: 76 HURT: 49 Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 27613312 -> 27613504 (<.01%) threads in affected programs: 1536 -> 1728 (12.50%) helped: 3 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	d466ccc6bd	libagx: promote math to use AGX address mode we want to fit into the 64 + ext() << #n pattern to let us fuse address arithmetic into our loads, so rework some libagx addressing to better match that Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	77ce91e99b	hk: reduce max SSBO size Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	01d2aa1d53	agx: fix bfeil timing Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00
Alyssa Rosenzweig	db8d467ec6	agx: model IC dispatch Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31964>	2024-11-08 21:15:42 -04:00

... 69 70 71 72 73 ...

186467 Commits