AlexIndustrial/mesa

Author	SHA1	Message	Date
Ian Romanick	e1bb53bb3c	nir/algebraic: Optimize some trivial bfi In fossil-db, one big compute shader on Hogwarts Legacy is helped for spills and fills. It has a lot of instances of bfi(0x3f, a, a). On Tiger Lake and Skylake, a compute shader in Unicom that has a single instance of this pattern is hurt for spills and fills. I think this is just due to non-determinism in the register allocation algorithm. shader-db: All Intel platforms had similar results. (Lunar Lake shown) total instructions in shared programs: 16992643 -> 16992548 (<.01%) instructions in affected programs: 17533 -> 17438 (-0.54%) helped: 33 / HURT: 0 total cycles in shared programs: 914313986 -> 914316238 (<.01%) cycles in affected programs: 3734544 -> 3736796 (0.06%) helped: 26 / HURT: 6 fossil-db: Lunar Lake, Meteor Lake, DG2, and Ice Lake had similar results. (Lunar Lake shown) Totals: Instrs: 208952780 -> 208952537 (-0.00%) Send messages: 10934879 -> 10934875 (-0.00%) Cycle count: 30988230904 -> 30988228660 (-0.00%); split: -0.00%, +0.00% Spill count: 534864 -> 534843 (-0.00%) Fill count: 667081 -> 667068 (-0.00%) Max live registers: 65686656 -> 65686624 (-0.00%) Non SSA regs after NIR: 244185358 -> 244185335 (-0.00%) Totals from 3 (0.00% of 704834) affected shaders: Instrs: 4708 -> 4465 (-5.16%) Send messages: 234 -> 230 (-1.71%) Cycle count: 264382 -> 262138 (-0.85%); split: -0.88%, +0.03% Spill count: 91 -> 70 (-23.08%) Fill count: 73 -> 60 (-17.81%) Max live registers: 647 -> 615 (-4.95%) Non SSA regs after NIR: 3957 -> 3934 (-0.58%) Tiger Lake Totals: Instrs: 230516919 -> 230515185 (-0.00%); split: -0.00%, +0.00% Send messages: 12657684 -> 12657680 (-0.00%) Cycle count: 23060318600 -> 23060279758 (-0.00%); split: -0.00%, +0.00% Spill count: 548462 -> 548446 (-0.00%); split: -0.00%, +0.00% Fill count: 582304 -> 582294 (-0.00%); split: -0.00%, +0.00% Scratch Memory Size: 19538944 -> 19539968 (+0.01%) Max live registers: 41713622 -> 41713593 (-0.00%) Non SSA regs after NIR: 260667939 -> 260667712 (-0.00%); split: -0.00%, +0.00% Totals from 174 (0.02% of 794323) affected shaders: Instrs: 158346 -> 156612 (-1.10%); split: -1.13%, +0.04% Send messages: 14330 -> 14326 (-0.03%) Cycle count: 24859875 -> 24821033 (-0.16%); split: -0.32%, +0.16% Spill count: 183 -> 167 (-8.74%); split: -9.29%, +0.55% Fill count: 284 -> 274 (-3.52%); split: -7.39%, +3.87% Scratch Memory Size: 9216 -> 10240 (+11.11%) Max live registers: 12587 -> 12558 (-0.23%) Non SSA regs after NIR: 164466 -> 164239 (-0.14%); split: -0.16%, +0.02% Skylake Totals: Instrs: 158904982 -> 158903764 (-0.00%) Send messages: 8490500 -> 8490496 (-0.00%) Cycle count: 19732284279 -> 19732345496 (+0.00%); split: -0.00%, +0.00% Spill count: 519127 -> 519115 (-0.00%) Fill count: 594283 -> 594290 (+0.00%); split: -0.00%, +0.00% Max live registers: 33708764 -> 33708739 (-0.00%) Non SSA regs after NIR: 169377234 -> 169377007 (-0.00%); split: -0.00%, +0.00% Totals from 174 (0.03% of 648725) affected shaders: Instrs: 160391 -> 159173 (-0.76%) Send messages: 14354 -> 14350 (-0.03%) Cycle count: 24776486 -> 24837703 (+0.25%); split: -0.07%, +0.32% Spill count: 332 -> 320 (-3.61%) Fill count: 587 -> 594 (+1.19%); split: -0.17%, +1.36% Max live registers: 12709 -> 12684 (-0.20%) Non SSA regs after NIR: 166557 -> 166330 (-0.14%); split: -0.16%, +0.02% Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32493>	2024-12-05 21:39:07 +00:00
José Roberto de Souza	04bdbeec31	intel/dev/xe: Fix access to eu_per_dss_mask DRM_XE_TOPO_EU_PER_DSS and DRM_XE_TOPO_SIMD16_EU_PER_DSS can be any number of bytes long but it was assuming it was always 4 bytes long. That was not a issue because Xe KMD return 4 bytes even if only needs 1 or 2 bytes but that is a problem with our HW simulator that was returning 2 bytes. Fixes: `a24d93aa89` ("intel/dev: Query and compute hardware topology for Xe") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32307>	2024-12-05 20:30:44 +00:00
Lionel Landwerlin	371b7a9b0d	anv: set pipeline flags correct for imported libs Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `3d49cdb71e` ("anv: implement VK_EXT_graphics_pipeline_library") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32507>	2024-12-05 19:53:34 +00:00
Lionel Landwerlin	6e396b400a	anv: fix missing bindings valid dynamic state change check Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `9ddd296cd3` ("anv: implement VK_EXT_vertex_input_dynamic_state") Reviewed-by: Ivan Briano <ivan.briano@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32507>	2024-12-05 19:53:34 +00:00
Adam Jackson	266dfb15c1	docs/envvars: Combine WGL sections Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32316>	2024-12-05 19:46:38 +00:00
Adam Jackson	f447e31daa	docs/envvars: Remove mention of IRIS_ENABLE_CLOVER This went away when clover dropped nir driver support. Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32316>	2024-12-05 19:46:38 +00:00
Eric R. Smith	a2f96667e2	mesa: update more drivers to handle pipe_blit_info swizzle_enable Handle swizzling by falling through to the software path. Swizzle should be rarely enabled, so this shouldn't affect performance in most cases. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31378>	2024-12-05 18:27:37 +00:00
Eric R. Smith	3da4a404ae	aux: add support for dumping the swizzle in pipe_blit_info Just some additional debug code for the new blit swizzle feature. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31378>	2024-12-05 18:27:37 +00:00
Eric R. Smith	b81aefcc19	mesa: when blitting between formats clear any unused components If the state tracker chooses to implement one format with a more general one (e.g. GL_ALPHA implemented with GL_RGBA) we end up in a situation where some components should be ignored. Readpix handles this correctly, but blit does not, which means that if we blit between different formats we can end up writing garbage into some components. Work around this by adding an explicit swizzle to the pipe_blit_info struct, which can re-arrange elements and/or put 0 or 1 into appropriate channels, and use this to set the appropriate values into unused channels via the sampler view. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/31378>	2024-12-05 18:27:37 +00:00
Erik Faye-Lund	9f69f7a66d	panvk: free preload-shaders after compiling These shaders are created using nir_builder_init_simple_shader(), which allocates using a NULL ralloc-parent, so ralloc_free should be the right function to free them with. Fixes: `0bc3502ca3` ("panvk: Implement a custom FB preload logic") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32486>	2024-12-05 17:45:16 +00:00
Erik Faye-Lund	43738a9a94	vulkan/meta: plug a couple of memory leaks We create NIR shaders here, and we need to free them when we're done with them as well. These shaders are created using nir_builder_init_simple_shader(), which allocates using a NULL ralloc-parent, so ralloc_free should be the right function to free them with. Fixes: `514c10344e` ("vulkan/meta: Add a concept of rect pipelines") Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32486>	2024-12-05 17:45:16 +00:00
Tomeu Vizoso	3aad0afc30	teflon/tests: Also use the cache for models in the test suite To speed things up now that we have more models under testing. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32485>	2024-12-05 17:02:27 +00:00
Tomeu Vizoso	74239aeb77	teflon/tests: Add support for models with float inputs and outputs Ended up deciding to drop C++ collections and use instead C pointers because the template use was starting to get ridiculous. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32485>	2024-12-05 17:02:27 +00:00
Tomeu Vizoso	f21d8af43a	teflon: Don't crash when a tensor isn't quantized We don't support yet hardware that can deal with floats, but it is better not to crash. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32485>	2024-12-05 17:02:27 +00:00
Tomeu Vizoso	a548b17b4e	teflon: Rename model tests so they aren't skipped by gtest-runner The regular expression engine in gtest-runner was matching more tests than we wanted, so we weren't testing all we thought. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32485>	2024-12-05 17:02:26 +00:00
Tomeu Vizoso	1e117478d4	teflon: Support tests with inputs with less than 4 dims Needed in models such as YOLOX. Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32485>	2024-12-05 17:02:26 +00:00
Tomeu Vizoso	140150083e	teflon: Add tests for the YOLOX model The model was generated from: https://github.com/Megvii-BaseDetection/YOLOXa (Apache License 2.0) Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32485>	2024-12-05 17:02:26 +00:00
David Rosca	8d3d35bf05	frontends/va: Add support for VA_SURFACE_ATTRIB_MEM_TYPE_DRM_PRIME_3 Reviewed-by: Leo Liu <leo.liu@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32113>	2024-12-05 16:34:09 +00:00
Lionel Landwerlin	80c0d2718c	anv: report formats supported by the common bvh framework Enables DXR 1.1 with vkd3d-proton Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com> Reviewed-by: Kevin Chuang <kaiwenjon23@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32487>	2024-12-05 15:54:10 +00:00
Eric R. Smith	aba90c1523	panfrost: check afbc status in panfrost_query_compression_modifiers In panfrost_query_compression_modifiers we need to ignore AFBC modifiers if the device does not support AFBC. In order to avoid duplicating code, we do this by calling panfrost_walk_dmabuf_modifiers with a flag that indicates we do not want AFRC modifiers. Reviewed-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32406>	2024-12-05 14:54:09 +00:00
Marek Olšák	dfc2f054b6	radeonsi/ci: update navi31 failures Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	ed4606a062	radeonsi/ci: remove --slow The tests were split or reduced in glcts. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	c0ccae84a7	radeonsi/ci: remove most flakes and some skips, update navi31 failures Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	af618dd907	radeonsi/ci: stop using a global flakes list, only use a per-chip flakes list We need to start treating flakes as fails and they are likely different between chips. I removed the gfx9 flakes file and renamed the original flakes file to gfx6-tahiti-flakes.csv, but it would be better to add a new flakes file for each generation we test. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	3ff8111fc6	radeonsi/ci: handle glinfo errors better Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	738a501e92	radeonsi: don't compute total_direct_count in si_draw if it's unused Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	ed372d4b7c	radeonsi: try to fix Navi14 regression in debug builds Assertion failure: ../src/gallium/drivers/radeonsi/si_state_shaders.cpp:1369: unsigned int si_get_input_prim(const si_shader_selector, const si_shader_key, bool): Assertion `gs->stage == MESA_SHADER_VERTEX' failed. Fixes: `7e959864b2` ("radeonsi: enable NGG culling for non-monolithic TES and GS") Tested-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	a3c293cdcd	radeonsi: revert to always returning true for load_cull_any_enabled_amd Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Marek Olšák	511a637a5c	radeonsi: pass cull face state via user SGPRs for shader culling The culling code always computes the determinant for culling zero-area triangles, so passing the state via user SGPRs doesn't really add much shader code to justify having shader variants for front/back face culling that uses the same determinant. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32288>	2024-12-05 12:07:06 +00:00
Alyssa Rosenzweig	ca9bf43d0b	nir,asahi: make argument alignment configurable this is more flexible. Mali needs 32-bit alignment, for example. I added an option struct in case we need to make this a callback or something later. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32398>	2024-12-05 10:58:51 +00:00
Alyssa Rosenzweig	0d77e91ca3	nir/opt_load_store_vectorize: match amul like imul for AGX, we preserve amul all the way until fusing address modes in order to be able to fuse effectively. so the load/store vectorizer wouldn't vectorize before fusing. however, after fusing we get fused intrinsics which are tricky to teach the vectorizer about as their semantics are pretty subtle. so we can't vectorize after, either. the easiest solution is to teach the vectorize about amul, which can always be replaced by imul for our pattern matches. this fixes certain cases of vectorization in OpenCL kernels on asahi. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32398>	2024-12-05 10:58:51 +00:00
Alyssa Rosenzweig	77d4ed0a01	nir/opt_algebraic: optimize sign bit manipulation libclc loves to generate the iand(0x7fffffff) pattern. ior/ixor patterns are added for completeness. Shaves 4 instructions off libclc vec4 normalize. v2: Loop over the bit sizes (Georg). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1] Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32398>	2024-12-05 10:58:51 +00:00
Alyssa Rosenzweig	be049e1c14	nir/search_helpers: handle bcsel in is_only_used_as_float this lets algebraic see through chains of instructions. v2: Limit recursion depth (Georg). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Marek Olšák <marek.olsak@amd.com> [v1] Reviewed-by: Georg Lehmann <dadschoorse@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32398>	2024-12-05 10:58:51 +00:00
Pavel Ondračka	ecc4d5da67	i915/ci: update CI expectations Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32494>	2024-12-05 09:35:43 +00:00
Boris Brezillon	19231c7ae3	pan: s/NIR_PASS_V/NIR_PASS/ Move away from NIR_PASS_V() like other drivers have done long ago. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32480>	2024-12-05 08:49:45 +00:00
Boris Brezillon	b47cf63cca	panvk: s/NIR_PASS_V/NIR_PASS/ Move away from NIR_PASS_V() like other drivers have done long ago. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32480>	2024-12-05 08:49:45 +00:00
Boris Brezillon	7e78aa73dd	panfrost: Use nir_shader_intrinsics_pass() for the line_smooth lowering pass We have a helper function to iterate only on intrinsics, so let's use it. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32480>	2024-12-05 08:49:45 +00:00
Boris Brezillon	34beb93635	panfrost: s/NIR_PASS_V/NIR_PASS/ Move away from NIR_PASS_V() like other drivers have done long ago. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32480>	2024-12-05 08:49:45 +00:00
Boris Brezillon	98e3c1e6fb	nir: Let nir_lower_texcoord_replace_late() report progress Useful if we want to wrap this pass with a NIR_PASS() to enforce validation. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32480>	2024-12-05 08:49:45 +00:00
Samuel Pitoiset	ea112cf84d	ci: update VKCTS main to a9f7069b9a5ba94715a175cb1818ed504add0107 This contains many more tests for Vulkan 1.4, but the Vulkan loader probably needs an update too. This should only affect RADV which is the only user for VKCTS main. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32475>	2024-12-05 08:06:23 +00:00
Samuel Pitoiset	8b755840fc	radv: fix initializing HTILE when the image has VRS rates VRS rates should only be preserved for clears, otherwise the HTILE buffer should be cleared completely. This fixes some failures/flakes in CI. Fixes: `8197d744f5` ("radv: Do not overwrite VRS rates when doing fast clears") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32463>	2024-12-05 07:34:58 +00:00
Samuel Pitoiset	e73fdac9a6	radv: enable DGC IES for compute with ESO This was supposed to be enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32484>	2024-12-05 07:06:17 +00:00
Simon Perretta	e26a383ee8	pco: fix x86 build Use inttypes.h when printing variables whose format specifier changes across different archs. Fixes: `37d47913` Fixes: `e67e4452` Closes: #12238 Signed-off-by: Simon Perretta <simon.perretta@imgtec.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32492>	2024-12-05 00:50:16 +00:00
Dylan Baker	43bdc84831	docs: update calendar for 24.3.1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32491>	2024-12-05 00:43:50 +00:00
Dylan Baker	fd0da8eb80	docs: Add SHA sums for 24.3.1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32491>	2024-12-05 00:43:50 +00:00
Dylan Baker	a3715349fd	docs: add release notes for 24.3.1 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32491>	2024-12-05 00:43:50 +00:00
Ian Romanick	0754a18621	brw/copy: Allow copy prop into src1 of broadcast This is the selector, and it must always be a uniform UD, so there's no reason to not propagate into it. No shader-db change on any Intel platform. fossil-db: All Intel platforms had similar results. (Lunar Lake shown) Totals: Instrs: 220507131 -> 220507127 (-0.00%) Cycle count: 31607052398 -> 31607053364 (+0.00%); split: -0.00%, +0.00% Totals from 5 (0.00% of 702410) affected shaders: Instrs: 995 -> 991 (-0.40%) Cycle count: 86392 -> 87358 (+1.12%); split: -0.07%, +1.19% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32097>	2024-12-05 00:15:27 +00:00
Ian Romanick	662339a2ff	brw/build: Use SIMD8 temporaries in emit_uniformize The fossil-db results are very different from v1. This is now mostly helpful on older platforms. v2: When optimizing BROADCAST or FIND_LIVE_CHANNEL to a simple MOV, adjust the exec_size to match the size allocated for the destination register. Fixes EU validation failures in some piglit OpenCL tests (e.g., atomic_add-global-return.cl). v3: Use component_size() in emit_uniformize and BROADCAST to properly account for UQ vs UD destination. This doesn't matter for emit_uniformize because the type is always UD, but it is technically more correct. v4: Update trace checksums. Now amly expects the same checksum as several other platforms. v5: Use xbld.dispatch_width() in the builder for when scalar_group() eventually becomes SIMD1. Suggested by Lionel. shader-db: Lunar Lake, Meteor Lake, DG2, and Tiger Lake had similar results. (Lunar Lake shown) total instructions in shared programs: 18091701 -> 18091586 (<.01%) instructions in affected programs: 29616 -> 29501 (-0.39%) helped: 28 / HURT: 18 total cycles in shared programs: 919250494 -> 919123828 (-0.01%) cycles in affected programs: 12201102 -> 12074436 (-1.04%) helped: 124 / HURT: 108 LOST: 0 GAINED: 1 Ice Lake and Skylake had similar results. (Ice Lake shown) total instructions in shared programs: 20480808 -> 20480624 (<.01%) instructions in affected programs: 58465 -> 58281 (-0.31%) helped: 61 / HURT: 20 total cycles in shared programs: 874860168 -> 874960312 (0.01%) cycles in affected programs: 18240986 -> 18341130 (0.55%) helped: 113 / HURT: 158 total spills in shared programs: 4557 -> 4555 (-0.04%) spills in affected programs: 93 -> 91 (-2.15%) helped: 1 / HURT: 0 total fills in shared programs: 5247 -> 5243 (-0.08%) fills in affected programs: 224 -> 220 (-1.79%) helped: 1 / HURT: 0 fossil-db: Lunar Lake Totals: Instrs: 220486064 -> 220486959 (+0.00%); split: -0.00%, +0.00% Subgroup size: 14102592 -> 14102624 (+0.00%) Cycle count: 31602733838 -> 31604733270 (+0.01%); split: -0.01%, +0.02% Max live registers: 65371025 -> 65355084 (-0.02%) Totals from 12130 (1.73% of 702392) affected shaders: Instrs: 5162700 -> 5163595 (+0.02%); split: -0.06%, +0.08% Subgroup size: 388128 -> 388160 (+0.01%) Cycle count: 751721956 -> 753721388 (+0.27%); split: -0.54%, +0.81% Max live registers: 1538550 -> 1522609 (-1.04%) Meteor Lake and DG2 had similar results. (Meteor Lake shown) Totals: Instrs: 241601142 -> 241599114 (-0.00%); split: -0.00%, +0.00% Subgroup size: 9631168 -> 9631216 (+0.00%) Cycle count: 25101781573 -> 25097909570 (-0.02%); split: -0.03%, +0.01% Max live registers: 41540611 -> 41514296 (-0.06%) Max dispatch width: 6993456 -> 7000928 (+0.11%); split: +0.15%, -0.05% Totals from 16852 (2.11% of 796880) affected shaders: Instrs: 6303937 -> 6301909 (-0.03%); split: -0.11%, +0.07% Subgroup size: 323592 -> 323640 (+0.01%) Cycle count: 625455880 -> 621583877 (-0.62%); split: -1.20%, +0.58% Max live registers: 1072491 -> 1046176 (-2.45%) Max dispatch width: 76672 -> 84144 (+9.75%); split: +14.04%, -4.30% Tiger Lake Totals: Instrs: 235190395 -> 235193286 (+0.00%); split: -0.00%, +0.00% Cycle count: 23130855720 -> 23128936334 (-0.01%); split: -0.02%, +0.01% Max live registers: 41644106 -> 41620052 (-0.06%) Max dispatch width: 6959160 -> 6981512 (+0.32%); split: +0.34%, -0.02% Totals from 15102 (1.90% of 793371) affected shaders: Instrs: 5771042 -> 5773933 (+0.05%); split: -0.06%, +0.11% Cycle count: 371062226 -> 369142840 (-0.52%); split: -1.04%, +0.52% Max live registers: 989858 -> 965804 (-2.43%) Max dispatch width: 61344 -> 83696 (+36.44%); split: +38.42%, -1.98% Ice Lake and Skylake had similar results. (Ice Lake shown) Totals: Instrs: 236063150 -> 236063242 (+0.00%); split: -0.00%, +0.00% Cycle count: 24516187174 -> 24516027518 (-0.00%); split: -0.00%, +0.00% Spill count: 567071 -> 567049 (-0.00%) Fill count: 701323 -> 701273 (-0.01%) Max live registers: 41914047 -> 41913281 (-0.00%) Max dispatch width: 7042608 -> 7042736 (+0.00%); split: +0.00%, -0.00% Totals from 3904 (0.49% of 798473) affected shaders: Instrs: 2809690 -> 2809782 (+0.00%); split: -0.02%, +0.03% Cycle count: 182114259 -> 181954603 (-0.09%); split: -0.34%, +0.25% Spill count: 1696 -> 1674 (-1.30%) Fill count: 2523 -> 2473 (-1.98%) Max live registers: 341695 -> 340929 (-0.22%) Max dispatch width: 32752 -> 32880 (+0.39%); split: +0.44%, -0.05% Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32097>	2024-12-05 00:15:27 +00:00
Ian Romanick	d2b266187d	brw: Use resize_sources several more places Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32097>	2024-12-05 00:15:27 +00:00
Ian Romanick	12d1886b87	brw/lower: Don't "fix" regioning of broadcast The next two commits modify the destination regioning in a way that, which still correct, trigger assertion failures if we try to fix the regioning here. Broadcast gets lowered in brw_eu_emit. For the purposes of region restrictions, let's assume that the final code emission will do the right thing. Doing a bunch of shuffling here is only going to make a mess of things. No shader-db or fossil-db changes on any Intel platform. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32097>	2024-12-05 00:15:27 +00:00

1 2 3 4 5 ...

198819 Commits