AlexIndustrial/mesa

Author	SHA1	Message	Date
Alejandro Piñeiro	ce98967274	v3dv: define a default attribute values with float type We are providing a BO with the default attribute values for the GL_SHADER_STATE_RECORD, that contains 16 vec4. Such default value for each vec4 is (0, 0, 0, 1). As the attribute format could be int or float, the "1" value needs to take into account the attribute format. But in the practice, the most common case is all floats. So we create one default attribute values BO assuming that all attributes will be floats, and we store it at v3dv_device and only create a new one if a int format type is defined. That allows to reduce the amount of BOs needed. Note that we could still try to reduce the amount of BOs used by the pipelines if we create a bigger BO, and we just play with the offsets. But as mentioned, that's not the usual, and would add an extra complexity,so it is not a priority right now. This makes the following test passing when disabling the pipeline cache support: dEQP-VK.api.object_management.max_concurrent.graphics_pipeline Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9845>	2021-03-26 15:00:05 +00:00
Iago Toral Quiroga	e790c20403	broadcom/compiler: try to fill up delay slots after a thrsw The way we handle thrsw instructions is that we try to merge them back into previously scheduled instructions to fill up its delay slots. This is generally safe, because the thrsw won't happen until after the delay slots, so we are not really changing the execution order of the instructions and we just need to make sure we don't violate a few specific restrictions. If we have not managed to fill up all delay slots after doing this, then we emit as many NOPs as needed to fill them. This is to ensure that we don't schedule an instruction that needs to execute after the thread switch before the thread switch happens. However, doing this can lead to inefficient code, since some times the instructions we schedule after a thrsw are indepdent of the thrsw and could be safely executed in its delay slots. This change removes the fixed NOP emission after a thrsw to fill delay slots and instead adds code to ensure that our instruction scheduling is aware of when it is scheduling instructions in the delay slots of a previous thrsw to avoid selecting conflicting instructions. The only case were we still emit fixed NOPs is for the thread end that we emit to terminate the program after scheduling all instructions because we can't end the instruction stream before the thread end is properly executed. total instructions in shared programs: 13691004 -> 13648140 (-0.31%) instructions in affected programs: 4345951 -> 4303087 (-0.99%) helped: 19645 HURT: 652 Instructions are helped. total max-temps in shared programs: 2319317 -> 2318687 (-0.03%) max-temps in affected programs: 10510 -> 9880 (-5.99%) helped: 532 HURT: 9 Max-temps are helped. total sfu-stalls in shared programs: 31752 -> 32354 (1.90%) sfu-stalls in affected programs: 840 -> 1442 (71.67%) helped: 7 HURT: 467 Sfu-stalls are HURT. total inst-and-stalls in shared programs: 13722756 -> 13680494 (-0.31%) inst-and-stalls in affected programs: 4335590 -> 4293328 (-0.97%) helped: 19453 HURT: 758 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9825>	2021-03-26 07:13:07 +00:00
Iago Toral Quiroga	f68f209e39	broadcom/compiler: add a v3d_qpu_writes_accum helper We have helpers to check if an instruction writes to specific accumulators. This one will check if it writes any of the general purpose accumulators, which will come in handy in a follow-up patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9825>	2021-03-26 07:13:07 +00:00
Iago Toral Quiroga	22a979be65	broadcom/compiler: convert add to mul when possible to allow merge Integer add/sub can be implemented as either an add or a mul instruction but we always emit them as add instructions at VIR level. We can use this flexibility to improve our QPU scheduling so we can be more effective at instruction merging by converting these to mul instructions when we are attempting to merge them with another add instruction. total instructions in shared programs: 13721549 -> 13691004 (-0.22%) instructions in affected programs: 3340493 -> 3309948 (-0.91%) helped: 12805 HURT: 1656 Instructions are helped. total max-temps in shared programs: 2319528 -> 2319317 (<.01%) max-temps in affected programs: 5285 -> 5074 (-3.99%) helped: 195 HURT: 3 Max-temps are helped. total sfu-stalls in shared programs: 31616 -> 31752 (0.43%) sfu-stalls in affected programs: 469 -> 605 (29.00%) helped: 52 HURT: 161 Sfu-stalls are HURT. total inst-and-stalls in shared programs: 13753165 -> 13722756 (-0.22%) inst-and-stalls in affected programs: 3340383 -> 3309974 (-0.91%) helped: 12782 HURT: 1666 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9769>	2021-03-25 09:51:42 +00:00
Alejandro Piñeiro	bdf93f4e3b	v3dv/cmd_buffer: return early for draw commands if there is nothing to draw So for example, on v3dv_CmdDrawIndexed we can return early if instanceCount is 0. This fixes failures when using the simulator with tests with the following pattern: dEQP-VK.draw.instanced.draw_indexed_vk_primitive_topology* Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9820>	2021-03-25 09:38:04 +00:00
Iago Toral Quiroga	bb201733ac	v3dv/pipeline_cache: fix assert Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Fixes: `e354c5280` ('3dv/pipeline: try to get the shader variant directly from the cache') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9824>	2021-03-25 09:25:27 +00:00
Eric Anholt	3cc390bf7d	broadcom: Disbale CLIF dumping when libexpat isn't available. Given what a niche developer tool CLIF dumps are, no sense requiring libexpat just for that. Acked-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9764>	2021-03-24 17:25:07 +00:00
Alejandro Piñeiro	74785346b4	v3dv: Add support for the on-disk shader cache Quoting Jason's commit message (`afa8f5892`), that also applies here: "The Vulkan API provides a mechanism for applications to cache their own shaders and manage on-disk pipeline caching themselves. Generally, this is what I would recommend to application developers and I've resisted implementing driver-side transparent caching in the Vulkan driver for a long time. However, not all applications do this and, for some use-cases, it's just not practical." Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	cf71280d74	v3dv/device: avoid unused-result warning with asprintf Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	2bee6ffec3	v3dv/pipeline: compute sha1 for no-op fragment shaders correctly We should use the nir shader, as with internal vkShaderModule, instead of just the name. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	9a4099858b	v3dv/pipeline: don't create a variant if compilation failed Also return the proper Vulkan result for this case, that is somewhat tricky. Technically Create[Graphics/Compute]Pipeline only allow OOM errors. So for this case, there is only the alternative of the generic VK_ERROR_UNKNOWN, even if we known the cause of the error. From spec: "VK_ERROR_UNKNOWN will be returned by an implementation when an unexpected error occurs that cannot be attributed to valid behavior of the application and implementation. Under these conditions, it may be returned from any command returning a VkResult" Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	e354c52801	v3dv/pipeline: try to get the shader variant directly from the cache Until now we were always doing a two-step cache lookup, as we were using the NIR shaders to fill up the key to lookup for the compiled shaders. But since we were already generating the sha1 key with the original SPIR-V shader (or its internal NIR representation) any info we were collecting from from NIR is already implicit in the original shader, so we can avoid using the NIR in most cases. Because the v3d_key that is used to compile a shader is populated with data coming directly from the NIR shader or produced during NIR lowerings, we can't use it directly as part of the pipeline cache entry. We could split them, but that would be confusing, so we add a new struct, v3dv_pipeline_key used specifically to search for the compiled shaders on the pipeline cache. v3d_key would be still used to compile the shaders. As we are using the same sha1 key for all compiled shaders in a pipeline, we can also group all of them in the same cache entry, so we don't need a lookup for each stage. This also allows to cache pipeline data shared by all the stages (like the descriptor maps). While we are here, we also create a single BO to store the assembly for all the pipeline stages. Finally, we remove the link to the variant on the pipeline stage struct, to avoid the confusion of having two links to the same data. This mostly means that we stop to use the pipeline stage structures after the pipeline is created, so we can freed them. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	6afb8a9fec	v3dv/pipeline: use broadcom_shader_stage as pipeline/variant stage type So we could avoid using gl_shader_stage plus a is_coord boolean, that only applies to VERTEX. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	0b98f20310	v3dv: define broadcom shader stages Mostly the same that main mesa gl_shader_stage, but including the coordinate shader. This would allow to loop over all the available stages (for example if we need to free them, compute the max spill size, etc). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	d7f4038374	v3dv/pipeline: remove v3d_key from shader_variant and pipeline stage We stopped to re-use them after pippeline creation long ago, so let's reduce the size of both structs, and avoid serialize/deserialize for the variant case. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	b8c73c512a	v3dv/pipeline: remove compiled_variant_count field We are not really compiling several variants, or at least not on the same pipeline. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	ebb2297a91	v3dv/pipeline: move topology to pipeline So now we only store it once per pipeline, instead of once per pipeline stage. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	dd72c99d77	v3dv/pipeline: use driver_location_map instead of nir utilities If we were able to get a shader variant from the pipeline cache, we will not have the nir shader available. Note that this is what we were doing on the driver before the nir io helpers were available. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	b71fd5587e	broadcom/compiler: add driver_location_map at vs prog data This maps the nir shader data.location to its final data.driver_location. In general we are using the driver location as index (like vattr_sizes on the same struct), so having this map is useful if what we have is the data.location, and we don't have available the original nir shader. v2: use memset instead of for loop, and nir_foreach_shader_in_variable instead of nir_foreach_variable_with_modes (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	2be0c36775	broadcom/compiler: add local_size in v3d_compute_prog_data As we plan to try to get directly the compiled variant from the cache, it would be possible to not have available the nir shaders, so we add this info on prog data. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	ab252d73a9	v3dv/pipeline: remove pipeline->use_push_constants In the past we used this boolean for several things, it is really superfluous right now. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	f276efb2f8	v3dv/pipeline: remove pregenerate_variant Right now we were not pre-generating several variants, but we decided to let this method, just in case we need that idea back. This ended being a bad idea. Several months have passed without that need, so having that method just adds confusion. Also, if we need to add a multiple-variant in the future, perhaps we would need to do it different, so let's not template in advance. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Alejandro Piñeiro	098816fc9a	v3dv/pipeline_cache: add more details when dumping debug info We tweak a little some of the individual messages, and add a new option to dump the stats when the pipeline destroy. As we are here we also we also tweak the names of the global options to make it more clear. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9403>	2021-03-22 17:10:47 +00:00
Mike Blumenkrantz	ad241b15a9	vk: consolidate dynamic descriptor binding sorting this code was duplicated across several drivers Reviewed-by: Adam Jackson <ajax@redhat.com> turnip changes Reviewed-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9480>	2021-03-22 16:51:55 +00:00
Iago Toral Quiroga	cbe24a0e9c	broadcom/compiler: use nir_lower_undef_to_zero total instructions in shared programs: 13731663 -> 13721549 (-0.07%) instructions in affected programs: 98242 -> 88128 (-10.29%) helped: 191 HURT: 131 Instructions are helped. total threads in shared programs: 412272 -> 412296 (<.01%) threads in affected programs: 24 -> 48 (100.00%) helped: 12 HURT: 0 Threads are helped. total uniforms in shared programs: 3780693 -> 3779137 (-0.04%) uniforms in affected programs: 10564 -> 9008 (-14.73%) helped: 114 HURT: 7 Uniforms are helped. total max-temps in shared programs: 2319942 -> 2319528 (-0.02%) max-temps in affected programs: 4191 -> 3777 (-9.88%) helped: 113 HURT: 22 Max-temps are helped. total sfu-stalls in shared programs: 31584 -> 31616 (0.10%) sfu-stalls in affected programs: 217 -> 249 (14.75%) helped: 51 HURT: 54 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 13763247 -> 13753165 (-0.07%) inst-and-stalls in affected programs: 98719 -> 88637 (-10.21%) helped: 187 HURT: 134 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9681>	2021-03-22 12:17:13 +00:00
Iago Toral Quiroga	1c987f5db3	broadcom/compiler: optimize constant vfpack total instructions in shared programs: 13733627 -> 13731663 (-0.01%) instructions in affected programs: 174140 -> 172176 (-1.13%) helped: 1597 HURT: 310 Instructions are helped. total uniforms in shared programs: 3784601 -> 3780693 (-0.10%) uniforms in affected programs: 58678 -> 54770 (-6.66%) helped: 2886 HURT: 3 Uniforms are helped. total max-temps in shared programs: 2322714 -> 2319942 (-0.12%) max-temps in affected programs: 15729 -> 12957 (-17.62%) helped: 2189 HURT: 1 Max-temps are helped. total spills in shared programs: 6010 -> 6012 (0.03%) spills in affected programs: 61 -> 63 (3.28%) helped: 0 HURT: 1 total fills in shared programs: 13494 -> 13497 (0.02%) fills in affected programs: 89 -> 92 (3.37%) helped: 0 HURT: 1 total sfu-stalls in shared programs: 31521 -> 31584 (0.20%) sfu-stalls in affected programs: 328 -> 391 (19.21%) helped: 30 HURT: 94 Inconclusive result (%-change mean confidence interval includes 0). total inst-and-stalls in shared programs: 13765148 -> 13763247 (-0.01%) inst-and-stalls in affected programs: 174237 -> 172336 (-1.09%) helped: 1551 HURT: 316 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9681>	2021-03-22 12:17:13 +00:00
Iago Toral Quiroga	b189409a46	broadcom/compiler: handle implicit uniform loads when optimizing constant alu Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9681>	2021-03-22 12:17:13 +00:00
Juan A. Suarez Romero	8ab6d2b4c4	ci/broadcom: use new piglit runner Switch from the old piglit to the new piglit-runner executor. Acked-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9680>	2021-03-18 16:31:45 +00:00
Juan A. Suarez Romero	727eadc76a	ci/vc4/v3d: run piglit testsuite against Xorg This increases the coverage adding tests that require an X server to run. Update also the expected results, and skip some flake tests. Reviewed-by: Andres Gomez <agomez@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9562>	2021-03-17 11:40:58 +00:00
Juan A. Suarez Romero	dc2e3d6ff2	ci/v3dv: add flaky test in the skip list Reviewed-by: Andres Gomez <agomez@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9562>	2021-03-17 11:40:58 +00:00
Alejandro Piñeiro	ec4c79c2b0	v3dv: avoid some maybe-uninitialized warnings Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9640>	2021-03-17 10:05:07 +00:00
Alejandro Piñeiro	c373b24369	v3dv/descriptor_set: don't free individual set if not allowed If we have a host_memory_base pointer it means that we are handling the pool memory as a whole, so each set is a sub-slice of the memory pool. In this case we can't just free the individual set. In other words, VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT is not set. Note tha at that point we were able to sub-allocate an set, and we are failing to sub-allocate the pool bo for the descripto bo. So technically we could try to return that set to the pool (so undo the change on host_memory_ptr before), that I assume was my intention when (wrongly) calling vk_free there. But in practice, at that point we are already on a out of descriptor pool situation, so in the end it doesn't even matter. This fixes a double free crash with the UE4 VehicleGame demo. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9640>	2021-03-17 10:05:07 +00:00
Iago Toral Quiroga	aefac60741	broadcom/compiler: use nir_lower_wrmasks to simplify TMU general stores This pass splits writemaks with non-consecutive bits into multiple store operations ensuring that each store only has consecutive writemask bits set. We can use this to simplify writemask handling in our backend removing a loop solely intended to handle this case. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9619>	2021-03-17 09:35:19 +00:00
Iago Toral Quiroga	51a263530f	broadcom/compiler: use nir_opt_load_store_vectorize This will make it so we pack consecutive scalar operations into a vector operation, reducing the amount of load/store operations in the NIR program. Our backend can handle vector load/stores, and doing so may be more efficient since we don't need to setup individual load/stores all the time. A pathological case is: dEQP-VK.spirv_assembly.instruction.compute.opcopymemory.array which goes from 862 instructions to only 573 by converting all scalar SSBO load/store operations to vec4 operations. total instructions in shared programs: 13752607 -> 13733627 (-0.14%) instructions in affected programs: 367117 -> 348137 (-5.17%) helped: 1168 HURT: 371 Instructions are helped. total threads in shared programs: 412230 -> 412272 (0.01%) threads in affected programs: 54 -> 96 (77.78%) helped: 23 HURT: 2 Threads are helped. total uniforms in shared programs: 3790248 -> 3784601 (-0.15%) uniforms in affected programs: 57417 -> 51770 (-9.84%) helped: 1420 HURT: 19 Uniforms are helped. total max-temps in shared programs: 2322170 -> 2322714 (0.02%) max-temps in affected programs: 14353 -> 14897 (3.79%) helped: 185 HURT: 306 Max-temps are HURT. total spills in shared programs: 5940 -> 6010 (1.18%) spills in affected programs: 65 -> 135 (107.69%) helped: 0 HURT: 11 total fills in shared programs: 13372 -> 13494 (0.91%) fills in affected programs: 75 -> 197 (162.67%) helped: 0 HURT: 11 total sfu-stalls in shared programs: 31505 -> 31521 (0.05%) sfu-stalls in affected programs: 751 -> 767 (2.13%) helped: 210 HURT: 246 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 13784112 -> 13765148 (-0.14%) inst-and-stalls in affected programs: 360283 -> 341319 (-5.26%) helped: 1125 HURT: 366 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9619>	2021-03-17 09:35:19 +00:00
Iago Toral Quiroga	3db322f305	broadcom/compiler: fix end of tmu sequence detection TMUWT always terminates a TMU sequence. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9619>	2021-03-17 09:35:19 +00:00
Iago Toral Quiroga	1e4abf1fe3	vulkan/util: call glsl_type_singleton_init_or_ref from vk_instance_init v2: link libvulkan_util with libglsl so it can find the glsl singleton symbols. v3: link with libcompiler instead of libglsl (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Dylan Baker <dylan.c.baker@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> for the v3dv bits. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> for the turnip bits. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> for the radv bits. Acked-by: Dave Airlie <airlied@redhat.com> for the lvp bits. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9457>	2021-03-17 08:15:36 +01:00
Lukas Feller	164a51c80f	v3dv: fix stride in buffer copy Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9402>	2021-03-17 06:42:34 +00:00
Lukas Feller	99a11f25b2	v3dv: fix assertion in job_compute_frame_tiling Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9402>	2021-03-17 06:42:34 +00:00
Mike Blumenkrantz	07c9dc54dd	v3dv: use common interfaces for shader modules squashed changes from Alejandro Piñeiro <apinheiro@igalia.com>: Add call to vk_object_base_init on internal shader_module: we have some cases where internally we have some shader modules that we don't create through CreateShaderModule, so in this case we need to manually call base_init. Not sure why this wasn't needed before. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9508>	2021-03-15 21:47:44 +00:00
Iago Toral Quiroga	177dcd4b68	broadcom/compiler: be more flexible scheduling TMU writes V3D 4.x allows more flexibility, so take advantage of that. Generally, we can reorder any writes in the same sequence, so long as they are not the sequence terminator (which must always be last, since it is the one triggering the operation), and TMUD writes, since these must be ordered with respect to each other. total instructions in shared programs: 13735183 -> 13731927 (-0.02%) instructions in affected programs: 903057 -> 899801 (-0.36%) helped: 2358 HURT: 746 Instructions are helped. total max-temps in shared programs: 2322020 -> 2322009 (<.01%) max-temps in affected programs: 619 -> 608 (-1.78%) helped: 19 HURT: 11 Inconclusive result (value mean confidence interval includes 0). total sfu-stalls in shared programs: 31494 -> 31489 (-0.02%) sfu-stalls in affected programs: 182 -> 177 (-2.75%) helped: 40 HURT: 40 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 13766677 -> 13763416 (-0.02%) inst-and-stalls in affected programs: 901343 -> 898082 (-0.36%) helped: 2349 HURT: 746 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9555>	2021-03-15 08:03:28 +01:00
Iago Toral Quiroga	87ed614c47	broadcom/compiler: flag wrtmuc with a read dependency on last_tmu_config Instead of using a write depdency. We use last_tmu_config to ensure ordering of instructions participating in different TMU sequences. To this end, all sequence terminators flag a write dependency on last_tmu_config, but wrtmuc is not a sequence terminator, so we can be more flexible by flagging it as a read depedency. This would prevent it to be moved into a previous sequence (since it cannot be moved past the previous sequence terminator due to the read depedency), but it allows it to be reordered with instructions in the same sequence, which allows us to pair it up more effectively. Particularly, it allows to pair up a wrtmuc with the sequence terminator of the same sequence, turning code like this: nop ; mov tmut, r0 ; thrsw; wrtmuc (tex[0].p0 \| 0x3) nop ; nop ; wrtmuc (tex[0].p1 \| 0x0) nop ; mov tmus, r1 Into this: nop ; mov tmut, r0 ; thrsw; wrtmuc (tex[0].p0 \| 0x3) nop ; mov tmus, r1 ; wrtmuc (tex[0].p1 \| 0x0) total instructions in shared programs: 13755738 -> 13735183 (-0.15%) instructions in affected programs: 2510921 -> 2490366 (-0.82%) helped: 10963 HURT: 485 Instructions are helped. total max-temps in shared programs: 2322828 -> 2322020 (-0.03%) max-temps in affected programs: 11303 -> 10495 (-7.15%) helped: 608 HURT: 19 Max-temps are helped. total sfu-stalls in shared programs: 31545 -> 31494 (-0.16%) sfu-stalls in affected programs: 235 -> 184 (-21.70%) helped: 62 HURT: 11 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13787283 -> 13766677 (-0.15%) inst-and-stalls in affected programs: 2525187 -> 2504581 (-0.82%) helped: 10989 HURT: 477 Inst-and-stalls are helped. v2: add a comment explaining the read depdency (Piñeiro). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9555>	2021-03-15 08:03:28 +01:00
Juan A. Suarez Romero	3f1c375581	ci/broadcom: allow custom kernels So far, testing VC4 and V3D/V3DV requires the CI runners having access to a Raspberry Pi 3/4 kernel, and the correspondent modules and bootloader files. If a different kernel must be used, it means touching the runners to provide them. This commit adds the option to define an URL pointing to a (compressed) tarball containing such files, without requiring dealing with the runners. This link is provided through the `BM_BOOTFS` job variable. The tarball must contain two directories in the root: a `/boot` directory (containing the kernel, DTBs and bootloader files), and a `/lib/modules` (or `/usr/lib/modules`) with the kernel modules. Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9527>	2021-03-12 11:03:17 +00:00
Jason Ekstrand	4fb6c051c9	anv: Move vk_format helpers to common code The Android ones we put in anv_android.c. Maybe one day we'll want a vk_android.h to put some common Android stuff but, for now, let's keep it contained to ANV's android code. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Iago Toral Quiroga	8525cb1c53	v3dv: call util_cpu_detect() when initializing the instance Fixes this assert in debug builds: in __GI___assert_fail (assertion=0x7ffff731f66b "util_cpu_caps.nr_cpus >= 1", file=0x7ffff731f650 "../src/util/u_cpu_detect.h", line=116, function=0x7ffff7323280 <__PRETTY_FUNCTION__.11654> "util_get_cpu_caps") at assert.c:101 in util_get_cpu_caps () at ../src/util/u_cpu_detect.h:116 in _mesa_float_to_float16_rtz (val=0) at ../src/util/half_float.h:93 in util_format_r16g16b16a16_float_pack_rgba_float (dst_row=0x7fffffffbdc0 "", dst_stride=0, src_row=0x7fffffffbf90, src_stride=0, width=1, height=1) at src/util/format/u_format_table.c:13459 in util_format_pack_rgba (format=PIPE_FORMAT_R16G16B16A16_FLOAT, dst=0x7fffffffbdc0, src=0x7fffffffbf90, w=1) at ../src/util/format/u_format.h:1525 in util_pack_color (rgba=0x7fffffffbf90, format=PIPE_FORMAT_R16G16B16A16_FLOAT, uc=0x7fffffffbdc0) at ../src/gallium/auxiliary/util/u_pack_color.h:432 in v3dv_get_hw_clear_color (color=0x7fffffffbf90, internal_type=6, internal_size=8, hw_color=0x7fffffffbf10) at ../src/broadcom/vulkan/v3dv_cmd_buffer.c:1241 v2: move call from physical device to instance init. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9408>	2021-03-10 11:44:01 +01:00
Iago Toral Quiroga	c057a1211b	broadcom/compiler: disallow ldunif during ldvary sequences if possible This restores many of the hurt shaders from the previous patch at the expense of re-adding ldvary tracking in the scheduler. total instructions in shared programs: 13760415 -> 13755738 (-0.03%) instructions in affected programs: 1207560 -> 1202883 (-0.39%) helped: 5080 HURT: 1731 Instructions are helped. total max-temps in shared programs: 2322991 -> 2322828 (<.01%) max-temps in affected programs: 5063 -> 4900 (-3.22%) helped: 229 HURT: 108 Max-temps are helped. total sfu-stalls in shared programs: 31827 -> 31545 (-0.89%) sfu-stalls in affected programs: 478 -> 196 (-59.00%) helped: 304 HURT: 21 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13792242 -> 13787283 (-0.04%) inst-and-stalls in affected programs: 1220856 -> 1215897 (-0.41%) helped: 5162 HURT: 1697 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Iago Toral Quiroga	947e9e42cc	broadcom/compiler: simplify ldvary pipelining We get optimal ldvary pipelining by doing the following: 1) Carefully merge a paired ldvary into the previous instruction when possible. 2) When the above succeeds, flag the ldvary as scheduled immediately so we can merge one of its children into the current instruction. 3) When scheduling ldvary sequences, only pick up instructions that are part of the sequence to avoid picking up something that prevents successful pipelining. This patch skips 3) assuming some hurt shaders in exchange for better scheduling flexibility during ldvary sequences. Besides eliminating most of the code dedicated to special handling ldvary sequences, this also usually allows us to produce better code by merging instructions that are unrelated to ldvary sequences into the ldvary sequences, which is particularly effective to fill up the gaps produced when scheduling the first and last ldvary sequences as well as the gaps produced by flat and noperspective varyings sequences that don't have both mul and add instructions. Notice that there are some hurt shaders, because some times the extra scheduler flexibility can lead to picking up instructions that will break a sequence without compensating for that, typically an ldunif that prevents us from doing the fixup for a follow-up ldvary. We will try to correct some of these cases with the next patch. total instructions in shared programs: 13786037 -> 13760415 (-0.19%) instructions in affected programs: 3201387 -> 3175765 (-0.80%) helped: 16155 HURT: 4146 Instructions are helped. total max-temps in shared programs: 2324834 -> 2322991 (-0.08%) max-temps in affected programs: 22160 -> 20317 (-8.32%) helped: 1340 HURT: 103 Max-temps are helped. total sfu-stalls in shared programs: 30685 -> 31827 (3.72%) sfu-stalls in affected programs: 782 -> 1924 (146.04%) helped: 253 HURT: 1416 Inconclusive result. total inst-and-stalls in shared programs: 13816722 -> 13792242 (-0.18%) inst-and-stalls in affected programs: 3171642 -> 3147162 (-0.77%) helped: 15331 HURT: 4179 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Iago Toral Quiroga	d37241bdc4	broadcom/compiler: move code block around These checks depend on prev_inst being set, so move them down below with all the other checks with the same requirement. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Iago Toral Quiroga	8bcda472a0	broadcom/compiler: add an additional sanity check assert to the ldvary fixup Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Jason Ekstrand	e20e85f01e	nir: Make nir_ssa_def_rewrite_uses_after take an SSA value This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after() with an SSA def, and rewrites all the users as needed. Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Jason Ekstrand	117668b811	nir: Make nir_ssa_def_rewrite_uses take an SSA value This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses() with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites all the users as needed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00

1 2 3 4 5 ...

1379 Commits