AlexIndustrial/mesa

Author	SHA1	Message	Date
Caio Marcelo de Oliveira Filho	1021abab07	intel/tools: Fix aub_file initialization in intel_dump_gpu The `device` can be set earlier either by a command line or a by intercepting an ioctl call to get the I915_PARAM_CHIPSET_ID done by the application early. In both cases `aub_file` and `devinfo` would not be initialized. Fix by splitting the conditions - `device == 0`: use the FD to get both device and devinfo. - Or `devinfo.gen == 0`: use `device` to initialize it. And separatedly, initialize aub_file the first time it is needed. Fixes: `d594d2a052` ("intel/tools: use device info initializer") Acked-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-12 19:18:26 -07:00
Rafael Antognolli	7bc022b4bb	anv/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced. If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Don't need to set the mask - it's mbo (Ken).	2019-08-12 16:19:08 -07:00
Rafael Antognolli	ad513fd386	intel: Get information about pixel pipes subslices. v2: Use 1 instead of 1UL (Ken).	2019-08-12 16:19:08 -07:00
Rafael Antognolli	32344dc581	intel/gen_decoder: Decode SLICE_HASH_TABLE.	2019-08-12 16:19:08 -07:00
Rafael Antognolli	e1cb71c182	intel/genxml: Update 3D_MODE and add SLICE_HASH_TABLE. Add these fields and the 3DSTATE_SLICE_TABLE_STATE_POINTERS instruction so we can properly configure the slice and subslice hashing on ICL+ v2: Make 'Mask' field a mbo (Ken).	2019-08-12 16:19:08 -07:00
Jason Ekstrand	d787a2d05e	anv: Implement VK_KHR_pipeline_executable_properties Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	67cb55ad11	anv: Add a ralloc context to anv_pipeline Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	fec4bdff40	anv: Force a full re-compile when CAPTURE_INTERNAL_REPRESENTATION_TEXT is set Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	651fbbf9b8	anv/pipeline: Split setting up per-stage keys into its own loop Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	78f3dfb4a2	anv: Record shader compile stats in the pipeline cache We're going to want these to be available regardless of caching. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	2af380d20f	anv/pipeline: Stash generated code in the pipeline stage Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	8d3cbd0393	intel/fs: Add SLM size to brw_cs_prog_data We don't need it for state setup but it's a useful statistic we want to pass on to developers. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Jason Ekstrand	134607760a	intel/compiler: Fill a compiler statistics struct This commit is all annoying plumbing work which just adds support for a new brw_compile_stats struct. This struct provides a binary driver readable form of the same statistics we dump out to stderr when we INTEL_DEBUG is set with a shader stage. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-08-12 22:56:07 +00:00
Paulo Zanoni	866bb775de	intel/fs: add 64 bit integer multiplication lowering While NIR's lower_imul64() solves the case of 64 bit integer multiplications generated early, we don't have a way to lower such instructions when they are generated by our own backend, such as the scan/reduce intrinsics. We'll need this soon, so implement it now. An easy way to test this is to simply disable nir_lower_imul64 to let those operations reach the backend. v2: - Fix Q/UQ copy/paste errors (Caio). - Transform an 'if' into 'else if' (Caio). - Add an extra comment to clarify the need for 64b = 32b * 32b (Caio). - Make private functions private (Caio). v3: - Remove ambiguity with 'b' and 'd' variables (Caio). - Allocate potentially less regs for the dwords (Caio). Cc: Jason Ekstrand <jason.ekstrand@intel.com> Cc: Matt Turner <matt.turner@intel.com> Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Paulo Zanoni	9217cf3b5e	intel/compiler: invert the logic of lower_integer_multiplication() Invert the logic of how progress is handled: remove the continue statements and mark progress inside the places where it actually happens. We're going to add a new lowering that also looks for BRW_OPCODE_MUL, so inverting the logic here makes the resulting code much easier to follow. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Paulo Zanoni	6ba4717924	intel/compiler: don't instantiate a builder for each instruction Don't instantiate a builder for each instruction during lower_integer_multiplication(). Instantiate one only when needed. On the other hand, these unneeded builders don't seem to cost much to init, so I don't expect any significant difference in performance: this is mostly about code organization. Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Paulo Zanoni	75b3868dcc	intel/compiler: extract subfunctions of lower_integer_multiplication() The lower_integer_multiplication() function is already a little too big. I want to add more to it, so let's reorganize the existing code first. Let's start with just extracting the current code to subfunctions. Later we'll change them a little more. v2: Make private functions private (Caio). v3: Fix typo (Caio). Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>	2019-08-12 15:16:23 -07:00
Rhys Perry	7740149852	nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo v2: add to series v3: update Makefile.sources v4: don't remove a comment and break statement v4: use nir_can_move_instr Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-08-12 22:01:30 +00:00
Francisco Jerez	c2fe7a0fb8	anv/gen9: Optimize slice and subslice load balancing behavior. See "i965/gen9: Optimize slice and subslice load balancing behavior." for the rationale. According to Jason, improves Aztec Ruins performance by 2.7%. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1) v2: Undo CPU performance micro-optimization done in i965 and iris due to lack of data justifying it on anv. Use cmd_buffer_apply_pipe_flushes wrapper instead of emitting pipe control command directly. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-12 14:40:21 -07:00
Francisco Jerez	03cba9f5d9	intel/genxml: Add GT_MODE hashing defs for Gen9. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-12 13:17:58 -07:00
Jason Ekstrand	14c96a6300	anv: Implement VK_EXT_subgroup_size_control version 2 The version bump adds a proper features struct. Fixes: `d10de25309` "anv: Implement VK_EXT_subgroup_size_control" Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-12 14:56:33 +00:00
Eric Engestrom	e4aa0fc63a	anv: add missing `break` Fixes: `f6e7de41d7` ("anv: Implement VK_EXT_line_rasterization") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 23:34:31 +01:00
Lionel Landwerlin	cefb4341b7	anv: drop unused code We stopped using this when we moved to Jason's mi_builder. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 17:01:38 +03:00
Tapani Pälli	5e38db0c47	anv/android: disable shared representable image support explicitly Android 9 loader conditionally advertises VK_KHR_shared_presentable_image extension based on this property and it looks like it does not initialize the struct before query. Pragmas are added to ignore warnings with Android specific structure types in same manner as commit `8d386e6eef` did. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-09 08:53:54 +03:00
Greg V	ac1561088d	intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEV Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: `134e750e16` ("i965: extract performance query metrics")	2019-08-08 21:44:33 +01:00
Greg V	c00ee00031	i965/tiled_memcpy: avoid creating bswap32 if it exists as a macro (e.g. on FreeBSD) Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Greg V	7b520dc74f	anv: add MAP_POPULATE fallback define for portability FreeBSD does not have MAP_POPULATE Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Greg V	2be3f16600	anv: remove unused Linux-specific include Fixes: `4201cc2dd3` ("anv: Implement VK_KHX_external_semaphore_fd") Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-08-08 21:44:33 +01:00
Rhys Perry	c52c54a746	anv,i965,iris: deduplicate setting of total_shared v5: add patch Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Rhys Perry	024a46a407	anv: use derefs for shared memory access vkpipeline-db for my Skylake GPU: total instructions in shared programs: 8847602 -> 8847896 (<.01%) instructions in affected programs: 10165 -> 10459 (2.89%) helped: 8 HURT: 2 total cycles in shared programs: 1606273555 -> 1606251634 (<.01%) cycles in affected programs: 2201803 -> 2179882 (-1.00%) helped: 7 HURT: 3 The shaders with more instructions is due to a loop over a shared array in Three Kingdoms being unrolled (and creating a lot of nested ifs). Not sure if that's good or bad. One of the shaders with worse cycles is only worse by 0.04% and the other two are the shaders with loops unrolled. v2: add patch v4: don't set spirv_options.shared_addr_format v4: move comment concerning the shared address format used and NULL v4: add vkpipeline-db results v5: rename to nir_lower_vars_to_explicit_types v5: move setting of total_shared to outside brw_compile_cs v6: set shared_addr_format v6: formatting changes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> (v5) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 12:10:39 -05:00
Tapani Pälli	aba57b11ee	anv: support GetSwapchainGrallocUsage2ANDROID for Android New function supports gralloc1 usage flags that get set separately for producer and consumer. As we still need to support old method too, let's share common code and use android_convertGralloc0To1Usage helper. Bump the VK_ANDROID_native_buffer version to indicate support for the new call. Changes were tested on Android Celadon P with Basemark GPU and various Sascha Willems Vulkan demos. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-08-08 05:08:01 +00:00
Mark Janes	61c54a8878	intel/perf: fix debug typo Misspelling was seen with INTEL_DEBUG=perfmon. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	2df1ab4d48	intel/perf: make gen_perf_query_object private Encapsulate the details of this structure within the perf implemenation. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	deea3798b6	intel/perf: make perf context private Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	1f4f421ce0	intel/perf: print debug information INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	a663c8c26e	intel/perf: make internal methods private Now that all references from i965 have been moved to perf, we can make internal methods private again. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	be8b466cff	intel/perf: make oa_sample_buffers private All references to this data structure have been moved inside the perf subsystem. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	f2a049b4e3	intel/perf: expose method to create query By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	9f5c160d82	intel/perf: move initialization of pipeline statistics metrics to gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	9f84efb452	intel/perf: move get_query_data into gen_perf This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	73eccdc4a5	intel/perf: move delete_query to gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	8c9eac1234	intel/perf: move is_query_ready to gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	a9be292722	intel/perf: move wait_query to perf The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	3c8ed58486	intel/perf: create a vtable entry for bo_busy Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	6fed756388	intel/perf: create a vtable entry for bo_wait_rendering Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	511bb15d4b	intel/perf: create a vtable entry for batch_references Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	3ecb23092e	intel/perf: refactor gen_perf_end_query into gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:56 -07:00
Mark Janes	018f9b81e5	intel/perf: refactor gen_perf_begin_query into gen_perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	52d3db9ab6	intel/perf: move perf-related state into gen_perf_context To move more operations into intel/perf, several state items are needed. Save references to that state in the perf_ctxt, rather than passing them in for every operation. This commit includes an initializer for gen_perf_context, to set those references and also encapsulate the initialization of the sample buffer state. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00
Mark Janes	df18acee78	intel/perf: create a vtable entries for buffer object map/unmap These operations are needed to refactor subsequent methods into perf Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-07 21:33:55 -07:00

1 2 3 4 5 ...

4522 Commits