AlexIndustrial/mesa

Author	SHA1	Message	Date
Jason Ekstrand	d3386e73c5	intel/nir: Lower array-deref-of-vector UBO and SSBO loads This fixes a serious performance issue with DXVK: https://github.com/doitsujin/dxvk/issues/937 This was caused by a recent change that to improve performance on RADV which back-fired on ANV and killed performance for some apps: https://github.com/doitsujin/dxvk/commit/e5a06d3f4a103a54cd4eb51970fedee405d1d698 Throwing in this bit of lowering lets us come along and CSE those UBO loads (or copy-prop for SSBO load) and get one load where we previously would have gotten several. VkPipeline-db results on Kaby Lake: total instructions in shared programs: 5115361 -> 5073185 (-0.82%) instructions in affected programs: 1754333 -> 1712157 (-2.40%) helped: 5331 HURT: 63 total cycles in shared programs: 2544501169 -> 2481144545 (-2.49%) cycles in affected programs: 2531058653 -> 2467702029 (-2.50%) helped: 9202 HURT: 4323 total loops in shared programs: 3340 -> 3331 (-0.27%) loops in affected programs: 9 -> 0 helped: 9 HURT: 0 total spills in shared programs: 3246 -> 3053 (-5.95%) spills in affected programs: 384 -> 191 (-50.26%) helped: 10 HURT: 5 total fills in shared programs: 4626 -> 4452 (-3.76%) fills in affected programs: 439 -> 265 (-39.64%) helped: 10 HURT: 5 All of the shaders with hurt spilling were in Rise of the Tomb Raider which also had shaders solidly helped in the spilling department. Not shown in those results (because I've not had success dumping the shaders) is Witcher 3 where this reduces spilling and improves over-all perf by around 20-25%. There were no shader-db changes. Apparently, this just isn't a pattern that happens in OpenGL. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Cc: "19.0" mesa-stable@lists.freedesktop.org	2019-03-15 23:10:27 -05:00
Jason Ekstrand	be2990d8fb	i965: Stop setting LowerBuferInterfaceBlocks Instead, we do UBO and SSBO deref lowering in NIR after we've given it a chance to optimize SSBO access: Shader-db results on Kaby Lake: total instructions in shared programs: 15235775 -> 15235484 (<.01%) instructions in affected programs: 14992 -> 14701 (-1.94%) helped: 19 HURT: 20 total cycles in shared programs: 339220331 -> 339027307 (-0.06%) cycles in affected programs: 79831981 -> 79638957 (-0.24%) helped: 540 HURT: 602 total loops in shared programs: 4402 -> 4348 (-1.23%) loops in affected programs: 186 -> 132 (-29.03%) helped: 27 HURT: 0 total spills in shared programs: 23261 -> 23234 (-0.12%) spills in affected programs: 38 -> 11 (-71.05%) helped: 1 HURT: 0 total fills in shared programs: 31442 -> 31371 (-0.23%) fills in affected programs: 98 -> 27 (-72.45%) helped: 1 HURT: 0 LOST: 12 GAINED: 12 Most of the help and hurt in instruction counts was just churn caused by re-ordering of optimizations and the fact that the NIR deref lowering code is emitting slightly different instructions. Nothing was hurt by more than three instructions and most things weren't helped by more than four. The primary exception to this is one Car Chase shader: shaders/non-free/gfxbench4/carchase/341.shader_test CS SIMD32: 1144 -> 821 (-28.23%) There is also one compute shader in Manhattan 3.1 and a fragment shader in the UE4 Shooter Game demo that now get a loop partially unrolled. Those showed up in the results as hurt instructions but were manually removed to get the results above. The lost/gained was a dozen Car Chase shaders that went from SIMD8 to SIMD16 thanks to improved register pressure: shaders/non-free/gfxbench4/carchase/366.shader_test CS shaders/non-free/gfxbench4/carchase/368.shader_test CS shaders/non-free/gfxbench4/carchase/370.shader_test CS shaders/non-free/gfxbench4/carchase/372.shader_test CS shaders/non-free/gfxbench4/carchase/376.shader_test CS shaders/non-free/gfxbench4/carchase/378.shader_test CS shaders/non-free/gfxbench4/carchase/380.shader_test CS shaders/non-free/gfxbench4/carchase/382.shader_test CS shaders/non-free/gfxbench4/carchase/384.shader_test CS shaders/non-free/gfxbench4/carchase/388.shader_test CS shaders/non-free/gfxbench4/carchase/4.shader_test CS shaders/non-free/gfxbench4/carchase/6.shader_test CS Given how much it appeared to be improved, I ran Car Chase on my laptop. Unfortunately, I wasn't able to see any measurable improvement. It might be helped by 1-2% but it's in the noise. It does render correctly as far as I can tell so the improvement is legitimate. All of the loops that got delete were in dolphin uber shaders. I've had no opportunity to test them for correctness or performance. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	c8d42c8cf6	nir: Rename nir_address_format_vk_index_offset to not be vk It's just a 32-bit index and offset. We're going to want to use it in GL as well so stop talking about Vulkan. Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-15 01:02:19 +00:00
Jason Ekstrand	162286eb75	anv: Only set 3DSTATE_PS::VectorMaskEnable on gen8+ We don't set it on HSW and earlier in i965 and disabling it appears to make derivatives somewhat more reliable. Acked-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-14 12:22:20 -05:00
Plamena Manolova	19ab082001	i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9 ARB_fragment_shader_interlock depends on memory fences to ensure fragment ordering and this ordering guarantee is only supported from GEN9 onwards. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980 Fixes: `939312702e` "i965: Add ARB_fragment_shader_interlock support." Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-14 13:04:12 +00:00
Jason Ekstrand	489bf2de23	anv/pass: Flag the need for a RT flush for resolve attachments Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: mesa-stable@lists.freedesktop.org	2019-03-13 17:58:27 -05:00
Jason Ekstrand	13099d4490	anv: Stop using VK_TRUE/FALSE We've been fairly inconsistent about this so we should really choose whether we're going to use VK_TRUE/FALSE or the C boolean values. The Vulkan #defines are set to 1 and 0 respectively so it's the same value as C gives you when you cast a boolean expression to an integer. Since there are several places where we set a VkBool32 to a C logical expression, let's just embrace C booleans and stop using the VK defines. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-13 17:58:27 -05:00
Caio Marcelo de Oliveira Filho	65e8761474	intel/nir: Combine store_derefs to improve code from SPIR-V Due to lack of write mask in SPIR-V store, generators may produce multiple stores to the same vector but using different array derefs. Use the combining store pass to clean this up. For example, layout(binding = 3) buffer block { vec4 v; }; void main() { v.x = 11; v.y = 22; } after going to SPIR-V and NIR, ends up with in two store_derefs to v[0] and v[1] vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) /* &((block )ssa_2)->field0 / vec2 32 ssa_6 = deref_array &(ssa_4)[0] (ssbo float) / &((block )ssa_2)->field0[0] / intrinsic store_deref (ssa_6, ssa_7) (1, 0) /* wrmask=x / / access=0 / vec1 32 ssa_13 = load_const (0x00000001 / 0.000000 /) vec2 32 ssa_14 = deref_array &(ssa_4)[1] (ssbo float) /* &((block )ssa_2)->field0[1] / intrinsic store_deref (ssa_14, ssa_15) (1, 0) /* wrmask=x / / access=0 / producing two different sends instructions in skl. The combining pass transform the snippet above into vec2 32 ssa_4 = deref_struct &ssa_3->field0 (ssbo vec4) / &((block )ssa_2)->field0 / vec4 32 ssa_18 = vec4 ssa_7, ssa_15, ssa_16, ssa_17 intrinsic store_deref (ssa_4, ssa_18) (3, 0) /* wrmask=xy / / access=0 */ producing a single sends instruction. v2: Move this from spirv_to_nir into the general optimization pass for intel compiler. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-13 08:39:16 -07:00
Caio Marcelo de Oliveira Filho	10dfb0011e	intel/nir: Combine store_derefs after vectorizing IO Shader-db results for skl: total instructions in shared programs: 15232903 -> 15224781 (-0.05%) instructions in affected programs: 61246 -> 53124 (-13.26%) helped: 221 HURT: 0 total cycles in shared programs: 371440470 -> 371398018 (-0.01%) cycles in affected programs: 281363 -> 238911 (-15.09%) helped: 221 HURT: 0 Results for bdw are very similar. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-13 08:39:16 -07:00
Caio Marcelo de Oliveira Filho	822a8865e4	nir: Add a pass to combine store_derefs to same vector v2: (all from Jason) Reuse existing function for the end of the block combinations. Check the SSA values are coming from the right place in tests. Document the case when the store to array_deref is reused. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-13 08:39:16 -07:00
Kenneth Graunke	3570d15b6d	intel/fs: Fix opt_peephole_csel to not throw away saturates. We were not copying the saturate bit from the original instruction to the new replacement instruction. This caused major misrendering in DiRT Rally on iris, where comparisons leading to discards failed due to the missing saturate, causing lots of extra garbage pixels to be drawn in text rendering, trees, and so on. This did not show up on i965 because st/nir performs a more aggressive version of nir_opt_peephole_select, yielding more b32csel operations. Fixes: `52c7df1643` i965/fs: Merge CMP and SEL into CSEL on Gen8+ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 20:11:55 -07:00
Jason Ekstrand	c056609c43	anv: Ignore VkRenderPassInputAttachementAspectCreateInfo We don't care about the information but there's no sense in throwing a debug warning about it. It's harmless but annoying to users. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109984 Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>	2019-03-12 21:06:39 -05:00
Danylo Piliaiev	9c80be956f	anv: Fix destroying descriptor sets when pool gets reset pool->next and pool->free_list were reset before their usage in anv_descriptor_pool_free_set Fixes: `775aabdd` "anv: destroy descriptor sets when pool gets reset" Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-12 17:09:37 +00:00
Jason Ekstrand	6d5d89d25a	intel/nir: Vectorize all IO The IO scalarization pass that we run to help with linking end up turning some shader I/O such as that for tessellation and geometry shaders into many scalar URB operations rather than one vector one. To alleviate this, we now vectorize the I/O once again. This fixes a 10% performance regression in the GfxBench tessellation test that was caused by scalarizing. Shader-db results on Kaby Lake: total instructions in shared programs: 15224023 -> 15220871 (-0.02%) instructions in affected programs: 342009 -> 338857 (-0.92%) helped: 1236 HURT: 443 total spills in shared programs: 23471 -> 23465 (-0.03%) spills in affected programs: 6 -> 0 helped: 1 HURT: 0 total fills in shared programs: 31770 -> 31766 (-0.01%) fills in affected programs: 4 -> 0 helped: 1 HURT: 0 Cycles was just a lot of churn do to moves being different places. Most of the pure churn in instructions was +/- one or two instructions in fragment shaders. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107510 Fixes: `4434591bf5` "intel/nir: Call nir_lower_io_to_scalar_early" Fixes: `8d8222461f` "intel/nir: Enable nir_opt_find_array_copies" Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-03-12 15:34:06 +00:00
Tapani Pälli	bef354321b	anv: revert "anv: release memory allocated by glsl types during spirv_to_nir" This reverts commit `47fc359822`. Reason is that patch did not take in to account situation where we might have both OpenGL and Vulkan using glsl_types at the same time. Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-12 14:12:36 +02:00
Juan A. Suarez Romero	775aabdd01	anv: destroy descriptor sets when pool gets reset As stated in Vulkan spec: "Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed." This fixes dEQP-VK.api.descriptor_pool.* Fixes: `14f6275c92` "anv/descriptor_set: add reference counting for..." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-03-11 20:40:31 -05:00
Tapani Pälli	47fc359822	anv: release memory allocated by glsl types during spirv_to_nir Fixes leaks for each glsl_type generated: ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18 ==32470== at 0x483880B: malloc (vg_replace_malloc.c:309) ==32470== by 0x4C43F4A: ralloc_size (ralloc.c:119) ==32470== by 0x4C44014: rzalloc_size (ralloc.c:151) ==32470== by 0x4C44258: rzalloc_array_size (ralloc.c:215) ==32470== by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:114) ==32470== by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const, unsigned int, char const) (glsl_types.cpp:1146) ==32470== by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501) ==32470== by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269) ==32470== by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018) ==32470== by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365) ==32470== by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490) ==32470== by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173) v2: move release call to vkDestroyInstance Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-11 13:14:45 +02:00
Tapani Pälli	7bb34ecff9	anv: release memory allocated by bo_heap when descriptor pool is destroyed Fixes following leak: ==21853== 32 bytes in 1 blocks are definitely lost in loss record 2 of 20 ==21853== at 0x483AB1A: calloc (vg_replace_malloc.c:762) ==21853== by 0x4C4DD7F: util_vma_heap_free (vma.c:221) ==21853== by 0x4C4D647: util_vma_heap_init (vma.c:46) ==21853== by 0x4957B9F: anv_CreateDescriptorPool (anv_descriptor_set.c:578) Fixes: `c520f4dec9` ("anv: Add a concept of a descriptor buffer") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 08:13:27 +02:00
Tapani Pälli	105002bd2d	anv: destroy descriptor sets when pool gets destroyed Patch maintains a list of sets in the pool and destroys possible remaining sets when pool is destroyed. As stated in Vulkan spec: "When a pool is destroyed, all descriptor sets allocated from the pool are implicitly freed and become invalid." This fixes memory leaks spotted with valgrind: ==19622== 96 bytes in 1 blocks are definitely lost in loss record 2 of 3 ==19622== at 0x483880B: malloc (vg_replace_malloc.c:309) ==19622== by 0x495B67E: default_alloc_func (anv_device.c:547) ==19622== by 0x4955E05: vk_alloc (vk_alloc.h:36) ==19622== by 0x4956A8F: anv_multialloc_alloc (anv_private.h:538) ==19622== by 0x4956A8F: anv_CreateDescriptorSetLayout (anv_descriptor_set.c:217) Fixes: `14f6275c92` ("anv/descriptor_set: add reference counting for descriptor set layouts") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 08:13:01 +02:00
Timothy Arceri	051b4064da	anv: add support for dumping shader info via VK_EXT_debug_report This information will be used by the vkpipeline-db tool. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 16:16:04 +11:00
Jason Ekstrand	8fdee457a4	anv/pipeline: Move lower_explicit_io much later Now that nir_opt_copy_prop_vars can properly handle array derefs on vectors, it's safe to move UBO and SSBO lowering to late in the pipeline. This should allow NIR to actually start optimizing SSBO access. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-08 22:03:34 -06:00
Jason Ekstrand	179d254cba	intel/nir: Move lower_mem_access_bit_sizes to postprocess_nir It doesn't really matter where this pass goes as long as it's after we call nir_lower_explicit_io and before we go into the back-end. Putting it brw_postprocess_nir lets us move nir_lower_explicit_io significantly later in the pipeline. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-08 22:03:14 -06:00
Brian Paul	0de83bacf0	intel/compiler: silence unitialized variable warning in opt_vector_float() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-08 10:23:11 -07:00
Brian Paul	b5ea56e411	intel/decoders: silence uninitialized variable warnings in gen_print_batch() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-08 10:23:11 -07:00
Alejandro Piñeiro	cf0b2ad486	nir/xfb: adding varyings on nir_xfb_info and gather_info In order to be used for OpenGL (right now for ARB_gl_spirv). This commit adds two new structures: * nir_xfb_varying_info: that identifies each individual varying. For each one, we need to know the type, buffer and xfb_offset * nir_xfb_buffer_info: as now for each buffer, in addition to the stride, we need to know how many varyings are assigned to it. For this patch, the only case where num_outputs != num_varyings is with the case of doubles, that for dvec3/4 could require more than one output. There are more cases though (like aoa), that will be handled on following patches. v2: updated after new nir general XFB support introduced for "anv: Add support for VK_EXT_transform_feedback" v3: compute num_varyings beforehand for allocating, instead of relying on num_outputs as approximate value (Timothy Arceri) Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-08 15:00:50 +01:00
Lionel Landwerlin	7271808df8	intel/error2aub: support older style engine names Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	a036eac029	intel/error2aub: deal with GuC log buffer When Guc is enabled, the error state will contain a "global" buffer for the GuC log buffer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	c619ea945d	intel/error2aub: add a verbose option Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	ca0161f890	intel/error2aub: write GGTT buffers into the aub file Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	9b5dc2124f	intel/error2aub: store engine last ring buffer head/tail pointers Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	cdab19fa57	intel/error2aub: annotate buffer with their address space Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	630a72827a	intel/error2aub: parse other buffer types We don't write them in the aub file yet. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	c0ea043888	intel/error2aub: strenghten batchbuffer identifier marker Found out that some base64 data matched the '---' identifier. We can avoid this by adding the surrounding spaces. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	650e6e5d33	intel/error2aub: identify buffers by engine Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Lionel Landwerlin	a07f5262f0	intel/error2aub: build a list of BOs before writing them The error state contains several kind of BOs, including the context image which we will want to write in a later commit. Because it can come later in the error state than the user buffers and because we need to write it first in the aub file, we have to first build a list of BOs and then write them in the appropriate order. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-08 11:01:14 +00:00
Jason Ekstrand	1664de5924	nir/builder: Add a build_deref_array_imm helper Unlike most of the cases in which we do this by hand, the new helper properly handles non-32-bit pointers. Reviewed-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-03-07 21:20:30 +00:00
Kenneth Graunke	4787bc944a	isl: Add a swizzle parameter to isl_buffer_fill_state() This is necessary for legacy texture buffer object formats, where we'll need to use a swizzle to fake e.g. luminance. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-07 11:39:27 -08:00
Lionel Landwerlin	0b3871bc7f	intel/aub_write: factorize context image/pphwsp/ring creation We allocate GGTT entries and physical addresses are we create engines rather than having a fixed layout. Context images now receive a parameter argument which is used to setup pml4 & ring buffer addresses. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	c1a2c72e76	intel/aub_write: turn context images arrays into functions We'll make them more parameterized in a later commit. As this is just a transitional commit, we allow ourself to leak the context images allocated in get_context_init(). We'll fix this in the next commit. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	8e14c9b7db	intel/aub_write: store the physical page allocator in struct We want to use this allocator in the next commit for GGTT pages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	0343a3b42b	intel/aub_write: log mmio writes Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	6ef46972d9	intel/aub_write: switch to use i915_drm engine classes Prepare aub write to deal with multiple engine instances. We don't pass the instance number yet this could be done in the future by having a 2 dimensional array of struct engine. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	8a81f5c255	intel/aub_write: break execlist write in 2 We want to reuse the execlist submission, but won't need the ring buffer update. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:32 +00:00
Lionel Landwerlin	69ee5bde4e	intel/aub_write: write header in init Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	01443f34b4	intel/aub_write: split comment section from HW setup In the future we'll want error2aub to reuse the context image saved by i915 instead of the default one we write in intel_dump_gpu. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	2b42adff14	intel/aub_read: reuse defines from gen_context Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	bf93084f44	intel/decoders: limit number of decoded batchbuffers IGT has a test to hang the GPU that works by having a batch buffer jump back into itself, trigger an infinite loop on the command stream. As our implementation of the decoding is "perfectly" mimicking the hardware, our decoder also "hangs". This change limits the number of batch buffer we'll decode before we bail to 100. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	acb50d6b1f	intel/decoders: handle decoding MI_BBS from ring An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Lionel Landwerlin	ec526d6ba0	intel/decoders: add address space indicator to get BOs Some commands like MI_BATCH_BUFFER_START have this indicator. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>	2019-03-07 15:08:31 +00:00
Tapani Pälli	4900c0cff4	anv: call blob_finish when done with it Fixes leaks from anv_device_upload_nir: ==7345== 8,192 bytes in 2 blocks are definitely lost in loss record 24 of 24 ==7345== at 0x4C2ED78: malloc (vg_replace_malloc.c:308) ==7345== by 0x4C31393: realloc (vg_replace_malloc.c:836) ==7345== by 0x54E0848: grow_to_fit (blob.c:67) ==7345== by 0x54E0BE5: blob_reserve_bytes (blob.c:166) ==7345== by 0x54E0C7C: blob_reserve_intptr (blob.c:186) ==7345== by 0x54704A7: nir_serialize (nir_serialize.c:1091) ==7345== by 0x512F97D: anv_device_upload_nir (anv_pipeline_cache.c:756) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>	2019-03-07 07:39:48 +02:00

1 2 3 4 5 ...

3976 Commits