AlexIndustrial/mesa

Author	SHA1	Message	Date
Ian Romanick	f2c69f8306	anv: Enable KHR_shader_integer_dot_product For now, only mark the 4x8BitPacked variants as accelerated. Applications are unlikely to use the "add with saturate" opcodes from VK_INTEL_shader_integer_functions2, so, technically, all of the AccumulatingSaturating variants "[provide] a performance advantage over user-provided code composed from elementary instructions..." on all Intel platforms. If we encounter an application that cares, we can do things differently then. Ditto for the non-packed 8Bit, 4-element vector variants. v2: Don't memset props as this also zeros sType and pNext. Noticed by Georg Lehmann in !12617. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>	2021-08-31 19:57:21 +00:00
Lionel Landwerlin	838c0e5eef	intel/fs: fix framebuffer reads We're missing some restrictions on those messages. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5292 Cc: mesa-stable Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12615>	2021-08-31 11:52:39 +00:00
Ian Romanick	a8d0c0af86	intel/fs: Remove type-based restriction for cmod propagation to saturated operations Previously, we misunderstood how conditional modifiers and saturate interacted. We thought the condition was evaulated before the saturate was applied. For the floating point cases, we went to some heroics to modify the condition to maintain the same results. For integer cases, it was not clear that this could even work. We had no use-cases and no tests-cases, so we just disallowed everything. Now we understand that the condition is evaluated after the saturate. Earlier commits in this series removed the various floating point heroics. It is easier to just delete the code that prevents some cases that just work. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	5ad88fd499	intel/fs: Remove after parameter from test_saturate_prop Originally this was part of "intel/fs: Remove condition-based restriction for cmod propagation to saturated operations". With some additional changes to that commit, it caused a lot of extra churn in the unit tests. I felt that made it harder to see the actual changes in the unit tests, so I split it out. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	e6373923a7	intel/fs: Remove condition-based restriction for cmod propagation to saturated operations I don't know why the float_saturate_l_mov test was #if'ed out, but it passes... so this commit enables it. No shader-db or fossil-db changes. In a previous iteration of this MR, this commit helped ~200 shaders in shader-db. Now all of those same shaders are helped by "intel/fs: cmod propagate from MOV with any condition". All of these shaders come from Mad Max. After initial translation from NIR to assembly, these shader contain patterns like: mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F / ... mov.sat(8) g90<1>F g90<8,8,1>F ... cmp.nz.f0(8) null<1>F g90<8,8,1>F 0 / 0F / An initial pass of cmod propagation converts this to mul(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F / ... mov.sat.XX.f0(8) g90<1>F g90<8,8,1>F Without this commit, XX is G. With this commit, XX is NZ. Saturate propagation moves the saturate: mul.sat(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F / ... mov.XX.f0(8) g90<1>F g90<8,8,1>F Without this commit (but with "intel/fs: cmod propagate from MOV with any condition"), the G gets propagated: mul.sat.g.f0(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F / With this commit (with or without "intel/fs: cmod propagate from MOV with any condition"), the NZ gets propagated: mul.sat.nz.f0(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F */ Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	47f0cdc449	intel/fs: cmod propagate from MOV with any condition There were tests related to propagating conditional modifiers from a MOV to an instruction with a .SAT modifier for a very long time, but they were #if'ed out. There are restrictions later in the function that limit the kinds of MOV instructions that can propagate. This avoids the dangers of type-converting MOVs that may generate flags in different ways. v2: Update the added comment to look more like the existing comment. That makes the small differences between the two cases more obvious. Noticed by Marcin. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19827127 -> 19826924 (<.01%) instructions in affected programs: 62024 -> 61821 (-0.33%) helped: 201 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 helped stats (rel) min: 0.13% max: 0.60% x̄: 0.35% x̃: 0.36% 95% mean confidence interval for instructions value: -1.02 -1.00 95% mean confidence interval for instructions %-change: -0.36% -0.34% Instructions are helped. total cycles in shared programs: 954655879 -> 954655356 (<.01%) cycles in affected programs: 1212877 -> 1212354 (-0.04%) helped: 155 HURT: 6 helped stats (abs) min: 1 max: 6 x̄: 3.65 x̃: 4 helped stats (rel) min: <.01% max: 0.17% x̄: 0.07% x̃: 0.07% HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 8 HURT stats (rel) min: 0.04% max: 0.23% x̄: 0.14% x̃: 0.15% 95% mean confidence interval for cycles value: -3.60 -2.90 95% mean confidence interval for cycles %-change: -0.07% -0.05% Cycles are helped. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	a9120eccff	intel/compiler: Move type_is_unsigned_int to brw_reg_type.h ...and rename it to brw_reg_type_is_unsigned_integer. It is now next to brw_reg_type_is_floating_point and brw_reg_type_is_integer. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	b23432c540	intel/fs: Fix a cmod prop bug when the source type of a mov doesn't match the dest type of scan_inst We were previously operating with the mindset "a MOV is just a compare with zero." As a result, we were trying to share as much code between the MOV path and the CMP path as possible. However, MOV instructions can perform type conversions that affect the result of the comparison. There was some code added to better handle this for cases like and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD mov.nz.f0(16) null<1>F g31<8,8,1>D The flaw in these changed special cases is that it allowed things like or(8) dest:D src0:D src1:D mov.nz(8) null:D dest:F Because both destinations were integer types, the propagation was allowed. The source type of the MOV and the destination type of the OR do not match, so type conversion rules have to be accounted for. My solution was to just split the MOV and non-MOV paths with completely separate checks. The "else" path in this commit is basically the old code with the BRW_OPCODE_MOV special case removed. The new MOV code further splits into "destination of scan_inst is float" and "destination of scan_inst is integer" paths. For each case I enumerate the rules that I belive apply. For the integer path, only the "Z or NZ" rules are listed as only NZ is currently allowed (hence the conditional_mod assertion in that path). A later commit relaxes this and adds the rule. The new rules slightly relax one of the previous rules. Previously the sizes of the MOV destination and the MOV source had to be the same. In some cases now the sizes can be different by the following conditions: - Floating point to integer conversion are not allowed. - If the conversion is integer to floating point, the size of the floating point value does not matter as it will not affect the comparison result. - If the conversion is float to float, the size of the destination must be greater than or equal to the size of the source. - If the conversion is integer to integer, the size of the destination must be greater than or equal to the size of the source. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	0797388dc2	intel/fs: Add many cmod propagation tests involving MOV instructions Of particular interest are the tests where the MOV performs a type conversion. If the restriction on conditional modifier for a MOV is ever relaxed, some of these cases must still be disallowed. v2: s/NZ/Z/ in one of the comments. Notice by Marcin. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	4f6c5da025	intel/fs: Remove redundant inst->opcode checks in cmod prop This foreach_inst_in_block_reverse_starting_from loop only applies CMP, MOV, and AND. AND instructions break out of the loop before this point. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	3afefb0818	intel/fs: Refactor some cmod propagation tests This will simplify some later changes to these tests. v2: Combine test_positive_saturate_prop and test_negative_saturate_prop into a single function. Suggested by Marcin. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Marcin Ślusarz	e0533ebf16	intel/compiler: INT DIV function does not support source modifiers BSpec says that for all generations. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5281 CC: mesa-stable Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12518>	2021-08-26 07:51:44 +00:00
Nanley Chery	565f9105b7	anv/image: Don't assert that HiZ can be added HiZ isn't yet enabled for Tile4/64. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	bd516c6581	intel/isl: Disable I915_FORMAT_MOD_Y_TILED on XeHP+ XeHP lacks support for Y-tiling. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	af40104e7d	intel: Add underscores to HALIGN and VALIGN enums The HALIGN enums for XeHP already have underscores. Make the other HALIGN and VALIGN enums conform. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	3d1f6342c0	intel: Update surface states for XeHP alignments Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	fbde743b07	intel/isl: Use a switch for HALIGN/VALIGN encoding Avoid using a sparse and relatively large array for HALIGN encoding. Additionally, this provides validation of the input alignment values. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	2e464e69b9	intel/isl: Fix halign/valign of uncompressed views We're going to start asserting for valid halign/valign values during surface state emission. Pre-SKL, isl_surf_get_uncompressed_surf creates uncompressed surfaces with invalid halign/valign values - 1x1. Fix this by replacing the call to isl_surf_get_image_surf with isl_surf_init, passing in the uncompressed format up-front. As we're no longer using isl_surf_get_image_surf, we also need to get the x and y offset of the image ourselves. Instead of getting a sample-based offset, then converting to elements later on, we use isl_surf_get_image_offset_B_tile_el to get the offset in elements up-front. With the above two changes, the generic code after the else-block is no longer needed for the single-layer-view code path. We move it and specialize it to the if-block (which is executed SKL+ and handles multi-layer views). Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	c7bcbc950c	intel/blorp: Fix Gfx7 stencil surface state valign Stencil on Gfx7 has a vertical alignment element of 8, but the largest its surface state can express is 4. Apply the Gfx6 solution of changing the alignment in blorp_surf_retile_w_to_y. Cc: mesa-stable Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	1f62cddaf5	intel/blorp: Fix faked RGB image alignment on XeHP On XeHP, NPOT and POT formatted surfaces will use different image alignment units when emitting surface states. When BLORP fakes an RGB image as RED, update the image alignment to prevent assert failures when emitting surface states. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	79ad9cda48	intel: Support Tile4/64 in surface states Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	dd9ae2dc7b	intel: Support Tile4/64 in depth/stencil state Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	f54de77c3a	intel/isl: Update tiling filter functions for XeHP Enable the XeHP-specific tilings and restrict them to that platform. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	0ab2fa18e4	intel/isl: Use an allow-list in gfx6_filter_tiling Try to avoid having to update isl_gfx6_filter_tiling when new tilings are added for new platforms. Note that the allow-list uses ISL_TILING_ANY_Y_MASK and thus assumes that no new Y-tilings will be added in the future. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	602f597bc1	intel/isl: Drop ISL_SURF_USAGE_DISPLAY_*_BIT We haven't used these since their introduction. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	ac37d7801c	intel/isl: Drop extra assert on array_pitch_el_rows ISL already asserts that the variable is a multiple of the tile height via isl_assert_div. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	4309012774	intel/isl: Size Tile64 surfaces with 4 dimensions In order to size Tile64 surfaces correctly, make sure that the total physical extent is arrayed. The code should handle 3D surfaces as well, but is untested for now. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	8fd7678241	intel/isl: Update image alignments on XeHP Implement the new XeHP alignment rules for surface layout. RENDER_SURFACE_STATE objects still need updating, but that's left for a separate commit. Rework: * Nanley: Include Sagar's VALIGN fix for D16_UNORM. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	0bcfa2d8fb	intel/isl: Define ISL_TILING_4/64 for XeHP XeHP defines new tiling formats, Tile4 and Tile64. They are needed in order to support depth/stencil surfaces and multisampling. Create new ISL enums and define some initial tiling information in order to enable them later on. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Nanley Chery	44ef425ce8	intel/isl: Add msaa_layout param to isl_tiling_get_info The additional parameter will be used by Tile64. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Jason Ekstrand	e307d46eab	intel/isl: Add more parameters to isl_tiling_get_info They are not used yet but the layout of Yf and Ys tiles are dependent on these parameters. While we're here, better document the function. Rework: * Nanley: Update crocus. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12132>	2021-08-25 22:39:30 +00:00
Ian Romanick	0f809dbf40	intel/compiler: Basic support for DP4A instruction v2: Very significant rebase on changes to previous commits. Specifically, brw_fs_nir.cpp changes were pretty much rewritten from scratch after changing the NIR opcode names and types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Jason Ekstrand	31fdd26d01	intel/compiler: Add unified barrier support for CS Program CS barrier message fields for producers/consumers. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Jordan Justen	6a950bab0c	intel/compiler: Add unified barrier support for TCS Program the producers/consumer fields for TCS Barrier messages. Producer and consumer fields are set to number of TCS threads. Ref: Bspec 54006 for Barrier Data Payload Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Jordan Justen	b4055a020f	intel/compiler: Regroup TCS barrier code paths Rearrange if/else fragments to unify case for Gen11 or later platforms. This will help the code look cleaner for adding unified barrier support to TCS. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11963>	2021-08-24 01:31:48 +00:00
Nanley Chery	2944f49610	intel: Parse INTEL_NO_HW for devinfo construction This commit does several things: * Unify code common to several drivers by evaluating INTEL_NO_HW within intel_get_device_info_from_fd (suggested by Jordan). * For drivers that keep a copy of the intel_device_info struct, a separate copy of the no_hw field is now unnecessary. Remove them. * Minimize kernel queries when INTEL_NO_HW is true. This is done for code simplification, but we may find reason to undo this later on. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>	2021-08-24 00:12:47 +00:00
Nanley Chery	7d59a66e3a	intel: Use env_var_as_boolean for INTEL_NO_HW The prior method of checking the result of getenv() for NULL would cause the feature to be enabled for INTEL_NO_HW=0. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>	2021-08-24 00:12:47 +00:00
Nanley Chery	4003f2d48d	anv: Optimize genX(cmd_buffer_emit_gfx12_depth_wa) Only emit the workaround as needed. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>	2021-08-20 17:50:35 +00:00
Nanley Chery	2ae70329f5	intel: Move the D16 workarounds out of ISL Implement the workarounds in anv and iris instead. Before this commit, ISL unconditionally modified workaround registers while filling out depth stencil state. To account for this, drivers unconditionally stalled prior to emitting depth stencil packets. This hurt performance. By having the drivers perform the workarounds, they can choose when to modify the relevant registers. The drivers now avoid emitting the workaround for NULL depth buffers. This reduces stalls and leads to better performance. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (the ISL/Anv bits) Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (the Iris bits) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>	2021-08-20 17:50:35 +00:00
Nanley Chery	14b3732b84	anv: Add genX(cmd_buffer_emit_gfx12_depth_wa) This will replace the workaround built into ISL. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>	2021-08-20 17:50:35 +00:00
Jason Ekstrand	a6a449837b	anv: Set CONTEXT_PARAM_RECOVERABLE to false We want the kernel to ban our context immediately instead of foolhardily attempting to recover. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: mesa-stable@lists.freedesktop.org Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12476>	2021-08-19 19:37:03 +00:00
Ian Romanick	5ce3bfcdf3	intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9. See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76. With the previous optimizations in place, this change seems to improve the quality of the generated code. Comparing a couple Vulkan CTS tests on Skylake had the following results. dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag: SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends dEQP-VK.spirv_assembly.type.vec3.i8.max_frag: SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Ian Romanick	f0a8a9816a	nir: intel/compiler: Add and use nir_op_pack_32_4x8_split A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results in a lot of shifts and MOVs. When that pattern can be recognized, the individual 8-bit components can be packed much more efficiently. v2: Rebase on `b4369de27f` ("nir/lower_packing: use shader_instructions_pass") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Ian Romanick	7c83aa0518	intel/fs: Emit better code for u2u of extract Emitting the instructions one by one results in two MOV instructions that won't be propagated. By handling both instructions at once, a single MOV is emitted. For example, on Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag: SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 41 instructions. 1 loops. 3804 cycles. 0:0 spills:fills, 5 sends Without "intel/fs: Allow copy propagation between MOVs of mixed sizes," the improvement is still 8 instructions, but there are more instructions to begin with: SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 44 instructions. 1 loops. 3944 cycles. 0:0 spills:fills, 5 sends Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Ian Romanick	e3f502e007	intel/fs: Allow copy propagation between MOVs of mixed sizes This eliminates some spurious, size-converting moves. For example, on Ice Lake this helps dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag: SIMD8 shader: 52 instructions. 1 loops. 4164 cycles. 0:0 spills:fills, 5 sends SIMD8 shader: 49 instructions. 1 loops. 4044 cycles. 0:0 spills:fills, 5 sends Unfortunately, this doesn't clean everything up. Here's a subset of the "before" assembly: send(8) g11<1>UW g2<0,1,0>UD 0x02106e02 dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q }; mov(8) g7<4>UB g11<8,8,1>UD { align1 1Q }; mov(8) g12<1>UB g7<32,8,4>UB { align1 1Q }; send(8) g13<1>UW g2<0,1,0>UD 0x02106e03 dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q }; mov(8) g15<1>UW g12<8,8,1>UB { align1 1Q }; mov(8) g8<4>UB g13<8,8,1>UD { align1 1Q }; mov(8) g14<1>UB g8<32,8,4>UB { align1 1Q }; mov(8) g16<1>UW g14<8,8,1>UB { align1 1Q }; xor(8) g17<1>UW g15<8,8,1>UW g16<8,8,1>UW { align1 1Q }; And here's the same subset of the "after" assembly: send(8) g11<1>UW g2<0,1,0>UD 0x02106e02 dp data 1 MsgDesc: ( untyped surface read, Surface = 2, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q }; mov(8) g7<4>UB g11<8,8,1>UD { align1 1Q }; send(8) g13<1>UW g2<0,1,0>UD 0x02106e03 dp data 1 MsgDesc: ( untyped surface read, Surface = 3, SIMD8, Mask = 0xe) mlen 1 rlen 1 { align1 1Q }; mov(8) g15<1>UW g7<32,8,4>UB { align1 1Q }; mov(8) g8<4>UB g13<8,8,1>UD { align1 1Q }; mov(8) g16<1>UW g8<32,8,4>UB { align1 1Q }; xor(8) g17<1>UW g15<8,8,1>UW g16<8,8,1>UW { align1 1Q }; There are a lot of regioning and type restrictions in fs_visitor::try_copy_propagate, and I'm a little nervious about messing with them too much. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Suggested-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Ian Romanick	f9665040f1	intel/compiler: Document and assert some aspects of 8-bit integer lowering In the vec4 compiler, 8-bit types should never exist. In the scalar compiler, 8-bit types should only ever be able to exist on Gfx ver 8 and 9. Some instructions are handled in non-obvious ways. Hopefully this will save the next person some time. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Jordan Justen	7faad66ab0	intel/pci-ids: Re-enable DG1 and add SG1 Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584>	2021-08-18 17:35:41 +00:00
Sagar Ghuge	57bfd7122f	anv: Fix VK_EXT_memory_budget to consider VRAM if available Instead of calling the OS query, re-run anv_update_meminfo to get the latest from either the kernel memory info API or the OS as appropriate. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5173 Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>	2021-08-18 17:13:00 +00:00
Jason Ekstrand	758662759d	anv: compute available memory in anv_init_meminfo We can now detect EXT_memory_budget support based on whether or not we have non-zero available system memory. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>	2021-08-18 17:13:00 +00:00
Jason Ekstrand	5c79c545e3	anv: Rework init_meminfo Instead of making LMEM the special case, unify the two paths by setting up a fake drm_i915_query_memory_regions struct and filling it out based on OS queries. The important functional change here is that we now pass system memory through the same GTT size and 3/4 filter that we were using with the OS queries. This should make behavior consistent on integrated GPUs regardless of whether or not we have the memory region query API. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>	2021-08-18 17:13:00 +00:00

1 2 3 4 5 ...

7023 Commits