AlexIndustrial/mesa

Author	SHA1	Message	Date
Jason Ekstrand	8fd2f2c276	intel/fs: Implement quad_swap_horizontal with a swizzle on gen7 This fixes dEQP-VK.subgroups.quad.compute.subgroupquadswaphorizontal_* on all gen7 platforms. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-30 22:38:19 +00:00
Jason Ekstrand	499d760c6e	intel/fs: Use ALIGN16 instructions for all derivatives on gen <= 7 The issue here was discovered by a set of Vulkan CTS tests: dEQP-VK.glsl.derivate..dynamic_ These tests use ballot ops to construct a branch condition that takes the same path for each 2x2 quad but may not be uniform across the whole subgroup. They then tests that derivatives work and give the correct value even when executed inside such a branch. Because the derivative isn't executed in uniform control-flow and the values coming into the derivative aren't smooth (or worse, linear), they nicely catch bugs that aren't uncovered by simpler derivative tests. Unfortunately, these tests require Vulkan and the equivalent GL test would require the GL_ARB_shader_ballot extension which requires int64. Because the requirements for these tests are so high, it's not easy to test on older hardware and the bug is only proven to exist on gen7; gen4-6 are a conjecture. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-30 22:38:19 +00:00
Matt Turner	46a3ea06be	i965/fs: Print the scheduler mode. Line wrap some awfully long lines while we are here. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-07-30 14:35:43 -07:00
Matt Turner	dabb5d4bee	i965/fs: Add a shader_stats struct. It'll grow further, and we'd like to avoid adding an additional parameter to fs_generator() for each new piece of data. v2 (idr): Rebase on 17 months. Track a visitor instead of a cfg. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-30 14:35:43 -07:00
Jason Ekstrand	4bb6e6817e	intel: Use a system value for gl_FragCoord It's kind-of an anomaly that the Intel drivers are still treating gl_FragCoord as an input. It also makes zero sense because we have to special-case it in the back-end. Because ANV is the only user of nir_lower_wpos_center, we go ahead and just update it to look for nir_intrinsic_load_frag_coord as part of this patch. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Jason Ekstrand	e401303597	intel/fs: Remove calculate_urb_setup from fs_visitor Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 23:30:26 +00:00
Lionel Landwerlin	c6196f7025	anv: implement VK_EXT_index_type_uint8 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-29 21:26:07 +00:00
Eric Engestrom	8486dbb066	intel/mi: only resolve to a temp register if source isn't in memory aka. fix a s/\|\|/&&/ typo Fixes: `74063ee61a` ("intel/mi: Add a new gen_mi_store_if() helper.") Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-29 13:35:42 -07:00
Jason Ekstrand	99d04a5bd6	anv: Don't claim support for 24 and 48-bit formats on IVB Cc: mesa-stable@lists.freedesktop.org	2019-07-29 11:34:30 -05:00
Jason Ekstrand	7c1b39cf18	isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSW On Haswell, the format works but it doesn't properly do an sRGB decode. It appears to act identically to R8G8B8_UNORM. Only Vulkan uses this format so this only affects Vulkan on HSW. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Eric Engestrom <eric@engestrom.ch>	2019-07-29 11:34:18 -05:00
Eric Engestrom	ef57fb2350	intel: replace large stack buffer with heap allocation For now, this keeps the "100 bytes" allocation; we can try to figure out the correct size as a follow up. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-29 13:58:57 +01:00
Eric Engestrom	d2de5b6ba2	anv+tu+radv: delete unusable dev_icd.json As per previous commit, Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, so delete them. Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> # for anv Reviewed-by: Eric Anholt <eric@anholt.net> # for tu Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> # for radv	2019-07-26 14:47:53 +00:00
Jason Ekstrand	295e5a17da	anv: Disable transform feedback on gen7 It's totally implementable, it's just that the plumbing is a bit different and we never hooked it up. Don't advertise a broken feature. Fixes: `36ee2fd61c` "anv: Implement the basic form of VK_EXT_transform_feedback"	2019-07-25 14:58:14 -05:00
Kenneth Graunke	fe08aa67a8	intel/mi: Add a unit test for gen_mi_store_if(). This tests that predicated stores work. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	74063ee61a	intel/mi: Add a new gen_mi_store_if() helper. This performs predicated MI_STORE_REGISTER_MEM commands, assuming that the condition is already loaded into MI_PREDICATE_DATA. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	27b5817b6c	intel/mi: Add gen_mi_nz() and gen_mi_z() helpers. These provide comparisons against zero. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	4e16b838ba	intel/mi: Add a gen_mi_ior() to go with gen_mi_iand() Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Kenneth Graunke	79b8e3c260	intel/mi: Optimize away LOAD_REGISTER_REG from a register to itself We might want to resolve something to be in a particular register, so we can access it outside of the gen_mi framework...but it may already be in that register, at which point there's no work to do. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-25 18:42:55 +00:00
Jason Ekstrand	9d2aa67c47	anv: Disable subgroup arithmetic on gen7 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-25 16:43:16 +00:00
Arcady Goldmints-Orlov	832cedfdee	anv: report HOST_ALLOCATION as supported for images Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as supported for images. It was being shown supported for buffers, but not images. Fixes: `69cc6272fb` ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-25 09:01:26 -05:00
Daniel Schürmann	e272fdd508	nir,intel: lower if (cond) demote() to new intrinsic demote_if(cond) This will effectively enable the optimization in anv. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-24 13:02:18 -05:00
Kenneth Graunke	517005b4cf	i965: Use NIR to lower legacy userclipping. This allows us to drop legacy userclip plane handling in both the vec4 and FS backends, and simplifies a few interfaces. v2 (Jason Ekstrand): - Move brw_nir_lower_legacy_clipping to brw_nir_uniforms.cpp because it's i965-specific. - Handle adding the params in brw_nir_lower_legacy_clipping - Call brw_nir_lower_legacy_clipping from brw_codegen_vs_prog Co-authored-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-07-24 18:00:13 +00:00
Jason Ekstrand	d10de25309	anv: Implement VK_EXT_subgroup_size_control Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	bcef32d49b	anv/pipeline: Plumb pipeline shader stage create flags Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	2a236c76f8	intel/compiler: Allow for required subgroup sizes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	4397eb91c1	intel/compiler: Allow for varying subgroup sizes Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	c84b8eeeac	intel/compiler: Be more conservative about subgroup sizes in GL The rules for gl_SubgroupSize in Vulkan require that it be a constant that can be queried through the API. However, all GL requires is that it's a uniform. Instead of always claiming that the subgroup size in the shader is 32 in GL like we have to do for Vulkan, claim 8 for geometry stages, the maximum for fragment shaders, and the actual size for compute. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	1981460af2	intel/compiler: Lower gl_SubgroupSize in postprocess_nir Instead of lowering the subgroup size so early, wait until we have more information. In particular, we're going to want different subgroup sizes from different stages depending on the API. We also defer lowering of subgroup masks because the ge/gt masks require the subgroup size to generate a subgroup mask. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Jason Ekstrand	f62227f2b7	intel/nir: Make brw_nir_apply_sampler_key more generic Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-07-24 12:55:40 -05:00
Andrii Simiklit	fa2fc68de1	intel/compiler: don't use a keyword struct for a class fs_reg warning: struct 'fs_reg' was previously declared as a class Fixes: `e64be391` ("intel/compiler: generalize the combine constants pass") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>	2019-07-24 13:26:42 +00:00
Rafael Antognolli	1f4cbc9a06	intel/genxml: Add new test for subgroups. Make sure that a <group> tag within another <group> tag work just fine. v2: rename 'halfbyte' to 'byte' to match the size (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	fe5ae96d66	intel/genxml: Add basic infra for encoding/decoding unit tests. Adding option to print quiet. v2: Add license header. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	e25ebe2ec9	intel/gen_decoder: Decode <group> inside <group>. Now we can decode a <group> tag inside another <group> tag, and properly print its indices and content. v2: Use push/pop stack to fields, groups and iters (Lionel). v3: Add assert(iter->level < DECODE_MAX_ARRAY_DEPTH) (Lionel). Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	f670c2e1ff	intel/gen_decoder: Add the concept of array "levels". We currently only support one level, which is the basic level of a <group> tag. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	618d054283	intel/gen_decoder: Add array field. We currently use the group->next pointer to iterate through the <group> tags. This change them to be a type of field, so we can descend into them while iterating, and then go back to the original position. Will be useful when we want to decode <group>'s inside <group>'s, and when there are more <field>'s after a <group> tag. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	21bdd51942	intel/gen_decoder: Rename internally "group" to "array". A gen_group (group in most of the code) can be of several types: - instruction - struct - register - group (?!?) The <group> tag actually represents an array of elements. So at least in our code, lets call it an array to avoid confusion with gen_group. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	69506cbb74	intel/gen_decoder: Add gen_spec_load_filename() function. Refactor the code from gen_spec_load_from_path() into a separate function, that can be used with a xml file that doesn't fit the genX.xml filename format. Will be used soon for implementing unit tests for gen_decoder. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Rafael Antognolli	1f2b22a6bd	intel/gen_decoder: Fix parsing of small genxml file. When using gen_spec_load_from path, only abort decoding if the read length is 0. Previously, we were aborting if finding an EOF, even if something was read from the file. Also only kill the decoded file if no commands or structs were found, and print a message in such case. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-23 17:45:19 +00:00
Sagar Ghuge	806e5a37ed	anv: Implement VK_KHR_imageless_framebuffer v2: Pass pointer instead of struct instance (Lionel) v3: 1) Fix small nits (Jason) 2) Add way to detect anv_framebuffer don't have attachments (Jason) 3) Get rid of unncessary pNext chain walk (Jason) 4) Keep framebuffer instance in anv_cmd_state (Jason) v4: 1) Dump attachments from cmd_buffer (Jason) v5: 1) Fix condition check and add assertion (Lionel) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-23 10:01:45 -07:00
Lionel Landwerlin	772a5f9814	anv: fix use of comma operator This doesn't fix any bug at the moment because the next statement is 'true' which happens to be APIMODE_D3D, but if that changes it could. The fixes tags is as far I could go but the error predates it (2016 is probably far enough). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `8db6f2e6eb` ("anv/pipeline: Roll genX_pipeline_util.h into genX_pipeline.c") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-23 15:54:48 +00:00
Jason Ekstrand	fa63fad333	intel/fs: Stop stack allocating large arrays Normally, we haven't worried too much about stack sizes as Linux tends to be fairly friendly towards large stacks. However, when running DXVK apps under wine, we're suddenly subject to Windows' more stringent stack limitations and can run out of space more easily. In particular, some of the shaders in Elite Dangerous: Horizons have quite a few registers and the arrays in split_virtual_grfs are large enough to blow a 1 MiB stack leading to crashes during shader compilation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108662 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Matt Turner <mattst88@gmail.com> Cc: mesa-stable@lists.freedesktop.org	2019-07-22 16:16:39 -05:00
Caio Marcelo de Oliveira Filho	0345aeeb40	intel/compiler: Use nir_opt_conditional_discard anv vkpipeline-db results for SKL: total instructions in shared programs: 3622461 -> 3611281 (-0.31%) instructions in affected programs: 396452 -> 385272 (-2.82%) helped: 2062 HURT: 1 total cycles in shared programs: 1458144669 -> 1458105320 (<.01%) cycles in affected programs: 4171830 -> 4132481 (-0.94%) helped: 1874 HURT: 180 total loops in shared programs: 2437 -> 2437 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 8745 -> 8748 (0.03%) spills in affected programs: 8 -> 11 (37.50%) helped: 1 HURT: 1 total fills in shared programs: 23392 -> 23395 (0.01%) fills in affected programs: 8 -> 11 (37.50%) helped: 1 HURT: 1 LOST: 0 GAINED: 1 No changes to shader-db on i965 or iris. The glsl compiler already does a similar optimization. Improvement suggested by Daniel Schürmann. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-22 09:33:48 -07:00
Eric Engestrom	dffeaa55dd	util: use standard name for snprintf() Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-19 22:39:38 +01:00
Lionel Landwerlin	3adc32df92	anv: fix format mapping for depth/stencil formats anv_format is supposed to have a pointer back to the associated VkFormat, we were missed this for depth/stencil formats. This doesn't fix anything afaict, but will be needed for future changes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `465de47bad` ("anv: associate vulkan formats with aspects") Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-18 09:40:01 +03:00
Lionel Landwerlin	ce4c5474af	anv: report timestampComputeAndGraphics true Spec says : "timestampComputeAndGraphics specifies support for timestamps on all graphics and compute queues. If this limit is set to VK_TRUE, all queues that advertise the VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT in the VkQueueFamilyProperties::queueFlags support VkQueueFamilyProperties::timestampValidBits of at least 36." On gen7+ this should be true (we only have 32bits of timestamp on gen6 and below). Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `802f00219a` ("anv/device: Update features and limits") Reported-by: Timothy Strelchun <timothy.strelchun@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-17 22:46:58 +00:00
Jason Ekstrand	7ceec21b76	intel/fs: Use a strided MOV instead of a conversion for load_* destinations In many cases, the compiler can just copy-prop the strided MOV whereas the conversion is a bit trickier. This cuts 5% of the instructions off of one particular Vulkan CTS test which does lots of load_ssbo. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-17 18:44:35 +00:00
Jason Ekstrand	68a4c796d5	intel/fs: Properly stride NULL replacement regs in DCE This fixes some validation errors generated by certain D->W conversions but is likely not a full solution. Calculating an actual register stride is a far more complex problem in general and should probably be handled by the brw_fs_generator. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-07-17 18:44:35 +00:00
Caio Marcelo de Oliveira Filho	f07f516c56	anv: Increase state allocation size limit to 2MB When running on ICL the dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 needs more than 1M for the shader, so bump it. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-07-16 14:17:52 -07:00
Jason Ekstrand	110669c85c	st,i965: Stop looping on 64-bit lowering Now that the 64-bit lowering passes do a complete lowering in one go, we don't need to loop anymore. We do, however, have to ensure that int64 lowering happens after double lowering because double lowering can produce int64 ops. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-07-16 16:05:16 +00:00
Jason Ekstrand	6a441151c2	anv: Account for dynamic stencil write disables in the PMA fix In `6ce8592836` we started looking at the dynamic stencil state and disabling stencil writes when the stencil mask is zero. Unfortunately, we never updated the PMA fix code accordingly so 3DSTATE_WM_DEPTH_STENCIL and the PMA fix were getting out-of-sync causing hangs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109203 Fixes: `6ce8592836` "anv: Disable stencil writes when both write..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2019-07-16 15:12:45 +00:00

1 2 3 4 5 ...

4419 Commits