AlexIndustrial/mesa

Author	SHA1	Message	Date
Timothy Arceri	ed61530121	glsl: reserve parameter storage on cache restore Since we know how big the list will be we can allocate the storage upfront. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	1183eb487f	glsl: don't try to load/store buffer object values in the cache Also add an assert to catch buffer overflows. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	cad1a9bfde	glsl: don't reprocess or clear UBOs on cache fallback Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	01d1e5a7ad	glsl: skip more uniform initialisation when doing fallback linking We already pull these values from the metadata cache so no need to recreate them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	794f7326bc	glsl: don't lose uniform values when falling back to full compile Here we skip the recreation of uniform storage if we are relinking after a cache miss. This is improtant because uniform values may have already been set by the application and we don't want to reset them. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	0e9991f957	glsl: don't reference shader prog data during cache fallback We already have a reference. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	2f19accc5e	mesa/glsl: add cache_fallback flag to gl_shader_program_data This will allow us to skip certain things when falling back to a full recompile on a cache miss such as avoiding reinitialising uniforms. In this change we use it to avoid reading the program metadata from the cache and skipping linking during a fallback. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	e3adde023b	glsl: add api and glsl version to hash generation for shaders Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	dc0c0c176d	glsl: cache uniform values These may be lowered constant arrays or uniform values that we set before linking so we need to cache the actual uniform values. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	49f3439089	glsl: make uniform values helper available for use elsewhere Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	bb16cf805d	glsl: cache some more image metadata Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:43 +11:00
Timothy Arceri	a3ff840d05	glsl: add support for caching atomic buffers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	3d15d814c0	glsl: add shader cache support for buffer blocks Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	6761259958	glsl: store subroutine remap table in shader cache V2: use new helpers to store/restore table entries. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	787535fb11	glsl: add support for caching subroutines Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	0057de58f9	glsl: add support for caching shaders with xfb qualifiers For now this disables the shader cache when transform feedback is enabled via the GL API as we don't currently allow for it when generating the sha for the shader. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	3bbfee3cd3	glsl: add shader cache support for samplers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	c4cff5f402	glsl: add basic support for resource list to shader cache This initially adds support for simple uniforms and varyings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	3c45d8f464	glsl: fix uniform remap table cache when explicit locations used V2: don't store pointers use an enum instead to flag what should be restored. Also do the work in a helper that we will later use for the subroutine remap table. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Carl Worth	a01973a784	glsl: Serialize three additional hash tables with program metadata The three additional tables are AttributeBindings, FragDataBindings, and FragDataIndexBindings. The first table (AttributeBindings) was identified as missing by trying to test the shader cache with a program that called glGetAttribLocation. Many thanks to Tapani Pälli <tapani.palli@intel.com>, as it was review of related work that he had done previously that pointed me to the necessity to also save and restore FragDataBindings and FragDataIndexBindings. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	e5bb4a0b0f	glsl: use correct shader source in case of cache fallback The scenario is: glShaderSource glCompileShader <-- deferred due to cache hit of shader glShaderSource <-- with new source code glAttachShader glLinkProgram <-- no cache hit for program At this point we need to compile the original source when we fallback. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	8771940682	glsl: make use of on disk shader cache The hash key for glsl metadata is a hash of the hashes of each GLSL source string. This commit uses the put_key/get_key support in the cache put the SHA-1 hash of the source string for each successfully compiled shader into the cache. This allows for early, optimistic returns from glCompileShader (if the identical source string had been successfully compiled in the past), in the hope that the final, linked shader will be found in the cache. This is based on the intial patch by Carl. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Timothy Arceri	34ca0fce22	glsl: add initial implementation of shader cache This uses disk_cache.c to write out a serialization of various state that's required in order to successfully load and use a binary written out by a drivers backend, this state is referred to as "metadata" throughout the implementation. This initial version is intended to work with all stages beside compute. This patch is based on the initial work done by Carl. V2: extend the file's doxygen comment to cover some of the design decisions. V3: - skip cache for fixed function shaders - add int64 support - fix glsl IR program parameter caching/restore and cache the parameter values which are used by gallium backends. - use new link status enum V4: - add compute program support Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-02-17 11:18:42 +11:00
Dave Airlie	03f4982c68	nir: handle some 64-bit integer conversions These are enough for the spir-v generator to handle UConvert and SConvert operations, and fix the 4 tests in CTS. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:13:21 +10:00
Dave Airlie	adb9555794	nir: handle 64-bit integer types in glsl->nir type conversion. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:13:14 +10:00
Dave Airlie	14167080e2	spirv: handle SpvOpUConvert in proper place. This was falling into the quantizetof16 path. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:11:59 +10:00
Dave Airlie	2d0b145902	spirv: add support for Int64 capability This just adds the support at the spirv->nir level for the Int64 cap. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:11:13 +10:00
Dave Airlie	48ebdbecc5	spirv/nir: add support for int64 This adds the spirv->nir conversion for int64 types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:11:05 +10:00
Dave Airlie	7593f2ac1b	nir/types: add C accessors for 64-bit integer types. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-02-16 14:10:45 +10:00
Bas Nieuwenhuizen	501a4c0d73	spirv: Add support for SpvCapabilityStorageImageReadWithoutFormat. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-02-15 21:18:18 +01:00
Kenneth Graunke	a3e4fa5495	glsl: Handle packed_type == ivec4[] in lower_packed_varyings(). For GS input arrays, we may turn a packed_type of ivec4 into an array of ivec4s. We still want flat qualification. Found by inspection. Not known to help anything. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-02-14 14:47:40 -08:00
Alex Smith	94d48b7f9f	spirv: Add support for SpvCapabilityStorageImageWriteWithoutFormat Allow that capability if the driver indicates that it is supported, and flag whether images are read-only/write-only in the nir_variable (based on the NonReadable and NonWritable decorations), which drivers may need to implement this. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 08:16:52 -08:00
Iago Toral Quiroga	5c6eaa1421	nir/spirv: do not require a format with images that are not sampled As soon as we support shaderStorageImageWriteWithoutFormat we can see write-only images (sampled == 2) that don't have a format specified. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-14 08:16:52 -08:00
Anuj Phogat	5e2909e732	mesa: Add EXT_frag_depth bits and enable it on all drivers Passes the newly added piglit test for this extension on i965. V2: Fix comments by Ilia. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>	2017-02-13 16:08:40 -08:00
Kenneth Graunke	57dc6d80a0	glsl: Drop resize-to-MaxPatchVertices hack. TCS and TES inputs without an array size are implicitly sized to gl_MaxPatchVertices. But TCS outputs are apparently not: "If no size is specified, it will be taken from the output patch size (gl_VerticesOut) declared in the shader." Fixes dEQP-GLES31.functional.program_interface_query.program_output. array_size.separable_tess_ctrl.var. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:25 -08:00
Kenneth Graunke	e99df398f1	glsl: Update a comment about link errors for TCS && !TES. OpenGL ES actually has spec text to prohibit this. It's just OpenGL that's confusing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>	2017-02-12 21:09:21 -08:00
Jose Maria Casanova Crespo	5bc222ebaf	glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1 From GLSL ES 3.10 spec, section 4.1.9 "Arrays": "If an array is declared as the last member of a shader storage block and the size is not specified at compile-time, it is sized at run-time. In all other cases, arrays are sized only at compile-time." In desktop GLSL it is allowed to have unsized-arrays that are not last, as long as we can determine that they are implicitly sized, which is detected at link-time. With this patch Mesa reports a compilation error as glslang does with the following shader: buffer SSBO { vec4 data[]; vec4 moreData;}; void main (void) { } Fixes: dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader Cc: "17.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-10 23:14:12 -08:00
Matt Turner	d7a0486a9e	glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=... Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed an OpenGL 3.3 compatibility profile context (with various unimplemented features and bugs), but still refused to compile shaders with #version 330 compatibility This patch simply adds a small bit of plumbing to let that through. Of course the same caveats apply: compatibility profile is still not supported (and will not be supported), so there are no guarantees that anything will work. Tested-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-02-09 15:14:43 +00:00
Samuel Iglesias Gonsálvez	824e1bb078	nir: add opcode to perform int64 to bool conversions Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-09 10:18:34 +01:00
Timothy Arceri	0bf21519b7	glsl: add param to force shader recompile This will be used to skip checking the cache and force a recompile. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-09 12:22:56 +11:00
Timothy Arceri	a3fd8bb8c5	st/mesa/i965: create link status enum For the on-disk shader cache we want to be able to differentiate between a program that was linked and one that was loaded from cache. V2: - don't return the new enum directly to the application when queried, instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome corruptions when using cache. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-09 12:22:56 +11:00
Jason Ekstrand	1de3cd8a34	spirv: Add more asserts in vtn_vector_construct Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99465	2017-02-07 08:08:06 -08:00
Marc Di Luzio	21efe2528c	glsl: correct compute shader checks for memoryBarrier functions As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-02-06 21:12:33 -08:00
Lionel Landwerlin	875b15eec4	spirv: add SPV_KHR_shader_draw_parameters support Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 15:08:33 +00:00
Lionel Landwerlin	bd46040162	compiler: add missing enums for debug Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 15:08:30 +00:00
Francisco Jerez	11e9ebbf15	nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:33:33 -08:00
Francisco Jerez	013d40d1ce	glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞). Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:33:33 -08:00
Francisco Jerez	7215375c44	nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity. See "glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity." for the rationale, but note that the instruction count benefit discussed there is somewhat less important for the SPIRV implementation, because the current code already emitted no control flow instructions -- Still this saves us one hardware instruction per scalar component on Intel SKL hardware. Fixes the following Vulkan CTS tests on Intel hardware: dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4 Note that most of the test-cases above expect IEEE-compliant handling of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so except for the last two the test-cases above weren't expected to pass yet. The reason they do is that the i965 back-end implementation of the NIR fmin and fmax instructions is not quite GLSL-compliant (it complies with IEEE 754 recommendations though), because fmin/fmax of a NaN and a non-NaN argument currently always return the non-NaN argument, which causes atan() to flush NaN to one and return the expected value. The front-end should probably not be relying on this behavior for correctness though because other back-ends are likely to behave differently -- A follow-up patch will handle the atan2(±∞, ±∞) corner cases explicitly. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:33:27 -08:00
Francisco Jerez	e9ffd12827	glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2017-01-31 10:32:45 -08:00
Francisco Jerez	7ec3af3f8f	glsl/ir_builder: Add rcp builder. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>	2017-01-31 10:32:43 -08:00

1 2 3 4 5 ...

1570 Commits