AlexIndustrial/mesa

Author	SHA1	Message	Date
Neil Roberts	34d4b3e367	glsl: Set default precision on record members Record types have their own slot to store the precision for each member in glsl_struct_field. Previously if the member didn’t have an explicit precision qualifier this was being left as GLSL_PRECISION_NONE. This patch makes it take into account the type’s default precision qualifier like it does for regular variables in apply_type_qualifier_to_variable. This has the additional benefit of correctly reporting an error when a float type is used in a struct without declaring the default type. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	235425771c	glsl/linker: Make precision matching optional in intrastage_match This function is confusingly also used to match interstage interfaces as well as intrastage. In the interstage case it needs to avoid comparing the precisions. This patch adds a parameter to specify whether to take the precision into account or not so that it can be used for both cases. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	19b27a8569	glsl/linker: Don’t check precision for shader interface On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. Section 4.3.10 of the GLSL ES 3.00 spec: “The type of vertex outputs and fragment inputs with the same name must match, otherwise the link command will fail. The precision does not need to match.” Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Neil Roberts	230d1e8d86	compiler/types: Making comparing record precision optional On GLES, the interface between vertex and fragment shaders doesn’t need to have matching precision. This adds an extra argument to glsl_types::record_compare to disable the precision comparison. This will later be used for the shader interface check. In order to make this work this patch also adds a helper function to recursively compare types while ignoring the precision. v2: Call record_compare from within compare_no_precision to avoid duplicating code (Eric Anholt). Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 09:29:53 +02:00
Iago Toral Quiroga	2a2501247b	nir: detect more dynamically uniform expressions Shader-db results for v3d: total instructions in shared programs: 9132728 -> 9119238 (-0.15%) instructions in affected programs: 596886 -> 583396 (-2.26%) helped: 1118 HURT: 224 total threads in shared programs: 234298 -> 234308 (<.01%) threads in affected programs: 10 -> 20 (100.00%) helped: 5 HURT: 0 total uniforms in shared programs: 3022949 -> 3022622 (-0.01%) uniforms in affected programs: 29163 -> 28836 (-1.12%) helped: 108 HURT: 37 total max-temps in shared programs: 1328030 -> 1327762 (-0.02%) max-temps in affected programs: 10097 -> 9829 (-2.65%) helped: 263 HURT: 15 total spills in shared programs: 3793 -> 3777 (-0.42%) spills in affected programs: 432 -> 416 (-3.70%) helped: 16 HURT: 0 total fills in shared programs: 4380 -> 4266 (-2.60%) fills in affected programs: 828 -> 714 (-13.77%) helped: 16 HURT: 0 Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-14 08:00:52 +02:00
Connor Abbott	37b92b0ae6	nir: Don't manually index intrinsic index enum This fixes a rebase fail in `ea51275e07`, and prevents it from happening again. There's no reason to do this manually. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-13 17:10:41 +02:00
Daniel Schürmann	7a858f274c	spirv/nir: add support for AMD_shader_ballot and Groups capability This commit also renames existing AMD capabilities: - gcn_shader -> amd_gcn_shader - trinary_minmax -> amd_trinary_minmax Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	ea51275e07	nir: add intrinsics for AMD_shader_ballot Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	1b89ebeede	nir/spirv: add support for the SubgroupBallotKHR SPIR-V capability This capability is required for the VK_EXT_shader_subgroup_ballot extension. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Daniel Schürmann	de56ebadce	nir/spirv: add support for the SubgroupVoteKHR SPIR-V capability This capability is required for the VK_EXT_shader_subgroup_vote extension. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>	2019-06-13 12:44:23 +00:00
Caio Marcelo de Oliveira Filho	2cb5907508	glsl: Check order and uniqueness of interlock functions With this commit all remaining compilation tests in Piglit for ARB_fragment_shader_interlock will pass. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:32 -07:00
Caio Marcelo de Oliveira Filho	b7c9fc72fd	glsl: Make interlock builtins follow same compiler rules as barriers Generalize the barrier code to provide correct error messages for other builtins. Fixes most of piglit compilation tests for ARB_fragment_shader_interlock. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>	2019-06-10 14:29:26 -07:00
Eduardo Lima Mitev	fb2169040a	nir/opt_algebraic: Fix rules for imadsh_mix16 The rules added in patch `3addd7c` are inverted: It should be: (al * bh) << 16 + c instead of: (ah * bl) << 16 + c Fixes a number of regressions under dEQP-GLES31.functional.draw_indirect.compute_interop.large.* on Freedreno. Reviewed-by: Rob Clark <robdclark@gmail.com>	2019-06-10 22:27:46 +02:00
Eric Engestrom	440fe0eb43	nir: fix s/&&/\|\|/ typo Fixes: `cd73b6174b` "nir/lower_to_source_mods: Stop turning add, sat, and neg into mov" Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-07 16:06:25 +01:00
Eduardo Lima Mitev	3addd7c8d9	nir_algebraic: Add basic optimizations for umul_low and imadsh_mix16 For umul_low (al * bl), zero is returned if the low 16-bits word of either source is zero. for imadsh_mix16 (ah * bl << 16 + c), c is returned if either 'ah' or 'bl' is zero. A couple of nir_search_helpers are added: is_upper_half_zero() returns true if the highest word of all components of an integer NIR alu src are zero. is_lower_half_zero() returns true if the lowest word of all components of an integer nir alu src are zero. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Eduardo Lima Mitev	c27b3758fa	nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes 'umul_low' is the low 32-bits of unsigned integer multiply. It maps directly to ir3's MULL_U. 'imadsh_mix16' is multiply add with shift and mix, an ir3 specific instruction that maps directly to ir3's IMADSH_M16. Both are necessary for the lowering of integer multiplication on Freedreno, which will be introduced later in this series. Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-07 08:45:05 +02:00
Jason Ekstrand	7a18ce0b91	glsl/loop_analysis: Don't search for NULL variables in the hash table Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-06-06 00:27:53 +00:00
Jason Ekstrand	d96878a66a	nir/propagate_invariant: Don't add NULL vars to the hash table Fixes: `8410cf66d` "nir/propagate_invariant: Skip unknown vars" Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2019-06-06 00:27:53 +00:00
Kenneth Graunke	c7d1b52a2c	nir: Combine lower_fmod16/32 back into a single lower_fmod. We originally had a single lower_fmod option. In commit `2ab2d2e5`, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit `ca31df6f`. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Kenneth Graunke	edd45af9ba	nir: Drop lower_fmod64 option. nir_lower_doubles offers a wide variety of fp64 lowering, including lowering fmod@64. The version there also better handles imprecisions due to lowered frcp@64. Let's consolidate on one version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-06-05 16:45:12 -07:00
Jason Ekstrand	fe2fc30cb5	nir: Don't replace the nir_shader when NIR_TEST_SERIALIZE=1 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-06-05 20:07:28 +00:00
Jason Ekstrand	9eba6d9a88	nir: Don't replace the nir_shader when NIR_TEST_CLONE=1 Instead, we add a new helper which stomps one nir_shader and replaces it with another. The new helper effectively just changes which pointer gets used for the base nir_shader. It should be 99% as good at testing cloning but without requiring that everything handle having the shader swapped out from under it constantly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Rob Clark <robdclark@chromium.org>	2019-06-05 20:07:28 +00:00
Alyssa Rosenzweig	d2d3cc66cf	nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a) This pattern was noticed in glmark's jellyfish scene. v2: Add inexact qualifier due to NaN behaviour. Minimal shader-db changes (slightly helped). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Elie Tournier <tournier.elie@gmail.com>	2019-06-04 19:57:19 +00:00
Caio Marcelo de Oliveira Filho	d482a8f680	spirv: Update the OpenCL.std.h header This corresponds to commit 8b911bd2ba37677037b38c9bd286c7c05701bcda on GitHub. We previously tweaked OpenCL.std.h from upstream to be included in C code. Now upstream header can be included, however the symbol names are slightly different (include an OpenCLstd_ prefix), so this patch also fixes vtn_opencl.c to use those. Reviewed-by: Karol Herbst <kherbst@redhat.com>	2019-06-04 12:12:51 -07:00
Jason Ekstrand	5176805471	spirv: Implement SPV_EXT_fragment_shader_interlock Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-04 17:30:51 +00:00
Jason Ekstrand	b5aa76b1df	spirv: Update the headers from latest Khronos master This corresponds to 8b911bd2ba37677037b38c9bd286c7c05701bcda in https://github.com/KhronosGroup/SPIRV-Headers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-06-04 17:30:51 +00:00
Caio Marcelo de Oliveira Filho	61de825e11	spirv: Like Uniform, do nothing for UniformId Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	b4eff83180	spirv: Implement SpvOpCopyLogical This is the same as SpvOpCopyObject but without the type checking, which is how vtn_composite_copy works, so we just need to hook the operation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	81586e9f53	spirv: Generalize OpSelect SPIR-V 1.4 supports OpSelect over any composite type, and also allows scalar boolean condition for vector types -- a case which we already handled to support old GLSLang. Added a helper function to recursively perform nir_bcsel, that makes easier to support structs. v2: Replace asserts() with vtn_fail_if(). (Jason) v3: Simplify Condition and Result types verifications. (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	17630291e5	spirv: Move OpSelect handling to a function This will make a later change easier to review. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:20:54 -07:00
Caio Marcelo de Oliveira Filho	ea0e89859c	nir/vars_to_ssa: Handle UNDEF_NODE in more places Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110832 Fixes: `911ea2c66f` "nir/vars_to_ssa: Use a non-null UNDEF_NODE pointer" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 17:09:22 -07:00
Caio Marcelo de Oliveira Filho	1f8546ba2f	spirv: Implement OpPtrEqual, OpPtrNotEqual and OpPtrDiff Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Caio Marcelo de Oliveira Filho	ca164ab495	nir: Add functions to subtract and compare addresses v2: Fix comparing addresses from formats that have more than one component by using nir_ball_iequal(). (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Caio Marcelo de Oliveira Filho	09cc3389b9	nir: Add nir_ball_iequal() helper Similar to nir_bany_inequal(). Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-06-03 13:45:09 -07:00
Jonathan Marek	91672becc3	nir: copy intrinsic type when lowering load input/uniform and store output Fixes: `c1275052` "nir: add type information to load uniform/input and store output intrinsics" Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Tested-by: Erico Nunes <nunes.erico@gmail.com> Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>	2019-06-03 12:46:14 +00:00
Caio Marcelo de Oliveira Filho	75590604a9	nir: Return nir_type_invalid for non-numeric base types Now that the type gathering function look at instructions that might have other types, return invalid type instead of crashing. That invalid will be properly ignored later. Fixes: `c12750527b` "nir: add type information to load uniform/input and store output intrinsics" Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 16:27:03 -07:00
Jonathan Marek	f387c2b238	nir: remove bool lowering from lower_int_to_float Removes the bool_to_float logic from the int_to_float pass, so that both can be used separately. By having separate passes we have better validation and it makes it possible to use with the lower_ftrunc option (int lowering generates ftrunc, but lower_ftrunc generates bools, ftrunc lowering should probably be reworked). For now we always expect lower_bool to come after lower_int. Also fixes f2i32 to become ftrunc and adds u2f/f2u cases. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	f6579ee204	nir: fix lower_{int,bool}_to_float for new mov opcode It is treated like the vecN instructions which also have no type. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	f889180ee1	nir: add lower_bitshift option Add a "lower_bitshift" option, which disables optimizations introducing bitshifts and lowers ishl by constant to a multiply, so that we don't have to deal with bitshifts in int_to_float lowering. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	887c2a6092	nir: fix gather_ssa_types Consts and undefs can be used as different types (common with "0" constant) so don't copy types from consts/undefs, only to them. It doesn't entirely solve the problem that the type given to the const could be wrong , but now the only realistic case is with "0" which is the same when casted to float, so it doesn't matter for lower_int_to_float. The other change is to get type information for load input/uniform and store output, and use that to get correct results. Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	c12750527b	nir: add type information to load uniform/input and store output intrinsics This type information will be used by gather_ssa_types to get usable results Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Jonathan Marek	6016df211f	nir: improvements to native_integers removal Improvements related to the patch that removed native_integers: * In glsl_to_nir, special cases for i2f,u2f,etc are no longer needed * In prog_to_nir, use sge/slt and let lower_scmp lower it if needed Signed-off-by: Jonathan Marek <jonathan@marek.ca> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 21:35:26 +00:00
Connor Abbott	3bd0733011	nir/instr_set: Use _mesa_set_search_or_add() Before this change, we were searching for each instruction twice, once when checking if it exists and once when figuring out where to insert it. By using the new function, we can do everything we need to do in one operation. Compilation time numbers for my shader-db database: Difference at 95.0% confidence -4.04706 +/- 0.669508 -0.922142% +/- 0.151948% (Student's t, pooled s = 0.95824) Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Jason Ekstrand <jason@jlekstrand.net>	2019-05-31 19:13:59 +02:00
Ian Romanick	3ee2e84c60	nir: Rematerialize compare instructions On some architectures, Boolean values used to control conditional branches or condtional selection must be propagated into a flag. This generally means that a stored Boolean value must be compared with zero. Rather than force the generation of extra compares with zero, re-emit the original comparison instruction. This can save register pressure by not needing to store the Boolean value. There are several possible ares for future improvement to this pass: 1. Be more conservative. If both sources to the comparison instruction are non-constants, it may be better for register pressure to emit the extra compare. The current shader-db results on Intel GPUs (next commit) lead me to believe that this is not currently a problem. 2. Be less conservative. Currently the pass requires that all users of the comparison match the pattern. The idea is that after the pass is complete, no instruction will use the resulting Boolean value. The only uses will be of the flag value. It may be beneficial to relax this requirement in some cases. 3. Be less conservative. Also try to rematerialize comparisons used for discard_if intrinsics. After changing the way the Intel compiler generates cod e for discard_if (see MR!935), I tried implementing this already. The changes were pretty small. Instructions were helped in 19 shaders, but, overall, cycles were hurt. A commit "nir: Rematerialize comparisons for nir_intrinsic_discard_if too" is on my fd.o cgit. 4. Copy the preceeding ALU instruction. If the comparison is a comparison with zero, and it is the only user of a particular ALU instruction (e.g., (a+b) != 0.0), it may be a further improvment to also copy the preceeding ALU instruction. On Intel GPUs, this may enable cmod propagation to make additional progress. v2: Use much simpler method to get the prev_block for an if-statement. Suggested by Tim. Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-05-31 08:47:03 -07:00
Ian Romanick	336eab0630	nir: Add a shallow clone function for nir_alu_instr Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Matt Turner <mattst88@gmail.com>	2019-05-31 08:47:03 -07:00
Bas Nieuwenhuizen	e24a7840f6	nir: Actually propagate progress in nir_opt_move_load_ubo. Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950). Fixes: `af355aaa07` "nir: add nir_opt_move_load_ubo() optimization pass" Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-05-31 07:45:43 +00:00
Jason Ekstrand	f1cb3348f1	nir/split_vars: Properly bail in the presence of complex derefs Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	cc59503b16	nir/vars_to_ssa: Properly ignore variables with complex derefs Because the core principle of the vars_to_ssa pass is that it globally (within a function) looks at all of the uses of a never-indirected path and does a full into-SSA on that path, it can't handle a path which has any chance of having aliasing. If a function_temp variable has a cast or anything else which may cause aliasing, we have to assume that all paths to that variable may alias and ignore the entire variable. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	911ea2c66f	nir/vars_to_ssa: Use a non-null UNDEF_NODE pointer We're about to change the meaning of get_deref_node returning NULL so we need a non-NULL value to mean properly undefined. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00
Jason Ekstrand	e84194686d	nir/deref: Add a has_complex_use helper This lets passes easily detect derefs which have uses that fall outside the standard load/store/copy pattern so they can bail appropriately. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-05-31 01:08:03 +00:00

1 2 3 4 5 ...

3786 Commits