If you have a struct, the var's base driver location is not the last
driver location that will be accessed in that var. We have a shader
struct member with this number for us, already. Fixes overflows in:
dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4670>
When I was writing drm-shim, I was focused on the v3d kmsro case -- use my
intel device as the kmsro display device and add on a simulator-based v3d
device that we could render with. But for the noop backends we use for
shader-db, it's a lot more useful to just overwrite the first render node
in the system so that you don't have to pass a -d <how many render nodes I
already have in my system> argument.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4664>
Back in a2xx, HW pitches were in pixels, so storing that was reasonable.
Ever since then, the HW wants pitches in bytes, and we have only one
instance of using pitch in pixels in the code (a3xx sysmem path).
Flip things around so that only a2xx has to worry about the cpp for
looking at pitches.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4558>
Instead of copying the operands of the other p_create_vector and labelling
the definition with label_vec, copy the operands and label it with
label_temp so that it can be copy-propagated.
This was found while removing a redundant copy in load_input_from_temps()
which removed duplicate p_create_vector instructions.
shader-db (Navi):
Totals from 139 (0.11% of 127638) affected shaders:
VGPRs: 8472 -> 7948 (-6.19%)
CodeSize: 514592 -> 512368 (-0.43%)
MaxWaves: 1089 -> 1195 (+9.73%)
Instrs: 100214 -> 99658 (-0.55%)
Cycles: 400856 -> 398632 (-0.55%)
VMEM: 15545 -> 15338 (-1.33%)
Copies: 5140 -> 4584 (-10.82%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>
For copies like v[7:8] = v[8:9], what currently happens is:
- do_copy() will skip the second dword
- the uses of the second dword will be reduced to 0
- the copy operation will be removed from the map
and v8 will never be set to v9.
So just decrease the uses of other operations after splitting or removing
the current operation, so: "v8 = v9" will be split off, it's uses reduced
and then the new copy will be done in the next iteration.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4686>
The macro "_WINVER" does nothing, the macro definitions that matter for
windows API version selection are "_WIN32_WINNT" and "WINVER".
The header "sdkddkver.h" (which is included from thousands of
different windows-headers) defines "WINVER" to the same value as
"_WIN32_WINNT" of only the latter is defined, which explains why this
works right now. But we shouldn't depend on that kind of luck, and
instead define the right maco.
Fixes: 3aee462781 ("meson: add windows compiler checks and libraries")
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4681>
nir_opt_load_store_vectorize has an is_strided_vector() function that
looks for types with weird explicit strides. It does so by comparing
the explicit stride against the type-size-derived typical stride.
This had a subtle bug. Simple vector types (vec2/3/4) have no explicit
stride, so glsl_get_explicit_stride() returns 0. This never matches the
typical stride for a vector, so is_strided_vector() would return true
for basically any vector type, causing the vectorizer to bail.
I found this by looking at a compute shader with scalar SSBO loads at
offsets 0x220, 0x224, 0x228, 0x22c. nir_opt_load_store_vectorize would
properly vectorize the first two into a vec2 load, but would refuse to
extend it to a vec3 and ultimately vec4 load because is_strided_vector()
saw a vec2 and freaked out.
Neither ACO nor ANV do load/store vectorization before lowering derefs,
so this shouldn't affect them. However, I'd like to fix this bug to
avoid the trap for anyone who decides to in the future. In a branch
where anv used this lowering, this cut an additional 38% of the send
messages in the shader by properly vectorizing more things.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4255>
In case of transform feedback query, it writes two integer values,
which one is for primitives written and another is for primitives
generated.
To handle this, the second member of the struct slot_value is worth
to be presented not as a padding.
In addition, we also need to modify get/copy_result to access both
values.
This patch is the prep work for the transform feedback query support.
Tested with
dEQP-VK.pipeline.timestamp.*
dEQP-VK.query_pool.occlusion_query.*
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Brian Ho <brian@brkho.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4604>
VEC4_OPCODE_PICK_HIGH_32BIT performs 32-bit UD access on a 64-bit DF
value. abs and negate make sense on DF, but break entirely when
trying to access pieces of the value as unsigned integer dwords.
Fixes an fsign Piglit test on Ivybridge:
tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-neg-abs
It had regressed when I removed nir_lower_to_source_modifiers, as that
caused us to start generating different code which provoked this bug.
Fixes: b7c47c4f7c ("intel/compiler: Drop nir_lower_to_source_mods() and related handling.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2817
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4691>