Mike Blumenkrantz
fcf58e75d0
lavapipe: heap-allocate rendering_state struct
...
this thing is like 28k now, which is just way too big to have on the stack
Reviewed-by: Dave Airlie <airlied@redhat.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15233 >
2022-03-07 03:56:46 +00:00
Mike Blumenkrantz
c82dcdf598
gallivm: avoid division by zero when computing cube face
...
this is illegal and produces NaNs which blow up the sample instr
cc: mesa-stable
fixes (llvmpipe and zink):
KHR-GL45.incomplete_texture_access.sampler
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic.samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.basic_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_pointer.render.struct_in_array.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.array_in_struct.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.basic.samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.basic_struct.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.basic_struct.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.nested_structs_arrays.sampler2D_samplerCube_vertex
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_both
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_fragment
dEQP-GLES31.functional.program_uniform.by_value.render.struct_in_array.sampler2D_samplerCube_vertex
Reviewed-by: Roland Scheidegger <sroland@vmware.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15246 >
2022-03-06 02:44:17 +00:00
Mike Blumenkrantz
cf9454bb2a
gallivm: fix debug prints for halfs
...
Reviewed-by: Roland Scheidegger <sroland@vmware.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15246 >
2022-03-06 02:44:17 +00:00
Icecream95
ba18799ca1
pan/bi: Don't assign slots for the blend second source
...
Another instruction might write to the second source, and then an
INSTR_INVALID_ENC fault will be raised because the tuple will write to
and read from the register at the same time.
Fixes: 795638767d ("pan/bi: Use fused dual source blending")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:55:00 -05:00
Icecream95
66a604efb5
pan/bi: Skip psuedo sources in ISA.xml
...
The second staging register source for the +BLEND instruction should
not be packed nor disassembled, so skip it when include_pseudo is not
set.
Fixes: 795638767d ("pan/bi: Use fused dual source blending")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:55:00 -05:00
Icecream95
9d4441c71a
panfrost: Fix ubo_mask calculation
...
BITSET_MASK returns ~0 when given an input of zero, when we need it to
return 0 instead.
Fixes shaders with only sysvals but no UBOs when push constants are
disabled.
This breaks when 31 or 32 UBOs are used, but PAN_MAX_CONST_BUFFERS is
currently set to 16.
Fixes: c246af0dd8 ("panfrost: Only upload UBOs when needed")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:55:00 -05:00
Icecream95
0b232b8659
panfrost: Improve comment for emit_fragment_job
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:55:00 -05:00
Icecream95
24101d944b
pan/bi: Add documentation for bifrost_nir_lower_store_component
...
Taken from the commit that introduced the function,
95458c4033 ("pan/bi: Lower stores with component != 0").
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:55:00 -05:00
Icecream95
42caddcf6b
pan/bi: Make disassembler build reproducibly
...
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:55:00 -05:00
Icecream95
d6c431c2e3
panfrost: Re-emit descriptors after resource shadowing
...
This could be made slightly more efficient by only setting the dirty
state that is needed, but eventually you reach a point where it's
cheaper to re-emit everything than work out what can or can't be kept.
Fixes rendering issues in Duckstation.
Fixes: cd2c1ef9da ("panfrost: Dirty track textures/samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:54:58 -05:00
Icecream95
b164ee0d7b
panfrost: Set dirty state in set_shader_buffers
...
Otherwise the pointer (which is uploaded as a sysval) won't be updated
when a new SSBO is bound.
Fixes: c34b760b9f ("panfrost: Dirty track constant buffers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:50:09 -05:00
Icecream95
cb8c47b15e
pan/bi: Check dependencies of both destinations of instructions
...
TEXC can have two destinations; the value for neither of them can be
used in the same bundle, so extend the code to check for this to
iterate over both destinations.
Fixes artefacts in the game "LIMBO".
Fixes: a303076c1a ("pan/bi: Add bi_instr_schedulable predicate")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:50:09 -05:00
Icecream95
9e714f7455
pan/bi: Add interference between destinations
...
Trying to write to overlapping register ranges from a single
instruction is undefined behaviour, so add interference between the
nodes to avoid this.
Hit in a dual-texture instruction in LIMBO.
Fixes: 9146bafbb4 ("pan/bi: Add dual texture fusing pass")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:50:09 -05:00
Icecream95
198cb4a77a
panfrost: Disable point size upper limit clamping
...
The hardware already clamps this, there is no need to do it in the
shader.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:50:09 -05:00
Icecream95
66684339d5
panfrost: Update point size limits to match hardware behaviour
...
Found while reverse-engineering the tiler heap format.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:50:09 -05:00
Icecream95
d54efebf04
panfrost: Set PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
...
Fixes arb-provoking-vertex-render Piglit test.
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:47:24 -05:00
Icecream95
948300da27
pan/mdg: Use util_logbase2 instead of C99 log2
...
log2 operates on double, we only need the integer util/ function.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15250 >
2022-03-05 14:27:44 -05:00
Ilia Mirkin
e42a8a5b92
a4xx: add emission of compute state, and compute dispatch
...
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794 >
2022-03-05 03:21:05 -05:00
Ilia Mirkin
63bba1dc6c
a4xx: add logic to emit image/ssbo state
...
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794 >
2022-03-05 03:21:05 -05:00
Ilia Mirkin
aac7028b58
freedreno/ir3: support a4xx compute differences
...
Mainly the workgroup id comes injected via consts by the hardware (or
CP), and we must make room for it, otherwise the driver won't know where
to put it.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794 >
2022-03-05 03:21:05 -05:00
Ilia Mirkin
6fb5e64ead
freedreno/ir3: support a4xx in load/store buffer/image emission
...
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14794 >
2022-03-05 03:21:05 -05:00
Rob Clark
e9cd4fba6f
freedreno/perfetto+fdperf: Set SYSPROF param
...
No need to check error return and deal with older kernels. Older
kernels won't have this param but their default behavior allows for
systemwide perfcntr collection.
Signed-off-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236 >
2022-03-04 16:06:34 -08:00
Rob Clark
af4b7f74b2
freedreno/drm: Add SYSPROF param
...
Add new param for putting kernel in system-profiling mode and add
corresponding fd_pipe_set_param() mechanism.
Signed-off-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236 >
2022-03-04 16:06:34 -08:00
Rob Clark
f925794b16
freedreno: Update uapi header
...
Signed-off-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15236 >
2022-03-04 16:05:10 -08:00
Rob Clark
d2e498b6a5
egl+libsync: Add helper to complain about invalid fence fd's
...
Debugging fd lifetime issues can be hard. Add a helper for debug builds
to print out an error if an fd is not a fence fd, and sprinkle it around
Signed-off-by: Rob Clark <robdclark@chromium.org >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094 >
2022-03-04 22:16:20 +00:00
Rob Clark
1e25f3b282
android: Push in-fence-fd down to driver
...
Rather than immediately stall on the CPU in SwapBuffers() if the
in-fence for the dequeued buffer is not yet signaled, push it down
to the driver.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6048
Signed-off-by: Rob Clark <robdclark@chromium.org >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094 >
2022-03-04 22:16:20 +00:00
Rob Clark
dfac374220
gallium/dri: Extend image extension to support in-fence
...
Extend dri so that an in-fence-fd can be plumbed through to driver.
Signed-off-by: Rob Clark <robdclark@chromium.org >
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15094 >
2022-03-04 22:16:20 +00:00
Samuel Pitoiset
af2951dde8
radv/ci: update list of expected failures
...
Add dEQP-VK.glsl.builtin.precision_double.determinant.compute.mat3
which fails on all generations.
It looks like CTS should relax tolerance slightly.
Co-authored-by: Charlie Turner <cturner@igalia.com >
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234 >
2022-03-04 18:43:18 +01:00
Samuel Pitoiset
51c6fdf708
radv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask
...
They randomly hang on Navi10 and randomly fail on Sienna Cichlid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234 >
2022-03-04 18:43:16 +01:00
Juan A. Suarez Romero
7ffee7f1ab
v3d: rebind sampler view if resource changed the BO
...
When discarding the whole resource to create a new one, if this resource
is used by a sampler view, a rebind must be done to use the new
resource.
But this must be done when setting the sampler views, because we don't
have access to those samplers before.
v2:
- Pack shader state on setting sampler views (Iago)
- Use a serial ID to know when to rebind sampler views (Juan)
v3:
- Move check to caller (Iago)
- Keep rebind sampler view on BO change (Iago)
v4:
- Rename "serial_bo" to "serial_id" (Iago)
- Add comments (Iago)
Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6027
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com >
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15171 >
2022-03-04 17:20:28 +00:00
Alyssa Rosenzweig
7bda838c56
panfrost: Push twice as many uniforms
...
The limit for Bifrost is twice as high as previously thought -- the limit is 64
*slots* of FAU, not 64 words. Each slot is 2 words. We can push twice as much,
saving a considerable number of cycles in some cases.
total instructions in shared programs: 2454260 -> 2431502 (-0.93%)
instructions in affected programs: 845176 -> 822418 (-2.69%)
helped: 3376
HURT: 304
helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6
helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11%
HURT stats (abs) min: 1.0 max: 60.0 x̄: 13.06 x̃: 8
HURT stats (rel) min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52%
95% mean confidence interval for instructions value: -6.50 -5.87
95% mean confidence interval for instructions %-change: -3.75% -3.43%
Instructions are helped.
total tuples in shared programs: 1963383 -> 1951560 (-0.60%)
tuples in affected programs: 638622 -> 626799 (-1.85%)
helped: 2959
HURT: 573
helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4
helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12%
HURT stats (abs) min: 1.0 max: 50.0 x̄: 8.35 x̃: 6
HURT stats (rel) min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92%
95% mean confidence interval for tuples value: -3.61 -3.08
95% mean confidence interval for tuples %-change: -2.18% -1.85%
Tuples are helped.
total clauses in shared programs: 387817 -> 365111 (-5.85%)
clauses in affected programs: 135527 -> 112821 (-16.75%)
helped: 3489
HURT: 25
helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5
helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87%
HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.56 x̃: 1
HURT stats (rel) min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67%
95% mean confidence interval for clauses value: -6.67 -6.26
95% mean confidence interval for clauses %-change: -17.65% -16.96%
Clauses are helped.
total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%)
cycles in affected programs: 84035.50 -> 50947.33 (-39.37%)
helped: 3547
HURT: 136
helped stats (abs) min: 0.041665999999999315 max: 54.0 x̄: 9.33 x̃: 8
helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84%
HURT stats (abs) min: 0.041665999999999315 max: 1.0 x̄: 0.12 x̃: 0
HURT stats (rel) min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61%
95% mean confidence interval for cycles value: -9.26 -8.71
95% mean confidence interval for cycles %-change: -35.34% -34.11%
Cycles are helped.
total arith in shared programs: 74918.46 -> 75022.62 (0.14%)
arith in affected programs: 22471.04 -> 22575.21 (0.46%)
helped: 1571
HURT: 1492
helped stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.17 x̃: 0
helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96%
HURT stats (abs) min: 0.041665999999999315 max: 2.375 x̄: 0.25 x̃: 0
HURT stats (rel) min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37%
95% mean confidence interval for arith value: 0.02 0.05
95% mean confidence interval for arith %-change: 1.08% 1.56%
Arith are HURT.
total ldst in shared programs: 174812 -> 137889 (-21.12%)
ldst in affected programs: 81319 -> 44396 (-45.41%)
helped: 3722
HURT: 0
helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8
helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75%
95% mean confidence interval for ldst value: -10.20 -9.64
95% mean confidence interval for ldst %-change: -47.97% -46.39%
Ldst are helped.
total quadwords in shared programs: 1757124 -> 1714130 (-2.45%)
quadwords in affected programs: 584065 -> 541071 (-7.36%)
helped: 3474
HURT: 173
helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9
helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33%
HURT stats (abs) min: 1.0 max: 26.0 x̄: 5.76 x̃: 4
HURT stats (rel) min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63%
95% mean confidence interval for quadwords value: -12.21 -11.37
95% mean confidence interval for quadwords %-change: -8.36% -7.95%
Quadwords are helped.
total threads in shared programs: 52898 -> 53142 (0.46%)
threads in affected programs: 262 -> 506 (93.13%)
helped: 250
HURT: 6
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.92 0.99
95% mean confidence interval for threads %-change: 93.69% 99.28%
Threads are helped.
total spills in shared programs: 161 -> 107 (-33.54%)
spills in affected programs: 54 -> 0
helped: 27
HURT: 0
total fills in shared programs: 1386 -> 796 (-42.57%)
fills in affected programs: 590 -> 0
helped: 27
HURT: 0
Fixes: d4dccea0ba ("panfrost: Add UBO push data structure")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239 >
2022-03-04 15:22:04 +00:00
Alyssa Rosenzweig
e7cfe18099
pan/bi: Run CSE after lowering FAU
...
Lowering FAU can add moves from uniforms. If a uniform is moved out to a
register mulitple times in a basic block, these moves can be CSE'd, saving
instructions at the cost of register pressure.
854 shaders in my shader-db are helped on cycle count (average 2.94% reduction
in cycles). Only 9 shaders have hurt thread count, and there is no change in
spills or fills. Overall, this seems to be a win.
Prevents instruction count regressions from the next commit.
total instructions in shared programs: 2454423 -> 2444690 (-0.40%)
instructions in affected programs: 386274 -> 376541 (-2.52%)
helped: 2105
HURT: 0
helped stats (abs) min: 1.0 max: 116.0 x̄: 4.62 x̃: 2
helped stats (rel) min: 0.04% max: 27.27% x̄: 3.64% x̃: 1.92%
95% mean confidence interval for instructions value: -4.91 -4.33
95% mean confidence interval for instructions %-change: -3.83% -3.45%
Instructions are helped.
total tuples in shared programs: 1963534 -> 1957106 (-0.33%)
tuples in affected programs: 233562 -> 227134 (-2.75%)
helped: 1491
HURT: 117
helped stats (abs) min: 1.0 max: 63.0 x̄: 4.44 x̃: 2
helped stats (rel) min: 0.04% max: 24.53% x̄: 4.39% x̃: 2.59%
HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.61 x̃: 1
HURT stats (rel) min: 0.18% max: 8.33% x̄: 1.44% x̃: 1.05%
95% mean confidence interval for tuples value: -4.28 -3.71
95% mean confidence interval for tuples %-change: -4.20% -3.73%
Tuples are helped.
total clauses in shared programs: 387848 -> 387079 (-0.20%)
clauses in affected programs: 13718 -> 12949 (-5.61%)
helped: 583
HURT: 60
helped stats (abs) min: 1.0 max: 16.0 x̄: 1.42 x̃: 1
helped stats (rel) min: 1.11% max: 25.00% x̄: 8.28% x̃: 6.67%
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.86% max: 20.00% x̄: 4.58% x̃: 4.00%
95% mean confidence interval for clauses value: -1.29 -1.10
95% mean confidence interval for clauses %-change: -7.57% -6.58%
Clauses are helped.
total cycles in shared programs: 201866.21 -> 201682.92 (-0.09%)
cycles in affected programs: 6241.79 -> 6058.50 (-2.94%)
helped: 952
HURT: 98
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.20 x̃: 0
helped stats (rel) min: 0.12% max: 26.00% x̄: 4.05% x̃: 2.38%
HURT stats (abs) min: 0.041665999999999315 max: 0.16666700000000034 x̄: 0.07 x̃: 0
HURT stats (rel) min: 0.18% max: 8.70% x̄: 1.60% x̃: 1.43%
95% mean confidence interval for cycles value: -0.19 -0.16
95% mean confidence interval for cycles %-change: -3.80% -3.24%
Cycles are helped.
total arith in shared programs: 74924.00 -> 74660.12 (-0.35%)
arith in affected programs: 9303.67 -> 9039.79 (-2.84%)
helped: 1513
HURT: 118
helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.18 x̃: 0
helped stats (rel) min: 0.07% max: 33.33% x̄: 4.68% x̃: 2.67%
HURT stats (abs) min: 0.041665999999999315 max: 0.16666800000000137 x̄: 0.07 x̃: 0
HURT stats (rel) min: 0.18% max: 8.70% x̄: 1.55% x̃: 1.37%
95% mean confidence interval for arith value: -0.17 -0.15
95% mean confidence interval for arith %-change: -4.48% -3.98%
Arith are helped.
total quadwords in shared programs: 1757254 -> 1751978 (-0.30%)
quadwords in affected programs: 197399 -> 192123 (-2.67%)
helped: 1464
HURT: 110
helped stats (abs) min: 1.0 max: 51.0 x̄: 3.73 x̃: 2
helped stats (rel) min: 0.04% max: 21.95% x̄: 4.16% x̃: 2.52%
HURT stats (abs) min: 1.0 max: 7.0 x̄: 1.71 x̃: 1
HURT stats (rel) min: 0.21% max: 13.04% x̄: 1.65% x̃: 0.93%
95% mean confidence interval for quadwords value: -3.58 -3.13
95% mean confidence interval for quadwords %-change: -3.97% -3.53%
Quadwords are helped.
total threads in shared programs: 52899 -> 52890 (-0.02%)
threads in affected programs: 18 -> 9 (-50.00%)
helped: 0
HURT: 9
HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239 >
2022-03-04 15:22:04 +00:00
Henry Goffin
c8f644ec44
frontends/va: ignore incoming frame_num from VA picture parameters
...
The Gallium pipe video "frame_num" variable is internally used as a
counter of elapsed reference frames since the last IDR. The incoming
frame_num field from VA picture parameters is not equivalent; the VA
value may wrap to zero prematurely, as it is a 16-bit struct field with
a documented max value of 2^(log2_max_frame_num_minus4 + 4)-1.
This change improves "infinite GOP" single-client live streaming, where
it is reasonable for the server to desire an endless series of P-frames
without IDR. Without this change, it is difficult/impossible for an
application to encode a P- or B-frame after the VA frame_num field wraps
around to zero, depending on the backend encoder implementation.
This change has no effect on existing applications that always signal an
IDR frame and reset the VA frame_num to zero before it wraps around. For
example, the FFmpeg vaapi encoder ignores the VA documentation and sends
an un-wrapped VA frame_num, which results in identical computation of
the internal frame_num (as long as each GOP is less than 65536 frames).
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5768
Reviewed-by: Thong Thai <thong.thai@amd.com >
patch revision 3: correctly avoid incrementing frame_num when the encoded
frame is not a reference, per h264 spec and ffmpeg behavior
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14332 >
2022-03-04 14:17:20 +00:00
Rhys Perry
d28b6b6856
aco: rework removal of jumps over branches
...
Only allow this in situations where we know it's safe. In particular, this
stops removal of unconditional branches like with
block_kind_continue_or_break.
Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang.
fossil-db (Sienna Cichlid):
Totals from 34 (0.02% of 162293) affected shaders:
Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08%
CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06%
Latency: 3467316 -> 3467652 (+0.01%)
InvThroughput: 3085493 -> 3085578 (+0.00%)
Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Fixes: f030b75b7d ("aco: relax condition to remove branches in case of few instructions")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214 >
2022-03-04 12:32:36 +00:00
Samuel Pitoiset
059f870d74
ac/nir: implement nir_op_pack_{uint,sint}_2x16
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231 >
2022-03-04 08:06:56 +00:00
Samuel Pitoiset
9b113f1b6c
aco: implement nir_op_pack_{uint,sint}_2x16
...
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231 >
2022-03-04 08:06:56 +00:00
Samuel Pitoiset
6532307555
nir: introduce nir_pack_{sint,uint}_2x16 instructions
...
These instructions have AMD hardware equivalent and they will be used
to lower fragment shader outputs in NIR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com >
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231 >
2022-03-04 08:06:56 +00:00
Xiaohui Gu
4d81c60e11
iris: Mark a dirty update when vs_needs_sgvs_element value changed
...
Add vs_needs_sgvs_element value check when updating vertex
element dirty state in iris_update_compiled_vs to solve
render error of Android game "Genshin Impact".
Signed-off-by: Xiaohui Gu <xiaohui.gu@intel.com >
Reviewed-by: Tapani Pälli <tapani.palli@intel.com >
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15142 >
2022-03-04 05:41:38 +00:00
Yiwei Zhang
aaa25cda0b
venus: add VK_EXT_image_robustness support
...
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205 >
2022-03-04 01:04:13 +00:00
Yiwei Zhang
ba212bf888
venus: add VK_EXT_provoking_vertex support
...
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205 >
2022-03-04 01:04:13 +00:00
Yiwei Zhang
33ba61b059
venus: add VK_EXT_line_rasterization support
...
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205 >
2022-03-04 01:04:13 +00:00
Yiwei Zhang
58182eb096
venus: update to latest venus protocol
...
Added the below extension support:
- VK_EXT_line_rasterization
- VK_EXT_provoking_vertex
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205 >
2022-03-04 01:04:13 +00:00
Yiwei Zhang
20efd9eff3
venus: group extensions promoted to 1.3
...
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205 >
2022-03-04 01:04:13 +00:00
Yiwei Zhang
fe3815b7fa
venus: clean up physical device features and properties
...
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205 >
2022-03-04 01:04:13 +00:00
Daniel Schürmann
ca4595e01a
nir/opt_shrink_vectors: update docstring
...
in order to reflect the various recent improvements.
Reviewed-by: Emma Anholt <emma@anholt.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468 >
2022-03-04 00:18:58 +00:00
Daniel Schürmann
405829cd85
nir/opt_shrink_vectors: remove duplicate components from vecN
...
vecN instructions which are only used by other ALU
will now get duplicate channels removed.
i915g:
total instructions in shared programs: 396309 -> 396294 (<.01%)
instructions in affected programs: 186 -> 171 (-8.06%)
r300:
total instructions in shared programs: 1165059 -> 1164354 (-0.06%)
instructions in affected programs: 35884 -> 35179 (-1.96%)
total temps in shared programs: 165497 -> 165326 (-0.10%)
temps in affected programs: 2990 -> 2819 (-5.72%)
softpipe:
total instructions in shared programs: 2860028 -> 2859084 (-0.03%)
instructions in affected programs: 55539 -> 54595 (-1.70%)
total temps in shared programs: 516939 -> 516546 (-0.08%)
temps in affected programs: 6623 -> 6230 (-5.93%)
Acked-by: Emma Anholt <emma@anholt.net >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468 >
2022-03-04 00:18:58 +00:00
Daniel Schürmann
e5963478c2
nir/opt_shrink_vectors: shrink load_const properly
...
This patch enables removal of arbitrary channels in
load_const instructions, if they are either unused or
duplicates of other channels and only used by ALU.
Totals from 692 (0.51% of 134913) affected shaders: (GFX10.3)
VGPRs: 21832 -> 21544 (-1.32%)
CodeSize: 1322016 -> 1313080 (-0.68%); split: -0.68%, +0.01%
Instrs: 243635 -> 242231 (-0.58%); split: -0.58%, +0.00%
Latency: 1856138 -> 1857237 (+0.06%); split: -0.09%, +0.15%
InvThroughput: 424298 -> 421671 (-0.62%); split: -0.62%, +0.01%
VClause: 4580 -> 4583 (+0.07%); split: -0.02%, +0.09%
SClause: 14336 -> 14354 (+0.13%); split: -0.04%, +0.17%
Copies: 8897 -> 8859 (-0.43%); split: -0.45%, +0.02%
PreSGPRs: 20439 -> 20437 (-0.01%)
PreVGPRs: 16011 -> 15907 (-0.65%); split: -0.97%, +0.32%
i915g:
total instructions in shared programs: 396471 -> 396309 (-0.04%)
instructions in affected programs: 6408 -> 6246 (-2.53%)
total const in shared programs: 56458 -> 56422 (-0.06%)
const in affected programs: 407 -> 371 (-8.85%)
LOST: shaders/closed/steam/trine-2/fp-3.shader_test FS
r300:
total instructions in shared programs: 1164421 -> 1165059 (0.05%)
instructions in affected programs: 143981 -> 144619 (0.44%)
total temps in shared programs: 165488 -> 165497 (<.01%)
temps in affected programs: 318 -> 327 (2.83%)
total consts in shared programs: 922140 -> 921952 (-0.02%)
consts in affected programs: 12438 -> 12250 (-1.51%)
softpipe:
total instructions in shared programs: 2859978 -> 2860028 (<.01%)
instructions in affected programs: 183355 -> 183405 (0.03%)
total temps in shared programs: 517071 -> 516939 (-0.03%)
temps in affected programs: 1416 -> 1284 (-9.32%)
total imm in shared programs: 103601 -> 102767 (-0.81%)
imm in affected programs: 3928 -> 3094 (-21.23%)
Acked-by: Emma Anholt <emma@anholt.net >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468 >
2022-03-04 00:18:58 +00:00
Dave Airlie
a10b5d7086
crocus: change the line width workaround for gfx4/5
...
This fixes piglit line-flat-clip-color and the hud fps counter.
Fixes: 6b7a68b7c2 ("crocus: add missing line smooth bits.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15229 >
2022-03-04 00:06:28 +00:00
Chia-I Wu
bbbbf39559
venus: abort when stuck
...
This gives
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 4096
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 8192
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 12288
MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 16384
MESA-VIRTIO: debug: aborting
Aborted
which should be more friendly than printing the messages forever.
On my i7-7820HQ, this aborts after roughly 4+8+16+32=60 seconds
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15200 >
2022-03-03 21:48:13 +00:00
Daniel Schürmann
ccf4bcd162
aco/ra: don't immediately assign a register for p_branch
...
These get now assigned after handling phis.
Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3)
CodeSize: 5519744 -> 5515308 (-0.08%)
Instrs: 1063045 -> 1061936 (-0.10%)
Latency: 11880452 -> 11875904 (-0.04%)
InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00%
Copies: 86908 -> 85799 (-1.28%)
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432 >
2022-03-03 20:21:08 +00:00