Ian Romanick
7078105592
soft-fp64/fadd: Just let the subtraction happen when the result will be zero
...
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 815717 -> 815491 (-0.03%)
instructions in affected programs: 735489 -> 735263 (-0.03%)
helped: 39
HURT: 34
helped stats (abs) min: 2 max: 192 x̄: 20.79 x̃: 12
helped stats (rel) min: 0.01% max: 0.46% x̄: 0.26% x̃: 0.28%
HURT stats (abs) min: 1 max: 65 x̄: 17.21 x̃: 11
HURT stats (rel) min: <.01% max: 1.11% x̄: 0.35% x̃: 0.19%
95% mean confidence interval for instructions value: -10.40 4.21
95% mean confidence interval for instructions %-change: -0.07% 0.13%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 6820707 -> 6813681 (-0.10%)
cycles in affected programs: 6388725 -> 6381699 (-0.11%)
helped: 51
HURT: 23
helped stats (abs) min: 3 max: 1837 x̄: 184.76 x̃: 120
helped stats (rel) min: <.01% max: 0.48% x̄: 0.25% x̃: 0.25%
HURT stats (abs) min: 18 max: 216 x̄: 104.22 x̃: 98
HURT stats (rel) min: 0.06% max: 0.73% x̄: 0.31% x̃: 0.11%
95% mean confidence interval for cycles value: -154.67 -35.22
95% mean confidence interval for cycles %-change: -0.15% <.01%
Inconclusive result (%-change mean confidence interval includes 0).
total spills in shared programs: 702 -> 703 (0.14%)
spills in affected programs: 702 -> 703 (0.14%)
helped: 0
HURT: 1
total fills in shared programs: 1497 -> 1499 (0.13%)
fills in affected programs: 1497 -> 1499 (0.13%)
helped: 0
HURT: 1
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
cae36fa217
soft-fp64/fadd: Pick zero or non-zero result based on subtraction result
...
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 817327 -> 815717 (-0.20%)
instructions in affected programs: 755504 -> 753894 (-0.21%)
helped: 73
HURT: 1
helped stats (abs) min: 1 max: 159 x̄: 22.12 x̃: 14
helped stats (rel) min: 0.05% max: 0.40% x̄: 0.22% x̃: 0.23%
HURT stats (abs) min: 5 max: 5 x̄: 5.00 x̃: 5
HURT stats (rel) min: 0.07% max: 0.07% x̄: 0.07% x̃: 0.07%
95% mean confidence interval for instructions value: -27.27 -16.24
95% mean confidence interval for instructions %-change: -0.24% -0.20%
Instructions are helped.
total cycles in shared programs: 6822826 -> 6820707 (-0.03%)
cycles in affected programs: 6390844 -> 6388725 (-0.03%)
helped: 71
HURT: 3
helped stats (abs) min: 2 max: 537 x̄: 30.72 x̃: 18
helped stats (rel) min: <.01% max: 0.08% x̄: 0.03% x̃: 0.03%
HURT stats (abs) min: 10 max: 32 x̄: 20.67 x̃: 20
HURT stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for cycles value: -43.41 -13.86
95% mean confidence interval for cycles %-change: -0.04% -0.03%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
70be98f17a
soft-fp64/fadd: Massively split the live range of zFrac0 and zFrac1
...
The main purpose of this commit is to prepare for "soft-fp64/fadd: Move
common code out of both branches of an if-statement".
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 822766 -> 817327 (-0.66%)
instructions in affected programs: 760943 -> 755504 (-0.71%)
helped: 74
HURT: 0
helped stats (abs) min: 8 max: 515 x̄: 73.50 x̃: 51
helped stats (rel) min: 0.58% max: 1.10% x̄: 0.77% x̃: 0.73%
95% mean confidence interval for instructions value: -91.17 -55.83
95% mean confidence interval for instructions %-change: -0.81% -0.74%
Instructions are helped.
total cycles in shared programs: 6816791 -> 6822826 (0.09%)
cycles in affected programs: 6384809 -> 6390844 (0.09%)
helped: 0
HURT: 74
HURT stats (abs) min: 6 max: 1179 x̄: 81.55 x̃: 50
HURT stats (rel) min: 0.02% max: 0.17% x̄: 0.09% x̃: 0.09%
95% mean confidence interval for cycles value: 48.99 114.12
95% mean confidence interval for cycles %-change: 0.09% 0.10%
Cycles are HURT.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
73fa3a1ca4
soft-fp64/fadd: Instead of tracking "b < a", track sign of the difference
...
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 824403 -> 822766 (-0.20%)
instructions in affected programs: 756260 -> 754623 (-0.22%)
helped: 68
HURT: 1
helped stats (abs) min: 1 max: 118 x̄: 26.26 x̃: 18
helped stats (rel) min: 0.02% max: 0.97% x̄: 0.31% x̃: 0.23%
HURT stats (abs) min: 149 max: 149 x̄: 149.00 x̃: 149
HURT stats (rel) min: 0.17% max: 0.17% x̄: 0.17% x̃: 0.17%
95% mean confidence interval for instructions value: -31.94 -15.51
95% mean confidence interval for instructions %-change: -0.37% -0.23%
Instructions are helped.
total cycles in shared programs: 6828935 -> 6816791 (-0.18%)
cycles in affected programs: 6385191 -> 6373047 (-0.19%)
helped: 73
HURT: 0
helped stats (abs) min: 2 max: 852 x̄: 166.36 x̃: 120
helped stats (rel) min: <.01% max: 0.80% x̄: 0.22% x̃: 0.17%
95% mean confidence interval for cycles value: -210.80 -121.91
95% mean confidence interval for cycles %-change: -0.27% -0.17%
Cycles are helped.
total fills in shared programs: 1442 -> 1497 (3.81%)
fills in affected programs: 1442 -> 1497 (3.81%)
helped: 0
HURT: 1
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
5b07f542e5
soft-fp64: Optimize __fmin64 and __fmax64 by using different evaluation order [v2]
...
v2: Go to extra effort to avoid flow control inserted to implement
short-circuit evaluation rules.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 797779 -> 796849 (-0.12%)
instructions in affected programs: 3499 -> 2569 (-26.58%)
helped: 21
HURT: 0
helped stats (abs) min: 8 max: 112 x̄: 44.29 x̃: 44
helped stats (rel) min: 16.09% max: 33.15% x̄: 25.72% x̃: 24.62%
95% mean confidence interval for instructions value: -55.94 -32.63
95% mean confidence interval for instructions %-change: -28.14% -23.30%
Instructions are helped.
total cycles in shared programs: 6601355 -> 6588351 (-0.20%)
cycles in affected programs: 25376 -> 12372 (-51.25%)
helped: 21
HURT: 0
helped stats (abs) min: 156 max: 1410 x̄: 619.24 x̃: 526
helped stats (rel) min: 42.39% max: 53.98% x̄: 50.12% x̃: 50.75%
95% mean confidence interval for cycles value: -776.58 -461.89
95% mean confidence interval for cycles %-change: -51.57% -48.67%
Cycles are helped.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net > [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com > [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
617a69107e
soft-fp64/ffloor: Simplify the >= 0 comparison
...
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 797951 -> 797779 (-0.02%)
instructions in affected programs: 126482 -> 126310 (-0.14%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 11.47 x̃: 10
helped stats (rel) min: <.01% max: 0.60% x̄: 0.28% x̃: 0.29%
95% mean confidence interval for instructions value: -14.79 -8.14
95% mean confidence interval for instructions %-change: -0.40% -0.16%
Instructions are helped.
total cycles in shared programs: 6601437 -> 6601355 (<.01%)
cycles in affected programs: 1089336 -> 1089254 (<.01%)
helped: 15
HURT: 0
helped stats (abs) min: 2 max: 12 x̄: 5.47 x̃: 6
helped stats (rel) min: <.01% max: 0.04% x̄: 0.01% x̃: 0.01%
95% mean confidence interval for cycles value: -7.06 -3.87
95% mean confidence interval for cycles %-change: -0.02% <.01%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
abf28d6a70
soft-fp64: Relax the way NaN is propagated
...
Also reassociate a couple expressions to encourage some CSE.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 813599 -> 797951 (-1.92%)
instructions in affected programs: 796110 -> 780462 (-1.97%)
helped: 92
HURT: 0
helped stats (abs) min: 3 max: 5198 x̄: 170.09 x̃: 83
helped stats (rel) min: 0.36% max: 5.50% x̄: 1.57% x̃: 1.40%
95% mean confidence interval for instructions value: -282.42 -57.75
95% mean confidence interval for instructions %-change: -1.71% -1.42%
Instructions are helped.
total cycles in shared programs: 6687128 -> 6601437 (-1.28%)
cycles in affected programs: 6582246 -> 6496555 (-1.30%)
helped: 92
HURT: 0
helped stats (abs) min: 36 max: 14442 x̄: 931.42 x̃: 592
helped stats (rel) min: 0.45% max: 3.16% x̄: 1.44% x̃: 1.23%
95% mean confidence interval for cycles value: -1257.58 -605.27
95% mean confidence interval for cycles %-change: -1.58% -1.30%
Cycles are helped.
total spills in shared programs: 759 -> 702 (-7.51%)
spills in affected programs: 759 -> 702 (-7.51%)
helped: 3
HURT: 0
total fills in shared programs: 2412 -> 1442 (-40.22%)
fills in affected programs: 2412 -> 1442 (-40.22%)
helped: 3
HURT: 0
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
8178fa8876
soft-fp64/fsat: Micro-optimize x >= 1 test
...
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841590 -> 841332 (-0.03%)
instructions in affected programs: 121957 -> 121699 (-0.21%)
helped: 7
HURT: 0
helped stats (abs) min: 15 max: 54 x̄: 36.86 x̃: 41
helped stats (rel) min: 0.16% max: 0.33% x̄: 0.23% x̃: 0.18%
95% mean confidence interval for instructions value: -49.73 -23.98
95% mean confidence interval for instructions %-change: -0.29% -0.16%
Instructions are helped.
total cycles in shared programs: 6926828 -> 6923967 (-0.04%)
cycles in affected programs: 1038569 -> 1035708 (-0.28%)
helped: 7
HURT: 0
helped stats (abs) min: 128 max: 616 x̄: 408.71 x̃: 446
helped stats (rel) min: 0.18% max: 0.44% x̄: 0.29% x̃: 0.22%
95% mean confidence interval for cycles value: -571.72 -245.70
95% mean confidence interval for cycles %-change: -0.38% -0.19%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
b6f58b4709
soft-fp64/fsat: Micro-optimize x < 0 test
...
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841647 -> 841590 (<.01%)
instructions in affected programs: 122014 -> 121957 (-0.05%)
helped: 7
HURT: 0
helped stats (abs) min: 3 max: 12 x̄: 8.14 x̃: 9
helped stats (rel) min: 0.04% max: 0.07% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -11.23 -5.06
95% mean confidence interval for instructions %-change: -0.06% -0.03%
Instructions are helped.
total cycles in shared programs: 6926904 -> 6926828 (<.01%)
cycles in affected programs: 1038645 -> 1038569 (<.01%)
helped: 7
HURT: 0
helped stats (abs) min: 4 max: 16 x̄: 10.86 x̃: 12
helped stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -14.97 -6.74
95% mean confidence interval for cycles %-change: -0.01% <.01%
Cycles are helped.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
7673dcbd21
soft-fp64/fsat: Correctly handle NaN
...
fsat is defined as min(max(a, 0.0), 1.0), and IEEE defines both min and
max to return the non-NaN value when one value is NaN. Based on this,
fsat should definitely return 0.0 for NaN.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 841666 -> 841647 (<.01%)
instructions in affected programs: 122033 -> 122014 (-0.02%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.71 x̃: 3
helped stats (rel) min: 0.01% max: 0.02% x̄: 0.02% x̃: 0.01%
95% mean confidence interval for instructions value: -3.74 -1.69
95% mean confidence interval for instructions %-change: -0.02% -0.01%
Instructions are helped.
total cycles in shared programs: 6927246 -> 6926904 (<.01%)
cycles in affected programs: 1038987 -> 1038645 (-0.03%)
helped: 7
HURT: 0
helped stats (abs) min: 18 max: 72 x̄: 48.86 x̃: 54
helped stats (rel) min: 0.03% max: 0.05% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -67.38 -30.33
95% mean confidence interval for cycles %-change: -0.05% -0.02%
Cycles are helped.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Fixes: a42163cbbc ("compiler: Add lowering support for 64-bit saturate operations to software")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2585
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
b421c0466d
soft-fp64/flt: Perform checks in a different order
...
The change to nir_opt_algebraic cleans up a pattern that was never
produced before the rest of this commit was added.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 843005 -> 841666 (-0.16%)
instructions in affected programs: 460655 -> 459316 (-0.29%)
helped: 64
HURT: 17
helped stats (abs) min: 1 max: 72 x̄: 21.72 x̃: 20
helped stats (rel) min: 0.01% max: 28.07% x̄: 12.67% x̃: 16.07%
HURT stats (abs) min: 1 max: 7 x̄: 3.00 x̃: 2
HURT stats (rel) min: 0.01% max: 0.04% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -20.87 -12.19
95% mean confidence interval for instructions %-change: -12.35% -7.66%
Instructions are helped.
total cycles in shared programs: 6944998 -> 6927246 (-0.26%)
cycles in affected programs: 3891872 -> 3874120 (-0.46%)
helped: 71
HURT: 10
helped stats (abs) min: 2 max: 772 x̄: 254.21 x̃: 156
helped stats (rel) min: <.01% max: 66.44% x̄: 21.72% x̃: 18.40%
HURT stats (abs) min: 18 max: 69 x̄: 29.70 x̃: 20
HURT stats (rel) min: 0.02% max: 0.04% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -270.82 -167.50
95% mean confidence interval for cycles %-change: -24.41% -13.65%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
f6992bf624
soft-fp64/fneg: Don't treat NaN specially
...
__fabs64 doesn't do anything special, and the value is still NaN
regardless of the value of the MSB. In a strict sense, it's possible
that both functions should set the "signal" bit.
lts on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 844558 -> 843005 (-0.18%)
instructions in affected programs: 725975 -> 724422 (-0.21%)
helped: 53
HURT: 4
helped stats (abs) min: 1 max: 313 x̄: 29.87 x̃: 21
helped stats (rel) min: 0.01% max: 0.94% x̄: 0.30% x̃: 0.22%
HURT stats (abs) min: 4 max: 11 x̄: 7.50 x̃: 7
HURT stats (rel) min: 0.03% max: 0.09% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -39.02 -15.47
95% mean confidence interval for instructions %-change: -0.34% -0.21%
Instructions are helped.
total cycles in shared programs: 6962024 -> 6944998 (-0.24%)
cycles in affected programs: 6185470 -> 6168444 (-0.28%)
helped: 59
HURT: 0
helped stats (abs) min: 64 max: 2863 x̄: 288.58 x̃: 208
helped stats (rel) min: 0.11% max: 0.87% x̄: 0.33% x̃: 0.27%
95% mean confidence interval for cycles value: -387.15 -190.00
95% mean confidence interval for cycles %-change: -0.38% -0.28%
Cycles are helped.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
de4acd8816
soft-fp64: Store sign value as 0 or 0x80000000
...
...instead of 0 or 1. Many places the sign bit is extracted, then later
put back in the same position. This saves some left-shift operations.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848106 -> 844558 (-0.42%)
instructions in affected programs: 833480 -> 829932 (-0.43%)
helped: 106
HURT: 1
helped stats (abs) min: 1 max: 995 x̄: 33.48 x̃: 12
helped stats (rel) min: 0.15% max: 2.20% x̄: 0.60% x̃: 0.35%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for instructions value: -51.88 -14.43
95% mean confidence interval for instructions %-change: -0.71% -0.47%
Instructions are helped.
total cycles in shared programs: 6969125 -> 6962024 (-0.10%)
cycles in affected programs: 6717689 -> 6710588 (-0.11%)
helped: 78
HURT: 7
helped stats (abs) min: 2 max: 2083 x̄: 110.27 x̃: 56
helped stats (rel) min: <.01% max: 0.30% x̄: 0.11% x̃: 0.11%
HURT stats (abs) min: 2 max: 1340 x̄: 214.29 x̃: 4
HURT stats (rel) min: 0.01% max: 0.71% x̄: 0.13% x̃: 0.02%
95% mean confidence interval for cycles value: -144.02 -23.06
95% mean confidence interval for cycles %-change: -0.12% -0.07%
Cycles are helped.
total spills in shared programs: 814 -> 759 (-6.76%)
spills in affected programs: 814 -> 759 (-6.76%)
helped: 2
HURT: 1
total fills in shared programs: 2488 -> 2412 (-3.05%)
fills in affected programs: 2488 -> 2412 (-3.05%)
helped: 2
HURT: 1
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
598e2fc6a1
soft-fp64: Pick a single idiom for treating sign value as a Boolean
...
Replace all of the bool(qSign) with qSign != 0u. Remove unnecessary
parenthesis from around most of the existing qSign != 0u.
This dramatically simplifies the next commit.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848109 -> 848106 (<.01%)
instructions in affected programs: 53 -> 50 (-5.66%)
helped: 1
HURT: 0
total cycles in shared programs: 6969145 -> 6969125 (<.01%)
cycles in affected programs: 396 -> 376 (-5.05%)
helped: 1
HURT: 0
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
325a21f5eb
soft-fp64: Simplify __countLeadingZeros32 function
...
findMSB returns -1 for an input of zero, so 31 - findMSB(a) is
sufficient on its own.
There's only one user of findMSB in shader-db, and it does not match
this pattern.
TODO: Add a pattern in the backend code generator that emits 31 -
nir_op_ufind_msb(a) as if it were nir_op_uclz. That should save a couple
instructions.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 859509 -> 848109 (-1.33%)
instructions in affected programs: 841058 -> 829658 (-1.36%)
helped: 97
HURT: 0
helped stats (abs) min: 3 max: 1161 x̄: 117.53 x̃: 72
helped stats (rel) min: 0.98% max: 6.74% x̄: 1.70% x̃: 1.35%
95% mean confidence interval for instructions value: -147.21 -87.84
95% mean confidence interval for instructions %-change: -1.94% -1.46%
Instructions are helped.
total cycles in shared programs: 7072275 -> 6969145 (-1.46%)
cycles in affected programs: 6955767 -> 6852637 (-1.48%)
helped: 97
HURT: 0
helped stats (abs) min: 32 max: 10900 x̄: 1063.20 x̃: 560
helped stats (rel) min: 1.18% max: 7.58% x̄: 1.84% x̃: 1.45%
95% mean confidence interval for cycles value: -1339.43 -786.96
95% mean confidence interval for cycles %-change: -2.11% -1.57%
Cycles are helped.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
812230fd94
soft-fp64: Don't open-code umulExtended
...
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 928859 -> 859509 (-7.47%)
instructions in affected programs: 866293 -> 796943 (-8.01%)
helped: 76
HURT: 0
helped stats (abs) min: 75 max: 8042 x̄: 912.50 x̃: 688
helped stats (rel) min: 5.35% max: 21.02% x̄: 10.35% x̃: 7.58%
95% mean confidence interval for instructions value: -1138.37 -686.63
95% mean confidence interval for instructions %-change: -11.69% -9.00%
Instructions are helped.
total cycles in shared programs: 7272912 -> 7072275 (-2.76%)
cycles in affected programs: 6763486 -> 6562849 (-2.97%)
helped: 76
HURT: 0
helped stats (abs) min: 214 max: 30136 x̄: 2639.96 x̃: 1923
helped stats (rel) min: 1.75% max: 9.20% x̄: 4.04% x̃: 2.41%
95% mean confidence interval for cycles value: -3455.29 -1824.63
95% mean confidence interval for cycles %-change: -4.69% -3.39%
Cycles are helped.
total spills in shared programs: 817 -> 814 (-0.37%)
spills in affected programs: 791 -> 788 (-0.38%)
helped: 2
HURT: 0
total fills in shared programs: 2438 -> 2488 (2.05%)
fills in affected programs: 2392 -> 2442 (2.09%)
helped: 0
HURT: 2
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
d1e0227ef1
soft-fp64/b2f: Reimplement using bitwise logic ops
...
This doesn't help a lot of shaders, but it helps those few a LOT.
This could also be implemented using bcsel. That version is very
slightly worse because the generated SEL instruction wants to have two
immediate sources, so one of them usually needs an extra MOV instruction
to load.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 929619 -> 928859 (-0.08%)
instructions in affected programs: 1651 -> 891 (-46.03%)
helped: 8
HURT: 0
helped stats (abs) min: 38 max: 152 x̄: 95.00 x̃: 95
helped stats (rel) min: 42.70% max: 86.36% x̄: 49.88% x̃: 44.66%
95% mean confidence interval for instructions value: -132.97 -57.03
95% mean confidence interval for instructions %-change: -62.28% -37.49%
Instructions are helped.
total cycles in shared programs: 7280180 -> 7272912 (-0.10%)
cycles in affected programs: 12960 -> 5692 (-56.08%)
helped: 8
HURT: 0
helped stats (abs) min: 352 max: 1456 x̄: 908.50 x̃: 910
helped stats (rel) min: 52.45% max: 91.19% x̄: 59.24% x̃: 55.15%
95% mean confidence interval for cycles value: -1274.03 -542.97
95% mean confidence interval for cycles %-change: -70.06% -48.41%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
4e3d69ad07
nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan
...
The pattern is added to opt_algebraic because, for example, comparisons
with constant 0.0 will produce (a1 < 0).
Even with a pass that optimized Boolean expressions, I think this would
be very difficult to automatically recognize and optimize.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 933054 -> 929619 (-0.37%)
instructions in affected programs: 784041 -> 780606 (-0.44%)
helped: 59
HURT: 0
helped stats (abs) min: 2 max: 213 x̄: 58.22 x̃: 44
helped stats (rel) min: 0.02% max: 2.51% x̄: 0.72% x̃: 0.46%
95% mean confidence interval for instructions value: -70.80 -45.64
95% mean confidence interval for instructions %-change: -0.92% -0.53%
Instructions are helped.
total cycles in shared programs: 7304712 -> 7280180 (-0.34%)
cycles in affected programs: 7176260 -> 7151728 (-0.34%)
helped: 92
HURT: 0
helped stats (abs) min: 8 max: 1414 x̄: 266.65 x̃: 166
helped stats (rel) min: 0.04% max: 2.34% x̄: 0.43% x̃: 0.22%
95% mean confidence interval for cycles value: -333.05 -200.26
95% mean confidence interval for cycles %-change: -0.54% -0.31%
Cycles are helped.
Regular shader-db changes:
No changes on any Intel platform.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
e0cefc5a23
nir/algebraic: Constant reassociation for bitwise operations too
...
Like 5886cd79a0 , but for iand, ior, and ixor.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake
total instructions in shared programs: 903108 -> 902830 (-0.03%)
instructions in affected programs: 654910 -> 654632 (-0.04%)
helped: 31
HURT: 5
helped stats (abs) min: 2 max: 31 x̄: 9.58 x̃: 7
helped stats (rel) min: 0.01% max: 0.23% x̄: 0.06% x̃: 0.04%
HURT stats (abs) min: 1 max: 10 x̄: 3.80 x̃: 3
HURT stats (rel) min: 0.01% max: 0.10% x̄: 0.03% x̃: 0.02%
95% mean confidence interval for instructions value: -10.55 -4.89
95% mean confidence interval for instructions %-change: -0.07% -0.03%
Instructions are helped.
total cycles in shared programs: 7059681 -> 7058006 (-0.02%)
cycles in affected programs: 5081309 -> 5079634 (-0.03%)
helped: 33
HURT: 12
helped stats (abs) min: 1 max: 444 x̄: 60.91 x̃: 18
helped stats (rel) min: <.01% max: 2.17% x̄: 0.25% x̃: 0.05%
HURT stats (abs) min: 1 max: 288 x̄: 27.92 x̃: 2
HURT stats (rel) min: <.01% max: 1.00% x̄: 0.23% x̃: 0.02%
95% mean confidence interval for cycles value: -68.32 -6.12
95% mean confidence interval for cycles %-change: -0.28% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).
Ice Lake
total instructions in shared programs: 895384 -> 895159 (-0.03%)
instructions in affected programs: 658678 -> 658453 (-0.03%)
helped: 37
HURT: 0
helped stats (abs) min: 3 max: 16 x̄: 6.08 x̃: 4
helped stats (rel) min: <.01% max: 0.07% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for instructions value: -7.46 -4.70
95% mean confidence interval for instructions %-change: -0.04% -0.03%
Instructions are helped.
total cycles in shared programs: 7092224 -> 7091195 (-0.01%)
cycles in affected programs: 5221666 -> 5220637 (-0.02%)
helped: 35
HURT: 11
helped stats (abs) min: 1 max: 247 x̄: 43.46 x̃: 12
helped stats (rel) min: <.01% max: 2.17% x̄: 0.23% x̃: 0.05%
HURT stats (abs) min: 2 max: 432 x̄: 44.73 x̃: 5
HURT stats (rel) min: <.01% max: 1.00% x̄: 0.25% x̃: 0.02%
95% mean confidence interval for cycles value: -49.00 4.26
95% mean confidence interval for cycles %-change: -0.27% 0.03%
Inconclusive result (value mean confidence interval includes 0).
Regular shader-db results:
All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611408 -> 17611398 (<.01%)
instructions in affected programs: 1648 -> 1638 (-0.61%)
helped: 2
HURT: 0
total cycles in shared programs: 338366148 -> 338366124 (<.01%)
cycles in affected programs: 124048 -> 124024 (-0.02%)
helped: 2
HURT: 0
No changes on any earlier Intel platforms.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
1d36af9338
nir/algebraic: Generalize some and-of-shift-right patterns [v2]
...
Generalizes some of the patterns from 76289fbfa8 and 905ff86198 . In
particular, some of the soft-fp64 code generates (a & 0x7fffffff) << 1
when constant 0.0 is compared (flt or feq).
v2: Reduce the set of added patterns to those that actually help
something. This reduces the size of the state transition tables by
about 29k. Suggested by Jason. Remove the existing patterns that this
commit subsumes.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake
total instructions in shared programs: 903171 -> 903108 (<.01%)
instructions in affected programs: 635903 -> 635840 (<.01%)
helped: 25
HURT: 11
helped stats (abs) min: 1 max: 16 x̄: 5.04 x̃: 3
helped stats (rel) min: <.01% max: 0.15% x̄: 0.04% x̃: 0.03%
HURT stats (abs) min: 2 max: 14 x̄: 5.73 x̃: 5
HURT stats (rel) min: <.01% max: 0.11% x̄: 0.04% x̃: 0.02%
95% mean confidence interval for instructions value: -3.91 0.41
95% mean confidence interval for instructions %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 7059527 -> 7059681 (<.01%)
cycles in affected programs: 5249401 -> 5249555 (<.01%)
helped: 41
HURT: 9
helped stats (abs) min: 2 max: 76 x̄: 11.90 x̃: 10
helped stats (rel) min: <.01% max: 11.86% x̄: 0.99% x̃: 0.01%
HURT stats (abs) min: 2 max: 380 x̄: 71.33 x̃: 12
HURT stats (rel) min: <.01% max: 0.22% x̄: 0.04% x̃: 0.01%
95% mean confidence interval for cycles value: -14.93 21.09
95% mean confidence interval for cycles %-change: -1.40% -0.20%
Inconclusive result (value mean confidence interval includes 0).
Ice Lake
total instructions in shared programs: 895506 -> 895384 (-0.01%)
instructions in affected programs: 658800 -> 658678 (-0.02%)
helped: 37
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 3.30 x̃: 2
helped stats (rel) min: <.01% max: 0.03% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -4.00 -2.59
95% mean confidence interval for instructions %-change: -0.02% -0.02%
Instructions are helped.
total cycles in shared programs: 7092748 -> 7092224 (<.01%)
cycles in affected programs: 5272008 -> 5271484 (<.01%)
helped: 36
HURT: 14
helped stats (abs) min: 2 max: 440 x̄: 21.67 x̃: 8
helped stats (rel) min: <.01% max: 11.86% x̄: 1.12% x̃: 0.02%
HURT stats (abs) min: 2 max: 122 x̄: 18.29 x̃: 6
HURT stats (rel) min: <.01% max: 0.07% x̄: 0.01% x̃: <.01%
95% mean confidence interval for cycles value: -29.24 8.28
95% mean confidence interval for cycles %-change: -1.40% -0.21%
Inconclusive result (value mean confidence interval includes 0).
Regular shader-db results:
All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611489 -> 17611408 (<.01%)
instructions in affected programs: 21188 -> 21107 (-0.38%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 3.78 x̃: 3
helped stats (rel) min: 0.03% max: 5.82% x̄: 1.13% x̃: 0.85%
HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel) min: 0.60% max: 0.60% x̄: 0.60% x̃: 0.60%
95% mean confidence interval for instructions value: -5.27 -1.48
95% mean confidence interval for instructions %-change: -1.70% -0.42%
Instructions are helped.
total cycles in shared programs: 338418502 -> 338366148 (-0.02%)
cycles in affected programs: 2289052 -> 2236698 (-2.29%)
helped: 18
HURT: 3
helped stats (abs) min: 4 max: 18000 x̄: 2909.67 x̃: 38
helped stats (rel) min: 0.09% max: 4.07% x̄: 0.96% x̃: 0.43%
HURT stats (abs) min: 2 max: 14 x̄: 6.67 x̃: 4
HURT stats (rel) min: 0.22% max: 1.13% x̄: 0.66% x̃: 0.64%
95% mean confidence interval for cycles value: -5204.00 217.91
95% mean confidence interval for cycles %-change: -1.31% -0.14%
Inconclusive result (value mean confidence interval includes 0).
Ivy Bridge
total instructions in shared programs: 11875617 -> 11875615 (<.01%)
instructions in affected programs: 1339 -> 1337 (-0.15%)
helped: 2
HURT: 0
No changes on any earlier Intel platforms.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com > [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
d6d63aec18
nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)
...
Like 70f9e2589e . Also scrub the unnecessary size qualifier in both
replacement patterns.
This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.
Perhaps the patterns that generate umin should be conditioned on
something, but I'm not sure what. lower_bitops might cover the cases
that matter, but it seems ugly.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 936505 -> 933388 (-0.33%)
instructions in affected programs: 925719 -> 922602 (-0.34%)
helped: 154
HURT: 1
helped stats (abs) min: 1 max: 211 x̄: 35.45 x̃: 16
helped stats (rel) min: 0.34% max: 9.30% x̄: 2.28% x̃: 0.96%
HURT stats (abs) min: 2342 max: 2342 x̄: 2342.00 x̃: 2342
HURT stats (rel) min: 2.28% max: 2.28% x̄: 2.28% x̃: 2.28%
95% mean confidence interval for instructions value: -51.21 10.99
95% mean confidence interval for instructions %-change: -2.61% -1.89%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 7323502 -> 7306184 (-0.24%)
cycles in affected programs: 7220376 -> 7203058 (-0.24%)
helped: 126
HURT: 1
helped stats (abs) min: 2 max: 946 x̄: 159.10 x̃: 95
helped stats (rel) min: 0.01% max: 9.62% x̄: 0.80% x̃: 0.37%
HURT stats (abs) min: 2728 max: 2728 x̄: 2728.00 x̃: 2728
HURT stats (rel) min: 0.37% max: 0.37% x̄: 0.37% x̃: 0.37%
95% mean confidence interval for cycles value: -192.07 -80.66
95% mean confidence interval for cycles %-change: -1.07% -0.51%
Cycles are helped.
total spills in shared programs: 635 -> 817 (28.66%)
spills in affected programs: 635 -> 817 (28.66%)
helped: 0
HURT: 3
total fills in shared programs: 2065 -> 2438 (18.06%)
fills in affected programs: 2019 -> 2392 (18.47%)
helped: 0
HURT: 2
Regular shader-db results:
All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611506 -> 17611489 (<.01%)
instructions in affected programs: 33442 -> 33425 (-0.05%)
helped: 32
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 1.69 x̃: 1
helped stats (rel) min: 0.08% max: 1.90% x̄: 0.27% x̃: 0.11%
HURT stats (abs) min: 1 max: 15 x̄: 6.17 x̃: 5
HURT stats (rel) min: 0.09% max: 1.50% x̄: 0.65% x̃: 0.55%
95% mean confidence interval for instructions value: -1.70 0.80
95% mean confidence interval for instructions %-change: -0.30% 0.05%
Inconclusive result (value mean confidence interval includes 0).
total cycles in shared programs: 338419218 -> 338418502 (<.01%)
cycles in affected programs: 385795 -> 385079 (-0.19%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 192 x̄: 24.57 x̃: 16
helped stats (rel) min: 0.04% max: 2.09% x̄: 0.33% x̃: 0.22%
HURT stats (abs) min: 64 max: 164 x̄: 105.33 x̃: 88
HURT stats (rel) min: 0.77% max: 1.58% x̄: 1.09% x̃: 0.93%
95% mean confidence interval for cycles value: -29.76 -2.06
95% mean confidence interval for cycles %-change: -0.40% -0.07%
Cycles are helped.
Ivy Bridge and Sandy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11875620 -> 11875617 (<.01%)
instructions in affected programs: 421 -> 418 (-0.71%)
helped: 2
HURT: 0
total cycles in shared programs: 178245336 -> 178245326 (<.01%)
cycles in affected programs: 3425 -> 3415 (-0.29%)
helped: 2
HURT: 0
No changes on Gen4 or Gen5.
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Ian Romanick
88eb8f190b
nir/algebraic: Simplify logic to detect sign of an integer
...
This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.
v2: Fix a typo in a comment. Noticed by Matt. Copy the correct fp64
shader-db results to the commit message. I realized that I used
accidentally used the results from the next commit.
Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:
Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 906235 -> 906149 (<.01%)
instructions in affected programs: 353966 -> 353880 (-0.02%)
helped: 31
HURT: 2
helped stats (abs) min: 1 max: 8 x̄: 3.03 x̃: 3
helped stats (rel) min: 0.01% max: 1.59% x̄: 0.10% x̃: 0.04%
HURT stats (abs) min: 3 max: 5 x̄: 4.00 x̃: 4
HURT stats (rel) min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -3.51 -1.70
95% mean confidence interval for instructions %-change: -0.19% <.01%
Inconclusive result (%-change mean confidence interval includes 0).
total cycles in shared programs: 7076552 -> 7076173 (<.01%)
cycles in affected programs: 2878361 -> 2877982 (-0.01%)
helped: 37
HURT: 2
helped stats (abs) min: 2 max: 48 x̄: 10.81 x̃: 6
helped stats (rel) min: <.01% max: 2.17% x̄: 0.47% x̃: 0.01%
HURT stats (abs) min: 1 max: 20 x̄: 10.50 x̃: 10
HURT stats (rel) min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -13.96 -5.48
95% mean confidence interval for cycles %-change: -0.72% -0.16%
Cycles are helped.
total fills in shared programs: 2064 -> 2065 (0.05%)
fills in affected programs: 45 -> 46 (2.22%)
helped: 0
HURT: 1
Regular shader-db results:
All Gen7+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611530 -> 17611506 (<.01%)
instructions in affected programs: 5934 -> 5910 (-0.40%)
helped: 10
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 2
helped stats (rel) min: 0.14% max: 1.24% x̄: 0.47% x̃: 0.34%
95% mean confidence interval for instructions value: -3.53 -1.27
95% mean confidence interval for instructions %-change: -0.78% -0.17%
Instructions are helped.
total cycles in shared programs: 338419178 -> 338419218 (<.01%)
cycles in affected programs: 19244 -> 19284 (0.21%)
helped: 4
HURT: 2
helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.05% max: 0.11% x̄: 0.08% x̃: 0.08%
HURT stats (abs) min: 26 max: 26 x̄: 26.00 x̃: 26
HURT stats (rel) min: 1.20% max: 1.20% x̄: 1.20% x̃: 1.20%
95% mean confidence interval for cycles value: -9.08 22.41
95% mean confidence interval for cycles %-change: -0.35% 1.04%
Inconclusive result (value mean confidence interval includes 0).
No changes on any earlier Intel platform.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142 >
2020-03-18 20:36:29 +00:00
Jose Fonseca
f6dad10d04
meson: Avoid duplicate symbols.
...
All the stubs in src/compiler/glsl/glcpp/pp_standalone_scaffolding.c
are duplicate symbols. They should only be used as replacement for
Mesa functions when building glcpp and glsl standalone compilers, but
in fact they are getting linked with Mesa.
This change fixes this by moving the standalone stubs to a
libglcpp_standalone target, that's only linked with the glcpp/glsl
tools.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com >
Reviewed-by: Neha Bhende <bhenden@vmware.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4186 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4186 >
2020-03-16 11:52:26 +00:00
Tapani Pälli
5910c938a2
nir/glsl: gather bitmask of images used by program
...
In a similar fashion as commit f5c7df4dc9 does for textures.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080 >
2020-03-16 10:34:21 +00:00
Danylo Piliaiev
1305b93274
glsl: do not crash if string literal is used outside of #include/#line
...
Fixes: 67b32190f3
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2619
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com >
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146 >
2020-03-13 11:49:06 +02:00
Eric Anholt
d0a52432b1
glsl/tests: Fix waiting for disk_cache_put() to finish.
...
We were wasting 4s on waiting for expected-not-to-appear files to show
up on every test. Using timeouts in test code is error-prone anyway,
as our shared runners may be busy on other jobs.
Fixes: 50989f87e6 ("util/disk_cache: use a thread queue to write to shader cache")
Link: https://gitlab.freedesktop.org/mesa/mesa/issues/2505
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140 >
2020-03-12 19:47:23 +00:00
Eric Anholt
e178bca5cc
glsl/tests: Catch mkdir errors to help explain when they happen.
...
A recent pipeline
(https://gitlab.freedesktop.org/Venemo/mesa/-/jobs/1893357 ) failed
with what looks like an intermittent error related to making files for
the cache test inside of the core of the cache. Given some of the
errors, it looks like maybe a mkdir failed, so log those errors
earlier so we can debug what's going on next time.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140 >
2020-03-12 19:47:23 +00:00
Caio Marcelo de Oliveira Filho
bf432cd831
nir: Add pass to combine adjacent scoped memory barriers
...
SPIR-V generates very granular barriers, however HW and backends might
not necessarily take advantage of those. This pass provides a general
mechanism to combine such barriers.
Reviewed-by: Eric Anholt <eric@anholt.net >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224 >
2020-03-12 19:21:36 +00:00
Caio Marcelo de Oliveira Filho
d31a8ed8fd
nir: Reorder nir_scopes so wider scope has larger numeric value
...
Makes code comparing and combining scopes slightly more readable.
Reviewed-by: Eric Anholt <eric@anholt.net >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224 >
2020-03-12 19:21:36 +00:00
Caio Marcelo de Oliveira Filho
67fc88fbb9
nir: Don't skip a bit in nir_memory_semantics
...
There was another enum entry in the draft versions of
nir_memory_semantics, but when it got dropped the entries were not
updated.
Reviewed-by: Eric Anholt <eric@anholt.net >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224 >
2020-03-12 19:21:36 +00:00
Juan A. Suarez Romero
90550b2a3e
nir/algebraic: coalesce fmod lowering
...
As fmod for 16/32/64 bits lowering does the same, let's merge all of
them in a single case.
Fixes dEQP-VK.glsl.builtin.precision_double.mod.compute.* on ACO.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4118 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4118 >
2020-03-12 16:42:52 +00:00
Juan A. Suarez Romero
acd0dd3b4b
nir/lower_double_ops: relax lower mod()
...
Currently when lowering mod() we add an extra instruction so if
mod(a,b) == b then 0 is returned instead of b, as mathematically
mod(a,b) is in the interval [0, b).
But Vulkan spec has relaxed this restriction, and allows the result to
be in the interval [0, b].
For the OpenGL case, while the spec does not allow this behaviour, due
the allowed precision errors we can end up having the same result, so
from a practical point of view, this behaviour is allowed (see
https://github.com/KhronosGroup/VK-GL-CTS/issues/51 ).
This commit takes this in account to remove the extra instruction
required to return 0 instead.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev >
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4118 >
2020-03-12 16:42:52 +00:00
Timur Kristóf
ec16535b49
nir: Add ability to lower non-const quad broadcasts to const ones.
...
Some hardware doesn't support subgroup shuffle, and on such hardware
it makes no sense to lower quad broadcasts to shuffle. Instead, let's
lower them to four const quad broadcasts, paired with bcsel instructions.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com >
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net >
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147 >
2020-03-12 13:16:07 +00:00
Rhys Perry
4d0203aa83
glsl/list: use uintptr_t for exec_node_data()'s subtraction
...
This fixes UBSan warnings when foreach_list_typed_safe() passes NULL:
pointer index expression with base 0x000000000000 overflowed to 0xffffffffffffffa8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com >
Reviewed-by: Matt Turner <mattst88@gmail.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4157 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4157 >
2020-03-12 12:09:07 +00:00
Rob Clark
3535797e8c
nir/print: show variable precision
...
Signed-off-by: Rob Clark <robdclark@chromium.org >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071 >
2020-03-10 16:01:39 +00:00
Neil Roberts
83e20139db
glsl/opt_minmax: Add support for float16
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Kristian H. Kristensen
e3cc81e86c
glsl/lower_instructions: Handle fp16 for FDIV_TO_MUL_RCP
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Hyunjun Ko
4fcac46cbd
glsl/lower_instructions: Handle fp16 for MOD_TO_FLOOR
...
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Neil Roberts
6c1c2b779a
glsl/lower_instructions: Use float16 constants when appropriate
...
When lowering instructions that involve floating-point constants, pick
the appropriate type for the constant so that it will also work with
float16 parameters.
v2: Use float16_t constructor instead of helper function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Neil Roberts
2b39bb4fc0
glsl/validate: Allow float16 in the expression tree
...
v2. [Hyunjun Ko (zzoon@igalia.com )] squashed 3 commits
into one commit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Kristian H. Kristensen
198d4a535b
glsl: Add type queries for fp16+float and fp16+float+double
...
Following the is_integer_32_64() convention, add is_float_16_32() and
float_16_32_64() for these commonly tested combinations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Hyunjun Ko
ad27eb28d9
glsl: Handle fp16 unary operations when lowering matrix operations
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Neil Roberts
1b8edffaa5
glsl: Add ir_unop_f2fmp
...
This is the same as ir_unop_f2f16 except that it comes with a promise
that it is safe to optimise it out if the result is immediately
converted back to float32 again. Normally this would be a lossy
operation but it is safe to do if the conversion was generated as part
of the precision lowering pass.
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Neil Roberts
5d6b007da8
glsl: Add b2f16 and f162b conversion operations
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Neil Roberts
6b9f6caf06
glsl: Add IR conversion ops for 16-bit float types
...
Adds ir_unop_f162f and ir_unop_f2f16.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Kristian H. Kristensen
878a35db9d
glsl: Expand fp16 to float before constant expression evaluation
...
This way the generated constant folding code doesn't need to
understand fp16. All operations have to be expanded to full float for
evaulation on the CPU, so we might as well do it up front. As far as
GLSL is concerned, fp16 isn't a separate type from float, so
everything we're supposed to support for float we need to do for fp16.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Kristian H. Kristensen
505428f20b
glsl: Implement constant propagation for fp16
...
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Kristian H. Kristensen
83afebf359
glsl: Add fp16 case for ir_triop_lrp optimization
...
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Neil Roberts
668ab9f19d
glsl: Add support for float16 types in the IR tree
...
Reviewed-by: Marek Olšák <marek.olsak@amd.com >
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com >
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00
Kristian H. Kristensen
4068d6baff
glsl: Add ir_constant constructor for fp16
...
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com >
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929 >
2020-03-09 16:31:08 +00:00