From 4f7cae5e61cb8397bf8cef55ad110d8aedf866e3 Mon Sep 17 00:00:00 2001 From: Alyssa Rosenzweig Date: Fri, 27 Jun 2025 15:38:26 -0400 Subject: [PATCH] nir/opt_algebraic: add trichotomy identity MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit In https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/35802 we will significantly rework geometry shaders & transform feedback. In the new approach, transform feedback is executed as part of the hardware vertex shader, meaning the vertex shader needs to write out all the "copies" of the same value into different parts of the XFB buffer. In the general case of a GS writing triangle strips, we get 0-3 copies. This is good and lets us parallelize XFB better with GS. In the case of a VS alone with XFB, we insert a passthrough GS. In that case special case, we can only get at most 1 copy, so if we can prove the length of the output strip is 3 we can delete 2/3 of the shader. Anyway, the only thing preventing NIR from doing that optimization is failing to see through some conditionals, fixed by optimizing with the law of trichotomy. We could add other variants of this pattern (signed vs unsigned, iand vs ior/ixor) if we expect anything else to hit this other than my boutique use case. Signed-off-by: Alyssa Rosenzweig Reviewed-by: Marek Olšák Reviewed-by: Mary Guillemard Part-of: --- src/compiler/nir/nir_opt_algebraic.py | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py index 4d6884861fd..15718f09df1 100644 --- a/src/compiler/nir/nir_opt_algebraic.py +++ b/src/compiler/nir/nir_opt_algebraic.py @@ -1074,6 +1074,10 @@ optimizations.extend([ (('iand', ('uge(is_used_once)', a, b), ('uge', a, c)), ('uge', a, ('umax', b, c))), (('iand', ('uge(is_used_once)', a, c), ('uge', b, c)), ('uge', ('umin', a, b), c)), + # Law of trichotomy. This pattern is load-bearing on AGX for optimizing + # emulated transform feedback. + (('iand', ('uge', a, b), ('ult', a, b)), False), + # A number of shaders contain a pattern like a.x < 0.0 || a.x > 1.0 || a.y # < 0.0, || a.y > 1.0 || ... These patterns rearrange and replace in a # single step. Doing just the replacement can lead to an infinite loop as