According to HSD 14016252163 if compute shader uses the sample operation, morton walk order and set the thread group batch size to 4 is expected to increase sampler cache hit rates by increasing sample address locality within a subslice. Rework: * Caio: "||" => "&&" for type checking in instr_uses_sampler() * Jordan: Use nir's foreach macros rather than nir_shader_lower_instructions() Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32430>