We were walking the instructions in the block for each
first-rpt-instruction in the block. Instead, on the first query per
block, make a set of all the rpts in the block, so we can O(1) check for
the remainder.
shader-db runtime for deadspace3 -7.60909% +/- 2.28996% (n=10) on a
debugoptimized build.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/37625>