2cc6da85d6
Haven't looked at what code this exactly generates but URem can't be fast. Instead of using two URem only use one and replace the second one with select/add (this is what the corresponding aos code already does).