So I boiled the test down to a bar minimum. I created a benchmark test consisting of nothing but the loop. I had to do something within the loop, so I stored the number in a variable.
As the names and the code suggest, I actually had 9 test and 9 range instances, for each value in 0..8. While there was a little variation within the sets, the difference between sets was significant! Using a split range was 60% faster than testing within the loop. The test achieved 376923 .. 390809 iterations per second, compared to 596918 .. 606710 for the split range, when tested for 10 seconds per version. Testing just test4 vs range4 for a minute gave similar results:
How about if the loops are larger? If we go from 0 to 80 or 0 to 800 instead of just 0 to 8, the test should get worse, since it's repeating the test over and over, while the split range has no extra work to do.
And in fact the relative speed improves linearly as the loops size grows by powers of ten from an upper limit of 8, to 8,000, after which the improvement decreases ... guess that's log(N). But by 80,000 it's only doing 50 or 100 reps a second, accuracy may be fading.
So the question is, in isolated testing, the range is definitely better than the test within the loop. Why does the test perform better in the otherwise identical program? After all, all that was happening within the loop was hash and array de-referencing, and an integer comparison.
But the test wrapper,
Of course the real lesson is that profiling programs that take 1/10 second overall is a waste of time