Skip to main content

More Perl 5 - Comparing "split range" vs "test in loop"

Although the difference I found between the split range and the test within loop implementations were small, it seems to me it's wrong.

So I boiled the test down to a bar minimum. I created a benchmark test consisting of nothing but the loop. I had to do something within the loop, so I stored the number in a variable.


test8 => sub { for ( 0..8 ) {
next if $_ == 8;
my $j = $_;
} },
range0 => sub { for ( 0..-1,1..8) {
my $j = $_;
} },



As the names and the code suggest, I actually had 9 test and 9 range instances, for each value in 0..8. While there was a little variation within the sets, the difference between sets was significant! Using a split range was 60% faster than testing within the loop. The test achieved 376923 .. 390809 iterations per second, compared to 596918 .. 606710 for the split range, when tested for 10 seconds per version. Testing just test4 vs range4 for a minute gave similar results:


Rate test4 range4
test4 386996/s -- -35%
range4 593520/s 53% --



How about if the loops are larger? If we go from 0 to 80 or 0 to 800 instead of just 0 to 8, the test should get worse, since it's repeating the test over and over, while the split range has no extra work to do.


0-8,000
Rate test4 range4
test4 540/s -- -39%
range4 886/s 64% --

0-80,000
Rate test4 range4
test4 55.4/s -- -38%
range4 88.7/s 60% --



And in fact the relative speed improves linearly as the loops size grows by powers of ten from an upper limit of 8, to 8,000, after which the improvement decreases ... guess that's log(N). But by 80,000 it's only doing 50 or 100 reps a second, accuracy may be fading.

So the question is, in isolated testing, the range is definitely better than the test within the loop. Why does the test perform better in the otherwise identical program? After all, all that was happening within the loop was hash and array de-referencing, and an integer comparison.


return # collision
if $val == $self->{grid}[$row][$c];



Using -d:FProf, I see that BruteForceTest.pl took 0.030 seconds to run a total of 5450 row, column and block tests, while BruteForceExplicitList.pl took 0.026 seconds for 5450 calls to the unified test. using the timings from the first test, above, the explicit list loop should have been 9 ms while the repeated test should be 14 ms. Presumable the 16ms is the hash and array de-referencing and the comparison.

But the test wrapper, cell_value_ok used to be a trivial call to the row, column and block tests, but now it constructs a few arrays. That must be why it has gone from 0.006 ms to 0.015 ms.

Of course the real lesson is that profiling programs that take 1/10 second overall is a waste of time

Comments

Popular posts from this blog

Perl5, Moxie and Enumurated Data Types

Moxie - a new object system for Perl5 Stevan Little created the Moose multiverse to upgrade the Perl 5 programming language's object-oriented system more in line with the wonderfull world of Perl 6. Unfortunately, it's grown into a bloated giant, which has inspired light-weight alternatives Moos, Moo, Mo, and others. Now he's trying to create a modern, efficient OO system that can become built into the language. I've seen a few of his presentations at YAPC (Yet Another Perl Conference, now known as TPC, The Perl Conference), among them ‎p5 mop final final v5 this is the last one i promise tar gz While the package provides some POD documentation about the main module, Moxie, it doesn't actually explain the enum package, Moxie::Enum. But delving into the tests directory reveals its secrets. Creating an Enum package Ranks { use Moxie::Enum; enum by_ARRAY => qw( unused 2 3 4 5 6 7 8 9 10 J Q K A ); enum by_HASH => { 2 => 2, 3 =...

If I Could Change Perl

Is there something that irritates you about Perl? One little thing you wish you could change, to make life so much easier? For me, it's the way declarations work. Whether it's with local, our or my, you can declare a variable name, or a list of several variable names: my ($x, $y, $z); Of course, you can initaliaze variables as you declare them. my $bank_balance = -999_999; my ( $x, $y, $z ) = ( 0, 0, 0 ); But if you have a number of variables to declare, and they aren't directly related to each other, as (x, y, z) clearly are, it would be so much better to declare the variable and immediately assign a value to it on the same line, the way C, Javascript and numerous sensible langages do. Currently, 'my', 'our' and 'local' expect a variable name, or a list of mariable names. So one possibility would be to provide an alternative form which takes a hash. Ideally, values defined in one line could be used lower down. my { $sides => 3, ...

Perl Floating Point-Multiplication Benchmark

I was worried whether I was making basic errors in testing the Perl version, so I decided to use the Benchmark module to get the numbers. I copied the matmult.pl file and added use Benchmark ':all'  to the header of the file. The main() routine got changed to : my $time = $ARGV[0] || 5; my %vars = ( 2 => [], 5 => [], 10 => [], 32 => [], 100 => [], ); for my $size ( keys %vars ) { my $filestub = q{F_} . $size . q{x} . $size . q{.}; $vars{$size}[0] = readMatrix( $filestub . '1' ); $vars{$size}[1] = readMatrix( $filestub . '2' ); } say "Processing for $time seconds each size ", "will take @{[5 * $time]} seconds."; say scalar localtime; cmpthese( -$time, { 'F_2x2' => sub { matmult( $vars{2}[0], $vars{2}[1]); }, 'F_5x5...