Sudoku in Perl5 #2 - Optimizing the Perl5 implementation

The test in BruteForceTest irritated me .. iterating across the whole row, testing each cell to see if it was the cell we didn't want to test against. And same for each row of the column, and each cell of the block. All those repeated tests must add up to a lot of wasted time, right?

So I came up with BruteForceRange, which doesn't iterate over the whole row, only the portions to the left and right of the current cell. In the column, it iterates over the cells above and below ... the block test is more complicated, we'll get to it.

The program is mostly the same as BruteForceTest. The first difference is the name changes:


package BruteForceRange;

Instead of specifying 0..8 for the row or column for loop ...


    for my $c ( 0 .. 8 ) {
        next CELL    # don't test against self.
            if $c == $col;

the new version specifies a portion to the left, and a portion to the right. This is possible because ranges in Perl must increment, a left boundary greater than the right one is an empty range.


     for my $c ( 0..$col-1,$col+1..8 ) {

We no longer need a test, let alone 8 tests that fail. Test_col() is similar, iterating over the rows above and below, but not the actual one we're testing against. test_block() is more complicated. Instead of the former nested block:


      for my $r ( $baserow .. $baserow + 2 ) {
         CELL:
           for my $c ( $basecol .. $basecol + 2 ) {
               next CELL        # don't test against self.
                   if $r == $row and $c == $col;

We still have to iterate over all the rows. If we're not in the test row, scan all the columns, but in the test row, scan only the rows above and below the test column. The ternary operator : test ? value-if-test-was-true : value-if-test-was-false is the compact solution. Using a two-part range within the ternary operator generates an error, unless you add parentheses to delimit the list ... or maybe it's to clarify the bounds of the if-true portion of the operator.


       for my $r ( $baserow .. $baserow + 2 ) {
         CELL:
             for my $c ( $r == $row
                       ? ($basecol..$col-1,$col+1..$basecol + 2)
                       : $basecol .. $basecol + 2 ) {

If the ternary operator combines with the split range freaks you out, an alternative would be to calculate the range between the loops and store the lements in an array:


       for my $r ( $baserow .. $baserow + 2 ) {
         CELL:
             my @columns = ( $r == $row
                           ? ($basecol..$col-1,$col+1..$basecol + 2)
                           : $basecol .. $basecol + 2
                           );
             for my $c ( @columns ) {

I won't show you the driver program I used while testing and verifying the module, it's just like the other one only with slightly different names. More interesting is a program to compare the results:


use BruteForceRange;
use BruteForceTest;
use Benchmark qw/:all/;

my $ARGS = { input => 
         '010056207'
         . '000700005'
         . '000300498'
         . '000200380'
         . '006000700'
         . '051007000'
         . '684002000'
         . '100003000'
         . '305460020'
};

cmpthese( -60, {
    BruteForceTest  => sub { BruteForceTest->new( $ARGS )->solve; },
    BruteForceRange => sub { BruteForceRange->new( $ARGS )->solve; },
      }
    );

new() returns an object for the specified puzzle, which is then told to solve itself. Since we're not going to look at the results, but only solve the problem over and over to see how fast it is, that's the simplest arrangement, and keeps the cmpthese call short. If the first argument to cmpthese is a positive number, it's a count of how many runs to perform; a negative number is the number of seconds to spend running the specified block ... in this case, each version is tested over and over for 15 seconds.

The results? It's a tie! In fact, the superfluous tests are a fraction faster that than the split range, 27.7 reps/second compared to 27.4 reps/second. That's on a 2.66 GHz Mac Pro tower with one Nehalem ( i7 ) processor and 12 GB of 1066 MHz DD3 memory sitting around doing nothing.


bash-3.2$ perl5.10 compare.pl
                  Rate BruteForceRange  BruteForceTest
BruteForceRange 27.4/s              --             -1%
BruteForceTest  27.7/s              1%              --
bash-3.2$

If you aren't Happy
Which Dwarf Are You?

Search This Blog