Skip to main content

Adventures in Autovivification

Having recently started a new job, I was exposed to old code with multi-step tests against autovivification in multi-level hashes. You get used to the code you have seen, but in a new environment it‘s irritating and jarring.
Moose does not generally have the problem, first because class structure is pre-declared, because values are accessed using accessor functions rather than directly, and because responsibility is delegated down to attributes, avoiding long chains. On the other hand, Moose has it's own overhead, so hand-rolled objects, and bare structures still have their use.
If you don‘t protect against autovivification, then mis-spelling a key, or referencing keys which haven‘t been instantiated in this instance, causes those keys to instantly come into existence.
 #!/usr/bin/perl
 use warnings;
 use strict;
 use Data::Dump 'dump';
 use 5.024;

 my $var = {key1 => {key2 => {key3 => 'a'}}};
 say dump $var;

 if ( $var->{key1}{key2}{key3b}[13]{foobarbaz} ) {
     say 'Miracles DO happen!';
 }
 say dump $var;

This is ugly, and could potentially have more serious consequences.
 { key1 => { key2 => { key3 => "a" } } }
 {
     key1 => {
         key2 => {
             key3  => "a",
             key3b => [
                          undef, undef, undef, undef,
                          undef, undef, undef, undef,
                          undef, undef, undef, undef,
                          undef, {},
                         ],
                    },
                },
 }

Manual Solutions

The manual solution is to assume the existence only of that portion which you can rely on, and then check stage by stage. I suppose in this case I should be verifying that key3b actually is an array, before determining the number of elements it contains, and whether element 13 stores a defined value. And perhaps the other elements should be tested for hash-y-ness. But generally you know what an element is, the question is whether it exists or not.
 my $var = {key1 => {key2=> {key3  => 'a'}}};
 say dump $var;

 if (   $var->{key1}{key2}{key3b} 
    &&  13 <=  scalar @{ $var->{key1}{key2}{key3b} } 
    &&  $var->{key1}{key2}{key3b}[13] 
    &&  $var->{key1}{key2}{key3b}{foobarbaz} ) {
     say 'Miracles DO happen!';
 }
 say dump $var;

It does lead to a clean data structure.
 { key1 => { key2 => { key3 => "a" } } }
 { key1 => { key2 => { key3 => "a" } } }

But there‘s a two-fold price:
  • many tests AND-ed together 
  • many stages of element reference.
Each of these increases the possibility of an error, and complicates the task of reading and understanding the code.
One improvement is to create a brief alias to the portion of the structure you can rely on. Then the individual tests are shorter, clearer, and less likely to contain an error. And short tests make it easier to understand the set of tests.
 my $var = {key1 => {key2=>{key3  => 'a'}}};
 say dump $var;

 my $alias = $var->{key1}{key2};

 if (   $alias->{key3b} 
    &&  13 < @{ $alias->{key3b} } &&  $alias->[13]    
    &&  $alias->{key3b}{foobarbaz} ) { 
        say 'Miracles DO happen!';
 }   
 say dump $var;

Creating a variable is wasteful and distracting. In the early 80s the programming language Turing had a keyword to achieve this, I believe it was alias. Perl already provides a very short alias that comes naturally to Perl programmers, $_.
You might use a for loop with a single element, to alias $_ to the reliable portion.
Or, although smartmatch itself can lead to problems, using given to assign $_ avoids the question of why there‘s a for loop, especially one with a single element. given is designed to assign $_; that‘s all it does. It is Perl's alias.
 no warnings 'experimental::smartmatch';

 my $var = {key1 => {key2=>{key3  => 'a'}}};
 say dump $var;

 given ( $var->{key1}{key2} ) {
     if (   $_->{key3b}
        &&  13 < @{ $_->{key3b} } &&  $_->[13]    
        &&  $_->{key3b}{foobarbaz} ) {
         say 'Miracles DO happen!';
     }
 }
 say dump $var;

Let's Use a Module

But none of these resolves the problem of the many tests.
The autovivification module comes to the rescue!
Faster than a speeding bullet! Able to leap tall buildings in a single bound! …
Oddly, the way you use the autovivification module is to say no autovification. After all, you want to prevent unintended autovivification, and how confusing would it be if you said use autovivification to disable autovivification.
 no autovivification;

 my $var = {key1 => {key2=>{key3  => 'a'}}};
 say dump $var;

 if ( $var->{key1}{key2}{key3b}[13]{foobarbaz} ) {
     say 'Miracles DO happen!';
 }
 say dump $var;

The result is what you want. The module does all the boring stuff for us, and prevents non-existent keys from suddenly being autovivified.
 { key1 => { key2 => { key3 => "a" } } }
 { key1 => { key2 => { key3 => "a" } } }

Besides protecting during a fetch, the default configuration also protects during an exists test.
# old way
say "It's there"
    if exists $var->{key1}{key2}{key3b}
        && 13 < @{ $var->{key1}{key2}{key3b}}
        && exists $var->{key1}{key2}{key3b}[13]
        && exists $var->{key1}{key2}{key3b}[13{foobarbaz};

# new way
say "It's there"
    if exists $var->{key1}{key2}{key3b}[13]{foobarbaz};

... as well as during attempted deletes.
 # old way
 delete $var->{key1}{key2}{key3b}[13]{foobarbaz}
    if exists $var->{key1}{key2}{key3b}
        && 13 < @{ $var->{key1}{key2}{key3b}}
        && exists $var->{key1}{key2}{key3b}[13]
        && exists $var->{key1}{key2}{key3b}[13]{foobarbaz};

 # new way
  delete $var->{key1}{key2}{key3b}[13]{foobarbaz};

autovivification.pm can also prevent accidental assignment to unintended keys.
You can also specify in the import line... which in this case actually invokes the unimport routine...that it should protect against storing values into keys which do not already exist. If, for example, you are using hash-based data structures or traditional hash-based objects , you can create the struct or object with all the keys it will need. You can invoke use autovivification qw<store> in the constructor, and protection will return when that goes out of scope.
 { key1 => { key2 => { key3 => "a" } } }
 Can't vivify reference at 06_store.pl line 16.

Once the object exists, values of existing keys can be modified, but attempting to assign to a new key generates a fatal runtime error.
But, frankly, I don‘t mind taking responsibility for creating and modifying structures, so long as I‘m spared the task of checking component existence.

Alternatives


The sanity.pm module, a competitor for Modern::Perl, common::sense and similar Boilerplate condensors, claims to provide autovivification protection.

 use sanity qw(Modern::Perl NO:autovivification);
 use Data::Dump 'dump';

 my $var = {key1 => {key2=>{key3  => 'a'}}};
 say dump $var;

 if ( $var->{key1}{key2}{key3b}[13]{foobarbaz} ) {
     say 'Miracles DO happen!';
 }
 say dump $var;

Unfortunately, I couldn‘t get it to work properly. In any case, it uses the external autovivification module, so it only provides a benefit if you like its Boilerplate condensation. But it is nice to not have to type
 use warnings
 use strict;
 use 5.024;
 # etc

Comments

Popular posts from this blog

Perl5, Moxie and Enumurated Data Types

Moxie - a new object system for Perl5 Stevan Little created the Moose multiverse to upgrade the Perl 5 programming language's object-oriented system more in line with the wonderfull world of Perl 6. Unfortunately, it's grown into a bloated giant, which has inspired light-weight alternatives Moos, Moo, Mo, and others. Now he's trying to create a modern, efficient OO system that can become built into the language. I've seen a few of his presentations at YAPC (Yet Another Perl Conference, now known as TPC, The Perl Conference), among them ‎p5 mop final final v5 this is the last one i promise tar gz While the package provides some POD documentation about the main module, Moxie, it doesn't actually explain the enum package, Moxie::Enum. But delving into the tests directory reveals its secrets. Creating an Enum package Ranks { use Moxie::Enum; enum by_ARRAY => qw( unused 2 3 4 5 6 7 8 9 10 J Q K A ); enum by_HASH => { 2 => 2, 3 =...

Creating Perl5 Objects with Moxie

Having in the previous article prepared data types for car suits and card ranks, I can now combine them to provide a playing card class, using Stevan Little's Moxie module (version 0.04, so definitely early days.) The goal is to provide an object-oriented paradigm to the Perl 5 programming language which is more sophisticated, more powerful and less verbose than manually bless() -ing hashes. To achieve that goal it needs to be faster and light-weight compared to Moose. Currently, Moxie.pm and and MOP.pm are add-on modules, but eventually, when they are more complete, when the wrinkles have been ironed out, and when they have gained acceptance and a community of users, they might be merged into the Perl core. One significant feature of Moxie is that it reduces boilerplate code. You don't have to specify warnigns or strict . As well, the features or the perl you are using are enabled, among them say , state , signatures , and post_deref . A Simple Moxie Class packag...

Book review: 390+ Python Interview Questions and Answers

I downloaded a preview portion of 390+ Python MCQs from Anazon, thinking reading through it would help me advance my Python skills beyond what I have learned from Harvard’s online CS50P (Python) course. I’m an experienced program looking to add a new skill to my repertoire, and while the course covered many significant aspects of Python programming, there are many other details to perfect, such as best practices, developing packages, and so on. The book is written by Manish Dnyandeo Salunke, who claims 15 years experience in IT,  but it is not clear who published it. It is obvious no one edited it, or verified the correctness of the questions, answers and explanations. Amazon allowed me to download a sample of (I think) 57 questions. Roughly half of these were wrong, and some of the others struck me as irrelevant. The maximum allowed length for an identifier, apparently, is 79 characters. Anything over 20 characters should be considered unusual, so sufficient to say the limit is se...