Recipe 5.7. Hashes with Multiple Values Per Key

5.7. Hashes with Multiple Values Per Key

Problem

You want to store more than one value for each key.

Solution

Store an array reference in $hash{$key}, and put the values into that array.

You can only store scalar values in a hash. References, however, are scalars. This solves the problem of storing multiple values for one key by making $hash{$key} a reference to an array containing values for $key. The normal hash operations - insertion, deletion, iteration, and testing for existence - can now be written in terms of array operations like push, splice, and foreach.

This code shows simple insertion into the hash. It processes the output of who (1) on Unix machines and outputs a terse listing of users and the ttys they're logged in on:

%ttys = ();

open(WHO, "who|")                   or die "can't open who: $!";
while (<WHO>) {
    ($user, $tty) = split;
    push( @{$ttys{$user}}, $tty );
}

foreach $user (sort keys %ttys) {
    print "$user: @{$ttys{$user}}\n";
}

The heart of the code is the push line, the multihash version of $ttys{$user} = $tty. We interpolate all the tty names in the print line with @{$ttys{$user}}. We'd loop over the anonymous array if, for instance, we wanted to print the owner of each tty:

foreach $user (sort keys %ttys) {
    print "$user: ", scalar( @{$ttys{$user}} ), " ttys.\n";
    foreach $tty (sort @{$ttys{$user}}) {
        @stat = stat("/dev/$tty");
        $user = @stat ? ( getpwuid($stat[4]) )[0] : "(not available)";
        print "\t$tty (owned by $user)\n";
    }
}

The exists function can have two meanings: "Is there at least one value for this key?" and "Does this value exist for this key?" Implementing the second approach requires searching the array for the value. The delete function and the first sense of exists are interrelated: If we can guarantee that no anonymous array is ever empty, we can use the built-in exists. We ensure that no anonymous array is ever empty by checking for such a situation after deleting an element:

sub multihash_delete {
    my ($hash, $key, $value) = @_;
    my $i;

    return unless ref( $hash->{$key} );
    for ($i = 0; $i < @{ $hash->{$key} }; $i++) {
        if ($hash->{$key}->[$i] eq $value) {
            splice( @{$hash->{$key}}, $i, 1);
            last;
        }
    }

    delete $hash->{$key} unless @{$hash->{$key}};
}

The alternative approach to multivalued hashes is given in Chapter 13, Classes, Objects, and Ties, implemented as tied normal hashes.

5.7. Hashes with Multiple Values Per Key

Problem

Solution

Discussion

See Also