Recipe 3.11. Program: hopdelta

3.11. Program: hopdelta

Have you ever wondered why it took so long for someone's mail to get to you? With postal mail, you can't trace how long each intervening post office let your letter gather dust in their back office. But with electronic mail, you can. The message carries in its header Received: lines showing when each intervening mail transport agent along the way got the message.

The dates in the headers are hard to read. You have to read them backwards, bottom to top. They are written in many varied formats, depending on the whim of each transport agent. Worst of all, each date is written in its own local time zone. It's hard to eyeball "Tue, 26 May 1998 23:57:38 -0400" and "Wed, 27 May 1998 05:04:03 +0100" and realize these two dates are only 6 minutes and 25 seconds apart.

The ParseDate and DateCalc functions in the Date::Manip module from CPAN can help this:

use Date::Manip qw(ParseDate DateCalc);
$d1 = ParseDate("Tue, 26 May 1998 23:57:38 -0400");
$d2 = ParseDate("Wed, 27 May 1998 05:04:03 +0100");
print DateCalc($d1, $d2);
+0:0:0:0:0:6:25

That's a nice format for a program to read, but it's still not what the casual reader wants to see. The hopdelta program, shown in Example 3.1, takes a mailer header and tries to analyze the deltas (difference) between each hop (mail stop). Its output is shown in the local time zone.

Example 3.1: hopdelta

#!/usr/bin/perl
# hopdelta - feed mail header, produce lines
#            showing delay at each hop.
use strict;
use Date::Manip qw (ParseDate UnixDate);

# print header; this should really use format/write due to
# printf complexities
printf "%-20.20s %-20.20s %-20.20s   %s\n", 
        "Sender", "Recipient", "Time", "Delta";

$/ = '';                # paragraph mode
$_ = <>;                # read header
s/\n\s+/ /g;            # join continuation lines

# calculate when and where this started
my($start_from) = /^From.*\@([^\s>]*)/m;
my($start_date) = /^Date:\s+(.*)/m;
my $then = getdate($start_date);
printf "%-20.20s %-20.20s %s\n", 'Start', $start_from, fmtdate($then);

my $prevfrom = $start_from;

# now process the headers lines from the bottom up
for (reverse split(/\n/)) {
    my ($delta, $now, $from, $by, $when);
    next unless /^Received:/;
    s/\bon (.*?) (id.*)/; $1/s;         # qmail header, I think
    unless (($when) = /;\s+(.*)$/) {    # where the date falls
        warn "bad received line: $_";
        next;
    }
    ($from) = /from\s+(\S+)/;           
    ($from) = /\((.*?)\)/ unless $from; # some put it here
    $from =~ s/\)$//;                   # someone was too greedy
    ($by)   = /by\s+(\S+\.\S+)/;        # who sent it on this hop

    # now random mungings to get their string parsable
    for ($when) {
        s/ (for|via) .*$//;
        s/([+-]\d\d\d\d) \(\S+\)/$1/;
        s/id \S+;\s*//;
    }
    next unless $now = getdate($when);          # convert to Epoch
    $delta = $now - $then;

    printf "%-20.20s %-20.20s %s  ", $from, $by, fmtdate($now);
    $prevfrom = $by;
    puttime($delta);
    $then = $now;               
}

exit;

# convert random date strings into Epoch seconds
sub getdate {
    my $string     =  shift;
    $string        =~ s/\s+\(.*\)\s*$//;        # remove nonstd tz
    my $date       =  ParseDate($string);
    my $epoch_secs =  UnixDate($date,"%s");
    return $epoch_secs;
}

# convert Epoch seconds into a particular date string
sub fmtdate {
    my $epoch = shift;
    my($sec,$min,$hour,$mday,$mon,$year) = localtime($epoch);
    return sprintf "%02d:%02d:%02d %04d/%02d/%02d",
        $hour, $min, $sec,
        $year + 1900, $mon + 1, $mday,
}

# take seconds and print in pleasant-to-read format
sub puttime {
    my($seconds) = shift;
    my($days, $hours, $minutes);

    $days    = pull_count($seconds, 24 * 60 * 60);
    $hours   = pull_count($seconds, 60 * 60);
    $minutes = pull_count($seconds, 60);

    put_field('s', $seconds);
    put_field('m', $minutes);
    put_field('h', $hours);
    put_field('d', $days);

    print "\n";
}

# usage: $count = pull_count(seconds, amount)
# remove from seconds the amount quantity, altering caller's version.
# return the integral number of those amounts so removed.
sub pull_count {
    my($answer) = int($_[0] / $_[1]);
    $_[0] -= $answer * $_[1];
    return $answer;
}

# usage: put_field(char, number)
# output number field in 3-place decimal format, with trailing char
# suppress output unless char is 's' for seconds
sub put_field {
    my ($char, $number) = @_;
    printf " %3d%s", $number, $char if $number || $char eq 's';
}

Sender               Recipient            Time                   Delta
Start                wall.org             09:17:12 1998/05/23
wall.org             mail.brainstorm.net  09:20:56 1998/05/23    44s   3m
mail.brainstorm.net  jhereg.perl.com      09:20:58 1998/05/23     2s