Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Wednesday January 01, 2003
08:05 PM

iCab log files

[ #9709 ]

The iCab web browser can keep a log of all requests and responses (although not the response message body). I use iCab when I need to look at a web transaction after I cannot figure out what I an missing---usually a form element or referer.

The log looks like this snippet (to see if in real life, turn on the log file in the Network>Connection/Log preference setting).

Thread #1 (1/1/03, 1:43):
 
Connecting to www.example.com  Port: 443
GET /script.pl?foo=bar HTTP/1.1
Host: www.aimsrdl.atsc.army.mil
Accept-Language: en
Connection: close
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/xbm, image/png, text/html, text/plain, */*
User-Agent: Mozilla/4.5 (compatible; iCab 2.8.2; Macintosh; U; PPC)
 
Thread #1 (1/1/03, 1:43):
 
Response: 200
Server: Microsoft-IIS/5.0
Date: Wed, 01 Jan 2003 09:42:36 GMT
Connection: close
Content-type: text/html

A logical web transaction can take several request-response cycles though, especially with image-heavy sites. iCab does not do one request at a time, so the order of requests and responses gets mixed up. To make this easier to read, I created a short script to parse the file. I am working on making the script into a module, too.

#!/usr/bin/perl
 
use Data::Dumper;
use HTTP::Message;
 
my %Log = ();
my $Side;
 
while( <> )
    {
    chomp;
 
    if( /^Thread\s+#(\d+)\s+\((.*?),\s+(.*?)\):\s*$/ or
            m|^(Exception)/Error\s+(-?\d+)\s*$| )
        {
        $Hash->{exception} = $2 if $1 eq 'Exception';
        $Log{ $Hash->{thread} }{$Side} = $Hash if defined $Side;
 
        $Hash = undef;
 
        $Side           = 'response';
        $Hash->{thread} = $1;
        $Hash->{date}   = $2;
        $Hash->{'time'} = $3;
        $Hash->{header} = HTTP::Headers->new();
        }
    elsif( /^Connecting to (\S+)\s+Port:\s+(\d+)/ )
        {
        $Hash->{host} = $1;
        $Hash->{port} = $2;
        $Side = 'request';
        }
    elsif( /^\s*$/ )
        {
        next;
        }
    elsif( m|^(\S+):\s+(.*)| )
        {
        $Hash->{header}->push_header( $1, $2 );
        }
    elsif( m=(POST|GET|HEAD)\s+(\S+)\s+HTTP/(\d\.\d)$= )
        {
        $Hash->{method}  = $1;
        $Hash->{url}     = $2;
        $Hash->{version} = $3;
        }
    }
 
print Data::Dumper::Dumper( \%Log );