Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Sunday November 30, 2003
07:52 PM

Just when you thought it was (type) safe

[ #16088 ]

I am writing a little spider application, using LWP and all of that good stuff. For my particular application I need to set the referer header, and along the way I collect the right URLs to put in that.

Since I am using LWP, URLs tend to show up as objects, but when I try to put them back into an HTTP request, things blow-up:

use HTTP::Request;
use URI;
 
my $url = URI->new( 'http://www.example.com' );
 
my $request = HTTP::Request->new( "http://www2.example.com" );
$request->referer( $url );

The referer() method comes from HTTP::Headers, and all it does is pass its arguments to the _headers() method. Inside the headers method, that $url ends up in $val, and then it has to run the gauntlet:

[HTTP::Headers, 1.43 sub _headers]
    if (defined($val)) {
    my @new = ($op eq 'PUSH') ? @old : ();
    if (!ref($val)) {
        push(@new, $val);
    } elsif (ref($val) eq 'ARRAY') {
        push(@new, @$val);
    } else {
        Carp::croak("Unexpected field value $val");
    }
    $self->{$lc_field} = @new > 1 ? \@new : $new[0];
    }

The thing in $val is defined, so it makes it into the block, but it is a reference, but not an ARRAY reference, so it falls through to the else{}. This works for most things, because _headers is a generic method, but referer could be a bit smarter.

[HTTP::Headers, 1.43, referer()]
sub referer           { (shift->_header('Referer',          @_))[0] }

Debugging this is was a pain. The URI objects automatically stringify, so printing them just shows the string form, rather than something like "URI=HASH(0xfb748)". My usual debugger, print(), fails to pick this up.

There are a couple of ways around this, none of them satisfying:

  • Interpolate into new strings for each use, i.e. "$url".
  • Check to see if the $url is a reference, then call the as_string method if it is.
  • Always turn things into strings, losing the ability to call methods.

Oh well, now you know. Do not pull your hair out over this one, because I already did.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.