Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

jjohn (22)

jjohn
  (email not shown publicly)
http://taskboy.com/
AOL IM: taskboy3000 (Add Buddy, Send Message)

Perl hack/Linux buff/OSS junkie.

Journal of jjohn (22)

Thursday March 28, 2002
10:20 PM

XML::Parser::ExpatNB example

[ #3840 ]

Let's all say it together: XML::Parser Sucks! There, that was cleansing.

After a much prodding of my XML buddies (hi jmac!) and an evil notion of using goto (thankfully Perl doesn't let you jump into the middle of a function), I came across a seemingly little used XML::Parser function parse_start which returns a new XML::Parser::ExpatNB object (with oh so little documentation) that does EXACTLY WHAT I NEED! I need a parser that parses a stream in increments. Consider how useful this is for dealing with XML messages coming across the network that might f'ing HUGE! This parsing method will at least give me an opportunity to chunk the data into smaller bits (save for the pathological 45TB between a single <tag> [even then, there may be options]). Anyway, this is a BEAUTIFUL, LOVERLY THING!!!!

So, here's a very goofy example of how to work with this bod boy. I'll be looking to shove this into Frontier::RPC2 in a most Eee-VEIL way. ;-)

use strict;
use warnings;
use XML::Parser;

my $p = XML::Parser->new(
             Style => 'My::Pkg',
            );

print "Reading from __DATA__\n";
my $data; # A place for my text data

# Don't be fooled: it's an object constructor
my $nb_p = $p->parse_start(data => \$data);

while(my $l = <DATA>){
  chomp($l);
  $nb_p->parse_more($l);
  if(my $s = ${$nb_p->{data}}){
    print "Back at the range, I got $s\n";
  }
}
$nb_p->parse_done; # shut this mother down

package My::Pkg;

sub Init {
  my($expat) = @_;

  print "Hello!\n";
}

sub Start {
  my($expat, $tag, %attrs) = @_;
  ${$expat->{data}} = undef;
  print "Start: $tag\n";
}

sub Char {
  my($expat, $text) = @_;
  ${$expat->{data}} = undef;
  return if $text =~ /^\s*$/;

  $expat->{char_bag} = $text;
}

sub End {
  my($expat, $tag) = @_;
  print "End: $tag\n";
  ${$expat->{data}} = $expat->{char_bag};

  # clean up
  $expat->{char_bag} = '';
  return;
}

__END__
<?xml version="1.0" ?>
<a>
  <b>
    <c>
         <d>fiddlesticks</d>
    </c>
  </b>
</a>

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Oh, and I did I mention I hate XML::Parser too? Especially the fact that it comes in 2 incompatible versions?

    Sorry for the formating, the lameness filter sucks!

    #!/usr/bin/perl -w
    use strict;
    use XML::Twig;

    my $p = XML::Twig->new(
                              twig_handlers =>
                                  { a => sub { my( $p, $a)= @_;