Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • XML::Simple can pretty much do that already. The difference is that in your example you ignore the return value from the parse and the hashref (or 'simple tree') ends up inside the handler object itself. But this is how you'd do it with XML::Simple:

        my $xs = XML::Simple->new(keyattr => {}, ...etc...);
        my $parser = XML::SAX::ParserFactory->parser( Handler => $xs );
        my $foo = $parser->parse_string( $xml );

        print $foo->{ a }; 

    • Woah, great! I guess I had an old version of XML::Simple lying around.
    • I would really like to have access to the 'simple tree' while I am filtering. The reason for this is that I am working with huge XML documents (of varying content), and I would like to be able to extract portions of it as the parse is proceeding. I imagine a Data::Dumper::Dumper call on the blessed object will reveal where the simple tree is being built up, so I can be bad and dig into the object myself. But I'd rather not do that. Any ideas?
      • You might want to look into XML::Filter::Dispatcher which I think can do what you want (ie: treat sections of the document as documents in their own right).

        Digging inside XML::Simple.pm wouldn't actually help (and might harm your sanity) since all the handler does is accumulate the events in an XML::Parser Tree-style structure and then once the whole document has been parsed it uses the 'collaspe' method to convert it to a simple tree.

      • A little late... you can do this using XML::Twig: the latest version lets you use the simplify method on any element of the tree. simplify gives you the same structure as XML::Simple. And of course you can use it during the parsing, so you can deal with parts of the tree.

        Example:

        #!/usr/bin/perl -w
        use strict;
        use XML::Twig;
        use YAML;

        XML::Twig->new( twig_roots => { foo => \&foo })
                 ->parse( \*DATA);

        sub foo
          { my( $t, $foo)= @_;
            my $data= $fo

        --
        mirod
  • Well, I emailed the people running GIA [opengov.us] about whether they had an API, but they haven't written me back. I'm sure that if they do then a module encapsulating the easiest uses of it can't be far behind, and if there isn't an API, I can't see why there wouldn't be one soon.
    --

    ------------------------------
    You are what you think.
    • Neat. Let me know if you hear anything. In the meantime I'm serious thinking of writing an interface to Thomas [loc.gov] using a nice OO interface and WWW::Mechanize in the background. Interested? I'd like to come up with the API first, and work from there.
      • Thomas looks neat (and very tentacle-y, in the amount of information it provides...), but I don't know doodly about WWW::Mechanize yet.

        I'd definately be interested to hear about it -- I personally try to be a political agnostic, but it doesn't always work.

        --

        ------------------------------
        You are what you think.