Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Is there any reason why you could not switch from XML::XPath to XML::LibXML? The code should be portable, except for a few constants that need to be renamed, I think.

    In any case, if you have spotted improvements that can be made to the XPath engine of XML::XPath (as opposed to its DOM implementation) I would be interested to hear about it for XML::XPathEngine [cpan.org], which is a fork of that engine.

    Thanks

    --
    mirod
    • A quick switch of using XML::LibXML for Test::XML::XPath doubled the test time for one of our tests. Of course, that was a single test run and there could have been other issues involved.

      The major issues I saw for XML::XPath were getting rid of a lot of "shift" statements in favor of list assignments directly from @_. There is at least one wrapper function which should be an alias and a rather curious looking AUTOLOAD. Also, some of the C-style for loops can be worked on.

      Other than that, I haven't lo

      • [Sorry about the tardiness of this reply.]

        If XML::LibXML takes a lot longer than XML::XPath, there’s probably something bad going on.

        In this case, I’d guess that the “something bad” is that XML::LibXML (or rather, the underlying libxml2 library) is loading external DTDs over the network. I can parse an 18-megabyte XML instance from a file in a little under a second on my laptop, once I turn off DTD loading.

        Turning off the ”run very slowly” flag can be done like this:

        my $parser = XML::LibXML->new;
        $parser->load_ext_dtd(0);

        my $doc = $parser->parse_file($xml_filename);

        If you’re thinking that loading external DTDs from the network is an undesirable default, I’m inclined to agree with you.

        [Credit where credit’s due: though I have seen and dealt with this issue elsewhere, I was reminded of it just now by this mailing-list posting [marc.info] by Aristotle Pagaltzis.]

        • You are correct that something was wrong. Turns out it was a completely unrelated error. Regrettably, we've discovered that we can't just switch over to XML::LibXML because we have two different namespaces and XML::XPath was letting us ignore this issue (this was in place long before I started). We have about 237 YAML documents driving our acceptance tests and they have multiple xpaths embedded which would presumably have to be changed to deal with this :/

          • In the interest of correctness it should be pointed out that XML::XPath is in violation of the XPath spec here, and that matching namespaced elements in XPath always requires binding a prefix to the namespace.

            • We are aware of this. We're just in a bad position with some legacy code and since it hadn't been addressed previously, we need to figure out the easiest way to move forward.