Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Ovid (2709)

Ovid
  (email not shown publicly)
http://publius-ovidius.livejournal.com/
AOL IM: ovidperl (Add Buddy, Send Message)

Stuff with the Perl Foundation. A couple of patches in the Perl core. A few CPAN modules. That about sums it up.

Journal of Ovid (2709)

Tuesday August 19, 2008
07:32 AM

How Can I Write This Better?

[ #37227 ]

Using XML::LibXML, I'm trying to determine if two nodes are different. I have the following humiliating code, but it works.

sub _node_differs {
    my ($self,$left,$right) = @_;

    # toStringC14N ensures "canonization" and allows '<title/>' to match
    # '<title></title>'
    my ($left_string, $right_string) = map {
        defined($_) ? ( eval { $_->toStringC14N } || $_->toString ) : ''
    } ( $left, $right );

    if ($left_string ne $right_string) {
        return [ $left, $right ];
    }
    return;
}

What's a "proper" way of doing that? I know I'm missing something obvious.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Apart from the weird 'eval' (toStringC14N is defined in XML::LibXML::Node, so anything that's part of a XML::LibXML document should have it), it doesn't look "humiliating".

    You can create a quotient of all XML (text) representations by putting in the same class those that produce the same DOM (internal) representation; you want to see if two nodes fall into the same equivalence class. Canonicalization is a way to produce a representative of the class, so comparing two canonicalized representation of DOM subt

    • The eval is there because I kept hitting nodes which would have that method defined, but I get this error:

      Failed to convert doc to string in doc->toStringC14N at /opt/csw/lib/perl/site_perl/XML/LibXML.pm line 955

      I don't know what's causing that, but the eval protects against it.

      • Looking through the source code of libxml2, it looks like it's caused by nodes that have no textual content: the xmlC14NDocDumpMemory function (used by toStringC14N) does not set its "output buffer" parameter if the buffer would be empty.

        Strange behaviour, I'd call it a bug, or at least a documentation deficiency.

        • Thanks for digging into this. I've included the URL of this discussion in our code base so people have a clue what that weird eval is for.