Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

runrig (3385)

runrig
  dougwNO@SPAMcpan.org

Just another perl hacker somewhere near Disneyland

I have this homenode [perlmonks.org] of little consequence on Perl Monks [perlmonks.org] that you probably have no interest in whatsoever.

I also have some modules [cpan.org] on CPAN [cpan.org] some of which are marginally [cpan.org] more [cpan.org] useful [cpan.org] than others.

Journal of runrig (3385)

Friday July 20, 2007
08:48 PM

Comparing XML docs

[ #33854 ]

I've found a workaround for one of yesterday's annoyances, at least the one where a "folder compare" function (not really a file directory) in Informatica PowerCenter doesn't really compare everything in the folders. You can export folders as XML documents, so that's what I did, and then used XML::Diff to compare the documents.

Except that some of the elements in one document are not in the same order as the other document, and although I don't care, XML::Diff does. There are some commercial tools that will compare unordered XML elements, and I ran across one that I guess was free but is no longer.

So it was XML::Filter::Sort to the rescue (thank you grantm - and all the other XML folks), and I just sorted all the elements where I didn't care about the order, and then diffed the results. Several of the elements where just sorted by the name attribute, so I made a bunch of sorters in one go:(Updated code: Needed "./" prefix on NAME)

my @sorters = (
  map {
    XML::Filter::Sort->new(
      Record => $_,
      Keys => [ ['./@NAME'] ],
    )
  )  @list_of_elements
);

I also wrote my first actual XML::SAX parser for the task of deleting some attributes where I didn't care about differences in values.

And some of the attributes had encoded control characters in them e.g. , and those just came out as spaces, and for the purposes of this, I didn't care, but in other situations, I might care, so I'm wondering if there's a way to preserve those. Though I hear from a reliable secondhand source that there is no reliable way to preserve them :-(

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Could you have used XML::SemanticDiff [cpan.org] instead? That seems to do what you want...
    • I looked at XML::SemanticDiff, and XML::Diff seems to suit my purposes better. XML::SemanticDiff tells you that there's a difference and where the difference is, and XML::Diff tells you all that plus what the difference is, though the output is more verbose. And I would still have to sort and filter things to see the actual differences that I want to see.