Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • You’re defining two identity copies, one on the root element and one for everything else; why? Next, you write a template for //* which is equivalent to matching * (but I haven’t slept and it’s 6AM, so I might be making a mistake). And finally, you say making the D nodes sort together with all the others would be a lot of work, when actually it’s less work than you’re already doing. Overall:

    <xsl:stylesheet
        xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     

    • Huh, the order of templates is not supposed to matter, but if I switch the order of the templates above, then it doesn't work anymore. If you change the "*" template to "//*", then it works again. Which probably explains why I had "//*" in my first template. Code:

      #!/usr/bin/perl

      use strict;
      use warnings;

      use XML::LibXML;
      use XML::LibXSLT;

      my $parser = XML::LibXML->new();
      my $xslt = XML::LibXSLT->new();

      my $source = $parser->parse_string(<<'EOT');
      <?xml version="1.0" encoding="UTF-8"?>
      <FOO>
      <A NAME="a" FOO="c"/>
      <B NAME="a" FOO="c"/>
      <C NAME="a" FOO="c"/>
      <D NAME="b" FOO="c"/>
      <A NAME="b" FOO="b"/>
      <B NAME="b" FOO="b"/>
      <C NAME="b" FOO="b"/>
      <D NAME="a" FOO="b"/>
      <A NAME="c" FOO="a"/>
      <B NAME="c" FOO="a"/>
      <C NAME="c" FOO="a"/>
      <D NAME="c" FOO="a"/>
      </FOO>
      EOT

      my $style_doc = $parser->parse_string(<<'EOT');
      <?xml version="1.0" encoding="ISO-8859-1"?>

      <xsl:stylesheet
          xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
          version="1.0"
      >

      <xsl:output
          method="xml"
          encoding="iso-8859-1"
          indent="yes"
          omit-xml-declaration="no"
      />

      <xsl:template match="//*">
        <xsl:copy>

          <xsl:apply-templates select="@*"/>

          <xsl:apply-templates select="*">
            <xsl:sort select="name()"/>
            <xsl:sort select="
                self::*[not(self::D)]/@NAME
                | self::D/@FOO
            "/>
            <xsl:sort select="self::E/@FOO"/>
          </xsl:apply-templates>

        </xsl:copy>
      </xsl:template>

      <xsl:template match="node()|@*">
        <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
      </xsl:template>

      </xsl:stylesheet>

      EOT

      my $stylesheet = $xslt->parse_stylesheet($style_doc);

      my $results = $stylesheet->transform($source);

      print $stylesheet->output_string($results);
      • Aha...from the docs [w3.org]:

        It is an error if this leaves more than one match. An XSLT processor may signal the error; if it does not signal the error, it must recover by choosing, from amongst the matches that are left, the one that occurs last in the stylesheet.
        • So it would probably be better to put an explicit priority on the "*" template.
          • Or use something more restrictive than node() in the identity transform to avoid matching elements, eliminating the conflict in the first place. The types a node can have are element, text, comment and processing instruction. Matching any node except elements therefore translates to XPath as “text()|comment()|processing-instruction()”. Another, possibly better way to write that (certainly a shorter one) is “node()[not(self::*)]”. (The predicate “[]” constrains the node()