Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Wednesday February 05, 2003
10:27 AM

Why Perl programmers haven't embraced XML

[ #10412 ]

Because you end up embedding too much information about the structure of your XML data in your program. And of course the XML is subject to change, especially if someone else is generating it.

And Perl programmers hate external dependencies that are stuck in code.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Perl and XML (Score:5, Insightful)

    by ziggy (25) on 2003.02.05 10:57 (#16735) Journal
    Your assertion does not accurately summarize my experiences with Perl and XML.

    First, lots of Perl programmers have embraced XML. There was a period of time when the only module for parsing xml was XML::Parser and a few half-finished attempts at doing something differently. Today, there are many polished alternatives for processing XML, including the interchangeable PerlSAX framework which mimics SAX in Java. In fact, some ideas crop up first in Perl (or rather in Barrie Slaymaker's head) before they are proven and reimplemented elsewhere.

    Second, there's the burning question: what problem is XML trying to solve? One thing that XML has done is replace a bazillion and one one-off file formats and replaced them with a single easy-to-parse framework for creating new formats. Perl handily munged one-off file formats before XML (especially text-based ones), so Perl programmers have been and still are less inclined to whip up some random XML to solve a problem.

    Third, I said a few years ago that the areas where XML is being heavily adopted are also the areas where Perl is not heavily used. That has changed somewhat since 1999 or so, but it is still largely true. It's a pain to screen scrape an HTML page with Perl, but it's more of a pain to do it in Java. That's one reason why there's more of a need to adopt XML-RPC and SOAP with Java (where' it's easier to generate the stubs and descriptions) than in Perl (where WSDL is more difficult to generate).

    Fourth, Perl is less hyped than other languages and environments. People who migrate to Perl are generally less inclined to use something because it is fashionable, and tend to actively choose something because it works and solves a burning problem. A lot of XML vocabularies are really, really bad on so many levels. It's obvious to an XML adept that many vocabularies (like Mac OS X's Property List format) are in XML only to be fashionable; little to no thought was put into how the vocabulary would be actually used. I find myself still coming up with simple one-off text formats in Perl because they work better than some random poorly-designed XML vocabulary du jour.

    • Re:Perl and XML (Score:4, Interesting)

      by barries (2159) on 2003.02.05 12:30 (#16739)
      It's a pain to screen scrape an HTML page with Perl, but it's more of a pain to do it in Java.

      Matt Sergeant [perl.org], AxKit [axkit.org]'s father, cooked up a neat approach to this: use libxml2 (via XML::LibXML [cpan.org]) to parse the HTML in html and recover modes, then apply normal XML tools to it. I've not tried it, but I'd like for you to be able to do that and use XML::Filter::Disparcher [cpan.org] to pluck out strings from the resulting XML stream using rules like:

          'string( foo/p )' => sub { print "foo/p contains '", xvalue, "'\n" },

      Anyone that wants to try this, I'll help; it's a neat use case.

      I agree wholeheartedly that XML is being badly applied to many things (as in your bad grammers comment), and that it's also being misapplied to things where there are more appropriate technologies. I'm no fan of BXXP/BEEP or SOAP, for instance. (I may yet change my mind on BEEP, if the toolset supporting it makes it less impenetrable).

      - Barrie

      P.S. <blush/>. In reality, most of the ideas that crop up in my head have been disproven loooong ago. I rediscover the obvious, daily. It's like having intellectual altzheimer's, I meet same concepts anew each day.

      P.P.S. Anyone interested in Perl+XML should definitely check out Kip Hampton's Perl and XML articles on xml.com [xml.com]. They range from the sublime [xml.com] to the sophisticated [xml.com].

      • Matt Sergeant, AxKit's father, cooked up a neat approach to this: use libxml2 (via XML::LibXML) to parse the HTML in html and recover modes, then apply normal XML tools to it.

        Matt's mentioned this on more than one occasion. I always thought that libxslt/xsltproc was "broken" in its support for parsing HTML. I don't know how I came to that conclusion, but it must have been based on an early release of libxslt.

        Anyway, later that day, on Matt's urging, I wrote a quick little XSLT stylesheet to grep ou

    • I agree with your points in general. Maybe it's just me (chorus: IT"S JUST YOU!) but I find XML painful to work with, generally. Most of the XML "solutions" don't feel, well, Perl-y to me.

      I have come up with something that does - it's reasonably fast, rapid to code, and it works, plus it keeps most of the dependencies on what the XML is like outside the program itself: Text::Template to generate XML I need to give someone else, and XML::XPath to extract data from XML they've given to me. I was thinking abo