Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Matts (1087)

  (email not shown publicly)

I work for MessageLabs [] in Toronto, ON, Canada. I write spam filters, MTA software, high performance network software, string matching algorithms, and other cool stuff mostly in Perl and C.

Journal of Matts (1087)

Monday October 29, 2001
09:24 AM

Perl and XML sucks?

[ #1103 ]

A couple of weeks ago Elliotte Rusty Harold, the author of several Java and XML books (including XML in a Nutshell, which I tech reviewed for him) put online Chapter 6 of his "Java and XML" book. On his site Cafe Con Leche, an XML geek site like use.perl, he asked for feedback, particularly with respect to the SAX support in other languages. What he said about Perl was, in my opinion, pure flamebait:

  • Although supporting the "Desperate Perl Hacker" was a goal of the original XML working group, Perl has always lagged other languages quite a bit when it comes to XML. The initial problem was the lack of support for Unicode, a sine qua non for XML. Today modern Perls have decent Unicode support, though you will need version 5.005_52 or later to really handle XML; and I recommend version 5.6 or later. There are several XML parsers available for Perl, though far and away the most popular is Larry Wall and Clark Cooper's XML::Parser. This is a wrapper around James Clark's Expat parser for C. At the time of this writing, support for the SAX API from Perl is still in its infancy and limited to SAX1, though this may be upgraded to SAX2 by the time you read this, and Perl 6 will probably include it as a standard library. Regardless, in my opinion Perl is not as ideal a language for processing XML as you might expect. Perl is very good at pulling out implicit structure in text documents such as tab delimited text files and comma separated values. However, XML documents tend to have very explicit structure that is easily addressed by a language like Java. Consequently the inevitable obfuscation of Perl code seems to me too high a price to pay.

This coverage bothered me a lot, and so since I know Elliotte, I took it upon myself to get this put right before it goes to print. I pointed out where he was wrong, and accused him of using FUD where it was appropriate.

So, today I got a reply. He's corrected some stuff, but stands by his FUD:

  • >Every language I can think of has easy ways to access complex data
    >structures. What's written there is pure FUD. Perl is no different to Java
    >in this respect.

    No, it is. Perl's built-in regular expression support is significantly different than what most other languages provide. It makes Perl a much more powerful tool for programs that need to parse text documents, certainly more powerful and convenient than Java for this use case. That's strong enough to outweigh the disadvantages of Perl for these sorts of programs. But if this is not what you're doing (and when processing XML it isn't) then you get all the disadvantages of Perl with none of counterweighting advantages.

    > "Consequently the inevitable obfuscation of Perl code seems to me too high
    >a price to pay."
    >It's possible to write crap in any language. I imagine you see Vietnamese as
    >obfuscated too (or maybe not, but the point being that Perl is simply
    >foreign to you).

    I can think of no imperative language that lends itself to "writing crap" as easily or that's as hard to read as Perl. I keep hearing claims that Perl can be written cleanly, but every time I look at Perl code that's allegedly clean its initially incomprehensible. I can't even follow my own Perl code if I've put it aside for a week or so, something that is definitely not true of the code I write in other languages.

I'm not even really sure how to defend this beyond what I've already said. I'd really like some help in penning a reply. Comments open.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • Either he cannot read Perl, which is his problem, or he does not read enough Perl. There is plenty of comprehensible Perl code out there. The only significantly obfuscated thing about my code, for example, is the occasional idiom, and the poor organization. The idioms are just about knowing the language; the poor organization is a problem in any language.

    As to his other problem, he comes from the perspective that Perl has significant inherent disadvantages, and that it is a necessary evil for programs
  • Until the last few opinionated sentences, I don't think it qualifies as flamebait or FUD at all. The Perl XML modules are a mess and a PITA to install and, if Jarkko's grumblings over Unicode and just how far Perl5 is from having real 'standard' Unicode support, then there may be some truth in what he says. I still really haven't figured out what XML is good for though either :)

    • I think he is missing that perl programmers tend to approach XML in a different manner to Java programmers.

      For example there are far more options in perl for parsing, creating, transforming and handling xml than in java.

      Not to mention perls ability to munge the data itself outside of validation. Why invoke all the overhead of an tree or parser object simply to output or parse in simple xml?

      For example I create Dia XML using Template Toolkit. Parsing info into an object and making the objects methods a

      @JAPH = qw(Hacker Perl Another Just);
      print reverse @JAPH;
    1. What disadvantages?
    2. What inevitable obfuscation?

    That's strong enough to outweigh the disadvantages of Perl for these sorts of programs.

    Stated as an axiom, but what evidence is there? The only thing people ever say is, "I couldn't even read my own code a week later." Try to assemble a bibliography to support the statement that Perl has disadvantages. Eliminating misunderstandings about JAPHs being similar to standard programs and Perl being less efficient than C (see my post under the code profil

    J. David works really hard, has a passion for writing good software, and knows many of the world's best Perl programmers
  • Elliotte is an interesting guy, but I've found him to be a little too opinionated at times.

    Yes, Perl has lagged a bit in terms of standardization of XML Processing modules when compared to Java. But that can be attributed to the nature of Java development vs. Perl development. Java developers tend to spend a lot of time creating one interface (e.g. SAX, DOM, JAXP) and then reimplement it a bunch of times. Technically, this is supposed to be better because the implementations are interchangeable and can