Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

darobin (1316)

darobin
  (email not shown publicly)
http://berjon.com/

Journal of darobin (1316)

Tuesday February 26, 2002
07:55 AM

Simplicity Check

[ #3129 ]

This must be the sixth of seventh time that I hear smart people call SAX complex. In the case of some techniques, I can understand why some would find them complex. But in the case of SAX it totally evades me.

SAX requires the following knowledge to be used effectively:

  • a minimal working understanding of XML. This is required because when using SAX, one is after all processing XML, or XML-like data. By minimal I mean elements, attributes, and character data. No rocket science here.
  • sufficient understanding of Perl to create about five methods.
  • sufficient understanding of Perl to use hashes.

And that's all. Yes there is more available from the spec. There are several Handler types that one can use when it can be more convenient to dispatch various events to different classes. But you can simply use the default ('Handler') and forget about the others. Yes there are events that can be used to express the more obscure parts of XML but either you know what they are and thus how to use them, or you don't and you can freely ignore them. I ignore them in 98% of the cases. That's why the spec has been split into Basic and Advanced chapters. Most people only read the Basic part, and are happy with that.

And for the very few rough edges (choosing a parser, building long pipelines...) there are helper modules. There also are quite a good bunch of articles on the subject at http://www.xml.com/pub/q/perlxml that really explain things in simple terms (thanks Kip!).

So is there something that triggers dummy mode in otherwise brilliant people? Is there some magic potion that makes it simple for some of us?

I'm not ranting. I simply feel at loss.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • It's not that I don't understand SAX. It's more that I don't like it. For the vast majority of cases, SAX provides far more behaviour than is necessary and imposes far more constraints on the programmer than is tasteful.

    You've said the same thing yourself in your post; you describe the subset of SAX that one needs in order to use it. Well, if that's all we need, what on earth is the rest of it doing there? The pipeline initiative is, at one level, about encapsulating the fundamental things about the Pipeli
    • For the vast majority of cases, SAX provides far more behaviour than is necessary and imposes far more constraints on the programmer than is tasteful.

      Can you list examples? I use SAX daily and have yet to feel constrained, tasteless, and over-behavioured.

      you describe the subset of SAX that one needs in order to use it. Well, if that's all we need, what on earth is the rest of it doing there?

      That's all most need. It's the usual 80/20 thing, except in this case it's closer to 95/5. Usin

      --

      -- Robin Berjon [berjon.com]

      • Ahem. I did go a little over the top there didn't I?

        I'm guessing that those of you who worked on implementing SAX might take it a little personally...

        I'm not sure if it is deliberate or not, but some of the evangelism I've seen about SAX seems to imply that it is the answer to every prayer when, on closer inspection, it patently is not. This does tend to make folks a little wary.
        • I'm not sure if it is deliberate or not, but some of the evangelism I've seen about SAX seems to imply that it is the answer to every prayer when, on closer inspection, it patently is not. This does tend to make folks a little wary.

          I suggest you spend a little time to crack the nut that is XML. Ignore the hype and try and see why bracketheads like us see value in focusing on processing the data and not on brute-force programming to solve a problem. It's not so much that SAX is the solution to world h

    • You've said the same thing yourself in your post; you describe the subset of SAX that one needs in order to use it. Well, if that's all we need, what on earth is the rest of it doing there?

      We have a saying in the XML world: «One program's data is another program's metadata»

      The most common use of XML is to encode data, and in those circumstances, the only thing that's needed are access to start/end elements and data. But there are programs out there that create XML, or do more involved pr

      • This, in my estimation, is the worst problem XML has created for itself.

        Indeed. I remember the days when we would have to try to convince people to use XML for problem foo and they weren't seeing the benefits. Nowadays it seems that I more and more recommend that people do not use XML for given problem bar, because they're only using a buzzword and in fact still not seeing the benefits.

        We have now reached a point where the marketing hype and alphabet soup is obscuring the truly good bits, and

        --

        -- Robin Berjon [berjon.com]

  • I have to agree with what barrie [perl.org] said elsewhere. SAX is complicated because XML is complicated, and lots of programmers dislike SAX because it is both XML based and event based, and picks up a good deal of the animus against XML by association. Tree-based APIs are less prejudiced because they're not event based and take the point of view that "XML is basically a data structure." But they still get tainted with some of the hatred against XML.

    Even hard-working XMLheads dislike SAX. The Cocoon project to

  • Can one use SAX without writing a class and methods? I.e., can one just give a set of callbacks, as one can with HTML::Parser v>=3 ?
    • Sure. So long as you call those callbacks MyPackage::start_element, MyPackage::characters, MyPackage::end_element, and then just pass in bless({}, 'MyPackage') as your handler.
        • Um. Whilst I'm still not entirely convinced about SAX, what's your problem with doing it OO? Allowing both functional and OO styles starts to lead us gently down the road towards the madness that is CGI.pm

          Of course, you could always subclass the basic XML::SAX::Filter so that you pass it callbacks on instantiation...
          • Your analogy to CGI.pm is specious.
            • I don't think it is, at least not completely. If we are to support what can be done with methods using callbacks (notably transparent filtering and the such) then down that path quite clearly lies madness.

              If you want a SAX handler that uses callbacks, then you most definitely can. Why one would want to do that, I really don't know as:

              1. handlers don't need to subclass anything (you only subclassing when you must create or propagate events, ie in a driver or a filter)
              2. you don't even need t
              --

              -- Robin Berjon [berjon.com]

              • If we are to support what can be done with methods using callbacks, (notably transparent filtering and the such) then down that path quite clearly lies madness.

                What part of it is madness?

                • In SAX you can chain filters to no end, without needing to care about which of them define which event handling methods. If you have Parser -> FilterA -> FilterB -> Handler, and FilterA doesn't wish to deal with, says, the start_element events, FilterB will nevertheless receive them (and so will the Handler, unless of course one of the steps catches them and refuses to propagate them). This scales to any number of Filters set up in a Pipeline, which makes such pipelines much easier to setup and

                  --

                  -- Robin Berjon [berjon.com]

    • Yes, that's what XML::Filter::Dispatcher [cpan.org] provides. It takes a pattern matching language (an XPath subset) and allows you to map matching events to subroutines or other SAX handlers.