Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • I'd be curious to have a few precisions here. SAX is push-parsing (with the tiny extra that you can get a little context from the driver if it provides it), so I don't see how people could get that wrong ;-) As for pull-parsing, it is true that SAX does nothing for that. It should be too hard to come up with an API for pull-parsing, the trouble is mostly in agreeing on one. I guess that if someone presents a pull-parser system that is reasonable enough, it'll be adopted.

    I would think that people's


    -- Robin Berjon []

    • As for the part that interests me most, how many different interpretations of SAX did you get, and how did they differ?

      A notable point of difference was between people who considered the events to be all that the spec [] specified, and people who considered the spec to specify events and also parse(), parse_file(), parse_string(), and their behavior. If you read SAX as obliging one to follow the behavior of parse() et al in current parsers, then you can't easily implement something like HTML::TokeParser.

      Maybe it would be simpler if someone wise in the spirit of SAX would go and read and understand the interfaces to HTML::Parser [] and HTML::TokeParser [], and then implement something SAXful SAXilicious SAXicilious with the same kinds of interfaces.

      One "there, I did it, look how!" is worth a dozen "Well, it's possible... "'s.

      Go! Do it!

      • Ah thanks for the precision. That is an area that I would never have considered grey, but then I have my nose right inside it all the time, and a lot of the talk that define(s|d) SAX isn't archived as it happened on IRC.

        Here's an attempted short breakdown of the general idea (as clear as I can make it past 4am). The parse calls (parse, parse_file etc.) are part of the SAX spec, and there's a very good reason for this and for why I very much doubt that it'll change. SAX drivers are meant to be total


        -- Robin Berjon []