Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Must admit to being an idiot in that it took me quite a while to figure out what thingy to click to reply to this.

    We talked about this on IRC and I agreed with you. I went and looked at HTML::Parser, which I think we figured out was a good model for what we want to be able to do with WikiText. It's all in XS, so I ran away. Bad move? Me no spik C.

    Kake

  • by jcavanaugh (1007) on 2003.02.19 20:42 (#17260)
    Have you looked at the source for TWiki?? Its regexp based as well. I realize that regexp based parsing/expanding is a primitive mechanism and painfully slow at times. However, its also pretty darn powerful as well. Im interested in your thoughts on how to make a better parser/renderer for something like TWiki a reality. --John Cavanaugh
    • Powerful? Sure, in that it can do a lot. No, in that it doesn't help the software understand the structure of the data at all. Regex is a very limited language.
    • The problem with Twiki's parser (and all the other ones) is they all look something like this:

        $text =~ s/someformatting1/<somehtml1>$1<\/somehtml1>/;
        $text =~ s/someformatting2/<somehtml2>$1<\/somehtml2>/;
        $text =~ s/someformatting3/<somehtml3>$1<\/somehtml3>/;
        $text =~ s/someformatting4/<somehtml4>$1<\/somehtml4>/;
        $text =~ s/someformatting5/<somehtml5>$1<\/somehtml5>/;

      Which is great if you want HTML, but what if y

  • Three Reasons (Score:3, Insightful)

    by chromatic (983) on 2003.02.19 21:05 (#17261) Homepage Journal
    • Writing a proper parser is hard
    • We're not as smart as you are
    • A terrible regex implementation that gets the job done pretty well today is a heckofalot better than a beautiful, perfect event-based parser that isn't here yet

    I'm only about halfway kidding.

    • I agree with all your points but the second one ;-)

      Since Text::WikiFormat::SAX doesn't work properly I'm obviously not as smart as you think I am ;-)

      But in all seriousness this is something I hope to put right, in a similar way to trying to put right the whole XML parser nonesense. That way I can put my code where my rant is, or something like that.
  • I might almost forgive them if they implemented the same regex, but each person implementing a different regex based "mini-language" is the worst.

    Maybe its not too late to get gnat to include recipes on using Parse::Yapp, and Parse::RecDescent in the next cookbook.