NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.
All the Perl that's Practical to Extract and Report
Stories, comments, journals, and other submissions on use Perl; are Copyright 1998-2006, their respective owners.
XML Regexes (Score:2)
A very good list of truisms. However, there is the subtlest of subtle flaws in this list -- all categorical statements (including this one) are false. For example:
Sometimes you do want to parse XML with regexes, but only in the most controlled of circumstances. Usually this involves munging huge quantities of data that are very rigidly formatted. If you can fully control the structure of XML inputs, and you tend to be reading inputs line-by-line (or bloc
Re:XML Regexes (Score:1)
In the case to which I allude, I assume that the code (I have not seen it) processes the XHTML line-by-line. This is a problem because the XML specification allows newline characters as valid whitespace characters within tags. This is a big problem because the input comes from arbitrary sources.
Parsing this XHTML without a stack or state machine somewhere is problematic.
Re:XML Regexes (Score:2)
Yep. No wiggle room on that. That's as hard and fast a rule as don't divide an integer by zero.
Re:XML Regexes (Score:1)
perl -e 'print "Just another Perl ${\(trickster and hacker)},";'
The Sidhekin proves that Sidhe did it!
Reply to This
Parent
Re:XML Regexes (Score:2)