I'm writing a new article and am going to figure out where to submit it. This one is about lexing without parsing (well, lexing without a grammar, to be accurate). Perl is great at munging text, but sometimes we have complicated text that we need to analyze, but we have no proper grammar for it. If the data is a line oriented logfile with a very predictable format, no big deal. However, if it's rather irregular and the regular expressions are getting too complicated, lexing the data into predictable tokens can make a hard problem very easy to manage.
A good example of this is parsing SQL. There's no complete SQL grammar written in Perl and the snippet I posted showed how to extract column aliases (SQL::Statement uses SQL::Parser and doesn't handle CASE statements, so it's not a solution. Jeff Zucker welcomes patches, though
Disclaimer: I'm not claiming that this technique (which I learned from HOP, I might add) is the best way of solving the "parsing SQL" problem. It's merely an illustration of the technique involved. A more complicated example involves transforming math expressions.
X \= 9 / (3 + (4+7) % ModValue) + 2 / (3+7).
I found myself needing to transform expressions like that into Prolog, respect precedence and allow parentheses to override precedence to become this:
ne(X, plus(div(9, mod(plus(3, plus(4, 7)), ModValue)), div(2, plus(3, 7)))).
Writing a simple lexer made it very easy to do, though I'll probably beef it up to allow constant folding.
ne(X, plus(div(9, mod(14, ModValue)),