Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Journal of rafael (2125)

Thursday July 25, 2002
02:26 AM

I wrote a Perl parser last night

[ #6624 ]
Yes, a Perl parser, something that takes some Perl 5 source code and that produces an abstract syntax tree. And it works.

Of course, this is based on an Evil Hack. The Evil Hack is to parse the output of perl -c -DTp. (You'll need a perl compiled with -DDEBUGGING for this to work.) These debug flags make perl output traces for any tokenizer and parser actions.

So, based on the traces, I can reconstruct the functioning of an LALR(1) parser, that "shadows" perl's parser (you know, shifts, reduces, and reading a new token symbol).

Drawbacks :

  • You need -DDEBUGGING.
  • Perl's tokenizer is very clever. It can produce fake (zero-length) tokens or permute some token in the input stream. You don't want to know about how it tokenizes "abc$def".
  • For the moment I can't always get the part of the input source that's associated with some tokens. (If I find out that some information is missing, I'll patch the debug traces in the core!)
  • If your Perl script outputs something like "yydebug: after reduction, shifting from state 23 to state 79" to stderr during execution of a BEGIN block, this will confuse my parser.

Now I have to design an API for it. Basically I can trigger any callback on shifts, reduces and reads. Those sets of callbacks are conveniently packaged as, well, packages. So I was thinking about something like

  • a Perl::ShadowParser that implements the parser
  • Perl::ShadowParser::* backend plugins that provides the callbacks
  • a little program for your convenience that runs the parser on any script with any callback(s) you've provided :

    perlshadowparser -b backend1 -b backend2=option1,option2 perlscript

Lots of tests will be needed, too.

If you have any ideas of something cool to do with it (ideas for backends...), I'm listening.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • ...is to produce IMCC output so you can target Parrot.

    Not sure how well that would work yet, but cool none the less!

    • Another obvious backend is to produce Perl 6 code.

      There are many ways to implement a Perl 5 to Perl 6 translator. For example, a B module similar to B::Deparse could do a good job. But it won't work on all Perl 5 sources. (See the BUGS section in the B::Deparse manpage.) Another way is to build a completely standalone Perl 5 parser. Very difficult (it's a task for Damians). A third solution is to put a hook into Perl 5.10's parser. The fourth solution is my Evil Hack. Of course, those solutions are not mut