Idly googling for merdre , what should I see on the first page of results but this. Though, on reflection, maybe it's not so strange after all. Perl is surely one of the more 'pataphysical programming languages.
It took me most of the day to track down the bug and fix it, but it was an interesting journey. What he found is that, if you call want() from (a sub that's called from) within the guard of a loop, it crashes the second time through.
It turns out that this happened because of a subtle design flaw in Want. Perl doesn't really have any proper introspection capabilities, so modules like Want have to be cunning and take advantage of data that's around for other reasons. To decide what context a sub is called in, Want locates the part of the optree where the sub is called, and then trawls it to find the essence of the expression the sub call is in. (For example foo() + 2 means foo is called in numeric context, whereas foo() && 2 means it's called in boolean context.
There's no easy way (that I know of) to find the right part of the optree, but there are various bits of information around that give enough of a clue. The activation record for a sub records the last statement that was executed before the sub call, and the address the sub should return to. So I walk the optree, starting at the last statement, until I find the return address; then I know where the sub must have been called from.
The second time through a loop, however, it can happen that the last statement executed is after the return point, so it keeps walking and walking but never finds what it's looking for.
It took me a while to see how to fix it, but in the end I found a way. It so happens that loops, as well as subroutines, leave an activation record on the context stack, so the new code does this: after it's found the activation record for the sub, it keeps looking up the stack to see if there's a loop around the sub call. If there is, the optree walk starts at the beginning of the loop instead. That seems to fix it.
I'm just waiting for Damian to give the all clear before I release the new version.
sub gron {
my ($f, $total, $width) = @_;
my $veet;
$veet = sub {
my ($partial, $subtotal, $n) = @_;
my $rem = $total - $subtotal;
if ($n+1 == $width) {
$f->($rem, @$partial);
}
else {
$veet->([$_, @$partial], $subtotal+$_, $n+1) for 0..$rem;
}
};
$veet->([], 0, 0);
}
gron(sub {print($_ ? $_ : " ") for @_; print "\n"}, 3, 27);
I'm fairly sure that the idea of using a recursive closure in Perl has never crossed my mind before. Notice the disguised conses as well
It has a source filtering mechanism called camlp4. At the heart of it is an extensible replacement parser, which makes it almost trivial to change or extend the language. One of the examples in the manual adds a new loop construct in six lines of code.
Of course, camlp4 itself is written not in ordinary O'Caml but in the "revised" (formerly "righteous") syntax invented by the author of camlp4.
It's interesting that several of the "big" changes planned for Perl 6 are already features of O'Caml: extensible syntax, currying, stable multithreading.
Oh, and it's (conceivably) faster than C++.
I have started wondering about the feasibility of replacing perl's regex engine with PCRE. The regex engine is supposedly pluggable already, but it looks as though plugging in a completely different regex engine would still be non-trivial. Any thoughts?
If you have an OS X machine try this:
perl -e 'mkdir("foo\xED\xA0\x80bar") or die $!'
I've found the bug in my PCRE patch, which is partly to do with the way * repetitions are handled. But you don't actually need to use iterative repetitions any more, because you can replace iteration with recursion!
£^(<\w+/>|<(\w+)>([^<>]|(?1)|)(?3)</\2>)$£
I'll fix the bug soon...
I've also managed to prove that all context-free languages can indeed be expressed. The proof takes the form of an algorithm for turning a context-free grammar into a regex:
becomes
/(|\(((?1)|x(?2))\)(?1))/
Of course, the interesting part is proving that the algorithm really works. I plan to write it up in more detail soon.
I've added an interesting extension to the syntax. Would this be a good idea for Perl?