The Perl prototype to shishi managed to match "abc" today. From here on, getting the regular expression engine part of the runtime together is a SMOP; ensuring that the parsing side of it works as well is slightly more tricky. It's baby steps (and I'm going to take a few more days off to speed things up) but I'm still very excited about what's possible.
Selective tainting. "Taint all filehandles, but not environment accesses." Or even, "taint this filehandle, since it's coming in from the network, but not that one, since it's the server config file and I trust it". Difficult to implement in Perl 5 without a major slowdown, I think. You'd have to split taint checking from actual tainting, for starters. But worth thinking about.
Not because there were any problems with it, but because Chris Blackwell, the head of the record company, though it was too painful to release. You've not felt angst until you've heard "Baby, Please Come Home" or "Some People Are Crazy".
Warning: Do not listen to this album alone.
Idea of the day: JIT the bytecode produced by the regexp engine. Shouldn't actually be too hard. Produce a bunch of C functions which perform each node, keep everything in registers, and use C magic to build a buffer stringing the assembly code for each function together.
Tomorrow morning, I'm teaching regular expressions in the OUCS public Perl course; anyone from the University can come along and be taught basic Perl. It's a six week course of three hours a week - to give you a clue about the pace of it, we spent three hours a couple of weeks ago just on arrays. We're talking absolute beginning programmers here. Still, it's good fun.
I spent the day writing the course; I'm pretty much ready, although I haven't written all the notes I need - I don't have anything on substitution, say, but I'm sufficiently prepared I can make it up on the hoof. Fun.
Games::Goban and Apache::AxKit::Language::XSP::ObjectTaglib uploaded!#!/usr/bin/perl
use strict;
use Mail::Util qw( read_mbox );
use Mail::Internet;
use Mail::Address;
use Time::Duration;
use Date::Manip qw( ParseDate UnixDate );
for (map { Mail::Internet->new($_) } read_mbox("/home/simon/mail/pending")) {
my $when = $_->get("Date");
my $who = (Mail::Address->parse($_->get("From")))[0];
$when = time - UnixDate(ParseDate($when), "%s");
print "Mail received ",ago($when), " from ",$who->phrase,"\n";
print "Subject: ", $_->get("Subject"),"\n";
}
Fourteen lines, six Perl modules, to summarise a mailbox. What was I thinking?
Actually, it's reasonably sane on the whole - it uses s-expressions to represent variations, and it's trivial to parse. Indeed, it's what Games::Goban will speak natively.
But I couldn't work out why I was getting weird off-by-one errors in some parts of my display code; indeed, everything outside the upper-left quadrant of the board. Then I worked it out.
This supposedly for-machine-consumption file format uses a pair of characters from 'a' to 't' to represent co-ordinates on a 19x19 board. The more alert of you will notice that 't' is not the 19th letter of the alphabet, but the 20th. That's right - it actually skips over 'i'.
Working around this makes the code messy; I think I should ditch the idea of storing the native co-ordinates, which would have been very nice, and use numerical co-ordinates instead. Ho hum.