About an hour of hacking later, I had a program that emitted valid XML. The hard parts were, as always:
In the past, I'd have hacked the table by hand. But this time I elected to use HTML::TableContentParser, and it made the job a lot easier. The documentation's rather blurry on the data structure you get back, but I used Data::Dumper to display it and quickly figured out what I was working with.
I've found that a lot of my screenscraping programs have the same structure. I quickly write the code that fetches the first page and saves it to a file. I look at it to visually confirm that I'm downloading the right page. Then I use Getopt::Std to implement an option that lets me say "don't download the first page, just load it from the local file". This speeds up debugging while I'm figuring out how to parse the HTML. When I was scraping the ORA proposals database last year, I had two or three steps that I could skip if I'd already debugged that part of the code.