Just a simple [cpan.org] guy, hacking Perl for fun and profit since way back in the last millenium. You may find me hanging around in the monestary [perlmonks.org].
What am I working on right now? Probably the Sprog project [sourceforge.net].
GnuPG key Fingerprint:
6CA8 2022 5006 70E9 2D66
AE3F 1AF1 A20A 4CC0 0851
Following news that chinese scientists have completed a face transplant, I commented that next time China hosts the Olympics they'll be able to have a six year old singer with a good voice and a pretty face. A colleague immediately came back with "Well they'll claim she's six".
We had the August meeting of Wellington.pm this week. Sam gave us his talk on the work he did converting the Perl source code history to git. It turned out to be a rather larger undertaking than he originally imagined. Sam was preparing this talk for YAPC::Eu but unfortunately the costs of getting there have proved prohibitive so he may just have to recast the talk as a chapter in his memoirs.
Next up, Brenda gave us a quick rundown on opportunities for networking with (open source) geeks in Wellington. In a novel twist, her presentation slides were a flickr photo set.
After Brenda's talk we had a lively discussion on strategies for dealing with an inherited codebase; code formatting styles; Perl best practices; the evils of misused subroutine prototypes; and interesting home grown config file formats.
To round off the evening, I gave a 'behind the scenes' talk on last month's HackOff competition. The focus for people on the night (and after the event) has been on decoding the datafiles for each of the problems. The focus of the talk was on how I went about encoding the data to produce the question files. There is strong interest in doing it again next year which means more work for me - but at least I had fun.
I was feeling virtuous for having finally made the time to address the RT queue for XML::SAX. After resolving 7 tickets I pushed out a new release and within about an hour I'd learned from the CPAN testers that the new release doesn't work with 5.6 (yay for CPAN testers!). The thing that's puzzling me is that I can't see how it ever worked.
Having compiled up a 'fresh' copy of Perl v5.6.2, I'm now able to see what the CPAN Testers are seeing. Here's a minimal test case:
$ perl -le 'q{TEST} =~/^[\x{0041}-\x{005A}]+$/ && print q{OK}'
Invalid [] range "}-\x" before HERE mark in regex m/^[\x{0041}-\x << HERE {005A}]+$/
The same thing works as expected on 5.8. The XML::SAX::PurePerl parser makes extensive use of this syntax (character classes which include ranges of unicode characters) to match productions such as these from the XML spec. It's been that way since 2002 and apparently working (modulo a few bugs). Indeed if I go back to the code in the previous release it passes all the tests - even with regexes that look like that.
I'm going to bed and hoping this makes more sense in the morning.
A client's web site that I support has a simple feedback form which emails the form submission to a number of business users. This form has become very popular with comment spammers despite the fact that nothing submitted via the feedback form ever ends up on the web site.
On Friday I added a simple anti-spam measure and was disappointed to discover that the emails continued to roll in over the weekend. After tracing back through various logs I discovered it wasn't my script at all! When we launched a new site design 3 months ago, I took the opportunity to consolidate a number of CGI scripts into the existing mod_perl application framework. The feedback form was tweaked to point to a new form handler URL. I left the old form handler script in place to facilitate easy rollback and assumed it would do no harm since there were no forms pointing at it. Duh!
So it appears that multiple bots have cached copies of the old form handler URL and the field names it used to expect - despite the fact that the original form disappeared 3 months ago.
Rule number 1 of web security says you can't trust the input data. In particular you can't assume the form that was posted is the one you provided. Unfortunately the comment spam continued to pass all of the old handler's validation rules, so it continued to sail through to email. Of course another key rule of web security is that your web site should not expose any code/functionality that is not essential for the running of the site. I guess I'll have to say mea culpa to that one.
My 4 year old Acer laptop died the other day. It just seems to be the hard drive but I had been thinking about replacing the machine anyway. I don't have the budget for anything flash and I needed something quick(!) so I ended up getting another Acer (Aspire 5920).
The machine came pre-installed with Windows Vista so when I powered it on I was prompted to "complete the installation process". That entailed answering some questions and waiting while updates were downloaded and of course rebooting a couple of times.
I don't actually have any desire to run Windows (except maybe for portability testing) so my next step was to download and install Ubuntu Hardy. Surprisingly, downloading a 700MB ISO, installing Linux and downloading updates took less time than "completing the Windows installation process". We're getting closer to a "just works" experience. Video (with fancy compositing effects) and wireless networking worked without any fuss at all. Audio works too but strangely only through the headphone socket, not through the built-in speakers. As it happens I generally only use the headphones so making the speakers work isn't a big priority.
Before my hard drive died, I had just started putting together an 'analysis' of how the teams in the Wellington.pm HackOff event solved each of the questions. After various setbacks, I hope to return to that task in the next couple of days. After that's done I really hope to have time to look at my embarrassingly long RT queues.
As previously mentioned, Wellington Perl Mongers hosted a 'HackOff' event this month. It was a fun evening with teams of programmers competing to solve problems quickly. The problems used in the live event are now available on the Wellington.pm site.
If you can answer all five questions in under 90 minutes you're doing better than our teams did
Once I've had some sleep I'll take a look at the code that the competitors posted and see if it's worthy of a write up.
The July meeting of Wellington.PM was last night. This month we hosted a 'Hack Off' - a bunch of teams of hackers racing to solve programming problems. The event was definitely a success and lots of people were asking "when can we do it again?"
"Team Cabbage" emerged victorious but "Team Amorphus" were close behind them.
I hope to have some more info up on the web site soon, including: the actual questions (so you can play at home); sample solutions; pictures etc. For now, I have a pretty graph.
The web page for the Wellington.PM 'HackOff' event has been getting quite a lot of traffic. Presumably much of that is people trying to solve the puzzle. Referring URLs in our server log include variations of:
http://www.google.co.nz/search?q=EFBBBF2C756FC9A5CA87CA8E642CC2A073C4B1C2A0C99F
Kids these days!
I've been working full time (and more) for most of the year on a CMS migration project. It seemed to take over my every waking hour and meant my backlog of non-work work has moved from being ridiculously long to insanely long. Anyway, we finally drew a line under it today and called it done. Hooray!
I don't imagine that migrating from one CMS to another is ever much fun. In this case it was certainly an adventure. We were moving from a proprietary CMS called ArticleManager (ArtMan) to Drupal. The version of ArtMan was quite old; was written in Perl (which had been obfuscated to protect their IP); and used a binary on-disk file rather than a database. I didn't have to worry about the Drupal side of things because our company has half a dozen people who specialise in knocking together Drupal sites and one of them would look after that side of things.
I was able to get a fairly high-fidelity export of the data using WWW::Mechanize to walk through ArtMan's article and category edit screens and pulling out the contents of the HTML form elements. I inserted the data into a Postgres database which I was then able to run lots of SQL queries over to try and decode what the various flags and statuses meant.
It was around this time that I discovered all our Drupal people were fully committed so despite my complete ignorance of PHP, the Drupal deployment became my problem too. Fortunately there were good people on hand to answer my many questions - thanks especially to Martyn.
I've worked with a few CMSs and I have yet to meet one that I like. Having said that, Drupal is probably the one I hate the least so far. The big thing that Drupal gets right is that they acknowledge everyone's requirements are different and that for all but the most trivial sites, you will need to customise the behaviour of the CMS. With this in mind Drupal provides an architecture and an API that enables you to add new functionality and change core functionality without changing the core code. The fact that the Drupal developers have achieved this using PHP is nothing short of miraculous. The API is undeniably quirky but you can't go past the fact that it works.
Another great thing about Drupal is the large number of modules that are available to drop into your installation. Some of them will even do stuff that's vaguely similar to stuff you want. I've come to the conclusion that the greatest value of these modules is that they provide sample code you can cut and paste when building your own modules to turn Drupal into exactly the system you want.
Another big win is that even though the core functions and add-on modules can be configured by pointing and clicking, they can also be configured from code. Martyn helped me set up an installation profile script which took me from nothing to fully configured in a little over a minute. Knowing you can burn down and completely recreate your development environment in minutes really helps to build confidence in the product and your ability to deploy it.
So I ended up building two custom modules to support:
Of course all that comes at a cost. Drupal performance sucks. Big time. No doubt a large part of that is all the hard work my custom code is doing and no doubt we could do magic with caching to make it suck less. But it doesn't matter because we were never going to install Drupal on our web servers anyway. We use wget to suck all the pages out into static files; run some fix-ups across it with a Perl script and rsync it up to the production server. For DR we just rsync to two servers instead of one.
At least my next assignment is Perl.