Just a simple [cpan.org] guy, hacking Perl for fun and profit since way back in the last millenium. You may find me hanging around in the monestary [perlmonks.org].
What am I working on right now? Probably the Sprog project [sourceforge.net].
GnuPG key Fingerprint:
6CA8 2022 5006 70E9 2D66
AE3F 1AF1 A20A 4CC0 0851
A client's web site that I support has a simple feedback form which emails the form submission to a number of business users. This form has become very popular with comment spammers despite the fact that nothing submitted via the feedback form ever ends up on the web site.
On Friday I added a simple anti-spam measure and was disappointed to discover that the emails continued to roll in over the weekend. After tracing back through various logs I discovered it wasn't my script at all! When we launched a new site design 3 months ago, I took the opportunity to consolidate a number of CGI scripts into the existing mod_perl application framework. The feedback form was tweaked to point to a new form handler URL. I left the old form handler script in place to facilitate easy rollback and assumed it would do no harm since there were no forms pointing at it. Duh!
So it appears that multiple bots have cached copies of the old form handler URL and the field names it used to expect - despite the fact that the original form disappeared 3 months ago.
Rule number 1 of web security says you can't trust the input data. In particular you can't assume the form that was posted is the one you provided. Unfortunately the comment spam continued to pass all of the old handler's validation rules, so it continued to sail through to email. Of course another key rule of web security is that your web site should not expose any code/functionality that is not essential for the running of the site. I guess I'll have to say mea culpa to that one.
My 4 year old Acer laptop died the other day. It just seems to be the hard drive but I had been thinking about replacing the machine anyway. I don't have the budget for anything flash and I needed something quick(!) so I ended up getting another Acer (Aspire 5920).
The machine came pre-installed with Windows Vista so when I powered it on I was prompted to "complete the installation process". That entailed answering some questions and waiting while updates were downloaded and of course rebooting a couple of times.
I don't actually have any desire to run Windows (except maybe for portability testing) so my next step was to download and install Ubuntu Hardy. Surprisingly, downloading a 700MB ISO, installing Linux and downloading updates took less time than "completing the Windows installation process". We're getting closer to a "just works" experience. Video (with fancy compositing effects) and wireless networking worked without any fuss at all. Audio works too but strangely only through the headphone socket, not through the built-in speakers. As it happens I generally only use the headphones so making the speakers work isn't a big priority.
Before my hard drive died, I had just started putting together an 'analysis' of how the teams in the Wellington.pm HackOff event solved each of the questions. After various setbacks, I hope to return to that task in the next couple of days. After that's done I really hope to have time to look at my embarrassingly long RT queues.
As previously mentioned, Wellington Perl Mongers hosted a 'HackOff' event this month. It was a fun evening with teams of programmers competing to solve problems quickly. The problems used in the live event are now available on the Wellington.pm site.
If you can answer all five questions in under 90 minutes you're doing better than our teams did
Once I've had some sleep I'll take a look at the code that the competitors posted and see if it's worthy of a write up.
The July meeting of Wellington.PM was last night. This month we hosted a 'Hack Off' - a bunch of teams of hackers racing to solve programming problems. The event was definitely a success and lots of people were asking "when can we do it again?"
"Team Cabbage" emerged victorious but "Team Amorphus" were close behind them.
I hope to have some more info up on the web site soon, including: the actual questions (so you can play at home); sample solutions; pictures etc. For now, I have a pretty graph.
The web page for the Wellington.PM 'HackOff' event has been getting quite a lot of traffic. Presumably much of that is people trying to solve the puzzle. Referring URLs in our server log include variations of:
http://www.google.co.nz/search?q=EFBBBF2C756FC9A5CA87CA8E642CC2A073C4B1C2A0C99F
Kids these days!
I've been working full time (and more) for most of the year on a CMS migration project. It seemed to take over my every waking hour and meant my backlog of non-work work has moved from being ridiculously long to insanely long. Anyway, we finally drew a line under it today and called it done. Hooray!
I don't imagine that migrating from one CMS to another is ever much fun. In this case it was certainly an adventure. We were moving from a proprietary CMS called ArticleManager (ArtMan) to Drupal. The version of ArtMan was quite old; was written in Perl (which had been obfuscated to protect their IP); and used a binary on-disk file rather than a database. I didn't have to worry about the Drupal side of things because our company has half a dozen people who specialise in knocking together Drupal sites and one of them would look after that side of things.
I was able to get a fairly high-fidelity export of the data using WWW::Mechanize to walk through ArtMan's article and category edit screens and pulling out the contents of the HTML form elements. I inserted the data into a Postgres database which I was then able to run lots of SQL queries over to try and decode what the various flags and statuses meant.
It was around this time that I discovered all our Drupal people were fully committed so despite my complete ignorance of PHP, the Drupal deployment became my problem too. Fortunately there were good people on hand to answer my many questions - thanks especially to Martyn.
I've worked with a few CMSs and I have yet to meet one that I like. Having said that, Drupal is probably the one I hate the least so far. The big thing that Drupal gets right is that they acknowledge everyone's requirements are different and that for all but the most trivial sites, you will need to customise the behaviour of the CMS. With this in mind Drupal provides an architecture and an API that enables you to add new functionality and change core functionality without changing the core code. The fact that the Drupal developers have achieved this using PHP is nothing short of miraculous. The API is undeniably quirky but you can't go past the fact that it works.
Another great thing about Drupal is the large number of modules that are available to drop into your installation. Some of them will even do stuff that's vaguely similar to stuff you want. I've come to the conclusion that the greatest value of these modules is that they provide sample code you can cut and paste when building your own modules to turn Drupal into exactly the system you want.
Another big win is that even though the core functions and add-on modules can be configured by pointing and clicking, they can also be configured from code. Martyn helped me set up an installation profile script which took me from nothing to fully configured in a little over a minute. Knowing you can burn down and completely recreate your development environment in minutes really helps to build confidence in the product and your ability to deploy it.
So I ended up building two custom modules to support:
Of course all that comes at a cost. Drupal performance sucks. Big time. No doubt a large part of that is all the hard work my custom code is doing and no doubt we could do magic with caching to make it suck less. But it doesn't matter because we were never going to install Drupal on our web servers anyway. We use wget to suck all the pages out into static files; run some fix-ups across it with a Perl script and rsync it up to the production server. For DR we just rsync to two servers instead of one.
At least my next assignment is Perl.
A $cow_orker recently sparked a debate about conventions for naming database objects. Obviously this is a bit of a religious issue for many and we certainly uncovered a variety of opinions. One very basic question which many feel strongly about is the pluralisation of table names. I have a preference for singular but am happy to run with plural if that's the convention in an existing project.
Early in my development career I saw a colleague ridiculed for creating a database table with a pluralised name. His justification was (quite reasonably) "I called it 'widgets' because I want to store multiple widget records in it". The DBA's response was "Of course you want to store multiple records in it. If you didn't have multiple records you'd hardly go to the bother of creating a table, would you?". From this logic it comes down to a simple choice: make every table name plural; or, don't bother. I've standardised on "Don't bother".
The thing I don't get is the vast number of people who subscribe to this inseparable pair of rules:
It seems obvious to me that if you agree with the first statement then using the same logic you should disagree with the second. Apparently other people don't see it the same way.
It seems to me that a 'widget' table defines the characteristics of a widget record and serves as a container for such records. Similarly a 'Widget' class describes the characteristics of a widget object and serves as a template for such objects. I just don't get why so many people see these two issues in black and white as obvious opposites.
Wellington Perl Mongers had their monthly meeting last night. There was a pretty good turn out despite the cold weather outside. It probably helps that the bulk of the attendees don't actually have to go outside to get from work to the venue
Andy gave us a talk on his foray into Perl Golf. While he had fun and (re)learnt a few things along the way, he concluded not much of it was relevant to writing maintainable code for $work. It's hard to argue with that conclusion.
I was up next with a talk on the exact cover algorithm. This was inspired by an article Eric Maki wrote for the Perl Review a while back. Eric was using it to solve/generate Sudoku while I was using it to solve a different puzzle.
Last but not least, Finlay gave us a brief intro to Parse::RecDescent and a particular application of it for parsing postal addresses from $previous_job. I've never actually used PRD in any non-trivial way so this was a good refresher.
Our next meeting will be May 13th, so if you're going to be visiting Wellington next month let me know
I am the very model of a database relational,
My updates are atomic and ACIDic and transactional,
My planner aims to optimise your queries scatological,
My indexes will cope with SQL that is pathological
My data types encompass from mundane to geographical,
My data safety record shows concern that's quite fanatical,
My cost per TPC will beat both DB2 and Oracle,
And yet the plebs persist in writing apps for bloody MySQL!