Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

grantm (164)

grantm
  (email not shown publicly)
http://www.mclean.net.nz/

Just a simple [cpan.org] guy, hacking Perl for fun and profit since way back in the last millenium. You may find me hanging around in the monestary [perlmonks.org].

What am I working on right now? Probably the Sprog project [sourceforge.net].

GnuPG key Fingerprint:
6CA8 2022 5006 70E9 2D66
AE3F 1AF1 A20A 4CC0 0851

Journal of grantm (164)

Monday July 28, 2008
03:17 AM

Crazy comment spammers

A client's web site that I support has a simple feedback form which emails the form submission to a number of business users. This form has become very popular with comment spammers despite the fact that nothing submitted via the feedback form ever ends up on the web site.

On Friday I added a simple anti-spam measure and was disappointed to discover that the emails continued to roll in over the weekend. After tracing back through various logs I discovered it wasn't my script at all! When we launched a new site design 3 months ago, I took the opportunity to consolidate a number of CGI scripts into the existing mod_perl application framework. The feedback form was tweaked to point to a new form handler URL. I left the old form handler script in place to facilitate easy rollback and assumed it would do no harm since there were no forms pointing at it. Duh!

So it appears that multiple bots have cached copies of the old form handler URL and the field names it used to expect - despite the fact that the original form disappeared 3 months ago.

Rule number 1 of web security says you can't trust the input data. In particular you can't assume the form that was posted is the one you provided. Unfortunately the comment spam continued to pass all of the old handler's validation rules, so it continued to sail through to email. Of course another key rule of web security is that your web site should not expose any code/functionality that is not essential for the running of the site. I guess I'll have to say mea culpa to that one.

Saturday July 26, 2008
05:59 PM

Hardware fun

My 4 year old Acer laptop died the other day. It just seems to be the hard drive but I had been thinking about replacing the machine anyway. I don't have the budget for anything flash and I needed something quick(!) so I ended up getting another Acer (Aspire 5920).

The machine came pre-installed with Windows Vista so when I powered it on I was prompted to "complete the installation process". That entailed answering some questions and waiting while updates were downloaded and of course rebooting a couple of times.

I don't actually have any desire to run Windows (except maybe for portability testing) so my next step was to download and install Ubuntu Hardy. Surprisingly, downloading a 700MB ISO, installing Linux and downloading updates took less time than "completing the Windows installation process". We're getting closer to a "just works" experience. Video (with fancy compositing effects) and wireless networking worked without any fuss at all. Audio works too but strangely only through the headphone socket, not through the built-in speakers. As it happens I generally only use the headphones so making the speakers work isn't a big priority.

Before my hard drive died, I had just started putting together an 'analysis' of how the teams in the Wellington.pm HackOff event solved each of the questions. After various setbacks, I hope to return to that task in the next couple of days. After that's done I really hope to have time to look at my embarrassingly long RT queues.

Wednesday July 16, 2008
06:13 AM

HackOff questions now online

As previously mentioned, Wellington Perl Mongers hosted a 'HackOff' event this month. It was a fun evening with teams of programmers competing to solve problems quickly. The problems used in the live event are now available on the Wellington.pm site.

If you can answer all five questions in under 90 minutes you're doing better than our teams did :-)

Once I've had some sleep I'll take a look at the code that the competitors posted and see if it's worthy of a write up.

Tuesday July 15, 2008
07:49 PM

Wellington.PM Hackoff all over

The July meeting of Wellington.PM was last night. This month we hosted a 'Hack Off' - a bunch of teams of hackers racing to solve programming problems. The event was definitely a success and lots of people were asking "when can we do it again?"

"Team Cabbage" emerged victorious but "Team Amorphus" were close behind them.

I hope to have some more info up on the web site soon, including: the actual questions (so you can play at home); sample solutions; pictures etc. For now, I have a pretty graph.

Thursday June 26, 2008
05:54 PM

Go the google generation!

The web page for the Wellington.PM 'HackOff' event has been getting quite a lot of traffic. Presumably much of that is people trying to solve the puzzle. Referring URLs in our server log include variations of:

http://www.google.co.nz/search?q=EFBBBF2C756FC9A5CA87CA8E642CC2A073C4B1C2A0C99FC 99F6FC+A9EC994C990C9A5C2A0C9AF64CB99756FCA87C68375C4B16C6CC7+9DCA8DC2A0C79DC9A5C A87C2A0C9B96FC99FC2A070C9B96FCA8D7+373C99064C2A0CA87C79DC9B9C994C79D73C2A0C79DC9 A5CA870A&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-GB:official&client=firefox-a

Kids these days!

Friday May 23, 2008
10:59 PM

Why I'm Passionate About Perl

The person who introduced me to Perl showed me that...
I ended up introducing myself to Perl. I was a sysadmin doing a lot of shell scripting and feeling like there must be a better way. In the past I'd done a fair bit of C, and before that Pascal and BASIC. None of these were really applicable and I kept reading about Perl in Usenet postings so I decided it was worth a closer look. I was too cheap to pay for a book so I just read the man pages. The fact that I was able to teach myself the language primarily from the perldoc is a testament to the accessibility of Perl.
I first starting using Perl to...
My first significant Perl project was automating the installation of Solaris (1.0) on Sun workstations. My scripts managed the network boot process and made installation decisions (disk partition sizes, packages selections etc) based on what hardware was found and what roles the workstation had been associated with. The system allowed us to do a 'bare metal' install by setting some flags on the server then initiating a reboot over the network - all finished and ready for the user to log in, in under 15 minutes.
I kept using Perl because...
With my background in shell and sysadmin work, Perl was a really good fit for the way I thought about problems. Later when I moved into developing web-based applications, Perl again was a natural fit. Compared to my earlier C experience, not having to bother with compiles and makefiles and lowlevel memory management meant Perl was a huge productivity boost and it also put the fun back into programming for me. The Perl culture also tends to favour pragmatic and practical solutions rather than complex ivory tower frameworks.
I can't stop thinking about Perl...
I wouldn't say that I dream in Perl, but I do use it every day both for work and for fun. I also coordinate the local Perl Mongers group, maintain some CPAN modules and try to write the odd article now and then. So Perl is regularly in my thoughts.
I'm still using Perl because...
There are a lot of really smart people using Perl and contributing to the Perl community. I have learnt a lot from them that has benefitted me professionally and personally. The fact that I am comfortable standing up presenting to a roomful of people is a direct result of my involvement with the Perl community. Test driven development is something that pervades the Perl community and has been of immense benefit to me. Not only is it easy to write code in Perl it's super easy to write tests too.
I get other people to use Perl by...
I have fun with Perl. Sometimes fun can be contagious. The Perl Mongers groups are an excellent way for people to build their Perl skills and develop professional networking skills. I'm happy to share Perl solutions when people are trying to solve a problem, but I'm not trying to force Perl on anyone and I'm always interested in learning better ways to do something.
I also program in ... and ..., but I like Perl better since...
I mentioned that I used to program in C. I've had no need to do that for many years since Perl has met my requirements for everything from simple glue scripts to full-blown web apps and complex GUI applications. I've tried Java a few times but it seems to change so much in every major release that I find my references books are always out of date. I also haven't been able to find the same sort of supportive community for Java as I found with Perl. And perhaps most importantly I just don't find Java fun. Ruby on the other hand is definitely fun. It has all the best bits of Perl with a cleaner syntax and object model. There are a large number of Ruby libraries available but I have been repeatedly disappointed by encountering incomplete and abandoned solutions. The fact that Perl's CPAN modules generally integrate well with Linux packaging systems (unlike Ruby's Gem packages) is also a win for me. Perl's TAP-based testing tools are also so much simpler to use than the xUnit style favoured by the strictly OO languages. More recently I have dabbled with PHP but been frustrated by the limited syntax (eg: regexes are too hard to use), poor modularity and the insane hasharray thingies. I have also done a fair bit of work with Tcl and it was also less painful than Java but ultimately Perl is a better fit for the way I think.
Thursday May 22, 2008
04:28 AM

Coming up for air

I've been working full time (and more) for most of the year on a CMS migration project. It seemed to take over my every waking hour and meant my backlog of non-work work has moved from being ridiculously long to insanely long. Anyway, we finally drew a line under it today and called it done. Hooray!

I don't imagine that migrating from one CMS to another is ever much fun. In this case it was certainly an adventure. We were moving from a proprietary CMS called ArticleManager (ArtMan) to Drupal. The version of ArtMan was quite old; was written in Perl (which had been obfuscated to protect their IP); and used a binary on-disk file rather than a database. I didn't have to worry about the Drupal side of things because our company has half a dozen people who specialise in knocking together Drupal sites and one of them would look after that side of things.

I was able to get a fairly high-fidelity export of the data using WWW::Mechanize to walk through ArtMan's article and category edit screens and pulling out the contents of the HTML form elements. I inserted the data into a Postgres database which I was then able to run lots of SQL queries over to try and decode what the various flags and statuses meant.

It was around this time that I discovered all our Drupal people were fully committed so despite my complete ignorance of PHP, the Drupal deployment became my problem too. Fortunately there were good people on hand to answer my many questions - thanks especially to Martyn.

I've worked with a few CMSs and I have yet to meet one that I like. Having said that, Drupal is probably the one I hate the least so far. The big thing that Drupal gets right is that they acknowledge everyone's requirements are different and that for all but the most trivial sites, you will need to customise the behaviour of the CMS. With this in mind Drupal provides an architecture and an API that enables you to add new functionality and change core functionality without changing the core code. The fact that the Drupal developers have achieved this using PHP is nothing short of miraculous. The API is undeniably quirky but you can't go past the fact that it works.

Another great thing about Drupal is the large number of modules that are available to drop into your installation. Some of them will even do stuff that's vaguely similar to stuff you want. I've come to the conclusion that the greatest value of these modules is that they provide sample code you can cut and paste when building your own modules to turn Drupal into exactly the system you want.

Another big win is that even though the core functions and add-on modules can be configured by pointing and clicking, they can also be configured from code. Martyn helped me set up an installation profile script which took me from nothing to fully configured in a little over a minute. Knowing you can burn down and completely recreate your development environment in minutes really helps to build confidence in the product and your ability to deploy it.

So I ended up building two custom modules to support:

  • custom content types with exactly the metadata we want
  • a category/navigation hierarchy and URL scheme that works the way we want
  • a simple system for hyperlinking between pages in such a way that the links don't break when pages are moved
  • file attachment handling that's close to what we want
  • a site design that's very close to what the designer mocked up
  • RSS feeds
  • integration with numerous small pieces of existing functionality
  • a reasonably straightforward user interface that content editors can use to manage all this stuff

Of course all that comes at a cost. Drupal performance sucks. Big time. No doubt a large part of that is all the hard work my custom code is doing and no doubt we could do magic with caching to make it suck less. But it doesn't matter because we were never going to install Drupal on our web servers anyway. We use wget to suck all the pages out into static files; run some fix-ups across it with a Perl script and rsync it up to the production server. For DR we just rsync to two servers instead of one.

And here's the result.

At least my next assignment is Perl.

Tuesday April 22, 2008
04:19 AM

Database Naming Conventions

A $cow_orker recently sparked a debate about conventions for naming database objects. Obviously this is a bit of a religious issue for many and we certainly uncovered a variety of opinions. One very basic question which many feel strongly about is the pluralisation of table names. I have a preference for singular but am happy to run with plural if that's the convention in an existing project.

Early in my development career I saw a colleague ridiculed for creating a database table with a pluralised name. His justification was (quite reasonably) "I called it 'widgets' because I want to store multiple widget records in it". The DBA's response was "Of course you want to store multiple records in it. If you didn't have multiple records you'd hardly go to the bother of creating a table, would you?". From this logic it comes down to a simple choice: make every table name plural; or, don't bother. I've standardised on "Don't bother".

The thing I don't get is the vast number of people who subscribe to this inseparable pair of rules:

  • Database table names should always be plural
  • Object class names should always be singular

It seems obvious to me that if you agree with the first statement then using the same logic you should disagree with the second. Apparently other people don't see it the same way.

It seems to me that a 'widget' table defines the characteristics of a widget record and serves as a container for such records. Similarly a 'Widget' class describes the characteristics of a widget object and serves as a template for such objects. I just don't get why so many people see these two issues in black and white as obvious opposites.

Tuesday April 08, 2008
05:10 PM

Wellington.pm meeting last night

Wellington Perl Mongers had their monthly meeting last night. There was a pretty good turn out despite the cold weather outside. It probably helps that the bulk of the attendees don't actually have to go outside to get from work to the venue :-)

Andy gave us a talk on his foray into Perl Golf. While he had fun and (re)learnt a few things along the way, he concluded not much of it was relevant to writing maintainable code for $work. It's hard to argue with that conclusion.

I was up next with a talk on the exact cover algorithm. This was inspired by an article Eric Maki wrote for the Perl Review a while back. Eric was using it to solve/generate Sudoku while I was using it to solve a different puzzle.

Last but not least, Finlay gave us a brief intro to Parse::RecDescent and a particular application of it for parsing postal addresses from $previous_job. I've never actually used PRD in any non-trivial way so this was a good refresher.

Our next meeting will be May 13th, so if you're going to be visiting Wellington next month let me know :-)

Friday March 07, 2008
03:50 AM

A Postgres Song

With apologies to Messrs Gilbert and Sullivan (and finger of blame pointed squarely at colleagues Gav and Luke) ...

I am the very model of a database relational,
My updates are atomic and ACIDic and transactional,
My planner aims to optimise your queries scatological,
My indexes will cope with SQL that is pathological

My data types encompass from mundane to geographical,
My data safety record shows concern that's quite fanatical,
My cost per TPC will beat both DB2 and Oracle,
And yet the plebs persist in writing apps for bloody MySQL!