Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Matts (1087)

Matts
  (email not shown publicly)

I work for MessageLabs [messagelabs.com] in Toronto, ON, Canada. I write spam filters, MTA software, high performance network software, string matching algorithms, and other cool stuff mostly in Perl and C.

Journal of Matts (1087)

Wednesday June 11, 2003
10:22 AM

Scary DB fix

[ #12748 ]

OK, so we were getting nervous about spam growth, and so someone suggested that we just try a different architecture.

I was worried about this, but it turns out the redesign was really simple...

Instead of one great big Pg or SQL Server database indexing all the spam for a user, we give each user a database of their own. We go from 9 billion rows in 1 table to about 100 rows.

Step in SQLite.

We simply have /a/b/c/d/user.db for every user, where /a/b/c/d is a split hash structure of the email address. Stick it all on a big fat NetApp storage system and you have perhaps the world's most scalable spam storage system.

Now the clever bit was that because all my database calls are abstracted, but not *too* abstracted (a-la Alzabo or Class::DBI), I simply changed all my bits of code that accessed the spam database to do:

  local $self->{dbi} = $self->sqlite_db($email);

And then like magic all the original code still works. And because all this was in a single abstraction layer it was about 2 hours work, with about another hour to fix up all the tests.

So everyone - make sure you have a database abstraction layer. It's like a template system - it'll make your life easier.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.