Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

dwhite21787 (1259)

dwhite21787
  (email not shown publicly)

Journal of dwhite21787 (1259)

Thursday August 01, 2002
07:46 AM

Finally wrote everything about our hashing system

[ #6834 ]
Woo hoo! Got one monkey off my back, finally.

I wrote a 12 page paper about how our distributed hash harvester actually works. It's not on the web yet, because it needs to go through a review process, but I hope it will be soon.

It's not the most spectacular use of Perl. Having the same piece of code running on 6 PCs doing loosely parallel hashing, monitored by a master cron job and a couple database tables isn't rocket science - rockets are someone else's job.

Using sexeger was the best improvement we've made so far. It halved the time we were spending in our slowest section of code, and lowered our overall run time by 1/3.

I plan on going to YAPC::Europe::2002 and I submitted an abstract; hopefully I can give a presentation, but if not I'll certainly want to pick up as much input as possible. There's so much more I should be doing with benchmarking, hashing and ODBC modules.

More later, and hopefully a URL for the paper in the NSRL website.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Sounds very interesting - I wrote something here at MessageLabs that sounds similar (though I'll know for sure when I read your paper) that does distributed hashing of emails so we can monitor how many of email X we've seen, and also have a global db of known spams - sort of like Razor, but not crap ;-)

    I'd love to get a look at it.