Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Alias (5735)

Alias
  (email not shown publicly)
http://ali.as/

Journal of Alias (5735)

Wednesday May 28, 2008
06:02 PM

ORLite::Mirror - Playing with other people's data

[ #36536 ]

One thing I've been wanting to do for ages is to have a muck around with the CPANTS dataset that is conveniently provided by the cpants.perl.org website.

The main thing stopping me has been the lack of an easy programmatic way to work with the data.

With the basic first implementation of ORLite done, the obvious next step to take is to enhance it to suck the SQLite data in from various places.

ORLite::Mirror is an ORLite subclass that mixes in LWP and a few other utility modules to allow the loading of the database from any arbitrary URL that LWP supports.

It supports both regular database files, and any compressed databases that have a URL ending with .gz.

What this means, of course, is that now I can load the CPANTS database without having to actually build the object model myself.

So now you can do stuff like the following...

#!/usr/bin/perl

# Create an ORM model for the CPANTS database.
use ORLite::Mirror {
        url => 'http://cpants.perl.org/static/cpants_all.db.gz',
        package => 'CPANTS',
};

my $count = CPANTS::Author->count;
print "CPANTS currently tracks $count authors\n";

my $authors = CPANTS::Author->select('where pauseid = ?', 'ADAMK');
print "ADAMK is " . $authors->[0]->name . "\n";

So if anyone else just happens to control any large chunks of CPAN-related data, it would be just awesome if you could periodically (by which I mean cron) publish that dataset to some stable URL where interested people could get hold of it :)

With enough of the data exposed in this way, it should be relatively trivial to build a module that pulls down all of the different data sets, and lets you write analysis algorithms that span across different data sets.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.