Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Alias (5735)

Alias
  (email not shown publicly)
http://ali.as/

Journal of Alias (5735)

Thursday January 15, 2009
08:54 AM

WE CAN HAZ (CPAN) DATA!!!

[ #38294 ]

I've been chasing an idea for a few years now without a huge amount of success.

The idea is to take all the data from all the different bits of CPAN and tie them together effectively, to have CPAN namespaces that fundamentally represent a data source rather than code.

My not-particularly-successful Data::Package class was an attempt to move in this direction, but never really got very far. And my repository has a couple of stillborn attempts at something like CPAN::Dataset in it.

The beginning of a new solution arrived with the idea that you can fairly easily post a copy of your data in the form of a compressed SQLite database.

ORLite::Mirror solved the actual functionality needed to create a class for a remote SQLite database.

But building an actual distribution for CPANTS or CPAN Testers was still not particularly economical because you don't CONTROL that data, and it's hard to put in the work to write the test and documentation when the dataset might change on it and you would need to change the documentation as well.

The still-forming ORLite::Pod enhancement to ORLite seems to add the additional automation that reduces the cost and effort required to produce and maintain a client-side ORM for someone else's data to the point where it becomes quite easy to just throw together a distribution for any dataset you can get a URL for.

To kick the process off, I've created a couple of new distributions in a new ORDB namespace (to hold these ORLite-based remote SQLite databases).

ORDB::CPANTesters is a simple single-table ORDB that lets you search on the CPAN Testers database.

ORDB::CPANTS is a multi-table ORDB that lets you work with the CPANTS data.

I've uploaded these distributions, despite ORLite::Pod not being fully completed yet, so I can get a better idea of how this all works in practice.

You are welcome to download and try these distributions out, just be warned that because the classes are code-generated from the SQLite database itself, the modules will need to pull the databases (the CPANTesters is almost 100meg) in order to compile the module.

In the mean time, what else can you think of that I can wrap a module around? I've got to get ahead of ZOFFIX on the leaderboard again somehow :)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.