Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Alias (5735)

Alias
  (email not shown publicly)
http://ali.as/

Journal of Alias (5735)

Wednesday June 09, 2010
12:49 AM

The CPAN just got a whole lot heavier, and I don't know why

[ #40387 ]

According to the latest CPANDB generation run, and the associated Top 100 website at http://ali.as/top100, something big just happened to the CPAN dependency graph.

After staying relatively stable for a long time, with the 100th position coming in at around 145 dependencies and incrementing by 1 every 3 months, the "weight" of the entire Heavy 100 set of modules has jumped by 20 dependencies in the last week or so!

The 100th position is now sitting at 166 dependencies, and perennial leader of the Heavy 100 MojoMojo has skyrocketed to an astonishing 330 dependencies. The shape of the "Jifty Plateau" is also much less distinguishable, which suggests it might be more than just a pure +20 across the board.

The question is why?

Is this caused by the restoration of a broken META.yml uncovering formerly ignored dependencies? Or has someone added a new dependency somewhere important accidentally?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • You would have probably thought of that, anyway, are you keeping track in a database of the top 100 lists? Maybe analyzing differences between previous runs could lead to interesting discoveries.

    Or maybe not, since you would have to have the number of dependencies for every CPAN module...

    This is software engineering problems at a different scale than we're used to.

    • I do keep meaning to start logging the SQLite files, it's only around 30-50 meg per run (I do them at an interval of 1-2 weeks, because the process downloads about a gig in the process of generating the index).

    • Keeping track of the dependencies for each module (well, for each distribution) isn't that awfully big. CPANdeps [cpantesters.org] does it for current versions of modules (and it tracks what the dependencies are, not just the number of dependencies), in 63MB, simply by caching all the META.yml files.

      And there are only 15,000-ish distributions. You'd only need a few bytes per distribution per snapshot-time.