Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Monday January 23, 2006
11:28 PM

CPAN Clean Sweep 2006

[ #28444 ]

It's a new year, so it's time to reduce the size of CPAN.

A long time ago, I was really interested in measuring the growth of CPAN, and comparing that to the size of MINICPAN, which I called the Schwartz Factor.

I'm interested again because CPAN just filled up an old laptop disk and my FreeBSD machine got really concerned about where it would put files. CPAN is over 3 gigs. Yowsers! We can get under 3 if people can delete about 150 Mb of old versions.

Some of that size, however, is cruft and old versions of modules. Delete those older files! You'll still find them on BackPAN, so you don't need to keep them in CPAN. :)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I'm wondering how much effect the previous call to clean up CPAN had. Because although that message, 'Increase your Schwartz' http://use.perl.org/~brian_d_foy/journal/8314 [perl.org] is interesteding to read, there are no comments to it. My guess is that the CPAN authors who leave lots of old versions around are also those who don't read use.perl journals.
  • Can we tell who the worst offenders are? Surely there's a dozen people or who have never deleted files and released dozens or hundreds of versions of things,.
    • Sure, we could just go through the author directories and count up the byte size of all of the old distros. Maybe I'll hack up that script; I need something for next month's The Perl Journal [theperlreview.com]. :)

      I'm not about shaming people though, and some people might have good reason to keep old versions around.
      • Well, it's never _about_ shaming people.

        It's about providing more information for people. When a new batch of information becomes available (like the CPANTs stuff) there will be a number of people that totally won't have known they were doing the wrong thing, and will quite quickly move to fix it.

        And those that don't care will never see.

        Just don't make a competition out of it. :)

        If you were to list people based on "inverted Schwatz factor" then you could just list it for the good people as "3 or less", to a
  • Couldn't some kind of auto-archiving take place, where things get moved to BackPAN?

  • About 80 files gone. Does CPAN/CPANPLUS search backpan when installing modules? If so, maybe it's time to just make this an automatic thing?
  • It would be helpful if authors would keep at least one older version on CPAN:

    1) The diff tool at search.cpan.org doesn't work against Backpan
    2) Sometimes unexpected bugs or incompatibilities do crop up, and in those cases, it's nice to be able to use CPAN.pm to revert rather than having to go and manually download the tarball.
    • I totally agree with this! I would even argue that only releases over a year old should be deleted. Having the last couple of versions available for forensic investigations using the search.cpan.org diff tool is very valuable.

      Disk space is cheap; 3GB is less than a single DVD. :)

  • Different subject entirely, but how are you maintaining your mirror? I used to use rsync to keep mind up-to-date, but it seems that pretty much all the mirrors I've tried rsyncing from recently are broken, so I never get back in sync.