Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Wednesday February 28, 2007
04:29 PM

Finding the big files in MiniCPAN

[ #32530 ]

The mini-CPAN, a smaller version of the Comprehensive Perl Archive Network that includes just the latest versions and excludes a few big things, is now about 700 MB on my machine. That means that it can't quite fit onto a single CD, at least without removing parts of it. That's no good.

I've been playing with GrandPerspective, a Mac OS X utility to show a tree map of a directory to easily show where the big files are. Here's the map for my /MINICPAN:

Here's the image link: mini cpan tree map.

The big files represented by the tan section in the lower left are BioPerl, Most of the other big boxes are parrot in various releases, but from different authors (so maybe my minicpan script needs to recognize the multi-author situations to remove old versions.). I'm not sure why these are in my minicpan:

$ find . -name "*parrot*" 2>/dev/null | xargs du -h
3.6M    ./authors/id/C/CH/CHIPS/parrot-0.4.7.tar.gz
528K    ./authors/id/J/JG/JGOFF/parrot-0.0.5.tar.gz
6.5M    ./authors/id/J/JG/JGOFF/parrot-0.0.8.1.tgz
756K    ./authors/id/J/JG/JGOFF/parrot-0_0_7.tgz
8.6M    ./authors/id/L/LT/LTOETSCH/parrot-0.1.2.tar.gz
2.6M    ./authors/id/L/LT/LTOETSCH/parrot-0.2.1.tar.gz
2.8M    ./authors/id/L/LT/LTOETSCH/parrot-0.3.1.tar.gz
2.8M    ./authors/id/L/LT/LTOETSCH/parrot-0.4.1.tar.gz
3.1M    ./authors/id/L/LT/LTOETSCH/parrot-0.4.5.tar.gz
3.7M    ./authors/id/P/PA/PARTICLE/parrot-0.4.8b.tar.gz
3.8M    ./authors/id/P/PM/PMIC/parrot-0.4.9.tar.gz
6.8M    ./authors/id/S/SF/SFINK/parrot-0.0.10.tar.gz
7.0M    ./authors/id/S/SF/SFINK/parrot-0.0.11.2.tar.gz
6.7M    ./authors/id/S/SF/SFINK/parrot-0.0.9.tar.gz
192K    ./authors/id/S/SI/SIMON/parrot-0.0.3.tar.gz
436K    ./authors/id/S/SI/SIMON/parrot-0.0.4.tar.gz

I've also been thinking about the idea of a user-defined filter for minicpan so it can exclude things I know I don't want.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • minicpan (and CPAN::Mini) have filters -- they're just not well documented in the minicpan man page. I apologize, and will try to rectify this.

    If you put "skip_perl: 1" in your .minicpanrc, it will not mirror perl, ponie, or parrot.

    If you include an entry for path_filters or module_filters, you can skip paths or modules. They're interpreted as whitespace-delimited regular expressions -- although I admit I don't use any myself, anymore, and the specifics have escaped me. I think this is about right:
    --
    rjbs