Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • CPAN Stats (Score:4, Informative)

    by ziggy (25) on 2003.10.19 12:16 (#24974) Journal
    Its interesting to note that according to Ziggy, almost 50% of the brand new modules on CPAN were uploaded in 2003.
    First, we need to remember that there are lies, statistics and misinterpreted statistics.

    The numer I came up with is from analyzing the modules list. This includes some duplication (mod_perl is listed twice; there are three distributions of AcePerl at the top of the list) and some exclusions (perl-5.8.x.tar.gz is not listed, nor is Meta).

    Analyzing the modules list cannot show the growth of CPAN, but can be used as a first approximation on the freshness of CPAN. I did not say that ~50% of the new modules on CPAN were created since January 2003. I did say that 44% (quoting from memory) of CPAN modules listed on the modules list were created or modified at least once since January 2003. The modules list cannot show the former. It can give an indication of the latter.

    The modules list does not (in general) reflect prior distributions of current modules (either currently on CPAN or deleted by the author) that would be found by analyzing an ls-lR of CPAN or backpan. Groveling over an ls-lR dump of either will find things that aren't modules (and arguably should not be counted), while the modules list will find too little. I find the modules list to provide a good first approximation, no more, no less.

    This means over time either we're creating new modules faster than we're maintaining the existing ones OR we just got a huge influx of new authors OR Ziggy's wrong.
    OR you're misinterpreting the statement "almost 50% of current module distributions on CPAN have been created or modified since January 2003".

    The difference is rather significant.

    My intent was to see if Perl was stagnating, or if people still care about Perl. Using uploads to CPAN as a proxy, I found that ~85% of what's current on CPAN (as listed in the modules list) was created or modified after the Perl6 announcement.

    We can argue over methodologies and precise figures, but I assert that the intent is sound, and these figures serve as a good first approximation -- no more, no less.

    The conclusion that I draw from these numbers is that as a community, we have not given up on Perl 5, nor have we stopped caring about Perl 5 since the Perl 6 announcement. I didn't expect anyone to infer that people stopped supporting old modules, or that the list of CPAN authors increased geometrically.

    • OR you're misinterpreting the statement "almost 50% of current module distributions on CPAN have been created or modified since January 2003".

      OR I got my information second hand and it was misquoted. :) Sorry, entirely my mistake. Teach me not to check my sources. I hope you didn't mistake the tone of my writing to imply I was trying to one up your stats.

      The ls -lR stats and your module list stats make a hell of a lot more sense now

    • Do that computation again, but this time leave off the Acme namespace and use the ls-lR since the modules list is incomplete and still manually maintained.

      The conclusion is obvious, if not terribly important, but statistics lie too often, especially around those who like pie charts. CPAN is the only thing that is doing well in the perl world right now...you don't need a bar graph to tell anyone that.

      • Do that computation again, but this time leave off the Acme namespace

        That's a good idea. Thanks.

        use the ls-lR since the modules list is incomplete and still manually maintained.

        The ls-lR for either CPAN or BACKPAN doesn't suit my needs at the moment. Eventually, I want to do a more detailed analysis of BACKPAN, but not this week. Thanks for making that ls-lR.

        The conclusion is obvious, if not terribly important, but statistics lie too often, especially around those who like pie charts. CPAN

        • Parroting received wisdom ("CPAN is the only thing that is doing well in Perl", "You can do more with Perl because it is «more expressive»", "Perl 6 is the future of Perl") doesn't do anyone any favors.

          Parroting? You say that like we don't pay attention to CPAN. We do, but we don't publish statistics just because we run the joint. Publishing inaccurate and misleading statistics don't do anyone any favours most of all. Statistics are the tool with which people lie to others as well as to thems

          • You say that like we don't pay attention to CPAN.

            No, I say that because a lot of pro-Perl scentiments are repeated endlessly without any critical analysis. If you actually read what I said, you'd see that I'm not singling out CPAN, nor am I not accusing CPAN's maintainers of not paying attention.

            I also never attributed any mystical properties to an uploaded distribution, so please don't say that I did. As I said before, the only thing an upload means is that someone created or updated a file on C

            • No, but you were taking the conclusion and generating numbers rather casually to prop it up which isn't critical analysis either.

              And I wouldn't call my earlier conclusion positive or 'pro-perl' either.