Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

acme (189)

acme
  (email not shown publicly)
http://www.astray.com/

Leon Brocard (aka acme) is an orange-loving Perl eurohacker with many varied contributions to the Perl community, including the GraphViz module on the CPAN. YAPC::Europe was all his fault. He is still looking for a Perl Monger group he can start which begins with the letter 'D'.

Journal of acme (189)

Monday February 17, 2003
06:32 AM

CPANSTATS v2

[ #10623 ]
CPANSTATS has turned out pretty interesting and the results are pretty cool. However, the number of systems running it has stabilised at eighty. This is obviously because only people who read my journal have installed it. I mean, who would install a script they downloaded off the Internet? ;-)

What we need now is to acheive critical mass. Get all the other people to take part in the project. The simple answer would be to integrate it into CPANPLUS. A great deal of people use CPANPLUS (and more will in the future if it gets into the Perl core). CPANPLUS tells you when it's out of date. The CPANSTATS results for it show that quite a few people are running the latest released version. Also, it removes a lot of my code by using the wonderful CPANPLUS::Backend. I'm always for deleting code...

It'll be new and exciting, but I haven't quite decided what it should do. Currently, I report stats for every module (eg CGI::Fast). Would it be simpler if I reported stats for each distribution (eg CGI in this case) instead? If it's built into CPANPLUS should it report stats automatically whenever you run it? Every week? Only if you explicitly tell it to? (By default, of course, it will be disabled). Does PAUSE contain historical data so I could show the release dates? Could it use collaborative filtering to suggest modules that you might like?

My brain is murky. The project could do a lot more, but at the moment I'm not quite sure what. Do you guys have any suggestions?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • distros (Score:4, Interesting)

    by darobin (1316) on 2003.02.17 8:22 (#17148) Homepage Journal

    I'd definitely find stats for distributions simpler, as some of them have many modules that are barely significant. I think it's the best level of granularity. Reporting whenever it's run or every week would seem fine to me, and while I agree that it ought to be off by default, I think it should strongly recommend turning it on (because it's fun!). Collaborative filtering would simply rock, if only for the gadget value. It could also give you the names of the five top people on CPAN you most want to buy a beer to.

    --

    -- Robin Berjon [berjon.com]

  • You might also see if you can get it linked from there, minimally from the oudated FAQ (#10). I
    was actually just thinking of emailing Graham abuot this on your behalf yesterday, except I've
    been bugging him a bit recently already.
    --
    Were that I say, pancakes?
  • Layout (Score:3, Interesting)

    by belg4mit (967) on 2003.02.17 11:39 (#17163) Homepage Journal
    You might also consider sorting by top level namespace en lieu of straight alphabetical?
    Especially since every component of large bundles
    get picked up.
    --
    Were that I say, pancakes?
  • Do you guys have any suggestions?

    I guess I'm in a minority, but I liked the functionality that I sent you the patch for to let the cpanstats script (with its heavy dependancy demands) run on one perl, but report the modules of another perl. Will this be lost with the integration into CPANPLUS, or will the CPANPLUS version still be able to probe another perl?

  • I work behind a fairly restrictive firewall, which also demands a username/password.. For some weird reason, the env_proxy parameter that you pass into the code doesnt pick up the username/password environment variables (so the submission fails)

    However, PPM does work (just as a point of comparison, it uses HTTP as well, right ? ) Perhaps a way of sending statistics offline (do a -dryrun and mail the tar.gz somewhere ?) might work for a few ppl who dont have access to the net all the time (and its through a

    • Thanks, I'm keeping this in mind for the next CPANSTATS project. I'll be creating lots more stats, and there will be options to send it to the server using a variety of methods.