Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Wednesday January 30, 2008
06:39 PM

MiniCPAN aging data in "many eyes"

[ #35526 ]

Jan Dubois suggested that I upload my MiniCPAN aging data to "many eyes", a nifty IBM data visualization project. I upload the data, you make pretty pictures of it and embed it in websites.

I've created the MiniCPAN aging data set and created a "Perl" topic hub. I don't think I need to do any more for you to play with the data. If you have your own data about Perl things, add it to the topic hub.

To make the pretty pictures, you need some Java applet fu in your browser. That doesn't work for me right now and I'm not going to worry about it at the moment. There is a feature to "share" a visualization by embedding some special HTML if you find a picture that you like.

Good luck, :)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • SIMILE Exhibit [mit.edu] is a similar project except it doesn't require Java (and probably doesn't have as many different visualizations).
  • I've added a dataset that's scraped from http://search.cpan.org/recent [cpan.org]. You can see a line graph of it here [ibm.com].

    Naturally it exhibits roughly the same curve as brian's data but with a slightly different shape and slightly more detail.

    I think we can safely say the trend is "up" :)

    • Have you mentioned anywhere that you uploaded the CPAN Testers data? I haven't been able to reach the server for a bit, but it's in the Perl topic hub. I made some bubble charts of it. Now it looks like one of those colorblind tests. I think I have the next cover for The Perl Review :)

      Soon I will import that Perl Jobs data too. I was a bit disappointed to not be able to find a way to compute virtual columns in the data or completely replace a data set with all new rows, but once I can get back onto the site
    • nice hack! now do you think you can extract "new modules" vs. "module updates" from that data? that would be even more interesting... :-)
      • Yes, I want to look at first time distributions too. That one is a little more tricky because I have to parse the file name (no big deal), and at the same time I want to collect data by author too. :)
  • I've just set up a little cron job that tracks the latest updates to the CPAN, Python Cheese Shop, RubyGems and PEAR (PHP). At some point in the future we'll be able to graph daily upload stats for all four.

    It'd be nice to be able to go back in time. All available RubyGems are described in a YAML file which includes their release date - so that's easy. I couldn't find a source of historical data for PEAR or Cheese Shop. If anyone can suggest sources for that data I'll go and investigate.

    I think really w

    • I managed to reconstruct histories for the other languages - although I'm not certain that some of the Cheese Shop figures aren't the result of double counting.

      The resulting graph is here [ibm.com]

      • Very nice.

        If u able to put these figures into rolling months then the chart may provide pretty good trend information.

        /13az/