Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

miyagawa (1653)

miyagawa
  (email not shown publicly)
http://bulknews.vox.com/
AOL IM: bulknews (Add Buddy, Send Message)

Journal of miyagawa (1653)

Wednesday November 07, 2007
01:49 AM

Tagging CPAN changes

[ #34850 ]

Question: Is it possible to annotate/tag each CPAN module update so that we can figure out if the update contains "security fix", "minor bug fix" or "major API change" etc.?

Context: At work we have a repository of third party CPAN modules that we use on Vox or TypePad. Once a module is added to the list, we manually follow the changes of each module to figure out if we need to upgrade (ala fix for major bugs, security issues, memory leaks etc.) or not to upgrade (ala backward incompatible API changes etc.)

It generally works well but sometimes we upgrade a module without knowing that it might break our code. In that case we take a look at how hard it is to update our code to follow the module change, and if it's not that easy, we simply revert the upgrade.

So, I think it's nice if we can automatically or even semi-automatically know, given module XXX-YYY version M to N, what kind of changes the upgrade will contain, without manually looking at Changes and diffing its source code. Note that I'm not saying these audit processes are worthless, but if we know what amount of change the upgrade introduces, it makes the work a bit easier.

Here are two possible solutions:

1) Having a rough standard to indicate these "minor bug fix", "security fix" or "major API change" type of thing in Changes file.

I know CPAN is not a place that we can force all module authors to follow one giant "standard", but we already have some kind of standardization on CPAN modules versioning: if the release is a developer release that "normal" user shouldn't upgrade, we add "_" in the version number so CPAN ecosystem will ignore it. Could we introduce more things similar to this, to tag each module update?

I realize that it's not easy because most authors write Changes file in a free text format. Some authors use more structured formats like YAML, POD or n3/RDF(!), but I myself don't like to do that. Hm, maybe YAML is accetable.

Anyway, if that doesn't sound realistic, I have another solution in my mind, 2) to have a Wiki/del.icio.us-like website where anyone can tag any module release. It might sound a bit more Web 2.0 way to accomplish the original purpose :)

We probably want to integrate the user authentication with PAUSE/BitCard so that we can say "this release is tagged 'minor bug fix' by the author."

Thoughts?

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I'd like an RSS feed (or whatnot) that keeps me up to date on changes to modules on my watchlist (likely, modules I have installed or I have as a prerequisite to my releases), so a machine parser for Changes files is something I'm interested in. I'm definitively in the "free format text file" camp. I'm not really against POD, but I'm against human-unreadable formats like RDF, because IMO the Changes file is still mostly for humans to consume and mostly written by humans too. At least my releases "should" be

  • I'd also like a machine-readable Changes file.

    There's been talk [perl.org] about a Changes.yml, or maybe Changes (which has no defined format anyway) could be in YAML itself.

    The obvious question is which YAML schema to use.

    • Thanks for the link ... I saw this article but totally forgot about it.

      Well, even though machine readable Changes file is definitely a good step, it still won't fix this exact problem, since even if you use YAML as a format, the content of the changes file (discussed in the link) is still a free text.
  • Aaron's use of RDF/N3 that you linked to is fascinating. He's integrated Dublin Core [dublincore.org] terms with DOAP [usefulinc.com] plus a "changefile" vocabulary of his own invention (although the schema for it seems yet to be written).

    The use of RDF as a change format has numerous benefits. A large amount of work has already put into constructing metadata vocabularies in it, which saves us from reinventing a big wheel with some crufty format based on YAML (as acme pointed out [perl.org], YAML is a failed format). RDF is also expressable in diff

    • I agree that it's a great format, but for most people it's painful to write.

      I'm also interested in the way microformats solve the RFC pain in XHTML, so that they use a rough standard in CSS class names or link@rel etc. so that we can automatically translate these XHTML into RDF later. Can we take a similar approach to that?
  • If this ever comes true, I would also like to see requirement changes in this file, e.g. whether the distribution has a new dependency on another module, or whether the minimum perl requirement increases.
    • Yeah, but isn't it something we can programatically generate based on META.yml changes?
      • Yes, you're right. Unless the author forgot to put the requirement into the META.yml at time. And changing an existing distribution is not possible.
        • s/META.yml/Makefile.PL/ maybe, because META.yml is (and can be re-) generated from Makefile.PL REQUIRES parameters.
  • I've written a prototype - something as a discussion basis. Module-Changes should find its way to a CPAN mirror. Soon. I've also written a journal entry [perl.org] about it.