Recently on Perl-QA there's been a debate both rather lengthy (going on for well over a week) and a touch acrimonious. It's actually been rather civilized and the acrimony has been more in the form of blunt comments, but it boils down to people still complaining about the "CPAN" problem:
The CPAN is huge and hard to evaluate
They weren't really talking about this problem. They were talking about tools surrounding this. In particular, there were sharp disagreements about the utility CPANTs and many people pointing out how things can be taken out of context. For example, both Damian Conway and Mark Jason Dominus have relatively low CPANTs scores and this certainly says something about their distributions. If you're experienced in the Perl community or new to it, you could draw drastically different conclusions as to what this "something" is.
Still, when you look at AnnoCPAN, cpanratings, CPAN RT, CPANTs, CPAN testers, the module itself, its competition, documentation, etc., you can quickly get overloaded with the bewildering array of choices (Class:: namespace, anyone?). I think, though, that these attempts to provide more information are good and people are misunderstanding the core problem.
Many people complain that the CPAN is huge and hard to evaluate, but that's not the problem. The fact that the CPAN is in the state it is in is largely because it's reflecting the real world. Let's try a little experiment: if you know nothing about content management systems, but you want the "best" one to run use.perl and your boss tells you it must not be slashcode, what do you choose? I'll wait. Tell me how long it takes you to choose the "best".
"But wait!", you protest. "What does 'best' mean?" Is it cost? Is it programming language? Is it ease of use once learnt? Is it easy of learning? Is it ease of setup? How much maintenance will it entail? Who else uses it? How long has it been around? What's its history vis-a-vis security issues? Does it rely on technologies that our IT department won't support? Do we have to change anything internally to use it? And the list goes on and on and on
You see, in the real world, when you choose software, whether you've written it yourself or not, you have a set of requirements and you have to evaluate the software against your needs. This is often very hard. CPAN, thus, mimics the real world to a certain extent.
There is an interesting difference, though. As a general rule, CPAN authors and those creating collaborating informational sites are interested in providing you with real information, not marketing spin. This can be a huge win, with the caveat that the author may not know what real information is needed.
What we really need is to incorporate one-stop shopping for these various resources to give us an ability to evaluate them in context. CPAN distributions need "tags" which people should be able to upvote or downvote based on appropriateness. Thus, even though "File::Find::Rule::XPath" might get returned as a search for "xpath", if you narrow your search with "xml", it would lower the likelihood of getting FFRX because even if it's been mistagged with "xml", people would downvote said tag and thus reduce its weight.
With that, we need solid APIs for the other services so that the consumer, once presented with a set of choices, can consider how well a module is maintained, last release date, bug count, annotations, etc. Which information is important? How the heck can I know? It's often subjective, anyway, and you and I could reasonably, disagree. Give 'em all the information we easily can, regardless of our personal opinion of its worth.
CPAN itself should probably not be the best place for this (though by default "search.cpan.org" is a poor man's version of what I envision). Instead, it should be the canonical repository and a front-end tool which allows intelligent searching, filtering, list of relevant articles and aggregation of all other relevant data that we can think of. It should merge the available information in one spot to ease the burden on the poor author.
This can be done, but whether it will be done is another story. I think, however, that this is what we need. I don't want more restrictions on the CPAN. I don't want to tell people developing external informational tools how they should do their stuff. I want one-stop shopping for most of this information and let me decide what's important. This would ease one of the biggest burdens we have.