The whole META.yml thing is really something of a cluster-fuck. In retrospect it was problem a bad idea to base our key piece of machine-to-machine metadata on a brand new format that was still in flux and didn't have a proper specification.
Problems still persist today. Most of them follow the theme of "In theory, theory is the same as practice, but in practice it isn't".
While in theory, META.yml is YAML, in practice, it isn't.
I've listed some of the issues below, in general moving from theory problems to practical problems.
1. YAML (still) doesn't exist yet.
The "specification" is still a pre-release. YAML is still effectively not "done", although the situation is far better now than in the past. Even as late as the 1.1 specification, the specification was based on an implementation (in the case of 1.1 I believe it was based on libyaml, or YAML::XS as we know it).
2. The META.yml specification specifically mandates incorrect (albeit accidentally valid) YAML.
The META.yml specification clearly specifies a YAML "header" in the form
The first line of a META.yml file should be a valid YAML document header like "--- #YAML:1.0".
The problem here is that isn't a "full" YAML header, it's just a document header with a comment. That YAML fragment hasn't been legal since way back when YAML was first invented.
According to the YAML specification, the above fragment should actually look like this.
We can also see that META.yml is based on the "1.0" (which, of course, was basically an "final draft alpha") specification of YAML, which is now deprecated to the point of being basically useless. YAML has been in this non-final final-draft phase for 5 years.
And of course, even the 1.0 specification of YAML didn't use #YAML:1.0, it specifies the use of "%YAML:1.0" (which is of course different to the current "final draft").
As an aside, it's this odd #YAML syntax that YAML::Syck "corrects" by silently and destructively modifying the original parse string. But that's another story (look for the nosyck test-skips in the YAML::Tiny test suite). I probably should just file a bug for that.
3. There are no YAML parsers
To my knowledge, nobody has managed to write a specification-compliant YAML parser yet. libyaml doesn't count, because the (1.1?) specification is based on it, rather than the other way around.
This isn't for lack of trying of course. Ingy alone has managed to release four different YAML parsers on the CPAN.
I've done one and a half myself (YAML::Tiny + Parse::CPAN::Meta) and I only was willing to have a shot at it by stripping out most of the junk to derive a "YAML Tiny" specification.
The one saving grace here is that all the YAML parsers have at least managed to maintain the same API for non-streaming reading and writing, so they are largely interchangeable.
4. There is no YAML support in the Perl core.
The configure_requires fix for the CPAN toolchain dependency defect requires a META.yml parser in the core to be comprehensively considered to be "complete".
This is the main strategic reason behind the creation of YAML::Tiny, to deal with small and straight forward YAML'ish files by defining a minimum usable subset and just parsing that, without buying into all the bigger formatting, specification, etc problems.
To my knowledge, none of the "YAML Parsers" meet the standards of the P5P team for inclusion in the Perl core, leaving the YAML::Tiny-based Parse::CPAN::Meta to fill the gap.
And of course, worse, once 5.10.1 is out and Parse::CPAN::Meta is in the core, it then becomes the "official" way to parse META.yml, which locks META.yml in to being based on the "YAML Tiny" specification, rather than the full YAML specification (whether the META.yml specification likes it or not).
5. YAML is (debatably) not a superset of JSON
The main people that consider JSON to be a subset of YAML is the YAML team.
6. META.yml doesn't really support Unicode
I've been saying for years that the back-compatibility period we should be targeting is about 10 years. While nobody (including myself about half the time) wants to crystallise this as policy, it does more or less turn out to be correct. We recently dropped back-compatibility for 5.005, and from perlhist you can see the release date of the 5.005_03 release that was more or less the primary target of back-compatibility member of the 5.005 series.
We now target 5.6.1 (or thereabouts) which was released around 8 years ago, and with 5.6 usage still at around 10% or so, I can imagine we'll hold at this level for another year or so before we move on to 5.8.1.
Since Unicode wasn't truly rock solid until 5.8.5 (2004-Jul-19), this still leaves us with a few years yet until Unicode is truly universal. Based on past trends, I'd expect to see Unicode become truly compulsory around 2011/12.
Until then, this leaves us with a problem in that META.yml files containing unicode will need to be readable by Perl versions (and toolchains) without proper Unicode support.
So what the hell do we do?
Personally, I'd like to see a two-phase approach.
In the first phase, we just accept that META.yml has issues and work around them. Clarify the META.yml specification to be based on the YAML Tiny subset of YAML, without support for Unicode characters (or at least discouraging them).
In the second phase, some time around Perl 5.10.3 or 5.12 or wherever, make Perl 5.8.5 the oldest supported version of Perl, and make a new META.json based on the actual JSON spec. Write a JSON::Tiny (or equivalent) and put it into the core.
Have PAUSE generate a synthetic META.json file based on META.yml for every existing distribution without support for it, and store it outside the tarball (in the same way we store README outside the tarball).
And, in the mean time, upgrade CPAN(PLUS).pm to support server-pushed minimum CPAN client checking. Which is to say, add support for the clients being able to the repository what the minimum-supported Perl/CPAN(PLUS) versions are, so that older clients know they need to upgrade not only the toolchains, but the clients themselves.