One of the biggest complaints/problems with Moose has always been it's compile time cost. Well with the newest release of Moose (0.33) and Class::MOP (0.49) we have made a major step towards reducing that. With judicious use of XS (konobi++ for that) and some caching in key parts of Class::MOP we have managed to cut Moose load time almost in half, here are the stats from my machine (MacBook Pro w/ 2.4 GHz Intel Core 2 Duo).
Moose 0.32 & Class::MOP 0.48 => 0.46 real 0.43 user 0.02 sys
Moose 0.33 & Class::MOP 0.49 => 0.27 real 0.22 user 0.02 sys
This change also brings allowed us to take advantage of some 5.10 specific improvements as well. In 5.8.* we use the PL_sub_generation interpreter global to determine when to invalidate our method cache, which was one of the big parts of the speed win. In 5.8.* this is incremented every time a package is changed, which means we were probably invalidating our cache even when we didn't need to. Now in 5.10 we are using the mro::get_pkg_gen function, which provides the same feature but increments on a per-package basis, which means less incorrect cache invalidation.
All in all, not bad for a few hours of work by the folks on #moose.
NOTE: I got my stats wrong in the ChangeLogs, I called it a ~45% speed increase, but it is really 45% less slow.