After seeing the goodies in Tim Bunce's OSCON talk on NYTProf 3, I couldn't resist trying it out, even though half the tests fail in the repository at the moment (especially since half his demos inevitably involve Perl::Critic or PPI and I spotted some curiosities in his video).
I've happy to report that because of the increased detail in the area of xsubs and built ins, I was able to fairly quickly find a situation in which excessive calls via overloading to a centralised xsub boolean true was still consuming far more time than I'd expected.
A few defined() wrappers later to speed up the worst offenders, I managed to pick up somewhere in the vicinity of 1-2% speedup on my sample loads (which are pretty similar to the loads of Perl::Critic). And since I've spent far more time than I care to admit optimising PPI already, anything I can measure as an entire percent or higher is a pretty decent win for me.
PPI 1.206 with just these and no functional changes will be uploaded shortly, and I'm hoping to find some time to tackle some other promising-looking avenues relating to unexpectedly expensive regular expressions shortly.
I also ran the profiling on Perl 5.10.1 RC1 and I'm happy to report that the improvements to UNIVERSAL::isa do seem to make it less expensive (by maybe somewhere between 10-20%). This should provide for another percent or two total speed up on typical loads.