Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chromatic (983)

chromatic
  (email not shown publicly)
http://wgz.org/chromatic/

Blog Information [technorati.com] Profile for chr0matic [technorati.com]

Journal of chromatic (983)

Wednesday May 06, 2009
04:20 PM

Silly Little Parrot Optimizations

[ #38931 ]

I've spent the past three days profiling Parrot and Rakudo startup times. Christoph Otto and Vasily Chekalkin did some great work on a roadmap item I added a while back -- specifically removing thousands of exported symbols from our shared library. (The more symbols you export, the longer dynamic linking takes.)

As every successful optimization changes the performance profile of your application, I've found some interesting bottlenecks. Some of them are even amusing, in the forehead-slapping "You're such a geek if you find this funny" way.

For example, my most recent trace showed that calculating the correct method resolution order in Rakudo classes created a lot of PMCs. In particular, a section of the algorithm removes an entry from an array by index. (In Parrot vtable terms, this is a delete_keyed_int operation.) Every instance of this PMC is a ResizablePMCArray, roughly akin to your standard Perl 5 array. For some reason, RPA had no specific implementation of the delete_keyed_int, which takes a primitive integer value and removes the PMC at that index in the array.

Instead, the RPA fell back to the default implementation of that operation. It takes the primitive integer and constructs a PMC Key from it in a boxing operation. Then it performs the original PMC's delete_keyed operation, which takes a Key PMC and removes the PMC identified by that key in the array.

RPA defines that operation. The first thing it does is to extract a primitive integer value from the Key.

I added a local delete_keyed_int definition and rewrite delete_keyed in terms of the latter. Rakudo now starts up 1.34% faster and allocates 9.38% fewer PMCs -- and all of those PMCs are PMCs with very short lifespans that would never survive the first garbage collection run. Avoiding even that is a performance improvement.

I estimate that Rakudo starts up nearly 40% faster now than it did when I started on Sunday night. We can get it faster yet.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • I've spent the past three days profiling Parrot [parrot.org] and Rakudo [rakudo.org] startup times. Christoph Otto and Vasily Chekalkin did some great work on a roadmap item I added a while back -- specifically removing thousands of exported symbols from our shared library. (The more symbols you export, the longer dynamic linking takes.)

    How do you, roughly speaking, go about removing thousands of symbols from a library?

    You said that, due to the local delete_keyed_int, Rakudo starts up 1.34% faster; and you said that Rakudo starts

    • How do you, roughly speaking, go about removing thousands of symbols from a library?

      We removed their external visibility. (Apologies if you know all of this; I think it might be interesting to other readers.)

      When you build a shared library in C, some of the functions and variables are usable by programs which use the shared library -- that's the point of building a shared library. On some platforms you must explicitly mark these functions as externally visible. Windows is an offender here. On most Uni