Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chromatic (983)

chromatic
  (email not shown publicly)
http://wgz.org/chromatic/

Blog Information [technorati.com] Profile for chr0matic [technorati.com]

Journal of chromatic (983)

Friday November 21, 2008
03:22 AM

Automatic Variables in a GCd Language

[ #37919 ]

One of the persistent bottlenecks in Parrot is that its naive, simple-to-describe stop-the-world mark-and-sweep garbage collector can be expensive. Even though it's mostly O(2n), there's one characteristic of almost every program which makes n far larger than you might want it to be: most GCable elements have short-to-very-short lifespans. That's even true in Perl.

Cleverer GCs use tricks such as memory partitioning or generations to handle this problem. When you need more memory, you first try to reclaim the youngest GCable elements, on the theory that most of them have very short lifespans. Everything that survives that collection gets promoted to an older generation or copied to an older partition which you sweep less frequently. Mark and sweep, at least as currently implemented in Parrot, always walks the entire live set of GCable elements every time the GC runs to completion. Even if a PMC anchored in the root set has survived the last thousand GC runs, it still gets marked during every run.

We haven't fixed that yet. (Replacing a garbage collector in a running system not exactly designed for pluggability of garbage collectors is not an easy task.) Andrew Whitworth's proof of concept for Google's Summer of Code 2008 gave us a lot of good information to improve our GC, but that task is still ahead of us.

If you've programmed in C or C++, you're familiar with automatic variables. These get allocated on the stack -- int x; char y;. If you've ever returned a pointer to a stack-allocated value, you've had fun debugging the results. (Parrot r32065 fixed an amusing bug related to this in Parrot.)

Heap variables you allocate yourself with malloc/calloc and must explicitly free yourself. The advantage here is that you can pass pointers to them with impunity, no matter where you are in the call graph with regard to the stack location where you allocated them.

The memory characteristics of individual programs vary, but it's likely that a program uses many more stack variables than it does heap variables. (I use the terms in a metaphorical sense. What I really mean is "Variables that don't escape the current compilation unit versus variables that do", but I don't want to talk about escape analysis.)

I don't know of many garbage collectors which can distinguish between stack-style GCables and heap-style GCables -- again, not without doing clever tricks with escape analysis. This is one place where I do wish sometimes we had a tiny kernel with the VM bootstrapped atop; the semantics of C just don't let you express or analyze the memory characteristics of the program in any way that lends itself to automatic lifetime tracking.

It's also difficult to reason from the code in general about the expected lifetime of a GCable element. Every PMC or STRING returned from a C function has effectively escaped, and we can't predict what any future consumer of that API will do with the results.

Like many problems which are unsolveable in general, there are specific exceptions. Sometimes you can read the code and verify that you have what would be in C a stack variable -- even if it's a GCable entity.

I added a pair of functions in Parrot r32313 called temporary_pmc_new and temporary_pmc_free. The documentation has quite a few caveats that I won't explain (figuring that they're scary enough that anyone who doesn't understand their implications should not use them). Basically, they're a way of creating a new PMC for the singular case where you know it's specific and entire lifespan.

I mention this to say that I teased a nice 5.14% speedup out of our multidispatch algorithm today by turning two PMCs into temporary PMCs. Parrot r32963 shows how easy the patch was to write, once I had identified the hotspot and verified that this was an appropriate optimization.

The ultimate solution is to improve our garbage collector so that it can recycle lots of young garbage very quickly, but I'll take a 5% optimization for now if I can spend two or three minutes coding and ten minutes reviewing profiling runs.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • Somehow, this reminds me of C++'s auto_ptr. But I guess having a "auto_ptr"-like PMC doesn't improve matters, as that itself would be a GCable entity.
    • Too much refcounting and you're back to Perl5...

      Instead, I imagine an upgrade-on-assign feature where the VM turns stack PMCs into heap PMCs the first time they're assigned to a field of another heap PMC. If they never get assigned, then you know it's safe to free them when the stack is unwound. That would be really hard to implement, I suspect.

      • Yes and no. To make that work correctly, we'd need some sort of macro for assignment, where if you're not using it, it's obvious you're doing something wrong. (I wonder how much work it would be to teach extension writers how to use it correctly.)

        That doesn't necessarily solve the "What if I return this from a C function?" question.

        That also requires us to figure out some way to move PMC headers, where we can allocate a new header from a heap-style pool and copy the contents of the old header into the new

        • To answer your last question, your assignment macro would have to look at both source and destination. If one is stack and one is heap, upgrade the stack one to heap. Even uglier, and a performance hit on every assignment to boot...

          So never mind. :-)

        • To make that work correctly, we'd need some sort of macro for assignment

          Or STM.

          • It's Monday morning after a four day weekend here, so I might not be at full brainpower, but I can't immediately see how that would work. (I don't doubt that it could; I'd just like a description, if you have the time.) Thanks!

            • Well, you would know what memory had changed during a transaction, right? You could just look at it and check any pointers to find out whether anything needs to be promoted from stack to heap.

              It’s almost certainly not practical. It just struck me that STM could solve the problem in a 100% reliable way (since no special incantation such as an assignment macro would be required). That would push the GC up a whole bunch of layers from the metal of the VM, which might not even be workable. I dunno. It jus

  • Nice work. I always enjoy reading about your adventures.

    It seems like it might be straightforward to add a flag to PIR to let authors mark stack variables. But that seems a little riskier because we'd need to be sure the free happened in all exit paths, including exception handling.

    • It's riskier than that, because it's very difficult in PIR to be sure that the stack variable doesn't get stashed or cached somewhere any time you use it with an opcode.

      I can know that for medium-sized programs, because I'm familiar with PIR, with the opcodes, and with the Parrot backing code, but I don't want to add features that are easy to misuse even if you've read and understood the internals.

      • Good point. I tend to think of PIR as analogous to machine code, but it's obviously much higher level than that.