Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chromatic (983)

  (email not shown publicly)

Blog Information [] Profile for chr0matic []

Journal of chromatic (983)

Wednesday October 22, 2008
11:46 PM

Maybe We're Doing Concurrency Wrong in VMs

[ #37716 ]

I want to have multiple units of work scheduled and running in parallel with well-defined separation but specific and well-declared points of communication between them. I want preemption, memory protection, an abstract view of the underlying machinery and hardware, statistics, tuning, and a sane extension mechanism. Why does that sound familiar? Maybe it reminds me of something else you might find in computer science.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • What are Operating Systems?

    - Stevan

    • Nailed it in one. I haven't figured out how an optimizer and a JIT work with this approach, however.

      • Then you haven't looked at the AS 400 operating system.

        It works like this. There is a virtual machine that sits on the regular machine. The operating system sits inside the virtual machine, as do the programs. When programs run that have not been optimized, they get recompiled to machine code, but this action and the result is not visible from inside the virtual machine.

        • Thanks for the reference. I think it might be different though because I don't want to replace the operating system, and because I'm considering concurrency pervasive even at the expression level.

          • While that is a cool abstraction, I think that for real performance, that is the wrong direction to go. There is a non-trivial minimum overhead to having a new process, and at some point at the expression level you'll find that parallelizing introduces more overhead than you're saving.

            For parallelizing "simple" code what you really, really would like to do is to turn it into vector operations that can operate in parallel on blocks of data. And then you want to try to offload that to a specialized vector p

            • There is a non-trivial minimum overhead to having a new process...

              At the OS level, sure -- you need a new process struct, new page tables (even only to mark pages as COW), a new stack pointer, a new register backing store, and new process-local storage.

              You don't need most of that in a VM, and you don't have that strict kernel/userspace divide. If you're clever, you can even use a smart generational garbage collector where most objects are free.

              That strategy is not appropriate for general purpose code...

              • In the AS/400 operating system they don't need a lot of that stuff either. :-) Seriously, why not have every process use the exact same address space when there is no possible way to address a memory location that you haven't been given access to?

                But that said, you still do need to have overhead. At a minimum you need to keep track of all of the "processes", be able to schedule them, and remember when 2 parallel pieces share a variable versus not sharing a variable. (If you parallelize a loop, and a lexi

              • The Haskell people tried parallelism at the expression level, which works even better for them because of side effects containment, but they found just what Ben said. The problem is that strong data dependencies between nearby locations permeate code at the micro scale while introducing potential for debilitating overhead. The algorithm really has to be designed with parallelism in mind, as Ben said; what the language can do is provide a very easy way to declare a desire for parallelism explicitly while hid

                • Did the Haskell people write a paper about that? I'd like to read it. (You can solve more of the data dependency problem if you use a register machine and use a smart register allocator to identify dependencies... though the algorithm to calculate expected call graphs through a compilation unit may be somewhat hairy.)

                  • I’m afraid I do not know. I got that from a talk I watched about concurrency in Haskell. Unfortunately I do not remember where I saw the link, or when or where it was given, or by whom, and I hardly remember any of the details of the concurrency sugar they created and why that design was chosen. These assertions about no free concurrency lunch (which I think the speaker briefly went into based on an attendee question, not as part of the prepared talk) were the only bit that really stuck with me.