Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Alias (5735)

Alias
  (email not shown publicly)
http://ali.as/

Journal of Alias (5735)

Friday March 12, 2010
12:26 AM

Making threads suck less in Padre

[ #40241 ]

Yes, threads suck. And in several different ways. Slow, memory-bloating, etc etc etc.

However, threads work.

Installing them from the CPAN on Windows, Mac and Linux works.

Using them as they are intended to be used works on Windows, Mac and Linux.

Integration with Wx works on Windows, Mac and Linux.

And using them to saturate a 4 CPU core development machine works, without resorting to having both them AND external processes.

And they work well enough that nobody has yet had an itch strong enough to step up and replace our task management code with something else that can support more than one CPU (and since our task system is derived from Process.pm, it is specifically designed to make it easy to use alternative backends).

So threads work, but they suck. Or at least, interpreter-copy sucks.

And in an IDE scenario where your process needs to load 50-100 meg of code just to drive the cooler IDE functions they suck even harder.

Fortunately, Padre has a rather forgiving attitude to things sucking.

We would rather have someone commit something that works and sucks, than not have it committed at all. And Padre is full of all kinds of features that work but suck. And slowly, gradually, they suck less.

The most recent couple of releases have come with stability warnings due to the arrival of our second-generation "Slave Driver" threading model.

The Slave Driver mechanism attempts to specifically contain and reduce the problems associated with threads.

During startup, we load the minimum number of modules required to conduct communication across threads, and then immediately spawn off a master thread which will remain unused while our main thread continues onward and loads up as much code as it likes.

Later on, when we need to do background work, this master thread is then further cloned into a slave thread which will bloat out incrementally, loading only the code it needs to execute the task.

While the slave driver mechanism itself landed a few weeks ago, the final step of pushing the master spawn point up into the start-up code just landed today.

The result of this is a change in our per-thread cost from 34meg per thread to about 20meg per thread (for a total reduction in memory in a low-usage case from 90meg to 60meg).

If not for the fact that loading Wx is an all or nothing proposition, we could probably cut this in half again. And while a collection of half a dozen 35meg threads (about the maximum we are likely to need to saturate 4 CPU cores) in a thread pool is a nasty amount of RAM, even for an IDE, half a dozen 10meg threads is a lot closer to a tolerable memory cost.

And if we can drop this further to around 5 meg, we get close to the memory cost of a forking/process model, which is the only other parallelism model available in the short term that supports many cores transparently.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • If not for the fact that loading Wx is an all or nothing proposition

    There is a patch to change that in a wxPerl branch; it should work but needs some real-world testing. I think Steffen was waiting for the Padre changes above before testing it.

    • Well, it was actually a bit different. I implemented the SlaveDriver bit because it's a precursor to being able to use the wx-threading-only branch. I got sidetracked fixing the problems the new approach created for Padre. (Andrew Bramble did a lot of the fixing, actually!) Then, I got sucked into other work.

  • "And if we can drop this further to around 5 meg, we get close to the memory cost of a forking/process model, which is the only other parallelism model available in the short term that supports many cores transparently."

    Yes, but it's not really available because it's not portable to win32. Besides: A lot of the memory you save with the fork() copy-on-write may eventually be copied when accessed by Perl. Pseudo-code:


    my $foo = 1.;
    fork();
    # $foo SV shared
    if ($foo == 1) {}
    # $foo SV likely not shared any more =

  • PS: ithreads aren't slow per se. It's just the communication between them that sucks. Just like communication between processes...

  • I have been prototyping a new threading system that's modeled after Erlang: threads share nothing with each other and can only communicate through a channel. I think that would be a better fit for background processes. You can find a rather old version on CPAN [cpan.org]. I'm planning to push a largely rewritten version to github as soon as it's remotely functional again.