Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Matts (1087)

  (email not shown publicly)

I work for MessageLabs [] in Toronto, ON, Canada. I write spam filters, MTA software, high performance network software, string matching algorithms, and other cool stuff mostly in Perl and C.

Journal of Matts (1087)

Thursday April 09, 2009
12:57 PM

Perl on LLVM

[ #38786 ]

There was recently some talk on p5p about getting perl up and running on the LLVM. This was following the recent excitement from the Python crowd about the Unladen Swallow project, and less so, the MacRuby Experimental Branch.

So following that post I decided to see how easy/hard it was to get to the first stage - getting perl compiled and running with clang, the llvm gcc-like compiler.

It wasn't too hard (a lot of compiling). After I got everything running I first ran perlbench, which looked reasonably promising:

                         gcc    llvm
                         ---    ----
arith/mixed              100      86
arith/trig               100      86
array/copy               100     101
array/foreach            100      92
array/index              100      93
array/pop                100      96
array/shift              100      95
array/sort-num           100      89
array/sort               100     101
call/0arg                100     102
call/1arg                100      89
call/2arg                100      75
call/9arg                100      89
call/empty               100      87
call/fib                 100      90
call/method              100      98
call/wantarray           100      89
hash/copy                100      95
hash/each                100      94
hash/foreach-sort        100      97
hash/foreach             100      91
hash/get                 100      91
hash/set                 100      89
loop/for-c               100      86
loop/for-range-const     100     111
loop/for-range           100     116
loop/getline             100      96
loop/while-my            100      94
loop/while               100      96
re/const                 100      86
re/w                     100      89
startup/fewmod           100      95
startup/lotsofsub        100      93
startup/noprog           100     101
string/base64            100      89
string/htmlparser        100      92
string/index-const       100      81
string/index-var         100     108
string/ipol              100     103
string/tr                100      86
AVERAGE                  100      93

So next step was to try some more real-world code. I took 41k non-spam mails and ran SpamAssassin on them (using the mass-check tool), with no network tests enabled, and a HTML::Parser also compiled with LLVM (and gcc, in the gcc instance).

Results of the timings:

real    40m56.599s
user    64m44.586s
sys    0m59.644s
real    45m38.831s
user    71m14.218s
sys    1m20.882s

So rather less promising.

Still, an interesting start - see the original link for information on where it needs to go from here. I think this might have a lot of mileage if the actual internals were ported to LLVM style code. If someone is interested in picking up this project, and maybe being paid for it, please get in touch.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.
  • I deliberately tried with the non-benchmark of running the core regression tests, because I was curious about process startup as well as steady-state costs, and found something similar - current clang is slower than current gcc. I tried to install gcc-llvm from macports, in the hope that that would be a fairer comparison, but the port failed to compile, so I've not (yet) been able to see whether the gcc optimiser via LLVM is better than than gcc without the LLVM "overhead", and hence isolate the effects of

    • I also wonder about gcc's new -fcreate-profile/-fuse-profile options. Might be worth a try... But ultimately I think a project like using LLVM has more legs than that one-off optimisation.

  • when I used llvm-gcc I had similar results using non llvm linkage.

    Make sure clang emits llvm bytecode so that link time optimizations (which are the most effective) can be run.

    For me results when from roughly 80-90% of gcc to about 110% on average IIRC.

    See my earlier post on llvm, from around june 2008 I think.

  • SpamAssassin speed as an end-result may not be so stellar anyway, if you're using sa-compile. I'm very curious to hear progress though!
  • Can you give us the Configure arguments you used?


    • Hmm, possibly... But I went through it interactively... I just edited the for gcc (with -Os optimisation) and replaced all instances of "cc" with "clang" and it "just worked".