Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

korpenkraxar (9237)

korpenkraxar
  (email not shown publicly)

Journal of korpenkraxar (9237)

Friday May 29, 2009
05:58 PM

How to compare and benchmark Perl5 and Perl6 performance?

[ #39051 ]
Hi all!

I am a big Perl fan and biologist who write and use Perl 5 daily for my studies of animal evolution. Perl is really big in the bioinformatics field, but I guess you all already knew that. Looking at the novel features of Perl 6, it seems as I and others in my field would enjoy using this version of the language even more :-)

Today I finally came around to downloading and compiling Rakudo Perl 6 and Parrot, which to my pleasant surprise was just as simple as detailed here http://rakudo.org/how-to-get-rakudo/ on my Archlinux X86_64 box. Your mileage may vary depending on OS, but most Linux distributions make it very easy for you to install the prerequisites you need to pull this off. You need stuff like the GNU C compiler, git, subversion and Perl 5, all of which are easily tracked down in your distro's package manager.

To make things clear before I go on: I am not a computer scientist or professional, full-time programmer. I simply use Perl because it lets me solve lots of data parsing, mining and analysis problems very quickly and close to how I think about the problem. That said, I am of course happy for all the performance optimizations I can get while running my perl scripts and programs. Speed of execution matters too, just as speed of development, and Perl 5 strikes a very nice balance for the stuff I am doing.

Using Perl 6 as programming language for day-to-day use in solving problems relevant to me also requires the Parrot interpreter/compiler to produce efficient code. I am unlikely to enjoy the nifty language novelties in Perl 6 if the code is horribly slow compared to what I am used to.

I decided to compare something really simple in Perl 5 and Perl 6 and just increment a number and put that number in the last cell of a growing array.

This is the Perl 5 version:


#!/usr/bin/perl -w

my $i = 0;

my @numbers;

until ( $i == 100000 ) {

$numbers[$i] = $i;

$i++;

}


Perl5: ~0.07s to complete, uses 26KB RAM at completion.

This is the corresponding Perl 6 version (bare with me, I do not know much Perl 6 yet so I really just translated the above code with minimal changes):


use v6;

my Int $i = 0;

my @numbers;

until ( $i == 100000 ) {

@numbers[$i] = $i;

$i++;

}


Rakudo Perl6: ~1m14s to complete, uses 1.4GB RAM at completion.

In this simple comparison, the Perl 5 implementation is >1000 times faster than Perl 6, which also uses >50000 times more memory than Perl 5.

Ouch.

I do not (want to?) believe this difference is due to the the way the new language is implemented and running in Parrot per se, but expect this to be due to lack of optimization and presence of bugs and leaks and perhaps my own ignorance of Perl 6. However, if this arbitrary code snippet is representative of the kind of performance regressions facing people curious to try the Rakudo implementation of Perl 6 out, it may be difficult to attract early adopters outside the core of the community for a while.

The problem is, how can I as a "real-world user" best track the development of Perl 6 from a performance perspective? Are there official or semi-official benchmark scripts around already that we could use?

I have a suggestion, which is to try implementing at least some of the benchmarks over at the The Computer Language Benchmarks Game http://shootout.alioth.debian.org/ in Perl 6. Is there someone around here who is interested in trying this out? Perhaps we could make a challenge out of it or something :-)
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • That's about the same kind of thing I saw when I tried it out in January.

  • Note that I upped the number of iterations to 1,000,000. Does Rakudo have inline C capability? For something that is numeric intensive, you might want a sharper tool.

    #include <stdio.h>
    main() {
        int i=0;
        int limit=1000000;
        int array[limit];

        while (i<limit-1) {
        //printf("I is %d", i);
            array[i]=i;
            i++;
        }
    }
    phred@harpua ~ $ time ./a.out

    real    0m0.015s
    user

    • I think if you'll reread the article, you'll find that he's a Perl fan, and wants to use Perl 6, but is concerned about what is an obvious performance problem in this part of the code at this time.

      The original poster is not asking for the best way to solve the problem of counting up to 100,000.

      --

      --
      xoa

  • For the most part, Rakudo development has been focused more on speed than features. We're also somewhat hampered by the fact that Parrot doesn't have many good profiling tools so that we can figure out where the bottlenecks are. Those are supposed to be in place for the next major release in July.

    This post generated a lot of very fruitful discussion on #parrot today. Many people speculated about why the loop might be as slow as it is, and offered suggestions about improving Rakudo's code generation. Wha

    • For the most part, Rakudo development has been focused more on speed than features.

      Oops, I obviously meant to write "...focused more on features than speed."

      Pm

    • Wow! I am very happy to see my post not only being read by the right people and not immediately discarded as "Perl6 bashing", but actually resulting in an immediate identification of parts of the problem.

      I actually hesitated to publish this at all, thinking that the problem was on solely on my part. I'm glad I knew better :-)
    • As mentioned previously, I'm reporting back with our status. In the course of investigating this program, we discovered that postfix:<> was in fact very badly implemented, making it far more expensive than it needed to be. Fixing this ended up requiring quite a few internal changes to Rakudo, and I've discovered even more things we want to fix/avoid when running Rakudo in Parrot.

      That said, here's where things stand now. I used the following code as a bench mark (basically same as original, cut the

      • It seems you are indeed right about the postfix being an important bottleneck. It never even occurred to me to change that to some other way of incrementing $i, but this is how it looks for me doing 10000 iterations:

        $i++; # 7.5s

        $i += 1; # 3.5s

        $i = $i + 1; # 3.5s

        In all cases it uses the expected ~130MB RAM, so at least there does not seem to be a significant leak specific to one of these expressions.
  • I have a suggestion, which is to try implementing at least some of the benchmarks over at the The Computer Language Benchmarks Game http://shootout.alioth.debian.org/ [debian.org] in Perl 6. Is there someone around here who is interested in trying this out? Perhaps we could make a challenge out of it or something :-)

    I'd really like to see this happen. There's already a shootout/ directory in the perl6-examples repository [1] -- I'll gladly give commitbits to anyone who wants to work on this or anything else in that r

    • I'll see what I can do.

      Some of the benchmarks are inspired by bioinformatics and I might try having a go at these since I am familiar with these kind of problems (I guess this is a fairly typical stance for "non-programmer" programmers, it being far easier to solve tricky problems if they relate to other concepts you are familiar with, such as biology and DNA data in my case).

      However, it will need to wait for a few months since I am hoping to finish my thesis in July :-)

      I'll definitely continue loo
    • I am interested in coding benchmarks for rakudo as well.
    • I would be very interested in seeing regular outputs and graphs from such a thing, and would put some effort into writing code as tuits become available.

      One thing to consider is having more than one way to do it -- i.e. including multiple versions of a benchmark in regular runs. This would show when things like "$var += 1" and "$var++" have performance gaps or convergence.