Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

chromatic (983)

chromatic
  (email not shown publicly)
http://wgz.org/chromatic/

Blog Information [technorati.com] Profile for chr0matic [technorati.com]

Journal of chromatic (983)

Saturday April 19, 2008
12:32 PM

Rakudo Copy on Wrong

[ #36199 ]

Late last night, PerlJam posted in #parrot a small Perl 6 program which gave the wrong answer in Rakudo:

my $foo = 'fred';
say $foo;
$foo--;
my $bar = 'fred';
say $bar;

The correct output is obviously:

fred
fred

Rakudo gave:

fred
frec

PerlJam and Infinoid both correctly diagnosed the problem as a COW problem. What's that, and why does it matter?

The Rakudo compiler turns this code into PIR code. PIR is the native high level language of Parrot. Inside Parrot, the PIR compiler (IMCC) turns PIR into Parrot bytecode. As part of that process, IMCC identifies constant string literals and treats them specially.

Like the Perl 6 code, the PIR code produced by Rakudo contains the string literal fred twice. The PBC produced by IMCC doesn't; it refers to a single internal data structure twice.

This is usually the right approach. In this case, where the literal string appears twice and is only four characters long, there's little benefit, but in a complex program, you can save a lot of memory and time with judicious caching.

Now of course sometimes people want to mutate these strings. They're mutable; you can change them. That's where the COW comes in. It's like memory handling on a decent operating system. You only make a copy of the memory at the last possible point, where you know you're going to modify your copy. Parrot strings support this, so if you use Parrot operations directly, you don't even have to know that COW exists. It just works.

The problem was that the string modification took place outside of Parrot, in a custom Perl6Str PMC. Think of a PMC like a class which represents internal data structures, and you're most of the way to understanding them. The Perl6Str PMC has two operations, increment and decrement which do exactly what you'd expect to strings on the C level. This means that they modify the C string directly.

Because this occurs at the C level (working directly on C pointers), Parrot doesn't have a chance to perform the copy-on-write operation to the string, and the modification of one string produces the modification of all other strings which refer to the same string literal.

My first solution was to call the Parrot string function to perform the copy (because there's a write coming up) directly, but that made too much code move around (C89 and declarations before code, grr). Instead, I made a two-line macro which does an in-place copy and assign, and only two lines of code had to change to do the right thing. Now the code prints, as it should:

fred
fred

(I spent more time writing this entry than I did fixing the problem.)

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • My first solution was to call the Parrot string function to perform the copy (because there's a write coming up) directly, but that made too much code move around (C89 and declarations before code, grr).
    Why C89? Seems silly now.
    • I'd love to be able to use C99, but some of our target platforms lack C99-conforming compilers. As I understand it, we're sometimes fortunate that Visual Studio supports C89.