Slash Boxes
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

Journal of nicholas (3034)

Wednesday July 01, 2009
04:19 AM

Full circle

[ #39201 ]

A subroutine-threaded core stores each opcode as a separate C-level function. Each op in sequence is called and then the op returns back to the runcore. This is two branch instructions to dispatch each op, compared to only one for a direct-threaded core. However, recent benchmarks I have seen in Parrot show that the subroutine core actually performs faster then the direct-threaded core does. This is because modern microprocessors have lots of hardware dedicated to predicting and optimizing control flow in call/return situations, because that is one of the most common idioms in modern software. This is a nonintuitive situation where more machine code instructions actually execute faster then fewer instructions. Parrot's default "slow" core ("-R slow") and the so-called "fast" core ("-R fast") use this technique (actually, these cores aren't exactly "subroutine-threaded", but it's close). From the numbers I have seen, the fast core is the fastest in Parrot. Here's how it works, basically:

for (pc = program_start; pc < program_end; pc++) {
    functable[*pc](interp, args);

Reminds me a lot of

while ((PL_op = CALL_FPTR(PL_op->op_ppaddr)(aTHX))) {

What goes around, comes around. Although Whiteknight's blog gives a lot of useful detail on why it's come around again, and why it was different in the middle.

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
More | Login | Reply
Loading... please wait.