The original design was not only rather heavyweight, it was also incomplete and a bit blurry. It was lacking support for MMD argument signatures, return value context, typechecking, and so on.
I'll try do describe some reasons, why we changed calling conventions to use a new abstract scheme. The major change is that all the argument passing is now done by dedicated opcodes, which allows any later implementation and adaption under the hood without changing the ABI of the Parrot VM.
A function call:
was translated to these argument-related opcodes (function lookup and call opcodes omitted for brevity):
set I0, 1 # prototyped call
set I1, 1 # 1 INT argument
set I2, 0 # 0 STR arguments
set I3, 1 # 1 PMC argument
set I4, 0 # 0 NUM arguments
set P5, Px # get PMC argument
set I5, Iy # get INT argument
set S0, "foo" # function name
A lot of opcodes to dispatch, but takes almost no time with the JIT runtime. Fine so far. Another function call:
produced exactly the same call register setup - these two function calls can not be discerned when, it comes to multi method dispatch.
Now the first argument setup just translates to:
set_args "(0b10,0)", Px, Iy
and the second is:
set_args "(0,0b10)", Ix, Py
Thus we not only saved 7 opcode dispatches per function call, we got a clear type information of the caller's arguments including the argument order. You don't have to set these type bits yourself, the assembler does it according to the passed arguments, so just writing:
set_args "(0,0)", Ix, Py
In the old scheme you could happily call a function with:
which was defined as:
set Nx, N5 # get 1st NUM param into n
set Sy, I5 # get 1st STR param into s
Due to the register usage of argument passing the function would have picked up whatever happens to be in registers N5 and S5 and would run - probably not long though. A possible "solution" would have been to force all compilers and Parrot hackers to emit code to first verify the passed arguments. That's of course another bunch of opcodes, bulky and error-prone. Now the function defines precisely what it awaits:
get_params "(0b11,0b1)", Nx, Sy
Again the type bits are filled in by the assembler. But during the call sequence, the argument passing code can verify the types (and counts) of arguments and parameters. Conversions to and from PMC parameters are specified and done automatically. Mismatches are reported by an exception.
Implicit register usage
The central mechanism of a function call in the old scheme was just the plain argumentless opcode:
It would pick up whatever happens to be in P1 and uses it as the continuation of the call. P2 was defined to be the invocant, if it's a method call. And so on - and it call's whatever is in P0. That's per se fine, if all code writers and compilers strictly use this convention and don't forget to NULLify registers that shouldn't be used for the call, but it's a major PITA for the assembler, which ought to track the control flow for proper register allocation: is the invoke a function call, a yield, a return from a function? Well it's not defined, it could be everything. Not a few lines inside imcc are trying to track down the usage of invoke opcodes to do the right thing. You can imagine that this does not contribute to clear code.
The old call scheme demanded that the invocant is passed out-of-band in P2. It's also only available in functions declared as methods by a special interpinfo call. This doesn't really match our major target languages, where the invocant just happens to be the first param of a method.
Calling a function as a method or vice versa would have needed to shift PMC arguments down or up to get everything into the registers that the callee expects.
Return value context
The old scheme had no provision for specifying, what and how many return results the caller expects. Now the get_results opcode is emitted before the actual function call, so that a function return has a chance to return what the caller wants.
Future and optimizationIn the old scheme the lower 16 registers of each kind were volatile (each function return could set these registers). This implies that you usually have to move registers from the preserved area into the lower half, during the call sequence registers are moved into the callee's lower half, from where another round through all parameters would have placed everything in the preserved area. This are three passes over all arguments - hardly to avoid in the general case.
The old call scheme reserved 4*16 registers just for function calls and returns. This accounts for 320 bytes (on a 32-bit machine) that have to be allocated per call to pass e.g. just one word argument to an one-liner function or an attribute accessor method.
Ok, we are not doing optimzation now - that's fine. But the old calling scheme would have prevented all future optimizations that will be needed. You can't do any optimizations later when the call scheme is carved in stone and just reserves half of the register resources for itself.