[OpenSCAD] Programming in Functional OpenSCAD

doug moen doug at moens.org
Wed Jan 31 16:19:05 EST 2018


In order to implement generalized tail recursion optimization, you need to
add a compilation pass that compiles the code into an executable form that
is different from the original parse tree. This executable form is a
sequence of instructions. These are traditionally byte codes, but you could
also use a linked list of instruction objects, or an array of pointers to
instruction objects. This is like compiling into abstract machine code.

The interpreter uses a separate stack data structure to store argument
values and local variables for each function call. A linked list of
dynamically allocated stack frame objects is one possibility.

The interpreter is a state machine with a set of registers. You need at
least: a current instruction pointer IP, a current stack frame pointer FP.
After each instruction is executed, these registers are updated. The IP is
updated to the next instruction to be executed. The FP is updated after a
call or return.

Instructions need operands: some way to access the data that is to be
operated on. In a parse tree interpreter, like what OpenSCAD uses, a tree
node like "+" contains child nodes for the left and right operands of "+".
However, in this style of interpreter, an expression like 'A+B' is compiled
into:
  evaluate A, store the result
  evaluate B, store the result
  fetch the A result, fetch the B result, add them, store the result

There are two basic styles that can be used for fetching and storing data:

1. One is a "stack machine", where you are pushing and popping intermediate
results on the stack. In this style, 'A+B' compiles into:
    * push A on the stack
    * push B on the stack
    * pop 2 values from the stack, add them, push the result on the stack

2. Another style is a "register machine", where there is an array of slots
indexed by integers that hold values. Each operation has slot indexes
indicating which slots it should load and store data from. Register
machines are currently more popular. It seems you can generate faster code,
and they aren't any more difficult to implement. Even with a stack machine,
you need a slot array to store local variables.

A?B:C expressions are compiled into an instruction sequence that contains
conditional and unconditional jumps. Like machine code.

In Curv, my stack is a linked list of dynamically allocated stack frame
objects. Each frame contains an array of slots. The number of slots for a
particular function call frame is determined by the compiler. A function
object contains the number of slots needed for a call to that function.

To compile a function call, you first emit instructions to evaluate the
arguments (which are either pushed on the stack, or stored in one or more
slots), then you emit a function-call instruction, which allocates stack
space for the function call, possibly copies the arguments into the new
stack frame, depending on your design, then it sets the FP to point to the
new stack frame and it sets the IP to point to the first instruction of the
function. The old IP needs to be stored somewhere, eg in the return address
field of the new stack frame, so that it can be restored by the return
instruction.

To return from a function call, you execute a return instruction. This
takes the return value and places it somewhere where it can be found by the
function-call instruction that called the current function. You need to
restore the FP to its previous value (pop the current frame off the stack),
and set the IP to the next instruction following the original function
call. Deallocate the current frame, which is no longer needed.

There is also a tail-call instruction. In the general case, you can perform
a tail call to the same function, or to a different function. Suppose that
your stack is a linked list of dynamically allocated frame objects. A
general tail call instruction will create and initialize the new stack
frame, just like regular function call. But then it will set the return
address field of the new stack frame to the same value as the return
address of the current stack frame. Then it destroys the current stack
frame, replacing it with the new stack frame.

Now, your compiler needs a strategy for figuring out when it can safely
replace a regular function-call instruction with a tail-call instruction.

* The body of a function is in tail-position.
* If a let expression is in tail-position, then the body of the let is also
in tail-position.
* If A?B:C is in tail position, then B and C are also in tail-position.
* These rules for marking an expression as "in tail-position" are applied
recursively to the parse tree of a function body.
* If a function call is in tail position, then it can be compiled using a
tail-call instruction.

One caveat. In the style of language interpreter I'm familiar with,
variable and argument references are lexically scoped. In Curv, I compile
variable and argument references into slot indexes, which index into the
slot array of the current stack frame. So the scope of a variable name is
fixed at compile time, before the program is evaluated. This is probably
why the Curv interpreter is so much faster than the OpenSCAD interpreter:
array indexing is faster than hash table lookup. By contrast, OpenSCAD is
dynamically scoped. Variable and argument references are names, and these
names are looked up dynamically during evaluation, which leads to semantics
that are different from lexical scoping. The same variable name node in the
parse tree may have different scopes in different calls to the same
function or module. So you'll have to figure out how to preserve these
dynamic scoping semantics if you implement this style of compiler, and I'm
not sure there are any references to help you, because the dynamic scoping
semantics of OpenSCAD seem utterly unique to me.

On 31 January 2018 at 14:57, Torsten Paul <Torsten.Paul at gmx.de> wrote:

> The tail recursion optimization is only applied in two very specific
>> circumstances. It is not implemented in a general way.
>>
>> Do you have any pointers to documentation about how to implement tail
> recursion or how it's implemented in some language?
>
> I was searching for that some time ago and there's lots of discussion
> about how to use tail recursion but not so much about how to actually
> implement it.
>
> ciao,
>   Torsten.
>
>
> _______________________________________________
> OpenSCAD mailing list
> Discuss at lists.openscad.org
> http://lists.openscad.org/mailman/listinfo/discuss_lists.openscad.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openscad.org/pipermail/discuss_lists.openscad.org/attachments/20180131/a398ff6b/attachment-0002.html>


More information about the Discuss mailing list