grammar book

Sunday, February 29, 2004

macros

Oh by the way, I implemented macro's pretty much exactly as Petrofsky had explained. The macro rewriting bit was fairly straightforward. I put in support for the most basic syntax-rules construct. Basically, it looks like this,

(syntax-rules () (<pattern variables>) <template>)

pattern variables must be just symbols that act as the parameters for the macro. template is an S-expression that is expanded by the macro transformer. The empty list right after syntax-rules is for literals, but I didn't bother implementing it.

Also notice that I didn't have the macro keyword appear in the macro itself. I could have, but I don't quite see the point of doing so. I ignored the whole pattern matching aspect of macros and just focusing on the transcription side.

So, when a macro is used its template is expanded and pushed onto the expression stack to get evaluated. Expansion is done by first creating map from pattern variables to the macro arguments. Then the expansion S-expression is built by recursively walking the template S-expression. If a number is encountered then it is put directly into the expansion (since numbers are immutable). A new list is created if a list is encountered and its contents are the expanded contents of the template's list. Finally, if a symbol is encountered which is mapped to an argument then the argument is put into the expansion, otherwise a new symbol with the same name is created and put into the expansion.

Actually, this isn't the whole story. The new symbol that is created is labelled in such a way that it can be distinguished as coming from that particular template transcription. This is the basis for the hygenic resolution of variable bindings. I introduced the class Label to capture the concept of a unique identifing mark on a symbol which makes it unique to that transcription AND provides enough information to look up the symbol in the environment of the macro transcription.

A new Label object is created each time a macro template is expanded, and it is applied to each generated symbol by the transformer. When the transformer emits a new symbol (it only does this, remember, when the template symbol is not a pattern variable) it labels the new symbol with all of the labels already applied to the template symbol plus the label for the current template expansion. To handle this, I modified Symbol objects to contain a list of Labels. Applying a Label to a Symbol means adding it to the end of this list.

The list of Labels then describes the history of macro expansions a symbol has undergone, in the order in which they happened. A Label itself contains two things, a unique integer across all instances, and an Environment. The unique integer per Label allows us to describe a Symbol as the symbol's name and the list of Label integers (just like Petrofsky's notation (ex. cons#1#2). In fact, this is exactly the sort of string I used to get the hash code for a Symbol so that symbols with different template expansion histories are treated as distinct symbols (the meaning of 'hygenic').

The only other change to the interpreter I made was to how a Symbol resolves its binding. Previously a Symbol.evaluate() would check with the VM's current environment for a binding. If one couldn't be found then it the symbol would be unbound. But the lexical scoping introduced by hygenic macros meant that symbol lookup should happen for each of the template transcription environments, from the most recent back. Furthermore, the symbol at the time of each macro template transcription, if it was in the environment at the time, wouldn't have label of that transcription. The upshot is that evaluating a Symbol involves first looking in the current environment, and then looking in each Label's environment, from the list of that Symbol's Labels, each time stripping off a label from the symbol.

# posted 9:46 AM.

Sunday, February 22, 2004

tokens, snow, and ink

ran out of tokens for utordial on thursday evening. still haven't arranged to get more, which means for the moment i'm disconnected. and then of course, i've been in pittsburg township for two days. talked a lot with paul about scheme, lambda calculus in general, and macros.

anyhow, the cs labs are a crappy (but surprisingly packed) place to be on a sunny afternoon, so this'll be short. (1) syntactic closures are fun, but when trying to work with i realized it's really hard to decided what bits of an expression need to be closed without looking at the semantics of the expression, and even then parameters can still be incorrectly captured. (2) renaming (explicitly: in such a way that the user could never do, or implicitly: like the serial code notation) seems like the most promising way to go about. well, i tried it out on several examples using odd hacks to introduce the new lexical scoping rules. works on paper. started implementing it last night (i'm not trying to do the whole define-syntax thing, just to handle a single pattern and an template).

# posted 2:05 PM.

Thursday, February 19, 2004

syntactic closures

looking through library.readscheme.org -- yum. just read through the paper, Syntactic Closures by Alan Bawden and Jonathan Rees. This was sort of what I was thinking when I mentioned macro expansions being somewhat like closures. The idea being that a syntactic closure is a paramaterized expression -- the parameters are resolved (read: evaluated) in the environment where the expression is used, and the others in the environment in which the expression was created. so, exactly like a regular closure except you don't call it explicitly with its parameters, you just use as usual. (which makes me think that syntactic closures are a more general type of closure).

so, are syntactic closures everything i'd need to do something like the hygenic macro thang? to be honest, my head is slightly spinning with all of this. i probably need to sit down and work through some examples using this method again. when i tried it before (last week sometime) I arranged things so that the expression returned by each macro transformer was a syntactic closure. this worked for one macro tranformation, but failed when i had nested macros (i.e. a macro composed of another macro). this happened because expressions in the inner macro were captured by its closure, even though they should have refered to something in the outer macro. that is, this happens when a macro's parameter was 'replicated' (well, this is what Petrofsky calls it) by the inner macro and then put in an expression which assigned it a different value:

(let ((x 1)) (let-syntax ((foo (syntax-rules () ((_ y) (let-syntax ((bar (syntax-rules () ((_) (let ((x 2)) y))))) (bar)))))) (foo x)))

i suspect i was doing something wacky...

anyhow, Bawden and Rees refer to hygenic macros as the "other complete solution", and give a nice short description of it (aside: so i take it guess it's proper to refer to /syntactic closures/ as a solution to the problem of writing well-behaved macros, not as a solution to writing /hygenic macros/).

blurp.

quasiquotation

just for fun i decided to implement simple quasiquotation, just backquote and comma. doing it showed me a little bit more about how wonky using two stacks can be. wonky in that i have to use both the RTN stack and the EXP stack to hold arguments to a function. The EXP stack holds unevaled arguments, the RTN evaled ones. I had to do this with the second and third operands for if, and then again with the template argument to quasiquote since the quasiquote needs the template to find out the bits that need to be evaled (i.e. the appropriately unquoted things), and then again to piece together the quoted expression.

scaling Mount Oregano

oh and by the way, i think i might be in love with scheme (at least for the moment). oregano and autumn still beat it out for the top positions, but scheme is gaining. it has just pushed its way passed pressing my hand across a carpet and corn bread.

# posted 11:10 AM.

Tuesday, February 17, 2004

literal expressions and external representation

part 4.1.2 of the spec discusses the quote operator. It says," (quote <datum>) evalutates to <datum>. <Datum> may be any external representation of a Scheme object." Hrm, so what's an external representation? It's a "sequence of characters", like "28" or "(8 13)". These sequences of characters "may be written in a program to obtain the corresponding object", as in the use of quote above.

so the question from before was something like, if you evaluate a quoted expression do you get back a something other than the expression itself -- a 'literal expression' type thing? according to the above, the answer is /yes/, what you get back is the external representation of the thing.

oddness. so, when does an external representation become an internal one? interpreter I - IV converted external representations to internal ones at the time of parsing.. (that's just what parsing was for). Evaluating a quoted expression, in those interpreters, has the effect of returning the expression itself, not the sequence of characters that make up the expression.

Ah, so maybe the spec is being murky again (that is, maybe my mind is being murky again as i read the spec) because look here, in the bit about external representations it says this, "Note that the sequence of characters "(+ 2 6)" is /not/ an external representation of the integer 8, even though it /is/ an expression evaluating to the integer 9 ..." Ah ha, so they /are/ calling the sequence of characters an expression, which means that if they say a quoted expression returns an external representation then they mean that it's exactly the same as returning the expression itself?

But wait, the above quotation from the spec continues, "... rather, it is an external representation of a three-element list, ..." Blurp.

so, my question remains,

when does an external representation become an internal one?

macros

Reading Petrofsky's article on syntax-rules and much more from the spec, it's making some sense. So, here are the two key bits from the spec on "hygienic" and "referentially transparent":

If a macro transformer inserts a binding for an identifier (variable or keyword), the identifier will in effect be renamed throughout the scope to avoid conflicts with other identifiers...
If a macro transformer inserts a free reference to an identifier, the reference refers to the binding that was visible where the transformer was specified, regardless of any local bindings that may surround the use of the macro

Together #1, and #2, ensure that identifiers the macro transformer spits out act like the identifiers in the macro spec, and not like any similarily named identifiers that occur around the /use/ of the macro. That is, the above rules allow you to treat the identifiers used in a macro expansion as distinct identifiers even if they share the same /external representation/ as identifiers in the region of the macro use (that's #1), and the scope rules used to resolve an expansion-produced identifier are such that if a binding to it (the distinct identifier from #1) isn't found in the current region, we look next in the region of the macro spec that produced it (that's #2).

He paused for a moment's reflection.
"Tricky," he said finally.

# posted 11:00 AM.

Sunday, February 15, 2004

the one-stack interpreter

last time the question was, how does kamin design his interpreter so that it only uses one stack? the answer: he does it by not using a stack at all! well, at least not explicitly. in the code for the lisp and scheme interpreter (and probably apl and others) expressions from the user are processed into a big parse tree, and then evaluated in place, with lots of recursion. this is exactly how version I through III of our interpreter worked.

But he does have a chapter at the end of his book, chapter 10, where he talks about modifying the lisp interpreter to use a single stack, compiler stylz. that is, like interpreter IV, expressions to be evaluated are pushed on to a stack. unlike interpreter IV, variable references are handled by using offsets within the stack. So if x is the 2nd formal parameter of some function, then references to it are resolved by looking at the (BLAH + 1)th thing on the stack. 'nuff said.

quasiquotation

it's a way to quote an S-Expression except for certain parts which are to be replaced by their evaluations. quasiquotation is introduced by the backtick, `, and the bits of the S-Expression to 'unquote' are prefixed by a comma. example:

-> (define x 100) -> `x x

-> `,x 100

-> `(,x) (100)

According to the spec, "Quasiquote forms may be nested. Substitutions are made only for unquoted components appearing at the same nesting level as the outermost backquote. The nesting level increases by one inside each successive quasiquotation, and decreases by one inside each unquotation."

the backquote/comma notation is exactly equivalent to (quasiquotation <S-Expression>) and (unquote <S-Expression>). it appears that the 'unquote' symbol is only used as a delimiter, not as a variable bound to a function. For instance, (quasiquotation (unquote unquote)) in guile leads to an unbound symbol 'unquote' error.

macros

The spec seems a little hard to understand when it describes how exactly to transform a macro.. it uses the words "hygenic" and "referentially transparent" in ways that make feel I ought to already know what they mean. So I found the helpful, and well-referenced newsgroup posting, An Advanced Syntax-Rules Primer for the Mildly Insane.

# posted 9:30 AM.

Wednesday, February 11, 2004

geez. So now I have an audience. enjoy the font.

interpreter progress

okay, well, things went well with the stack based implementation. as greg notes, it made things *this* much easier.

so i used two stacks to handle evaluation of expressions, one stack holding the expressions to be evaluated, and the other holding return values from previously evaluated expressions. apparently Kamin uses only one, how does he do it?

quoted expressions

i met with miles yesterday's yesterday and we talked over how we each put our interpreters together. two things came up,

if you evaluate a quoted expression do you get back a special type of literal expression?

for instance, does 'dog evalute to the regular symbol dog or to a special type of literal symbol dog? what did miles mean by literal? well, i think he meant that there was a difference between variables and |literal symbols| in that if a variable was evaluated it returns whatever it's bound to whereas if a literal symbol is evaluated it returns itself. so the evaluation of the evaluation of 'dog should return the literal symbol dog according to miles? i dunno.. sounds confusing. maybe it'll make more sense when i get into macros and quasiquotation.

next up

next up is getting macros working. i read over the appropriate bits in R5RSand it seems clear enough. a macro definition is like a closure, but its evaluation results in an expression to be evaluated, and that expression comes from transforming the macro template. i was more than a little confused by greg's description of templates yesterday because he was introducing quasiquotation at the same time -- which makes sense since i imagine that dealing with quasiquotation inside a marco template brings out all the questions of how macro templates are transformed. well, more on this later when i step through the spec in more detail.

i'll also read up on macros in C. something i've only barely messed with to use wxWindows. greg was pointing out some weirdness that happens with C macros... something about how the macro parameters and local variables with the same name can cause unexpected, but very unintuitive, things to happen.

# posted 4:16 PM.

Thursday, February 05, 2004

So how might I implement the stack-based evaluation? Well...

Representation and implementation

So assume for now that we'll stick with the existing hierarchy:

        SExpression
            Symbol
            Number
            SList
            Function

to represent scheme s-expressions in memory. Assume also that we have a way of building these up from a string representation.
Let's follow along as the different types of S-Expressions get evaluated.
Inside the REPL we have something like this:

        SExpression expression = parser.getNextExpression();
        SExpression result = virtualMachine.getReturnValue(expression);
        System.out.println(result.toString);

getReturnValue() is

            expressionStack.push(expression); 
            evaluate();
            return returnValueStack.pop();

evaluate() is

        while(expressionStack.isNotEmpty())
            expressionStack.pop().evaluate(this);

So that handles the general stuff, in specific:
Example:
-> 123

this is parsed into a Number.

        Number.evaluate(VirtualMachine)
           virtualMachine.returnValueStack.push(this);

Example:
-> +

this is parsed into a Symbol.

        Symbol.evaluate(VirtualMachine)
            SExpression value = virtualMachine.lookup(this);
            if value == null
                throw EvaluationException("Unbound variable: ${name}");
            virtualMachine.returnValueStack.push(value);

        virtualMachine.lookup(Symbol)
            for each environment in environmentStack from top to bottom
                SExpression value = environment.lookup(symbol);
                if value is not null,
                    return it
            if we haven't found anything return null.

        environment.lookup(Symbol)
            return bindings.get(symbol)

Example:
-> '+
this is parsed into a SList: (quote +)
that is, an SList which contains the Symbol 'quote', followed by the symbol '+'.

        SList.evaluate(VirtualMachine)
            SExpression first = this.first();
            SExpression rest  = this.rest();  // guaranteed to be a list
            if first is null,
                throw new EvaluationException("missing or extra expression");
            virtualMachine.expressionStack.push(new ApplyFunction());
            virtualMachine.expressionStack.push(first);
            virtualMachine.returnValueStack.push(rest);

        ApplyFunction.evaluate(VirtualMachine)
            SExpression function  = virtualMachine.returnValueStack.pop();
            SExpression arguments = virtualMachine.returnValueStack.pop();
            if function is not an Function then
                throw new EvaluationException("wrong type to apply");
            if arguments is not a SList then
                throw new RuntimeException("stack outta wack.");
            ((Function)function).setup(virtualMachine, (SList)arguments)

        QuoteFunction.setup(VirtualMachine, SList arguments)
            if arguments.size() != 1 then
                throw new EvaluationException("missing or extra expression");
            virtualMachine.expressionStack.push(this);
            virtualMachine.returnValueStack.push(arguments.index(0));
            
        QuoteFunction.evaluate(VirtualMachine)
            SExpression result = virtualMachine.returnValueStack.pop();
            if result is null then
                throw new RuntimeException("stack outta wack");
            virtualMachine.returnValueStack.push(result);

Example:
-> (+ 2 3)

       AdditionFunction.setup(VirtualMachine, SList arguments);
            if arguments.size() is not 2 then 
                throw new EvaluationException("wrong number of arguments to
                ${name}: ${arguments}")
            virtualMachine.expressionStack.push(this); 
            virtualMachine.expressionStack.push(arguments.index(1)); 
            virtualMachine.expressionStack.push(arguments.index(0)); 

        AdditionFunction.evaluate(VirtualMachine)
            SExpression operand1 = virtualMachine.returnValueStack.pop(); 
            SExpression operand2 = virtualMachine.returnValueStack.pop(); 
            if operand1 or operand2 are not Numbers then
                throw new EvaluationException("wrong argument type for
                ${name}");
            // i'm leaving out the type casting in the next step
            Number result = new Number(operand1.value() + operand2.value());
            virtualMachine.returnValueStack.push(result);

# posted 2:08 AM.

Wednesday, February 04, 2004

There are some problems with evaluating the parse tree on its own. The problems come in when trying to deal with call/cc. In order to handle them I decided to have the S-Expressions in the parse tree operate on external stacks, rather than calling each other recursively. Here'r are my thoughts which I wrote up earlier on the topic:

Expression Evaluation

Evaluation of (call/cc f) needs to pass to f something that represents the current state of execution -- i.e. whatever the interpreter was planning to do with the return value of (call/cc f). With the other versions of the interpreter the evaluation of an SExpression is handled entirely by recursive double dispatch between an expression, its subexpressions, and the vm. Knowning what the interpreter is going to do with an expression involves knowing the stack of calling SExpression.evaluate() functions that has led up to the call to evalute the current expression. And then, knowing the call-stack isn't quite enough because several evaluations can happen in a single evaluate() method, and so t's not clear where in the method to return to.
Idea: explicitly keep track of the stack of expressions to evaluate. That is, have the VM evaluate expressions on a stack, and have each SExpression.evaluate() not be allowed to evaluate any other SExpressions, but only push and pop SExpressions from the stack.

Example: 1

        EXP STACK: [ 1 ]  ; push the expression to evaluate on the stack
        RTN STACK: [ ]

pop the top and inspect it. It's a number. How do we evaluate numbers? Well, they just evaluate to themselves. So, push the result of evaluating the number onto the return-value stack.

        EXP STACK:  [ ] 
        RTN STACK:  [ 1 ]

Example: (+ 2 3)

        EXP STACK:  [ (+ 2 3) ]  ; push expression to evaluate 
        RTN STACK:  [ ]

pop it off and inspect it. It's a list. How do we evalute lists? Well, we first evaluate the 'car' of the list, and then apply the return value as a function, to the rest of the list.

        EXP STACK:  [ + ]  ; the 'car' of the list
                    [ << apply-function >> ] ; a function application
        RTN STACK:  [ (2 3) ]

So now we evaluate the top of the stack, '+', which is a symbol. How do we evaluate symbols? We look them up.

        EXP STACK:  [ << apply-function >> ] 
        RTN STACK:  [ << addition-function >> ] 
                    [ (2 3) ]

Now we evaluate a function, the apply-function function. This pops an SExpression from the RTN STACK, verifies that it is a Function, and then applies it to list of the arguments on the RTN stack. Well, what does 'applying' the addition-function do first? Well, it first has to set itself up for evaluation. So it checks that it has the appropriate number arguments (which it does) and then evalutes each of its arguments. Only after all of this does the function actually get evaluated. So, first things first:

        EXP STACK:  [ 3 ]  ; push argument 2
                    [ 2 ]  ; push argument 1
                    [ << addition-function>> ] ; push myself.
        RTN STACK:  [ ]

So now we evaluate the arguments, both numbers:

        EXP STACK:  [ << addition-function>> ] 
        RTN STACK:  [ 2 ]
                    [ 3 ]

Now we evalute the addition-function. It pops its arguments off the return stack, checks that they are both numbers, and adds them up.

        EXP STACK: []
        RTN STACK: [ 5 ]

Nothing's left on the EXP STACK, so we're done evaluating the expression. The return value is on the RTN STACK.
Example: (if T 'true 'false)

        EXP STACK:  [ (if T 'true 'false) ] 
        RTN STACK:  [ ] 

        EXP STACK:  [ if ] 
                    [ << apply-function >> ] 
        RTN STACK:  [ (T 'true 'false) ] 

        EXP STACK:  [ << apply-function >> ] 
        RTN STACK:  [ << if-function >> ] 
                    [ (T 'true 'false)  ]

        EXP STACK:  [ T ] 
                    [ << if-function >> ]
        RTN STACK:  [ 'true  ] 
                    [ 'false ] 

        EXP STACK:  [ << if-function >> ]
        RTN STACK:  [ T ]
                    [ 'true  ] 
                    [ 'false ] 
       
        EXP STACK:  [ 'true ]
        RTN STACL:  [ ] 

        ...

        EXP STACK:  [ ]
        RTN STACK:  [ true ]

Q1: What about functions with an arbitrary number of arguments?

A1.1: The function could push a sentinel on the RTN stack, so that the function just consumes expressions from the return stack until it reaches the sentinel.
A1.2: The function could push the number of things to read as the first thing on the RTN stack.
A1.3: The function instance could hold state as to the number of expressions it should read.

A1.3 means having to create a new instance of the function each time it's evaluated (so as to keep states distinct). Boo.
A1.1 means having to create a new SExpression that's just has the job of being a sentinel. Boo.
A1.2 No internal state, no new SExpressions, just good, clean fun. Way to go!

Closures

So a closure is a function and an environment in which to evaluate the function. When a closure is evaluated it swaps in its environment frame stack for the current stack, executes it's function, and swaps back in the old environment.
Example: ((lambda (x) (+ 2 x)) 1)

    
        EXP STACK:  [ ((lambda (x) (+ 2 x)) 1) ] 
        RTN STACK:  [ ]
        ENV STACK:  [ p1: { } ] 
        
        ...

        EXP STACK:  [ << (lambda (x) (+ 2 x)), p1: {} >>  ] 
        RTN STACK:  [ 1 ]
        ENV STACK:  [ p1: {} ]

So at this point we have the closure on the top of the expression stack, it's argument on the RTN stack, and the current environment (labelled 'p1') on the environment stack. Evaluating the closure does a few things:

replaces the environment stack with its own. Since it's the same, there's no difference.
creates a new environment frame on the stack to hold the bindings to its formal parameters (p2).
puts a function on stack to return the environment stack to what it was.
puts it's body on the stack to be executed.

        EXP STACK:  [ (+ 2 x) ] 
                    [ << replace-env-stack p1 >> ] 
        RTN STACK:  [ ]
        ENV STACK:  [ p2: { x = 1 } ] 
                    [ p1: {} ]

Q: Aren't we storing state in the 'replace-env-stack' function?
A: Yes, we are.
Q: Do you have a good reason for doing it here, and not above with apply-function?
A: Hrm.. well, what's the alternative:

   
                EXP STACK:  [ (+ 2 x) ] 
                            [ << replace-env-stack >> ] 
                RTN STACK:  [ p1 ]
                ENV STACK:  [ p2: { x = 1 } ] 
                            [ p1: {} ]

So that means that we'd treat environments as SExpressions. I don't like that. I can't even make sense of what that would me. So forget it.
Moving on.

        ...

        EXP STACK:  [ << replace-env-stack p1 >> ] 
        RTN STACK:  [ 3 ]
        ENV STACK:  [ p2: { x = 1 } ] 
                    [ p1: {} ] 

        EXP STACK:  [ ] 
        RTN STACK:  [ 3 ]
        ENV STACK:  [ p1: {} ]

Example: (set add (lambda (x) (lambda (y) (+ x y))))

        EXP STACK:  [ (set add (lambda (x) (lambda (y) (+ x y)))) ] 
        RTN STACK:  [ 3 ]
        ENV STACK:  [ p1: {} ] 
    
        ... 

        EXP STACK:  [ (lambda (x) (lambda (y) (+ x y))) ] 
                    [ << set-function >> ] 
        RTN STACK:  [ add ]
        ENV STACK:  [ p1: {} ] 

        ... 

        EXP STACK:  [ << set-function >> ] 
        RTN STACK:  [ << (lambda (x) (lambda (y) (+ x y))), p1 >> ]
                    [ add ]
        ENV STACK:  [ p1: {} ] 
       
        EXP STACK:  [ ] 
        RTN STACK:  [ ] 
        ENV STACK:  [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ]

Example: continuing from above, (set add6 (add 6))

        EXP STACK:  [ (set add6 (add 6)) ] 
        RTN STACK:  [ ] 
        ENV STACK:  [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ] 
        
        ... 

        EXP STACK:  [ (add 6) ] 
                    [ << set-function >> ] 
        RTN STACK:  [ add6 ] 
        ENV STACK:  [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ] 

        ... 

        EXP STACK:  [ << apply-function >>  ] 
                    [ << set-function >> ] 
        RTN STACK:  [ << (lambda (x) (lambda (y) (+ x y))), p1 >> ]
                    [ 6 ]  
                    [ add6 ] 
        ENV STACK:  [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ] 
    
        ... 

        EXP STACK:  [ (lambda (y) (+ x y))  ] 
                    [ << replace-env-stack p1 >> ] 
                    [ << set-function >> ] 
        RTN STACK:  [ add6 ] 
        ENV STACK:  [ p2: { x = 6 } ] 
                    [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ] 

        ... 


        EXP STACK:  [ << replace-env-stack p1 >> ] 
                    [ << set-function >> ] 
        RTN STACK:  [ << (lambda (y) (+ x y)), p2 >> ] 
                    [ add6 ] 
        ENV STACK:  [ p2: { x = 6 } ] 
                    [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ] 


        EXP STACK:  [ << set-function >> ] 
        RTN STACK:  [ << (lambda (y) (+ x y)), p2 : { x = 6 } >> ] 
                    [ add6 ] 
        ENV STACK:  [ p1: { add = << (lambda (x) (lambda (y) (+ x y))), p1 >> } ] 

        EXP STACK:  [ ] 
        RTN STACK:  [ ] 
        ENV STACK:  [ p1: {  add = << (lambda (x) (lambda (y) (+ x y))), p1 >>,
                            add6 = << (lambda (y) (+ x y)), p2: { x = 6 } >>    } ]

Example: continuing from above, (add6 1)

        EXP STACK:  [ (add6 1) ] 
        RTN STACK:  [ ] 
        ENV STACK:  [ p1: {  add = << (lambda (x) (lambda (y) (+ x y))), p1 >>,
                            add6 = << (lambda (y) (+ x y)), p2: { x = 6 } >>    } ] 
        
        ... 

        EXP STACK:  [ << apply-function >> ] 
        RTN STACK:  [ << (lambda (y) (+ x y)), p2: { x = 6 } >> ]
                    [ 1 ] 
        ENV STACK:  [ p1: {  add = << (lambda (x) (lambda (y) (+ x y))), p1 >>,
                            add6 = << (lambda (y) (+ x y)), p2: { x = 6 } >>    } ] 
    
        EXP STACK:  [ (+ x y) ] 
                    [ << replace-env-stack p1 >> ] 
        RTN STACK:  [ ]
        ENV STACK:  [ p3: { y = 1 } ]
                    [ p2: { x = 6 } ] 
                    [ p1: {  add = << (lambda (x) (lambda (y) (+ x y))), p1 >>,
                            add6 = << (lambda (y) (+ x y)), p2: { x = 6 } >>    } ] 

        ...

        EXP STACK:  [ << replace-env-stack p1 >> ] 
        RTN STACK:  [ 7 ]
        ENV STACK:  [ p3: { y = 1 } ]
                    [ p2: { x = 6 } ] 
                    [ p1: {  add = << (lambda (x) (lambda (y) (+ x y))), p1 >>,
                            add6 = << (lambda (y) (+ x y)), p2: { x = 6 } >>    } ] 


        EXP STACK:  [ ] 
        RTN STACK:  [ 7 ]
        ENV STACK:  [ p1: {  add = << (lambda (x) (lambda (y) (+ x y))), p1 >>,
                            add6 = << (lambda (y) (+ x y)), p2: { x = 6 } >>    } ]

Continuations

Let's get some terminology going. Let the 'state' of the interpreter at any given moment consist of the EXP, RTN, and ENV stacks.

A continuation is a 1-ary function which *replaces* the current state of the interpreter with its own, and pushes its argument on the RTN stack.

Example:
Assume we have the current state:

        EXP STACK:  [ ] 
        RTN STACK:  [ ]
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ]

In other words, we have a 1-ary function 'f' which just applies it's parameter to the number 1. So (f number?) returns T.

Now evaluate (call/cc f)

        EXP STACK:  [ (call/cc f) ] 
        RTN STACK:  [ ]
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ] 

        EXP STACK:  [ call/cc ] 
                    [ << apply-function >> ]
        RTN STACK:  [ (f) ]
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ] 

        EXP STACK:  [ << apply-function >> ]
        RTN STACK:  [ << call/cc >> ]  
                    [ (f) ]
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ]

So call/cc is a function which which creates a continuation and passes that to its argument. Let's label the current state when call/cc is executing 's1'. s1 looks like this:

            EXP STACK:  [ ]  
            RTN STACK:  [ ]  
            ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ]

So the next step is to apply 'f' to the continuation.

        EXP STACK:  [ f ]
                    [ << apply-function >> ] 
        RTN STACK:  [ << continuation-function, s1 >> ]
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ] 

        ... 

        EXP STACK:  [ << apply-function >> ] 
        RTN STACK:  [ << (lambda (cc) (cc 1)), p1 >>  ]
                    [ << continuation-function, s1 >> ]
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ] 

        EXP STACK:  [ (cc 1) ]
        RTN STACK:  [ << replace-env-stack, p1 >>  ]
        ENV STACK:  [ p2: { cc = << continuation-function, s1 >> ]  
                    [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ] 

        ... 

        EXP STACK:  [ <> ]
        RTN STACK:  [ << continuation-function, s1 >> ]
                    [ (1) ] 
                    [ << replace-env-stack, p1 >>  ]
        ENV STACK:  [ p2: { cc = << continuation-function, s1 >> ]  
                    [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ]

So now, applying the continuation to 1, we replace the current state with s1, and push 1 onto the return stack.

        EXP STACK:  [ ]  
        RTN STACK:  [ 1 ]  
        ENV STACK:  [ p1: { f = << (lambda (cc) (cc 1)), p1 >> } ]

And we're done.

# posted 1:53 AM.

Monday, February 02, 2004

So the tokens -> parse tree -> double dispatch evaluation of the parse tree thing did work. I put together a lisp interpreter, and then modified the scope rules for scheme (the Kamin's lisp only has two scopes: local and global, but his scheme has more: every lambda expression is a closure). What are closures? Well, let me paste in here the text of a letter I wrote to Paul on this subject. Paul asked me to explain the following:

-> (define add (lambda (z) (lambda (z) (+ z 6))))
-> (define add6 (add 1))
-> (add6 3)
9

I responded with this:

In scheme everything is a so-called S-Expression. (symbolic expression). There are three or four types of S-Expressions: numbers, symbols, lists and closures (read: functions). a number is , well, you get it. a symbol is anything that isn't a parenthesis or a number, and a list is defined recursively as
( S-Expression * )
Each of these things can be evaluated. A number evaluates to itself, a symbol evaluates to whatever S-Expression it's bound to in the current, or any enclosing, scope (more on that later). A list can also be evaluated. It's like a function call. The first expression in the list is evaluated, and it must return a function. So to evaluate (+ 2 3) we first evaluate the '+' symbol to get the 2-ary function implementing addition. The rest of the list is passed into the function, and the function is evaluated.
So 123 evalutes to the number 123, (+ 2 3) evaluates to 5, + evaluates to the function implementing addition, and evaluating () is an error (since there is no function to execute.
The single quote, ', is actually a short form. It works like this, ' is short for (quote ). The 'quote' function is 1-ary function which just returns its argument. So 'a is short for (quote a) which evaluates to the symbol "a". '123 is (quote 123) which evalutes to the number 123, '() evaluates to (), ''() evaluates to (quote ()), etc..
Now, the function
(define add (lambda (z) (lambda (z) (+ z 6))))
So you know what a 'define' does.. it evaluates its second argument and binds that to the symbol that is its first argument. In this case we evaluate
(lambda (z) (....))
and bind it to 'add'. well.. (lambda (z) (...)) evaluates to a closure, which for the moment we can think of as a function. It's a 1-ary function with parameter 'z' and body (lambda (z) (+ z 6)). Forget the fact that the body is another lambda expression, we'll deal with that in a moment, it could just as well be something like (print z) or (+ z 1).
So now if i were to type 'add' at the prompt i should get whatever's bound to the symbol 'add' which is the 1-ary function I described above. But your question is, what does the function bound to 'add' actually do. Well, if I type (add 1) I invoke the function add with the argument '1'. What does that give us? Well, when we invoke a function we return the evaluation of the function's body, where we evaluate the body in a new scope in which we bind each parameter of the function to the evaluation of its respective argument. In our case, we are going to evaluate (lambda (z) (+ z 6)) in the scope where 'z' is bound to 1.
so what does (lambda (z) (+ z 6)), when 'z' is '1', return? first, forget about 'z' being bound to '1', and you'll see that the lambda expression evaluates to a 1-ary function with the body (+ z 6).. in other words, it's a function which adds 6 to its single argument. got it? the fact that 'z' is bound to '1' in the scope actually plays no role in the behaviour of this particular function because this function overrides the scope's "z" with the parameter with the same name. It's kind of like this in Java:
class add {
    int z = 1;

    int add6 (int z)
    {
        return z + 6;
    }       
}
If you created an instance of 'add' and called add6(3) you'd get 9.
(I'm still not even sure what the difference between a list and a function application is...
well, hopefully this makes a little more sense now. a list is what i defined above, a function application is the evaluation of a list, where the first element evaluates to a function. So (+ 2 3) is a list, but when we try to evaluate it we first eval + to get the addition function and so on.. we could also write an equivalent list ((lambda (x y) (+ x y)) 2 3). You see, this is the same thing as (+ 2 3). When we evaluate it we evaluate the first element (lambda (x y) (+ x y)) to get a 2-ary function that adds its arguments and then we apply it to the two arguments 2 and 3.
for instance, is (7 3) a valid expression in scheme? Or do you have to say ('7 '3) for it to make sense? Or '(7 3)??)
Well, this should make sense now. (7 3) is a list. If you tried to evaluate it you'd first try to evaluate 7 and get the number 7, which isn't a function so you'd get an error.
('7 '3) is also a list, which when evaluated would give you an error because '7 evaluates to 7 -- not a function. '(7 3) is a list too, namely the list (quote (7 3)) which evalutes to the list (7 3).
Also, what does Miles mean by <<(lambda (x) e), {p}>>? Is he just referring to an ordered pair that corresponds to one of your internal data structures? If so, then maybe I don't need to understand that.
Okay, so this is what I was glossing over when I said closures where just functions. A lambda expression (lambda (...) (...)) evaluates to a closure, which is the function implementing the lambda expression AS WELL as the current scope p (sometimes refered to as the environment). The idea is that when a closure is evaluated you evaluate it in the scope p.
So, let's say you do this:
-> (define a 2)                ; bind 2 to the symbol a
Then the current environment is {a <- 2}.
-> (define plus-two (lambda (x) (+ 2 a))
so 'plus-two' is bound to a closure which contains the lambda function and the current environment. You can see this play out if I was to do this:
-> (plus-two 2)
4
-> (define a 4)
-> (plus-two 2)
6
get it? So above when we defined add6 we actually created the closure:
    << (lambda (z) (+ z 6)), {z <- 1} >>
But applying the lambda expression to some number we first bind the argument to the parameter 'z' which overrides the things in the closure's environment.. like the example I gave with the method parameter overriding the class member.
So check this:
-> (define add (lambda (z) (lambda (x) (+ z x))))
-> (define add7 (add 7))
-> (add7 1)
8
There I changed the parameter for the inner lambda expression so that it uses a different name, which doesn't override the closure's environment binding.

# posted 1:42 AM.

grammar book

Sunday, February 29, 2004

macros

Sunday, February 22, 2004

tokens, snow, and ink

Thursday, February 19, 2004

syntactic closures

quasiquotation

scaling Mount Oregano

Tuesday, February 17, 2004

literal expressions and external representation

macros

Sunday, February 15, 2004

the one-stack interpreter

quasiquotation

macros

Wednesday, February 11, 2004

interpreter progress

quoted expressions

next up

Thursday, February 05, 2004

Representation and implementation

Wednesday, February 04, 2004

Expression Evaluation

Closures

Continuations

Monday, February 02, 2004

Links

archives