Gliese 1337: vau

Showing posts with label vau. Show all posts

Saturday, May 12, 2012

Even More Concurrency; Or, What You Can't Find By Searching Google Scholar

(Continuing the series of Schrodinger's Equation of Software, First Class Everything: Loops, Tail-Calls, and Continuations, and Automatic Concurrency)

Our last version of cps_map_eval gave us concurrent evaluation of arguments, and without even needing any locks or other kinds of synchronization. That's pretty impressive... except that it only works as long as a) every argument returns at least once and b) no argument returns more than once before all have returned at least once.
Those may not seem like very strict restrictions, except that the whole point of continuations is to break them. Consider this program:


((vau () % (+ (return 2) 2)))

With the last version of cps_map_eval, it will cause an error, because when the first argument thread has terminated, it will not have filled in its slot- it jumped to a different continuation. The parent thread will try to evaluate (+ None 2) and throw an exception, when what we really wanted was for it to not run at all! A similar problem can be seen in this program:


(+  ((vau () % (par (return 1) (return 2))))
    (do-long-computation))

      In this case, no arguments fail to return. But the first argument activates its continuation twice, and from different threads. If both of those continuation calls execute before (do-long-computation) completes, the first one will fill in the argument slot and disappear as intended, but the second will go ahead and execute the function body with an incomplete argument list, because we have assumed erroneously that a continuation can only be called a second time after the function body has begun executing. This is false as soon as argument expressions can themselves contain multiple threads, and even more false if you allow mutation so that external threads might get hold of the continuation.
      I searched far and wide for existing research on combining continuations and concurrency. But it seems that on this particular topic, nothing exists. Fortress does not have continuations, and no experimental LISP dialect has automatic concurrency! There is a lot of interesting work on what continuations mean in concurrent systems, and different ways to deal with them when you have explicit language constructs for parallelism like spawn or pcall, but none of it is easily applicable to a situation where all of the parallelism is implicit. There are some very interesting ideas out there, like this work on subcontinuations and this on inter-thread communication with ports, which I may come back to in later posts, but for now it looks like I, too, can be on the cutting edge of functional programming research.
      What we want is not for all argument threads to terminate before executing the function body, but for all argument slots to be filled, whether it's by the same threads that were started for that purpose or not. In order to do that, we'll have to introduce a counter for the number of argument positions that's decremented whenever an argument slot is filled; when the counter reaches zero, then we can run the function body. A version of cps_map_eval that does that looks like this:


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    counter = [arglen]
    finished = Event()
    flock = Lock()
    argv = [None]*arglen

    def assign_val(i,val):
        argv[i] = val
        flock.acquire()
        counter[0] -= 1
        flock.release()
        if counter[0] == 0:
            finished.set()

    def reassign(i,val):
        finished.wait()
        new_argv = argv[:]
        new_argv[i] = val
        return k(new_argv)

    def arg_thread(i,ax):
        eval(ax,v,ArgK( lambda val: assign_val(i,val),
                        lambda val: reassign(i,val)))

    threads = [Thread(target=arg_thread,args=(i,ax))
                for i, ax in enumerate(x[:-1])]

    for t in threads: t.start()

    def arg_k(val):
        argv[-1] = val
        flock.acquire()
        counter[0] -= 1
        flock.release()
        if counter[0] == 0:
            finished.set()
        else:
            finished.wait()
        for t in threads: t.join()
        return k(argv)

    return Tail(x[-1],v,
                ArgK(arg_k,lambda val: reassign(-1,val)))

Again, we're using a 1-element array because Python won't let us mutate variables in closure scopes, but will let us alter the contents of collections. Initial continuations fill in values and decrement the counter; non-initial continuations and the parent thread that will execute the function body wait until all of the slots are filled (every initial continuation has been called). When all of the slots are filled, any waiting threads are unblocked and the function body can be executed for as many times as continuations containing it have been activated. This almost works. Except for one rather important problem. This program will never terminate:


((vau () % (+ (return 1) 1)))

The argument thread will send 1 to the outer continuation, never giving anything to (+ [] 1), and the parent thread will hang forever waiting for that slot to be filled. So, we can't allow the parent thread to block. To fix this, we don't privilege the parent thread. It becomes just another argument evaluator, whose sole speciality is the ability to call join() to cleanup all the other threads eventually. Instead, every initial continuation checks to see whether it filled in the last argument slot, and if so, it evaluates the function body. This means that the thread that began a computation may not be the same thread that completes it- the continuation can get picked up by a different thread, and the parent may end up swapped out for one of its children. This version of cps_map_eval takes care of that:


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    counter = [arglen]
    finished = Event()
    flock = Lock()
    argv = [None]*arglen

    def assign_val(i,val):
        argv[i] = val
        flock.acquire()
        counter[0] -= 1
        flock.release()
        if counter[0] == 0:
            finished.set()
            return k(argv)

    def reassign(i,val):
        finished.wait()
        new_argv = argv[:]
        new_argv[i] = val
        return k(new_argv)

    def arg_thread(i,ax):
        eval(ax,v,ArgK( lambda val: assign_val(i,val),
                        lambda val: reassign(i,val)))

    threads = [Thread(target=arg_thread,args=(i,ax))
                 for i, ax in enumerate(x[:-1])]

    for t in threads: t.start()

    def arg_k(val):
        r = assign_val(-1,val)
        for t in threads: t.join()
        return r

    return Tail(x[-1],v,
            ArgK(arg_k, lambda val: reassign(-1,val)))

We're getting closer. All of the previous examples will now behave properly. But this one won't:


((vau () % (+ (return 1)
              ((vau () % (par (return 1)
                              (return 2)))))))

      The first argument slot is never filled. The first thread to return to the second argument slot fills in its value and terminates. And the second thread to return to the second argument slot then waits for the first one to be filled. And it will wait forever, that one blocked thread preventing the program from terminating.
      We could consider having some kind of thread garbage collector that periodically checks to see if all existing threads are blocked and if so terminates the program, or cleans up threads that can never be unblocked because the necessary continuations aren't held by anyone anymore. But what we really want is to simply not block in the first place. The choice of the name Event for the notification objects we've been using from Python's threading library so far is apt: we'd like something like an event loop that can store up computations and then execute them when an event (filling all the argument slots) occurs, or not if that event never occurs.
      We can achieve the same effect without actually building an event loop, though. Everything can still live nicely inside of cps_map_eval, and it looks like this:


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    counter = [arglen]
    flock = Lock()
    argv = [None]*arglen
    blocked = set()

    def assign_val(i,val):
        argv[i] = val
        flock.acquire()
        counter[0] -= 1
        if counter[0] == 0:
            flock.release()
            for t in blocked: t.start()
            r = k(argv)
            for t in blocked: t.join()
            blocked.clear()
            return r
        flock.release()

    def reassign(i,val):
        new_argv = argv[:]
        new_argv[i] = val
        return k(new_argv)

    def reactivate(i,val):
        flock.acquire()
        if counter[0] == 0:
            flock.release()
            return reassign(i,val)
        blocked.add(Thread(target=reassign,args=(i,val)))
        flock.release()

    def arg_thread(i,ax):
        eval(ax,v,ArgK( lambda val: assign_val(i,val),
                        lambda val: reactivate(i,val)))

    threads = [Thread(target=arg_thread,args=(i,ax))
                 for i, ax in enumerate(x[:-1])]

    for t in threads: t.start()

    def arg_k(val):
        r = assign_val(-1,val)
        for t in threads: t.join()
        return r

    return Tail(x[-1],v,
            ArgK(arg_k, lambda val: reactivate(-1,val)))

If a continuation call cannot immediately continue, it's dumped into a blocked set and the thread terminates. If the argument slots are ever all filled, that set is checked, and everything in it is re-activated. If they aren't, the set just gets garbage collected when the necessary continuations are no longer held by anybody else. Note that thread join()s only occur after a parent thread has finished all possible useful work, and cannot do anything but return a final value to the top-level eval loop; thus, they do not inhibit concurrency. We could even get along without them, and let Python deal with the cleanup, but if we were using raw POSIX pthreads that wouldn't be an option, so we might as well cleanup our garbage. We did end up having to use 1 lock, to ensure atomic access to the counter, but that's pretty darn small; most of the time, it will never be contended. Note that we need to keep the lock around additions to the blocked set, just in case, but we don't need the lock when iterating over it; that's because we only get to that point if the counter has just been set to zero, and if the counter is zero, we know that no other thread will ever access the blocked set again.
So there it is. Automatic concurrency that plays nice with continuations, without any need to modify the core vau evaluator.

As usual, working code can be found at https://github.com/gliese1337/schrodinger-lisp/.

Thursday, May 3, 2012

Automatic Concurrency

(Continuing the series of Schrodinger's Equation of Software and First Class Everything: Loops, Tail-Calls, and Continuations)

So far, we've had a language with de-jure concurrent evaluation of lambda arguments, but de-facto left-to-right evaluation order. A modification to cps_map_eval can make our language genuinely concurrent.
First, we'll drastically simplify the continuations of each argument evaluation, so they don't explicitly chain together anymore. Instead, we'll just evaluate them all in a loop. If the loop already finished, and all of the arguments have been previously evaluated, the continuation is the same as before: reassign the current argument and re-call the continuation of the whole argument list; otherwise, the continuation is implicitly the next iteration of the loop.


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    argv = [None]*arglen
    done = False
    
    def arg_thread(i,ax):
        def assign_val(val):
            if done:
                new_argv = argv[:]
                new_argv[i] = val
                return k(new_argv)
            else:
                argv[i] = val
        eval(ax,v,assign_val)

    for i, ax in enumerate(x):
        arg_thread(i,ax)
    
    done = True
    return k(argv)

This is much simpler than our previous code, so why didn't we use it? Well, evaluating arguments in an explicit loop means that each call to eval is no longer a tail call. That's not a huge deal, because it will only grow the stack with recursive calls to eval for as many times as you type in nested function calls in argument positions in your source code; but we don't need to grow the stack at all, and doing so gains us nothing by itself since this still results in implicit left-to-right sequential evaluation. However, notice that the extra function used to create a closure for the value of the index for each argument evaluation is named arg_thread- with the code in this form, we can arrange to execute each argument evaluation in its own concurrent thread.


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    argv = [None]*arglen
    done = False
    
    def arg_thread(i,ax):
        def assign_val(val):
            if done:
                new_argv = argv[:]
                new_argv[i] = val
                return k(new_argv)
            else:
                argv[i] = val
        eval(ax,v,assign_val)

    threads = [Thread(target=arg_thread,args=(i,ax))
                for i, ax in enumerate(x)]

    for t in threads: t.start()
    for t in threads: t.join()
    
    done = True
    return k(argv)

This isn't quite true parallelism because of Python's Global Interpreter Lock. But it is true concurrency, and with a better underlying implementation of threads would result in automatic paralellism. The existing thread starts up one new thread for each argument to be evaluated, waits for all of them to terminate, and then returns to the main eval loop. Each new thread makes one call to eval to start up its own evaluation loop. However, this implementation creates one more thread than necessary: the main thread sits idle while all of the arguments are evaluating. This is especially heinous if you only have 1 argument; why start a new thread just to do one sequential evaluation? That can be fixed by changing just a couple of lines so that the last argument in the list is evaluated in the current thread:


    threads = [Thread(target=arg_thread,args=(i,ax))
                for i, ax in enumerate(x[:-1])]

    for t in threads: t.start()
    arg_thread(arglen-1,x[-1]) #make use of the current thread
    for t in threads: t.join()

But now we've re-introduced a recursive call to eval (inside of arg_thread)! Let's go ahead and CPS that away:


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    argv = [None]*arglen
    done = [False]
    
    def arg_thread(i,ax):
        def assign_val(val):
            if done[0]:
                new_argv = argv[:]
                new_argv[i] = val
                return k(new_argv)
            else:
                argv[i] = val
        eval(ax,v,assign_val)

    threads = [Thread(target=arg_thread,args=(i,ax))
                for i, ax in enumerate(x[:-1])]

    for t in threads: t.start()
    
    def arg_k(val):
        if done[0]:
            new_argv = argv[:]
            new_argv[-1] = val
            return k(new_argv)
        else:
            argv[-1] = val
            for t in threads: t.join()
            done[0] = True
            return k(argv)
    
    return Tail(x[-1],v,arg_k)

This has some ugly code duplication. We can get rid of that and eliminate all of the conditional branches at the same time. The new version of cps_map_eval looks like this:


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])

    argv = [None]*arglen
    
    def assign_val(i,val):
        argv[i] = val

    def reassign(i,val):
        new_argv = argv[:]
        new_argv[i] = val
        return k(new_argv)

    def arg_thread(i,ax):
        eval(ax,v,ArgK( lambda val: assign_val(i,val),
                        lambda val: reassign(i,val)))

    threads = [Thread(target=arg_thread,args=(i,ax))
                for i, ax in enumerate(x[:-1])]

    for t in threads: t.start()
    
    def arg_k(val):
        argv[-1] = val
        for t in threads: t.join()
        return k(argv)

    return Tail(x[-1],v,
                ArgK(arg_k,lambda val: reassign(-1,val)))

ArgK is a callable wrapper class that takes a function to run the first time it's called and a function to run every other time it's called. This eliminates a lot of nesting and allows reassign to be shared everywhere. The definition of ArgK looks like this:


### self-modifying continuations for argument evaluation

class ArgK():
    def __init__(self,first,rest):
        def k(val):
            self.k = rest
            return first(val)
        self.k = k

    def __call__(self,val):
        return self.k(val)

And it contains no conditional branches.
Now that we have a mechanism for concurrent evaluation of arguments, we can use that to build a concurrent analog to the sequential "begin" construction. The simplest way to do it is to simply pass concurrent expressions as arguments to "list"; a simple vau expression to do that and resturn the result of the first expression looks like this:


(define par (vau (a) % (car (eval % (cons list a)))))

It's rather inefficient to create a list if you're going to throw most of it away, though. Just as we could make the built-in sequence function much simpler and more efficient than the sequential version of cps_map_eval, we can make a much better built-in concurrency construction. We'll have it parallel the semantics of "begin" by returning the value of the last listed expression.


def par(k,v,*x):
    """
    Evaluates arguments in parallel, returning the
    last listed value like sequence does.
    Ensures that all parallel threads terminate
    before continuing
    """
    if len(x) == 0: return k(None)
    final = [None]

    def call_k(val):
        return k(final[0])

    def par_thread(ax):
        eval(ax,v,ArgK(lambda val: None,call_k))
    
    threads = [Thread(target=par_thread,args=(ax,))
                for ax in x[:-1]]
    for t in threads: t.start()

    def par_k(val):
        for t in threads: t.join()
        final[0] = val
        return k(val)
    return Tail(x[-1],v,ArgK(par_k,call_k))

The par function starts up a new thread with its own eval loop for every argument except the last, throws away the results of all those other threads, and saves the results of evaluating its last argument. And what if you don't want to throw away the results of all those other threads? Just call "list", and it will evaluate all of it's arguments concurrently! You can test it with this sample program which evaluates a bunch of expressions of varying complexity and prints the results in the order that they are completed:


(define mul (lambda (a b) (* a b)))
(begin
    (print (mul (+ 1 2) 3))
    (print (* 4 5))
    (print (+ 10 12))
    (print 7))
(par
    (print (mul (+ 1 2) 3))
    (print (* 4 5))
    (print (+ 10 12))
    (print 7))
(print
    (list
        (print (mul (+ 1 2) 3))
        (print (* 4 5))
        (print (+ 10 12))
        (print 7)))

As usual, working code can be found at https://github.com/gliese1337/schrodinger-lisp/.

Saturday, April 28, 2012

First-Class Everything: Loops, Tail Calls, and Continuations

      From where we left off in "Schrodinger's Equation of Software", we had first-class functions, first-class syntax, and first-class environments. Since we can build anything out of lambdas, and we can build lambdas out of vau and wrap, our language ought to be complete, right? Well, yes, if we actually had a perfect implementation of the mathematical ideal of vau calculus.
      Unfortunately, the limitations of Python's stack mean that we can't use recursive loops. There are two ways to solve this: we can add another built-in function just to handle loops, or we can separate the interpreted stack from the Python stack and make tail calls work.
      The simplest way to make tail calls work is to use trampolining. So, we'll define a special interpreter data structure for a tail call that encapsulates everything we'll need to execute the function call (i.e., all of the arguments to eval), and return that instead of the value of the function call whenever we have a function call in tail position.


class Tail():
    def __init__(self,expr,env):
        self.expr = expr
        self.env = env

    def __iter__(self):
        yield self.expr
        yield self.env

Then, we modfy eval to check if it got a value or a tail call instruction every time it executes a procedure. If it did get a tail call, it replaces its arguments with the data in the tail call object and loops:


def eval(x, env):
    "Evaluate an expression in an environment."
    while True:
        if isa(x, Symbol):          # variable reference
            return env[x]
        elif isa(x, list):          # (proc exp*)
            proc = eval(x[0], env)
            if hasattr(proc, '__call__'):
                val = proc(env,*x[1:])
                if isa(val, Tail):
                    x, env = val
                else:
                    return val
            else:
                raise ValueError("%s = %s is not a procedure" %
                                  (to_string(x[0]),to_string(proc)))
        else: return x

Note that eval should *never* return a tail call object; it's not a value in the language, it's just bookkeeping for the interpreter. That works pretty well; if you write proper tail-recursive loops, they'll run just fine. But we can do better. For one thing, there's still a recursive call to eval in eval. For another, not all calls to eval in the builtin functions are tail calls. E.g., here's the definition of "if":


    'if': lambda v,z,t,f: Tail((t if eval(z,v) else f), v)

The middle call to eval can't be replaced by Tail. The basic binary operators can't be fixed at all, and these sequences of eval -> procedure -> eval can still result in blowing up the stack. Ideally, we want to eliminate any recursive calls to eval, so no matter what happens to the interpreted language stack, Python's stack doesn't grow. We could make sure that *all* function calls are in tail position by transforming our interpreted programs into CPS- but that would add a whole lot complexity to our interpreter to do the transformation. We can get the same effect by noting that the continuation of an interpreted expression is exactly the same as the continuation of the interpreter when it's interpreting that expression; therefore, we just have to CPS the interpreter. And since our eval function is tiny, that's no trouble at all!
The CPSed eval function looks like this:


def eval(x, env, k=lambda x:x):
    "Evaluate an expression in an environment."
    val = None
    while True:
        if isa(x, Symbol):          # variable reference
            val = k(env[x])
        elif isa(x, list):          # (proc exp*)
            def capture_args():
                cx, cv, ck = x[1:], env, k
                def try_call(proc):
                    if hasattr(proc, '__call__'):
                        return proc(ck,cv,*cx)
                    raise ValueError("%s = %s is not a procedure" %
                                      (to_string(cx[0]),to_string(proc)))
                return try_call
            x, k = x[0], capture_args()
            continue
        else:
            val = k(x)
        if not isa(val, Tail):
            return val
        (x,env,k) = val

Of course, we also have to re-write all of our builtin functions, which are now called with an extra continuation parameter, in CPS, but we'll get to that later. Now, the k parameter to eval isn't a real continuation, because Python doesn't have them; it's a callback that actually does return a value, but the value it returns is the value of executing the continuation. At the top level, when calling eval from outside, there's no continuation to process values, so the callback is just the identity function. Once again, eval can never return a Tail, nor can it return a continuation- but continuations can return Tails (in fact, they usually will), so we have to make sure that we check that whenever we get a value via a call to k.
Function application is a little bit special; we build our own continuation in the eval function that will actually do the procedure call, using a Python closure to save the necessary state, then loop to evaluate the procedure itself. We also avoid the overhead of constructing and destructing a Tail object since we already have all the bits that we need right there, and the environment stays the same. Of course, we also have to update the definition of a Tail, since we've increased the number of parameters to eval:


class Tail():
    def __init__(self,expr,env,k):
        self.expr = expr
        self.env = env
        self.k = k

    def __iter__(self):
        yield self.expr
        yield self.env
        yielf self.k

To help support recursion, we'll update the way that closure __call__s are handled a little bit:


def __call__(self, k, call_env, *args):
    new_env = Env(zip(self.vars, args), self.clos_env)
    new_env[self.sym] = call_env
    if not 'self' in args: new_env['self'] = self #safe recursion
    return Tail(self.body, new_env, k)

Now, every procedure gets access to a special "extra parameter" called "self" that refers to the currently executing function, unless it's overwritten by an explicit parameter of the same name. This ensures that recursion will still work even if the function is re-assigned to a different name, or if it's anonymous, without having to bother with the Y-combinator. Now we can write a simple infinite loop:


((vau (a) %
    (begin
        (define va (eval % a))
        (print va)
        (self (+ va 1))))
    0)

And it works! A never-ending stream of increasing numbers printed to the console, and Python doesn't complain about blowing up the stack.
Back to CPSing the built-in functions. Only a few of them are actually interesting: define, begin, list, and wrap. Define is interesting because it has side-effects: it modifies the environment. Other side-effectful functions (like set! and print) are essentially identical in structure:


def defvar(k,v,var,e):
    def def_k(val):
        v[var] = val
        return k(val)
    return Tail(e,v,def_k)

It just builds a continuation that performs its side-effects before passing the return value onto the external continuation.
Begin has to build a chain of continuations that will evaluate successive expressions in an arbitrarily long list until it gets to the end and passes the value of that expression to the external continuation. It looks like this:


def sequence(k,v,*x):
    if len(x) == 0: return k(None)
    return Tail(x[0],v,k if len(x) == 1 else lambda vx: sequence(k,v,*x[1:]))

The apparent recursive call to sequence does not actually risk exploding the stack, becuase it's not called from here; it's only called as the value of k somewhere in eval. Thus, it only grows the stack by two frames: one for the lambda that throws away intermediate values, and one for the next call to sequence, which will return without actually calling itself. The implementation of "cond" is similar.
List and wrap initially seem similar to begin. However, they have to actually save intermediate results, and that makes a big difference. List is implemented with this function to map evaluation loops over a list of argument expressions:


def cps_map_eval(k,v,*x):
    vx = []
    arglen = len(x)
    def map_loop(i):
        if i == arglen: return k(vx)
        else:
            def assign_val(vmx):
                vx.append(vmx)
                return map_loop(i+1)
            return Tail(x[i],v,assign_val)
    return map_loop(0)

It builds a chain of continuations that tack the value of an expression onto the end of the mapped argument list and then evaluates the next one. Wrap uses that same function to evaluate arguments to a wrapped procedure:


def wrap(k,v,p):
    return Tail(p,v,
        lambda vp:
            k(lambda ck,cv,*x:
                cps_map_eval(lambda vx: vp(ck,cv,*vx),cv,*x)))

      Since, after CPSing, every call to eval in the builtin functions is transformed into constructing a Tail object, CPSing our interpreter has also nicely eliminated the dependency between the definition of the global environment and the interpreter itself.
      Now that we have proper tail calls and a continuation chain that's independent of the Python stack, we're just about ready to implement first-class continuations! There is, however, one little problem: if we give interpreted programs access to their continuations, anything that tries to activate a continuation into a function argument list will break cps_map_eval. The continuation will tack new values onto the old argument list, re-evaluating any arguments after the one whose continuation was called, and try to continue to the next procedure with too many arguments!
      To fix this, we need to ensure two things: 1) argument evaluation order is arbitrary and unimportant; 2) calls to continuations that return to a function argument slot fill in the value of that slot, and only that slot. The new version of cps_map_eval that accomplishes those goals looks like this:


def cps_map_eval(k,v,*x):
    """
    Evaluates the elements of an argument list,
    creating continuations that will assign values
    to the correct indices in the evaluated list.
    """
    arglen = len(x)
    if arglen == 0: return k([])
    argv = [None]*arglen
    done = [False]
    def map_loop(i):
        if i == arglen:
            done[0] = True
            return k(argv)
        else:
            def assign_val(vmx):
                if not done[0]:     #on the first time through,
                    argv[i] = vmx   #evaluate the next argument in the list
                    return map_loop(i+1)
                else: #if this is a continuation call,
                    new_argv = argv[:]   #copy the other argument values
                    new_argv[i] = vmx    #and just overwrite this one
                    return k(new_argv)
            return Tail(x[i],v,assign_val)
    return map_loop(0)

Note that "done" is a single-element list rather than a simple boolean variable due to Python's scoping rules- closures can't mutate variables in higher scopes, but they can mutate the contents of containers. With just a little more work, this could be further modified to ensure that argument evaluation occurs in parallel.
Now that our continuations are safe, we need to decide how to give the interpreted language access to them. It seems a little disingenuous to make calls to continuations look just like calls to functions, but since we turned all of our syntax into function calls, we don't really have any special syntax that we could use that would look different. So, we'll just go with that. We'll wrap up interpreter callbacks in a special callable Continuation object for the interpreted language:


class Continuation():
    def __init__(self,k):
        self.k = k

    def __call__(self, call_k, call_env, *args):
        if len(args) != 1:
            raise Exception("Continuations take exactly 1 argument.")
        return self.k(args[0])

We could just use a Python closure for that, but making it a class of it's own results in nicer debugging output. When called, it throws out the current environment and continuation, and activates the saved callback instead. This version of Continuation acts like a vau expression- it doesn't evaluate its argument! We could use wrap on a continuation value, but it's probably better if explicit arguments to continuations behave the same way as implicit return values, so here's an alternate version that evaluates the continuation's argument:


class Continuation():
    def __init__(self,k):
       self.k = k

    def __call__(self, call_k, call_env, *args):
        if len(args) != 1:
            raise Exception("Continuations take exactly 1 argument.")
        return Tail(args[0],call_env,self.k)

This even looks more like what it actually does: returns to another iteration of the eval loop, but replacing the current continuation with the saved continuation.
We still need to provide a way for programs to get references to continuations. Rather than adding an extra built-in form like call/cc to bind continuations to the environment, we'll just make another minor edit to how closures are __call__ed:


def __call__(self, k, call_env, *args):
    new_env = Env(zip(self.vars, args), self.clos_env)
    new_env[self.sym] = call_env
    if not 'self' in args: new_env['self'] = self #safe recursion
    if not 'return' in args: new_env['return'] = Continuation(k)
    return Tail(self.body, new_env, k)

Every function gets a reference to the current continuation at its call site as an extra over-writable argument, just like "self", which it can return or pass on to other function calls.
Here are a couple of simple programs to verify that it really works:


(print ((vau () %
    (begin
        (+ 1 2)
        (return 7)
        19))))

Prints 7, not 19, because the current continuation that would return 19 is thrown out and replaced.


(define c nil)
(define mul (lambda (a b) (* a b)))
(print
    (mul 
        ((vau () % (begin (set! c return) 2)))
        (+ 2 3)))
(c 3)
(c (+ 1 2))

Prints 10, 15, 15, restarting the calculation with a different value for the first argument each time c is called.

As before, working code can be found at https://github.com/gliese1337/schrodinger-lisp/.

Thursday, April 26, 2012

Schrodinger's Equation of Software

After reading Michael Nielsen's "Lisp as the Maxwell's equations of software", I thought "that eval function is huge; there has got to be a way to simplify it". Additionally, I've long been annoyed by the fact that many of Lisp's syntactic forms and function calls look exactly the same; this leads to enormous frustration when you try to do something like (map and list-of-bools) and are informed by the compiler that "and" is not a function- it's a macro. If it looks like a function, it should act like a function. Ideally, it should actually be a function; the language should not treat it specially.

So, I started playing around trying to eliminate as many special forms as possible from the interpreter. Here's my modification of the tiddlylisp eval function, stripped down with as much as possible moved into the global environment, and with a little extra error handling thrown in:

def eval(x, env):
    def eval(x, env):
    "Evaluate an expression in an environment."
    if isa(x, Symbol):  # variable reference
        return env.find(x)[x]
    elif not isa(x, list):  # constant literal
        return x
    elif x[0] == 'quote' or x[0] == 'q': # (quote exp), or (q exp)
        (_, exp) = x
        return exp
    elif x[0] == 'if':  # (if test conseq alt)
        (_, test, conseq, alt) = x
        return eval((conseq if eval(test, env) else alt), env)
    elif x[0] == 'cond':  # (cond (p1 e1) ... (pn en))
        for (p, e) in x[1:]:
            if eval(p, env):
                return eval(e, env)
        raise ValueError("No Branch Evaluates to True")
    elif x[0] == 'set!':  # (set! var exp)
        (_, var, exp) = x
        env.find(var)[var] = eval(exp, env)
    elif x[0] == 'define':  # (define var exp)
        (_, var, exp) = x
        env[var] = eval(exp, env)
    elif x[0] == 'lambda':  # (lambda (var*) exp)
        (_, vars, exp) = x
        return lambda *args: eval(exp, Env(zip(vars, args), env))
    elif x[0] == 'begin':  # (begin exp*)
        for exp in x[1:]:
            val = eval(exp, env)
        return val
    else:    # (proc exp*)
        exps = [eval(exp, env) for exp in x]
        proc = exps.pop(0)
        if hasattr(proc, '__call__'): return proc(*exps)
        else: raise ValueError("%s is not a procedure" % to_string(x[0]))

The global environment looks like this:

global_env = Env({
    '+': operator.add,
    '-': operator.sub,
    '*': operator.mul,
    '/': operator.div,
    '>': operator.gt,
    '<': operator.lt,
    '>=': operator.ge,
    '<=': operator.le,
    '=': operator.eq,
    'eq?': lambda x,y:(not isa(x, list)) and (x == y),
    'cons': lambda x,y:[x]+y,
    'car': lambda x:x[0],
    'cdr': lambda x:x[1:],
    'list': lambda *x:list(x),
    'append':operator.add,
    'len': len,
    'symbol?': lambda x:isa(x,Symbol),
    'list?': lambda x:isa(x,list),
    'atom?': lambda x:not isa(x,list),
    'exit': exit,
    'True': True,
    'False': False
})

The only things that have to be coded into the eval function are basic syntactic forms that need to be able to control the evaluation of their arguments or directly access the environment. If we implemented macros, most of those could be extracted from eval and implemented as macro definitions in the global environment (in exchange for complicating eval with macro expansion). But, there's another way to simplify eval, without ever needing macros: rather than evaluating all of the arguments to a function, pass the unevaluated expressions along with a reference to the dynamic calling environment and let the function decide which ones it actually needs evaluated. A function that behaves that way is called a vau expression, and using it allows us to eliminate all special syntactic forms from the language; even the definition of a vau expression itself can be implemented as a vau expression!

The interpreter for vau expressions looks like this:

def eval(x, env):
    "Evaluate an expression in an environment."
    if isa(x, Symbol):    # variable reference
        return env[x]
    elif isa(x, list):    # procedure call
        proc = eval(x[0], env)
        if hasattr(proc, '__call__'):
            return proc(env,*x[1:])
        raise ValueError("%s = %s is not a procedure" %
                        (to_string(x[0]),to_string(proc)))
    return x # literal constant

Much simpler! And this is the global environment:

def vau(clos_env, vars, call_env_sym, body):
    def closure(call_env, *args):
        new_env = Env(zip(self.vars, args), self.clos_env)
        new_env[self.sym] = call_env
        return eval(self.body, new_env)
    return closure

def defvar(v,var,e):
    val = eval(e, v)
    v[var] = val
    return val
 
def setvar(v,var,e):
    val = eval(e, v)
    env.find(var)[var] = val
    return val

def cond(v,*x):
    for (p, e) in x:
        if eval(p, v):
            return eval(e, v)
    raise ValueError("No Branch Evaluates to True")

def sequence(v,*x):
    for e in x[:-1]: eval(e, v)
    return eval(x[-1], v)
 
def wrap(v,p):
    p = eval(p,v)
    return lambda v,*x: p(v,*[eval(expr,v) for expr in x])

global_env = Env({
    '+': lambda v,x,y:eval(x,v) + eval(y,v),
    '-': lambda v,x,y:eval(x,v) - eval(y,v),
    '*': lambda v,x,y:eval(x,v) * eval(y,v),
    '/': lambda v,x,y:eval(x,v) / eval(y,v),
    '>': lambda v,x,y:eval(x,v) > eval(y,v),
    '<': lambda v,x,y:eval(x,v) < eval(y,v),
    '>=': lambda v,x,y:eval(x,v) >= eval(y,v),
    '<=': lambda v,x,y:eval(x,v) <= eval(y,v),
    '=': lambda v,x,y:eval(x,v) == eval(y,v),
    'eq?': lambda v,x,y:
    (lambda vx,vy: (not isa(vx, list)) and (vx == vy))(eval(x,v),eval(y,v)),

    'cons': lambda v,x,y:[eval(x,v)]+eval(y,v),
    'car': lambda v,x:eval(x,v)[0],
    'cdr': lambda v,x:eval(x,v)[1:],
    'list': lambda v,*x:[eval(expr, v) for expr in x],
    'append': lambda v,x,y:eval(x,v)+eval(y,v),
    'len':  lambda v,x:len(eval(x,v)),
    'symbol?': lambda v,x:isa(eval(x,v),Symbol),
    'list?': lambda v,x:isa(eval(x,v),list),
    'atom?': lambda v,x:not isa(eval(x,v), list),
    'True':  True,
    'False': False,
    'if':  lambda v,z,t,f: eval((t if eval(z,v) else f), v),
    'cond':  cond,
    'define': defvar,
    'set!':  setvar,
    'vau':  vau,
    'begin': sequence,
    'eval': lambda v,e,x: eval(eval(x,v),eval(e,v)),
    'wrap': wrap
})

The definition of the environment is now more complex because every function has to handle the evaluation of its own arguments. The vau expression that defines new vau expressions takes a list of names of arguments and a body expression, just like a lambda would, but has an additional parameter for the symbol to which the calling environment should be bound. Now, if you look carefully, you may be wondering "where's quote? where's lambda? don't we need those?" Well, no, we don't; quoting is the default behavior anyway, since vaus don't evaluate arguments, and anything that we can do with lambdas we can do with vaus, always exercising explicit control over evaluation. But in order to save typing, we can write quote and lambda as standard library functions. They look like this:

(define q (vau (e) % e))
(define lambda (vau (args body) %
    (wrap (eval % (list vau args (q $) body)))))

In order to define lambda, we needed an extra built-in function called "wrap"; wrap just takes a vau expression and returns a new vau expression that automatically evaluates all of its arguments. Lambda uses eval to produce a vau expression that takes the same list of argument names and the same body as the lambda expression did with a throwaway environment binding, and then wraps it to ensure that all of its arguments will be evaluated.

All that wrap does is essentially to map eval over the argument list. We can write map in vau expressions:

(define null? (vau (x) % (= (eval % x) (q ()))))
(define map (vau (f l) % 
    (begin
        (define fv (eval % f))
        (define lv (eval % l))
        (if (null? lv) (q ())
            (cons (fv (car lv)) (map fv (cdr lv)))))))

So, can we write lambda using an explicit map, and no wrap? Well, not very easily. An attempt is something like this:

(define curry (vau (f a) %
    (begin
        (define fv (eval % f))
        (define av (eval % a))
        (vau (b) & (fv av (eval & b))))))

(define lambda (vau (args body) %
    (vau (eval % args) &
        ((eval & (list vau args (q $) body))
            (map (curry eval &) (eval & args)))))

But the problem with this is that a call to (vau ...) does not evaluate its arguments! So, rather than lambda returning a vau expression that takes the same arguments as the lambda expression, it will end up returning a vau expression that takes arguments named "eval", "%", and "args". If you fix that by evalling the outer vau expression, then you'll have to quote the inner evalled vau expression, which unbinds its eventual evaluation from the environment in which lambda was called. Providing wrap as an additional primitive saves the day.

Wrap also allows us to simplify the definition of our global environment a bit:

global_env = Env({
    '+': wrap(None,operator.add),
    '-': wrap(None,operator.sub),
    '*': wrap(None,operator.mul),
    '/': wrap(None,operator.div),
    '>': wrap(None,operator.gt),
    '<': wrap(None,operator.lt),
    '>=': wrap(None,operator.ge),
    '<=': wrap(None,operator.le),
    '=': wrap(None,operator.eq),
    ...
})

Although this formulation is somewhat less efficient. One could also simply write all of the built-in functions in forms that do not evaluate their arguments, assuming that they will only get primitive values and not expressions passed it, and then define the more general argument-evaluating forms as standard library functions using wrap.

Vau & wrap actually provide a more fundamental abstraction than lambda does, in the sense that they can be used to build lambdas and some other extra stuff. So, if the Lisp interpreter is like software's Maxwell equations, I see the vau-based interpreter as a bit like software's Schroedinger equation, with the special built-in global environment functions (like the definition of vau itself, or the define function, or the if function) filling the roles of the different fundamental particles that obey it, which can be composed to form all matter. Maybe vau expressions are like complex-valued quantum wave functions, while wraps are like the resulting probability distributions.

Vau essentially makes syntax and environments into first-class language citizens, while wrap allows us to drop into the more concise world of implicit evaluation when we want to. Making syntax with explicit control of evaluation into a first-class citizen means that we don't need macros anymore; notice that our definition of lambda used its arguments as components in constructing a new syntax tree and then evaluated that, just as macros are used for compile-time syntax transformations. Anything that can be accomplished with compile-time macros can be accomplished at run-time with a vau expression instead. (Of course, implementing compile-time macros may still be desirable for efficiency reasons.)

Since synactic forms and regular functions are no longer distinct, there's no need for keywords; anything can be used as a variable name, and anything can be shadowed or overwritten. This may seem like a very dangerous thing to allow, as any JavaScript hacker could tell you, but if you're paranoid about other code overwriting important functions in your dynamic environment, you can always write your code in a closure containing a reference to the unmodified global environment, and evaluate everything in that.

Since we have explicit control over when and how arguments are evaluated, we also no longer have to worry about laziness vs. strictness, or call-by-value vs. call-by-name vs. call-by-reference. Those distinctions only arise when the details of evaluation are abstracted away behind a lambda expression. Want strict evaluation? Write your function to evaluate all its arguments up front. Want lazy evaluation? Just evaluate arguments when you actually need them, and cache the results in a mutable local variable. Want call-by-name? Evaluate arguments when you need them and don't bother with caching.

With first-class environments, we can also build namespaces without built-in language support; referencing a symbol in a namespace is just a special case of evaluating an expression in an environment. This allows for cool things like returning an environment from a deeply nested nested closure, and using it to evaluate a function definition elsewhere, thus permitting flattening out of deeply nested closures without losing access to the static environment where the nested closure was originally defined. It's just like defining a class in, say, C++ with a bunch of private instance variables, and then defining all of the method implementations elsewhere, un-nested, with a little annotation that says that they belong to that environment.

For more examples of vau expressions, here's the tiddlylisp square root program from Michael's article re-written into vau form:

(define sqrt (vau (x) % (sqrt-iter 1.0 (eval % x))))

(define sqrt-iter 
    (vau (guess x) % 
        (begin
            (define gv (eval % guess))
            (define xv (eval % x))
            (if (good-enough? gv xv) gv
                (sqrt-iter (improve gv xv) xv)))))

(define good-enough? 
    (vau (guess x) % 
        (< (abs (- (eval % x) (square (eval % guess)))) 0.00001)))

(define improve (vau (guess x) %
    (begin
        (:= gv (eval % guess))
        (average gv (/ (eval % x) gv)))))

(define abs (vau (x) % (begin (define xv (eval % x)) (if (< 0 xv) xv (- 0 xv)))))
(define square (vau (x) % (begin (define xv (eval % x)) (* xv xv))))
(define average (vau (x y) % (* 0.5 (+ (eval % x) (eval % y)))))

Now maybe you're wondering "why did we eval the argument to sqrt? It just gets passed straight on to sqrt-iter, and sqrt-iter evaluates it, so isn't that redundant?"

Well, if you instead write sqrt as (vau (x) % (sqrt-iter 1.0 x)), it will work so long as you only pass in primitive numbers as arguments. If you try (sqrt (+ 1 2)), though, Python will explode at you. What went wrong?

(vau (x) % (sqrt-iter 1.0 x)) passes sqrt-iter the symbol x, not it's value; sqrt-iter then evaluates the symbol x, and, since the dynamic environment of sqrt in which x was bound is included in the scope chain of the dynamic environment passed to sqrt-iter, it resolves to the expression passed to sqrt. If that was a number, all is well. But if it was a compound expression, it blows up! Defining sqrt as (vau (x) % (sqrt-iter 1.0 (eval % x))), however, ensures that when sqrt-iter evaluates the symbol x (which is value of sqrt-iter's formal parameter x), it will always get a number back. The square function actually could be simplified, since we know it's only ever called with primitive number arguments, but I've left it with its own call to eval so that it could be used in other situations.

Appendix: Syntactic Sugar

Based on the analogy of evaluating in an environment to referencing a namespace, we can add a little syntactic sugar to our parser to get rid of all the "evals". We'll have ENV::EXPR desugar to (eval ENV EXPR) and 'EXPR desugar to (q EXPR) so we can re-write our library functions like this:

(define sqrt (vau (x) % (sqrt-iter 1.0 %::x)))

(define sqrt-iter 
    (vau (guess x) % 
        (begin
            (define gv %::guess)
            (define xv %::x)
            (if (good-enough? gv xv) gv
                (sqrt-iter (improve gv xv) xv)))))

(define good-enough? 
    (vau (guess x) % 
        (< (abs (- %::x (square %::guess))) 0.00001)))
(define improve (vau (guess x) %
    (begin
        (:= gv %::guess)
        (average gv (/ %::x gv)))))

(define abs (vau (x) % (begin (define xv %::x) (if (< 0 xv) xv (- 0 xv)))))
(define square (vau (x) % (begin (define xv %::x) (* xv xv))))
(define average (vau (x y) % (* 0.5 (+ %::x %::y))))

(define q (vau (e) % e))
(define lambda (vau (args body) %
    (wrap %::(list vau args '$ body))))

(define null? (vau (x) % (= %::x '())))
(define map (vau (f l) % 
    (begin
        (define fv %::f)
        (define lv %::l)
        (if (null? lv) '()
            (cons (fv (car lv)) (map fv (cdr lv)))))))

(define curry (vau (f a) %
    (begin
        (define fv %::f)
        (define av %::a)
        (vau (b) & (fv av &::b)))))

We can also write an extra library function to get a handle on the current environment, in case we're not inside a vau expression:

(define get-env (vau () % %))

And that allows us to see that any variable reference

    vau> (define a 2)
    vau> a
    2

is actually equivalent to

    vau> (define std (get-env))
    vau> std::'a
    2

evaluating a symbol in the current namespace.

Code for a working vau interpreter can be found at https://github.com/gliese1337/schrodinger-lisp.