Saturday, February 4, 2012

Javascript Needs Continuations

I am a JavaScript junkie. I love JavaScript. I love building things in JavaScript. And I love the fact that node.js lets me easily use JavaScript on the server as well as the client. But sometimes, JavaScript is just plain missing a really useful feature, and it gets on my nerves. One thing that I do not love about JavaScript, in which opinion I am far from alone, is the proliferation of nested callbacks.

The biggest problem with asynchronous callbacks is that they're infectious. Asynchronicity cannot be isolated and encapsulated.
Consider this code:
var returnHTML = renderAPage(name);
response.end(returnHTML);
...
function renderAPage(name){
return "Hello "+(name||"World")+"!";
}

Now maybe I want to make it a little more interesting and read a template out of a file or something:
var returnHTML = renderAPage(name);
response.end(returnHTML);
...
function renderAPage(name){
return fs.readFileSync("hello.html",'utf8').replace("{name}",name||"World");
}

But this is slow, so I clearly want to use asynchronous IO. But if I do, I cannot maintain the same interface! Altering the internal implementation of renderAPage requires leaking changes up through everything that calls it- and this is true no matter how deeply nested that one asynchronous call may be in other auxiliary functions. My new asynchronous code now looks like this:
renderAPage(name,function(err,returnHTML){
if(err) throw err;
response.end(returnHTML);
});
...
function renderAPage(name, callback){
fs.readFile("hello.html",'utf8', function(err, text){
if(err) callback(err);
else callback(null, text.replace("{name}",name||"World"));
});
}

Much more cluttered, and I've essentially had to transform my entire program by hand into continuation passing style just so the event loop can take control away after that asynchronous system call and then give it back later. CPS transformations are supposed to be the job of a compiler, not a programmer. What I would really like to do is this:
var returnHTML = renderAPage(name);
response.end(returnHTML);
...
function renderAPage(name){
var return_to = arguments.continuation;
fs.readFile("hello.html",'utf8', function(err, text){
if(err) throw err;
return_to(text.replace("{name}",name||"World"));
});
}
Where return_to is a genuine continuation- called like a function, but passes its argument through to the return value at the place where the continuation was saved. Being able to save continuations makes asynchronicity encapsulatable. And unlike other approaches to fixing JavaScript concurrency, this does not require any additional syntax or keywords; just an extra field on arguments representing the continuation at the place where a function was called.

Eh, but there is one complication- the way asynchronous calls are handled currently, renderAPage will end up returning twice, and the first time it'll return undefined, which is bad. We can just check the return value and terminate if it's not the real value, kind of like checking the return value on POSIX fork, but that fails to eliminate the leaking of implementation details. We could change the semantics of asynchronous calls, so they always suspend execution of that particular stack and never return. But then, what if you really do want to return twice?

I don't think that can be addressed without some additional syntax. Fortunately, it's a very simple bit of additional syntax. The break and continue keywords can already be used with label arguments, and they seem the perfect words to use for continuation control with expressions as arguments:
var returnHTML = renderAPage(name);
var response.end(returnHTML);
...
function renderAPage(name){
var return_to = arguments.continuation;
break fs.readFile("hello.html",'utf8', function(err, text){
if(err) throw err;
return_to(text.replace("{name}",name||"World"));
});
}
Here, I'm using the break keyword to signal that this function call will never return- if it tries to return, just terminate execution. Thus, the only way to get information out of it is to have the continuation called. But what if I want parallel execution?
var returnHTML = renderAPage(name);
var headers = renderHeaders();
response.writeHead(200,headers);
response.end(returnHTML);
renderAPage and renderHeaders might both contain asynchronous calls with break, but I have no need to run them sequentially, and I don't want to pause the whole thread while waiting for renderAPage to return via continuation. Well, that's where continue comes in:
var returnHTML, headers;
continue returnHTML = renderAPage(name);
continue headers = renderHeaders();
response.writeHead(200,headers);
response.end(returnHTML);
This usage of continue tells the interpreter not to worry about whatever might be going on inside the following expression- don't worry about side effects, don't worry about execution breaks; spawn a new thread to handle it if you want, but that's an unnecessary implementation choice. You're allowed to keep going and execute some more lines, but just remember that if you ever actually need to read the result of that expression, pause and wait for the return, whether it's an actual return statement or calling a continuation. I'm not sure how one should handle the possibility of multiple returns in this situation; the simple way might be to simply say that continues are only allowed to return once, and subsequent calls to that continuation will throw an error or be ignored.

If the interpreter does feel free to actually spawn a new thread to handle "continues", this potentially gives the programmer great power to define new asynchronous functions without having to use the dreaded nextTick or setTimeout, perhaps something like this:
function myAsyncFunction(){
var k = arguments.continuation;
continue break (function(){
...
k(return_value);
})();
}
The continuation is saved; an anonymous internal function is called and specified not to return, but execution of the containing function is allowed to continue, and will initially return undefined, just like fs.readFile. At some point, however, the internal anonymous function calls the continuation, and it returns again.

These additional behaviors for break and continue do not conflict with any existing syntactically valid constructs, and since they don't require adding any new keywords, they're guaranteed not to break any pre-existing code. That extra syntactic complexity is, however, all just figuring out how to deal with concurrency. This is important because it's the main impetus for my annoyance over JavaScript's lack of continuations, but once you've got continuations they're useful for much more than just that. And while adding real continuations may be a major feat for implementation in the interpreter, the interface for it does not have to be, and again should be perfectly backwards-compatible. Will anyone give me my arguments.continuation?

EDIT: Somehow, my code samples got all messed up when this was first published. They are now all corrected.