Friday, May 6, 2016

A Programming Language for Magic

Or, The Structure and Interpretation of Demonic Incantations, Part II.

As previously mentioned, a great deal of the training for a magician in the animist, demon-magic world consists of developing the mental discipline to compose precise and unambiguous instructions. Natural human languages are terrible at both precision and disambiguation; legalese is a direct result of this fact, and it still fails. In order to summon demons safely, therefore, a specialized formal language will be required- a programming (or "incantation") language.

Many people criticize FORTH and related languages for being hard to follow when data is passed implicitly on the program stack, but that's not an indictment of the concatenative style- it's an indictment of point-free (or, less charitably "pointless") style; i.e., the avoidance of explicit variable names. Concatenative programs can use explicit variables for convenience, and applicative languages can be written in point-free style as well. This is in fact quite common in Haskell and various LISPs, though the syntax of most mainstream languages makes it less convenient.

Despite both being called "languages", programming languages aren't much like human languages. That is, after all, why we need them- because human languages don't do the job. Conversely, programming languages tend to not play nice with humans' natural language abilities. Nobody likes trying to speak a programming language, or even to write fluently in one. When working out ideas on a whiteboard, or describing an algorithm in a paper, real-world programmers and computer scientists frequently avoid real programming languages and fall back on intuitive "pseudo-code" which sacrifices rigor in exchange for more easily communicating the essential ideas to other humans (or themselves 10 minutes later!), with careful translation into correct, computer-interpretable code coming later.

Now, magicians could work in a similar way, and it makes sense that they frequently would: laboriously working out correct spells in a written incantation language ahead of time, and then executing them by summoning a demon with the simple instruction "perform the task described on this paper". One can even go farther and imagine that, given the effort involved in creating such a spell, that they might be compiled into books for frequent re-use, stored in a university library, such that from then on magicians could execute common useful spells by instructing a demon to, e.g., "execute the incantation on page 312 of volume 3 of the Standard Encyclopedia of Spells as stored on shelf 4A of the Unseen University Library".

But, this is not a universal solution. For many simple spells, the description of where to find them may well be longer and more error-prone than simply reciting the spell itself. And, as any sysadmin could tell you, sometimes it's not worth writing a stored program for one-off tasks; you just need to be able to write a quite bash script on the command-line. For magical purposes, therefore, we'll want a programming language that is optimized for speaking, while retaining the precision of any other programming language.

After some discussion on the CONLANG-L mailing list, I think I have a pretty good idea of (one possibility for) what a "magical scripting language" might look like.

The major problem that I see needing to be solved is parenthesization; one can't be expected to properly keep track of indentation levels in speech, and explicit brackets around everything can get really hard to
keep track of even when you can look at what you're doing on a screen (which is why we have special editors that keep track of bracket-matching for us!)

Ideally, the language would not require explicit parenthesization anywhere, and other "syntactic noise" and boilerplate code would be minimized, so that every token of the language carries as much semantic weight as possible, to make them easier to memorize. This leads me to think that a concatenative, combinator-based language (like FORTH, or the non-programming-language conlang Fith) would be the best base, on top of which "idiomatic" flourishes could be added. Concatenative languages are often called "stack based", but they don't have to be implemented with a stack. They are in fact isomorphic to applicative languages like LISPs (i.e., there is a purely mechanical method for transforming any concatenative program into a fully-parenthesized applicative program and vice-versa), and can equally well be thought of as "data flow" languages, specifying how the outputs of one operation connect up with the inputs of another, and this is in fact a topic I have written about on this blog before.

There is a trade-off between being able to keep track of complex data flow without any explicit variable names, which can get very difficult if certain items are used over and over again, and being required to keep track of lots of temporary or bookkeeping variables (like loop indices). A concatenative language that allows you to define variables when you want, but minimizes extra variables whenever they would constitute "syntactic noise" with minimal semantic value, seems to me the best of both worlds. This should make simple incantations relatively easy to compose on-the-fly, and easy to memorize, as they would be mostly flat lists of meaningful words, similar to natural language.

Once you introduce re-usable variables, however, you have to decide the question of how variables are scoped. Letting all variables have the same meaning everywhere ("global scope") is an extremely bad idea; it makes it very dangerous to try re-using existing spells as components of a new spell, because variable names might be re-used for very different purposes in different sub-spells. If we want to be able to define and re-use sub-spells (i.e., functions, subroutines, procedures, or blocks, in terms of real-world programming languages), there are two major approaches to scoping rules available: dynamic scoping, and static scoping.

Loosely speaking, dynamic scoping means that any variables that are not defined in a function (or block) are the same as variables with the same name that are in-scope at the place where the function was called. This can only be figured out when a function is actually run, hence "dynamic". Static scoping means that any variables not defined in a function (or block) are the same as variables with the same name that are in-scope at the place where the function was defined. This can be figured out by looking at the text of a program without running it, hence "static".

Dynamic scope is generally easier to implement in (some kinds of) interpreters, while static scope is generally easier to implement in compilers, which early on in CS history led to a split between some languages using dynamic scope and some using static scope just because it was easier to implement the language that way. The modern concensus, however, is that dynamic scope is almost always a Bad Idea, because it makes it harder to analyze a program statically and prove that it is correct, whether with an automatic analysis tool or just in terms of being able to look at it and easily understand how the program works, without running it. Nevertheless, dynamic scope does have some legitimate use cases, and some modern languages provide work-arounds to let you enforce dynamic scoping where you really need it.

Scala, for example, simulates dynamic scoping with "implicit arguments", and this really highlights a good argument for dynamic scoping in a spoken programming language. One pain point with most programming languages, which would only be exacerbated in a spoken context, is remembering the proper order for long lists of function arguments. This can be ameliorated by named (rather than positional) arguments, but having to recite the names of every argument every time one calls on a sub-spell seems like a lot of boilerplate that it would be nice to be able to eliminate.

With dynamic scoping, however, you get the benefits of named arguments without having to actually list them either in a function definition (although that might well be a good idea for documentary purposes when spells are written down!) or when calling them. Just use reasonable names in the body of a sub-spell for whatever object that sub-spell is manipulating, and as long as variables with the appropriate names have already been defined when you wish to use a sub-spell, that's all you need. Making sure that those names refer to the correct things each time the sub-spell is called is taken care of automatically.


So, what might this incantation language actually look/sound like? Let's look first a simple script to do your dishes after a meal:

loop [more-than COUNT-DISHES 0] [let DISH [NEXT-DISH] DISH pickup DISH SINK move WATER DISH use SOAP DISH use DISH CABINET move]

Things that are assumed to built-in to the language are in lower case, while things that are assumed to be magician-defined are in upper case. If every "verb" has fixed arity (so you always know exactly how many arguments it will need- there are no ambitransitives) then you don't need brackets around arguments or special delimiters between statements/expressions. We only need them to delimit "complement clauses"- i.e., blocks of code that are themselves arguments to other operators (in this, case "loop" and "let").

If magicians want to create their own higher-order spells which take other spells as arguments, there's not much to be done about eliminating those brackets. There are transformations in combinator calculus that can be used to remove them, but the results are extremely difficult for a human to follow, and no one would compose them on-the-fly for anything task of significant complexity. We could introduce a macro metaprogramming system that would let magicians define their own clause-delimiter syntax to eliminate cruft or enhance memorability, but then you might have to remember separate syntax for every sub-spell, which could get even worse than having to remember a single set of brackets. This is, in fact, a real problem that the LISP community (among others) deals with- if your language is too powerful, and too expressive, you get fragmentation problems, and end up with extra memory burden trying to keep track of what ridiculous things your co-workers decided to use that power for. It's often easier to enforce consistency.

Still, we can achieve some improvement by adding some "flourishes" to some of the basic, built-in operators in the language, replacing generic brackets with special construct-specific keywords. This is in fact a strategy taken by many real-world programming languages, which use block keywords like "while...wend" or "if...endif" rather than generic brackets for everything. With that alteration made, the spell might look something like this:

while more-than COUNT-DISH 0 loop let DISH be NEXT-DISH in DISH pickup DISH SINK move WATER DISH use SOAP DISH use DISH CABINET move wend

Now there's a lot of distance between the "loop" and the "wend", which could, in a more complicated spell, make it easy to lose track of, even with the specialized keywords. To help with that, we can do some aggressive let-abstraction- assigning blocks to variables, and then using the short variable name in place of the long literal block definition. That gets us a spell like

while more-than COUNT-DISH 0 loop WASHING wend

In some ways, that's better, but now we have the potentially-confusing sequence of nested "let"s ("let ... be let ... be ... in ... in"), and we can't move the second "let" outside the definition of "WASHING" because it actually needs to be re-evaluated, redefining "DISH", every time "WASHING" is called. This wouldn't be a big problem in a written programming language, but it's seriously annoying in speech. There are a few ways it could be fixed; one solution I like is allowing cataphoric variables (which refer to a later definition) in addition to anaphoric variables (which refer to an earlier definition), so we can switch up which structure to use to avoid repetition, and make a choice about which structure is more conceptually appropriate in different kinds of spells. Cataphoric variables are not terribly common in real-world programming languages, but they do exist, as in Haskell's "where"-clauses. Implementing "where" in out magic language, we can write the spell as

while more-than COUNT-DISH 0 loop WASHING wend
where WASHING is let DISH be NEXT-DISH in DISH pickup DISH SINK move WATER DISH use SOAP DISH use DISH CABINET move done

We might also take advantage of dynamic scoping to write the algorithm in a more functional style, mapping over the collection of dishes, like so:

for DISH in DISHES loop WASHING rof
where WASHING is DISH pickup DISH SINK move WATER DISH use SOAP DISH use DISH CABINET move done


for DISH in DISHES loop WASHING rof

In either case, the nested "let" to define the DISH variable is eliminated by binding a DISH variable in a for-in construct over a collection of DISHES, and relying on dynamic scope to make that variable available in the body of WASHING. Note that this avoids the need for spoken function parameter definitions, although it would be relatively easy to add syntax for them in cases where they do turn out to be useful.

Now, this looks like a pretty straightforward spell, but note that it only works if we assume that things like "pickup", "NEXT-DISH", "use", and so forth have all been rigorously and safely defined ahead of time! These are all actually pretty complex concepts, and in practice demon magic would probably not be used for relatively low-effort, high-descriptive-complexity tasks like washing dirty dishes. But, for magical scripting to be useful, we would a lot of these kinds of everyday things to be pre-defined. And just as *NIX users tend to have opinions about what things they individually want to have available in their command-line environment, and maintain custom .bash_profile/.bashrc files per login account, I imagine that perhaps working magicians carry around phylacteries containing written copies of personally-preferred spells, and begin each incantation by referring to them to establish the appropriate execution environment.