Tuesday, March 14, 2023

Linguistics & Andy Weir

If you have read one book by Andy Weir, it's probably The Martian (also available in a classroom edition, with less swearing!); or perhaps you have seen the movie, starring Matt Damon!

Unfortunately, there isn't much interesting going on with linguistics or language representation in The Martian. However, Andy Weir has published two other space-adventure hard-SF books: Artemis, set on the Moon, and Project Hail Mary, which goes interstellar. And they actually do some neat stuff which isn't covered by previous books I've reviewed!

The protagonist of Artemis is bilingual in English and Arabic. For the most part, this is just an interesting bit of character-and-world-building background, which ties in to her national origin (not white or American), which in turn ties in to the economy of the titular city of Artemis. The vast majority of the time, she speaks English, and there are couple of brief bits of dialog that are italicized to indicate, aha, this is not English, she's speaking Arabic now. But, there is one absolutely brilliant line of transliterated, but not translated, Arabic dialog, which occurs when Our Heroine is being bothered by a tourist:

"Ma'alesh, ana ma'aref Englizy," I said with a shrug. [...] Nothing like a language barrier to make people leave you alone.

I do not speak Arabic, but I would bet that means something like "Sorry, I don't speak English."

[Goes to check Google Translate.]

Ah, apparently it means "I suck at English." Close enough! As far as I could tell, that is the only place in which bilingualism actually impacts the plot, and it could easily have been left out, but that is totally a thing that a bilingual person might do, in a very relatable situation! It's like a context clue, but relying on your understanding of the social context being described, rather than the literal context.

Project Hail Mary has a much higher count of interesting linguistic bits, but I can't tell you about them without some spoilers. So, if that's a thing you care about, click that Amazon Affiliate link, buy the book, read it, and then come back here and give me another page view!

Are we good now? Good!

The main character, Ryland Grace, starts out monolingual in English. However, he has to interact with speakers of Russian and Chinese, and an alien, along with text in three of their languages. The only Chinese which is directly represented is the name of the Hail Mary mission commander, Yáo Li-Jie, whose family name "Yáo" is represented as a written character and transliterated in Ryland's dialog and narration. The representation of Russian, on the other hand, is not completely consistent, but spans several representational levels in different circumstances:

Level 0: People are said to be speaking in Russian, but Grace doesn't understand it, so we get no explicit representation.

Level 1: Grace can hear people speaking in Russian, and recognize the sounds, so we get a transliteration of the Russian speech into Latin characters. E.g.:

"Eto Stratt. Chto sluchylos?" she demanded.
"Vzryv v issledovatel'skom tsentre," came the reply.
"The research center blew up," she said.

(Also note the partial diegetic translation, with context that allows the non-Russophone reader to infer what the initial question probably was.)

Level 2: When Grace sees Russian text, that text is represented as-is, in the original orthography, regardless of whether or not Ryland can understand it. E.g.:

The name patch reads ИЛЮХИНА, another name from the crest. This was Ilyukhina's uniform.

In this case, Grace does understand it, because he recognizes his crewmate's name, even if he doesn't speak Russian, and we get a diegetic transliteration. The same thing is done with the character for Yáo's name. And we know that he never actually learned Russian because of another instance of direct orthographic representation:

Five 1-liter bags of clear liquid labelled водка. It's Russian for "vodka". How do I know that? Because I spent months on an aircraft carrier with a bunch of crazy Russian scientists. I saw that word a lot.

Not because he learned to actually read Russian--because he saw that word a lot.

There is one example of orthographic representation of Russian in a Russian person's dialog--just a single word--which is where the inconsistency comes in. Ryland wouldn't have understood it (well, maybe he would, just because it sounds really similar to the English word in this case) or known how to write it, so it should've been transliterated for consistency. Unless Andy Weir was just trying to do some fancy thing beyond my understanding with that.

Anyway, the really cool stuff happens once Grace meets an alien, whom he names "Rocky". Rocky is from 40 Eridani A, lives under 28 atmospheres of pressure at over 200 degrees, and "sees" with passive sonar--very reminiscent of the Hot Abyormenites from Hal Clement's Cycle of Fire! (Although the precise mechanism of sound perception and processing between those species is quite different; in that respect, the Eridians remind me more of the Tenebrans from another Hal Clement novel, Close to Critical.) And...

"Fortunately, Rocky speaks with musical chords."

Like the Machi do, or the aliens from The Jupiter Theft by Donald Moffitt. (Huh. Maybe I should review that book some time....) And yeah, that is pretty dang fortunate, because it makes the alien language ridiculously easy to analyze, and to synthesize. I kinda have to assume that that's exactly why Andy Weir decided to design the Eridians that way--Ryland Grace is not a linguist, and while Weir does a remarkably good job of not sweeping first-contact language barriers under the rug, he's made several decisions about how Eridians and their language work that allow skipping a lot of the potential complexity. Donald Moffitt had a slightly different motivation in his work--giving the aliens a musical language allowed him to make it important to the plot that his main character had perfect pitch, which not all humans do, which made that main character specially suited to learn the alien language and, well... be the main character! But, in another parallel between these two works, by the end of the book Weir has Grace using a keyboard to "speak" to Eridians in their native language.

Grace is not stated to have perfect pitch, but he does rely on Rocky speaking in a consistent scale, particularly to have his computer (which does have perfect pitch!) automatically recognize Eridian words. That's not completely unreasonable, but I am quite glad that Weir did not explicitly state that the Eridian language was actually tied to an absolute pitch scale, because, as briefly mentioned in my review of the Machi languages, there are good reasons to think that any naturally-evolved audio communication system for biological beings could not be based on an absolute scale. Additionally, unless I missed something, the simplest syllables that are actually described in the text from Rocky's speech consist of chords of at least two notes, so identifying phonemes by frequency ratios with no fixed scale is a possibility. Unfortunately, we are told two unlikely-seeming things about the nature of Rocky's speech:

  1. Some Eridian words use chords consisting of notes that can be described in terms of named notes on the Western musical scale. That particular pattern of frequencies (or rather, family of patterns of frequencies, depending on which tuning system you use) for making up a scale is not even universal among human cultures, and certainly has no relation to the use of pitch in any human language with phonemic tone or any whistling language, so it kind of defies belief that an alien species would develop a tone-chord phonology that lined up with the modern Western musical scale. I choose to retcon this by saying that Ryland Grace just picked notes that were close enough to the frequency values spit out by his waveform analysis to make things easier to write down.
  2. Rocky is described as transposing his speech by an octave to indicate certain emotional states. It's important that the transposition is exactly one octave, because that makes it easy for Grace to figure out what's going on and fix it when his computer stops recognizing all of Rocky's words. Now, the octave is a very mathematically natural interval... but the idea of octave equivalence isn't actually natural even for humans; it has to be learned, and its importance as a musical concept it also not universal in human cultures. So... why would an alien species develop octave equivalence as a key feature of their natural language?
A lot of the complication of learning an alien language is avoided by making Rocky (a non-viewpoint character) take on most of the load, rather than Grace. Rocky (if not Eridians in general) apparently has an eidetic memory for sounds, including human speech sounds, and can pick up Grace's English words for things on a single exposure. I have to wonder what implications this might have for the childhood Eridian language acquisition process, and how language works for them in general. The immediate implication, however, is that they quickly get to a point where Grace can just speak English and have Rocky understand him, while Rocky adopts a sort of Eridian-English pidgin in which he speaks Eridian words (not being able to articulate the human speech sounds of English) slotted into an English-like grammar. This has the convenient side-effect of meaning that Weir didn't have to actually construct any Eridian grammar! Although, it does appear that Rocky's native language lacks a distinction between nominative and possessive personal pronouns, based on the fact that his italicized dialog never features possessive pronouns.

This kind of "receptive multilingualism", in which each person speaks their own language while understanding the other, is not a new thing, although I believe this is the first media I have reviewed that uses it. It's notably quite common in Star Wars, where it is used for exactly the same purpose: to portray communication between species who can't pronounce each other's languages, most famously when Han Solo is conversing with Chewbacca, or anyone at all is talking with a beeping R2-series droid. However, receptive multilingualism is also a thing in real life, where it does not occur because of differences in physical articulatory abilities (which are the same for nearly all humans), but either as a side-effect of the simple fact that learning to understand a new language is far easier than learning to speak it, or due to cultural restrictions on who is permitted to use various languages.

While the diegetic purposes are the same, however, the presentation to the audience of receptive multilingualism in Star Wars vs. Project Hail Mary is quite different. In Star Wars, multilingual conversations without a translator are always structured such that the half of the conversation which the audience has access to is enough to infer all of the necessary information from the scene. Weir, however, uses a two-layered approach similar to what he does with Russian and Chinese: any Eridian speech that Grace does not understand is presented as a string of Unicode musical note symbols (e.g., ♪ and ♫)--a conceit which I have seen only once before, in Lorinda J. Taylor's The Termite Queen. There are no appropriate Unicode symbols for chords or staffs, so we have to assume that the actually chosen symbols do not represent anything salient about the actual phonetic content of Rocky's speech, except maybe the total number or chords/syllable, or the relative utterance length. Meanwhile, when Grace understands something that Rocky has said, it is presented as an English translation in italics.

As briefly implied above, during their initial interactions Grace uses a computer to record Rocky's utterances and recognize known utterances later to help him understand what Rocky is saying before he learns to recognize Eridian words himself. Additionally, he uses audio waveform analysis software to extract the component frequencies of each utterance. Computer assistance would almost certainly be essential in documenting and decoding any alien language we might come across, but it's too bad that Grace was not trained as a linguist, or he might have known about all of the software tools that exist for analyzing and documenting human language already, and pulled out Praat for doing spectral analysis of Rocky's speech--it would not be the first time Praat had been used to analyze non-human utterances! (A note on worldbuilding: the starship Hail Mary is supposed to have been loaded with every piece of software available to humanity at launch, just in case, so Praat would definitely have been in there.)

There is one instance in which Weir-via-Grace makes an explicit claim about linguistics:
The oldest words in a language are usually the shortest.

Which is... sketchy. Depending on how exactly you interpret it, it might not be false, but it's not particularly useful. For example, old words tend to be common words, and common words tend to be short... but not all common words are old, and not all old words are common. And this topic comes up when Grace is learning Rocky's words for numbers, which brings up the further question of why Grace assumes that numbers would necessarily be old words. However, this statement has absolutely no relevance to the story. Charitably, perhaps it is meant to show that Grace only has no linguistic training, and only folk-understanding of linguistic science? But what really comes across is that the author didn't really know what he was talking about, and the book would've better with that one sentence just cut out.

This does give us a nice segue to talking about Eridian numbers, though. For the most part, the problem of translating between numeric and unit systems, just like the problem of learning a new language, is offloaded to the non-viewpoint character, who is not merely a linguistic savant but also a mathematical savant, able to do unit-of-measure and numeric base conversions instantaneously in his head (er... cephalothorax?). Grace does, however, learn Eridian numbers to decode Eridian clocks, and works out pretty quickly that they have a base-six numeral system. The choice of how to represent Eridian numerals in the text is kind of interesting--much like using musical note symbols to represent Eridian speech (or at least, that Eridian speech is happening), Weir makes use of existing Unicode symbols that are not typically used in English text and which approximate the diegetic forms of the Eridian symbols to show Eridian numerals in the text. That's the closest we come to any representation of Eridian writing, and cleverly avoids needing to include any pictures in the text (aside from the diagram of the ship provided in the front of the book). Now, Rocky has 5 limbs and 15 fingers, so why would the Eridians have a base-6 system? Well, while all of Rocky's limbs are functionally interchangeable, balancing on two legs for a natural tetrapod would be unnecessarily tricky--but an Eridian could stand, and possibly walk, on any three limbs at a time, leaving two free to use as arms, with a total of 6 fingers between them. Thus, developing a base-6 numeral system based on counting the six fingers of two Eridian hands would be directly analogous to humans developing base-10 numeral systems based on counting the 10 fingers of two of our hands. Note that the actual logic behind Eridian numerals is not addressed in the story, but this seems like a reasonable reverse-engineering of the author's probable intent. If Project Hail Mary had instead been written by a human who natively spoke a minority language of Papua New Guinea with a base-27 body-counting system, perhaps the Eridian numeral system would be slightly more opaque.

If you liked this post, please consider making a small donation!


No comments:

Post a Comment