Showing posts with label books. Show all posts
Showing posts with label books. Show all posts

Monday, May 26, 2025

Tlön, Uqbar, Orbis Tertius

    Tlön, Uqbar, Orbis Tertius is a 1940 short story by Jorge Borges, translated into English in 1961 and appearing in the collection Labyrinths.

    In modern terms, it describes the discovery of a secret society of sci-fi engaged in a multi-generational worldbuilding project--creating an encyclopedia of the world of Tlön. This is essentially the "explain a film plot badly" summary, but a proper summary is very philosophical and literary, and if you want that you can go read the Wikipedia page or something. I'm just here for the linguistic references!

There are no nouns in Tlön's conjectural Ursprache, from which the "present" languages and the dialects are derived: there are impersonal verbs, modified by monosyllabic suffixes (or prefixes) with an adverbial value. For example: there is no word corresponding to the word "moon," but there is a verb which in English would be "to moon" or "to moonate." "The moon rose above the river" is hlör u fang axaxaxas mlö, or literally: "upward behind the onstreaming it mooned."

    This isn't actually particularly odd. Many indigenous North American languages--especially the Salish family--are famous for having a heavy preference for verbs, and deriving nouns from relative clauses or participle-like constructions. Native American languages are not particularly famous for monosyllabic roots, but there's no particular reason those features should not be combined. Such a language could easily turn up in nature, and I would not be slightest bit surprised if someone discovered a language exactly like that somewhere in Papua New Guinea! Borges clearly did not generate a complete Tlönian conlang for this short work, but there is some translated Tlönian there which we might as well try to analyze, just for fun. I don't know what the original text looked like, but at least in English translation, there is not a one-to-one correspondence between Tlönian words and English words, which is a plus. Many times, translations between arbitrary will end up with the same number of words just by chance, but by not matching up in number, we know that the words cannot match up one-to-one in sense, which gives us an excuse to think up different ways that information could be organized in the Tlönian sentence. Obviously, with such little data, it's impossible to settle on one obviously correct answer, but I like to think that Tlönian uses something like a relational noun construction (where the "noun" is of course actually a verb), and has reduplication for extended actions, leading to hlör = upward; u fang = at what-is-behind; axaxaxas  = multiply-reduplicated form of ax, "to flow", maybe with an adverbial suffix as for "onward", "towards a goal"; and mlö "(it) is-the-moon". Incidentally, "Axaxaxas mlö" is also the title of one of the books mentioned in Borges more famous story The Library of Babel.

    We get one other word of Tlönian, though, which seems very mugh like a noun: hrönir (singular hrön), referring to duplicate instances of things which are lost once and then found multiple times. Maybe it's actually a verb meaning "to be found multiple times" and -ir isn't so much a plural as a pluractional or something. About other parts of Tlön, we are told that

In [languages] of the northern hemisphere [..] the prime unit is not the verb, but the monosyllabic adjective. The noun is formed by an accumulation of adjectives. They do not say "moon," but rather "round airy-light on dark" or "pale-orangeof-the-sky" or any other such combination. In the example selected the mass of adjectives refers to a real object, but this is purely fortuitous.
    This, too, is actually not so strange after all. Per Topics in Warlpiri Grammar by David Nash, Warlpiri does not formally distinguish nouns from adjectives, and can string them together in any order to pinpoint a more precise concept which is the intersection of all the provided descriptors. What would be strange is if, rather than merely focusing on "adjectives", northern Tlönian in fact only had words for semantic attributes and not entities; but Borge himself seems to have had a hard time conceiving of that, given that he includes "sky" in the provided glosses. The philosophy of not having fixed words for specific objects, but just using contextually-relevant descriptions as needed, regardless of whether we think of any of those descriptors as "adjectives" or "nouns" is, however, strongly reminiscent to me of the communicative philosophy of Toki Pona. So even if this is a less naturalistic vision of language than the first version of Tlönian represents, it has at leat been shown to be emminently workable by a conlang community arising some 61 years after Borge posed this idea.


Monday, November 13, 2023

The Year of Sanderson

Brandon Sanderson has never put a conlang in a book. But he is aware of them, and has done stuff with fictional languages and naming practices. Brandon Sanderson also speaks Korean; not only is he bilingual, but his second language is not just another European language. It's something very different from English which I might expect to have provided him with a greater degree of metalinguistic awareness than the average author, and raises my expectations for linguistic sophistication in his books.

In my review of Larry Niven's Grammar Lesson, I wrote

There are all sorts of other ways that this kind of grammatical quirk could be integrated into a sci-fi story that have nothing to do with exemplifying or manipulating the speakers' psychology. Brandon Sanderson actually gives a good example of this in the Mistborn trilogy... which is something I shall have to discuss after I get my hands on Secret Project Four and can do a Big Unified Sanderson Linguistics Post.

This is that post. Now, I have not read everything that Brandon has ever written, and I have forgotten some of what I have read, so this will not be completely comprehensive, but we can start with that example from the Mistborn trilogy. (<- Amazon Affiliate link.) A large portion of the plot in the later stages of the story revolves around the interpretation of an ancient prophecy, which is complicated both by magical interference that alters the records, and by actual linguistic drift. Whatever language they speak on the planet Scadrial (which realistically has just one standard language amongst its human inhabitants, given the global level of control exercised by the immortal Lord Ruler) in the Mistborn era, it evidently has an English-like system of strictly gendered animate pronouns, whereas the ancient language of the prophecy has an epicene (gender neutral) third person--a feature which Brandon may have been aware of from Korean! This complicates the process of translation, as any given translator must make a choice about how to render this pronoun in the modern language, which biases the interpretations of the modern characters in plot-significant ways. Good job, Brandon! The names are just.. eh, they're fine. But on the bright side, there is so little in the way of native names and non-English cultural terms that the field is wide open for any conlanger who might be hired to create a proper language--there's very little restriction imposed by the existing linguistic cannon!

The bulk of this review, as you can tell from the title, will focus on the four books from the Year of Sanderson: Tress of the Emerald Sea, The Frugal Wizard's HandbookYumi & the Nightmare Painter, and The Sunlit Man. (<- All Amazon Affiliate links.)

It turns out that The Sunlit Man has the most linguistic content to comment upon, so I'll be going through the books in reverse publication order. There is still little enough that I can do a nearly-complete listing of the interesting bits.

Starting on page page 2 of the Dragonsteel Premium Hardcover Edition, we get this:

The man shook him, barking at Nomad in a language he didn’t understand.
“Trans . . . translation?” Nomad croaked.
Sorry, a deep, monotone voice said in his head. We don’t have enough Investiture for that.

which is packed with information: there is translation magic, it's not working right now so we'll have to actually deal with the consequences of a language barrier, but we should expect it to start working eventually because explicitly mentioning it here makes it a massive Chekhov's gun, so we won't be getting a language-learning montage.

Page 21 has two bits of secondary language representation, with usages of diegetic translation and contextual irrelevance:

Another of the officers nodded, staring at Nomad. “Sess Nassith Tor,” he whispered.
Curious, the knight says. I almost understood that. It’s very similar to another language I’m still faintly Connected to.
“Any idea which one?” Nomad growled.
No. But . . . I think . . . Sess Nassith Tor . . . It means something like . . . One Who Escaped the Sun.
...

Glowing Eyes gestured to Nomad. “Kor Sess Nassith Tor,” he said with a sneer, then kicked Nomad again for good measure.
A few officers scrambled forward and grabbed him under the arms to drag him off.

For all I know, this connects with stuff in the Stormlight Archive, which I haven't read yet because I'm waiting for the series to be complete, but since I know from his own public statements that Brandon has not created any full conlangs, I kinda suspect this is ad-hoc--but it works because there is little enough there that the possibilities for how to analyze it and justify the translation are practically unrestricted, and it's impossible to prove any inconsistency. But, we also know that whatever this language is, it is definitely not just a relex of English, because Brandon had enough awareness to not allow for a word-for-word matchup! (I'd guess that "nassith" is some kind of participle, but like I said, interpretations are pretty much unrestricted with this little data.) In the second instance, we could try to make some guesses about what "kor" means based on the surrounding contextual actions, but ultimately it just doesn't actually matter, except that the glowy-eyed guy is emphasizing something, which we get from the italics.

Page 28 gives us a Failure To Communicate and a reminder of why translation magic isn't working, and that Nomad needs to be working on fixing that--i.e., reiterating that we ain't gonna see Nomad doing any monolingual fieldwork. After that, we get all the way to page 64 before we get some more metalinguistic description:

He said this in Alethi on purpose, which wasn’t his native tongue. Previous experiences had taught him not to speak in his own language, lest it slip out in the local dialect. That was how Connection worked; what Auxiliary was doing would make his soul think he’d been raised on this planet, so its language came as naturally to him as his own once had.

So, we get a name for a language that Nomad actually knows, we know that it isn't his native language (so maybe that'll come up later?), and we get some more details of how the translation magic actually works, which turns out to be probably the most sensible way to do it!  

Pages 71 and 79 tell us about the linguistic environment on this particular planet:

“Is this the stranger? What is his name?”
“I was not graced with such information,” Rebeke said. “He doesn’t seem able to understand the words I speak. As if . . . he doesn’t know language.”
Zeal made a few motions with his hands, gesturing at his ears, then tapping his palms together. He thought maybe Nomad was deaf? A reasonable guess, Nomad supposed. No one else on this planet had tried that approach.

So, apparently there is only one acoustic language on this planet (which turns out to be quite reasonable under the circumstances, as it was in Mistborn), and people are not generally aware that there can be other languages. However, there is also at least one sign language--so, yay for sign representation, and, wow, that implies quite a lot about this very tiny society that's struggling to survive. How the heck do they maintain a sign-using language community when there probably aren't that many deaf people around? But, moving on to page 79:

“I offer this thought: do you suppose he’s from a far northern corridor? They speak in ways that, on occasion, make a woman need to concentrate to understand.”
“If it pleases you to be disagreed with, Compassion,” Contemplation said, “I don’t think this is a mere accent. No, not at all. Regardless, there are more pressing matters.[...]”

it turns out that at least some people do have an awareness of dialect continua! Which, in contrast to the situation on Scadrial, absolutely should exist in this setting. 

On page 133, after getting his translation magic to work, Nomad manages to explain the concept of other languages to a local:

“Why do you do that?” Rebeke asked. “Talk gibberish sometimes?”“It’s my own language,” he said. “In other places, Rebeke, people speak all kinds of words you wouldn’t recognize.”

And then on page 175, we get an in-character acknowledgment of the underlying language barrier:

“Wait, how tall are these mountains?” Nomad asked.
“Tall,” Zeal said. “At least a thousand feet.”
A thousand feet? Like a single thousand?
At first, he assumed that the Connection had stopped working, and he hadn’t interpreted those words correctly.

Not much to say about that aside from, hey, any representation of someone actually having realistic struggles with a non-native language is a rare thing and it's nice to see it acknowledged.

On page 238, we get a little background on the Alethi language that Nomad knows but is not his mother tongue, and also a word in his actual mother tongue with diegetic translation:

They called themselves the Alethi, but we knew them as the Tagarut. The breakers, it means.

On page 290, we get a fun cultural note:

“You blessed fool,” Hardy said. “We’re all a group of blessed fools.”
Wait, the knight says. Is that fellow using the word “blessed” as . . . as a curse?
“It’s a conservative religious society,” Nomad said in Alethi. “You use the tools you’re given.”

This is a good acknowledgment that the common sources of curse words vary from culture to culture. The way that Quebecois French speakers swear is etymologically quite different from how the overlapping English-speaking Canadian community swears! It's also worth noting here that for the most part, Brandon uses a non-diegetic translation convention with dialog tags clarifying the diegetic language when it is other-than-standard to indicate the variety of fictional languages present in this setting.

Page 342 is a comparative gold mine, where we get some information about Nomad's mother tongue and about the local culture:

“It is the name I deserve. And it sounds a little like my birth name, in my own language.”
“Which is?”
“Sigzil,” he whispered. [...]
“Nomad,” Compassion said. “A wanderer with no place. That name no longer fits you, Sigzel, because you have a place. Here, with us.” She said the name a little oddly, according to their own accents.
...

“We name you Zellion,” Contemplation said. [...]
“It means One Who Finds,” Compassion said. “Though I know not the original language.”
“It’s from Yolen,” he whispered. “Where my master was born.”

So, now we know that, whatever the word for "nomad" is in Nomad / Zellion's mother tongue, it is phonetically close to "Sigzil"; and we know that the local language has at least slightly different phonological rules, such that they can only approximate it as "Sigzel"; and we've got a probable participle or relativized verb from from a third language, from a named planet so we can potentially correlate that with information from other books in the Cosmere. I really want to emphasize here that, although Brandon isn't being particularly innovative with interpretive techniques (we've just got straight diegetic translation going on), and there are no actual conlangs backing this up, Brandon is still managing to include references to realistic linguistic features that highlight differences that should exist between different fictional languages, which does a lot to add linguistic depth to the setting even without a fully constructed conlang or even a worked-out naming language.

On page 374, we get a couple more names of languages, including, finally, an identification of (a clear Anglicization of) Sigzil/Nomad/Zellion's mother tongue:

“Rosharan,” the man said in his own tongue. “Can we speak in a civilized language, please? Do you speak Malwish?”
Zellion shook his head, pretending not to understand and hoping they didn’t speak any of his native languages. At least he could honestly claim ignorance of Azish, having been forced to overwrite the ability to speak that with the local language.

And that is confirmed on page 413:

It was more of an Alethi thing actually, not an Azish one.

And there we have it: The complete overview of linguistic representation in The Sunlit Man.

Yumi & the Nightmare Painter has a very different approach to linguistic representation. Our two lead characters, Yumi and Painter (aka Nikaro) speak related languages (spoilers: one being a descendant of the other), and this is referenced to explain why they can understand each other, but there is no practical indication in the interactions between Yumi and Nikaro that there are any noticeable differences in the languages (thanks again to some magical translation shenanigans). There is a mention near the end of the book that people from Nikaro's city cannot understand those from Yumi's when the general populations finally meet, so they are in fact different languages, but for all that it impacts foreground character interactions, they might as well be speaking exactly the same language. Accordingly, there is much less material to catalog and analyze.

On page 3 of the Dragonsteel Premium Hardcover Edition, we get an introduction to the term "hion":

After losing his staring match, the nightmare painter strolled along the street, which was silent save for the hum of the hion lines.

which is thoroughly described by the following several paragraphs. But then on page 10, we get introduced to the term "yoki-hijo", with far more ambiguous translation:

The Chosen. The yoki-hijo. The girl of commanding primal spirits.

Are these all different titles? Or does "yoki-hijo" mean "The Chosen"? Or does it mean "the girl of commanding primal spirits"? This gets resolved by implication on page 13, where we have an example of appositional translation:

Yumi was one of the Chosen, picked at birth, granted the ability to influence the hijo, the spirits.

OK, so "hijo" means spirits, so "yoki-hijo" probably means "the girl of commanding primal spirits". That's a lot to pack into the word "yoki" and the semantics of whatever construction is implied by the juxtaposition. Quite a potential challenge for any conlanger who might try to engineer a proper conlang compatible with the textual evidence. (Spoiler: I'd bet the "hi-" in "hion" and "hijo" are meant to be related.)

On the next page (14), we get explicit translation by the narrator (who happens to be Hoid):

Liyun, her kihomaban—a word that meant something between a guardian and a sponsor. We’ll use the term “warden” for simplicity.

Back on page 12, we get introduced to the word "tobok", with a definition implied by context in the process of getting dressed:

Then the tobok, in two layers of thick colorful cloth, with a wide bell skirt.

And explicit translation for the term "getuk":

Torish clogs—they call them getuk—feel like bricks tied to my feet.

"Kihomaban" and "getuk" appear nowhere else after they are introduced and defined, so they seem to serve the sole purpose of providing scene setting--they tell you something about what the language they come from sounds like, and Hoid providing definitions reminds you that these people Are Not Speaking English. "Tobok" gets reused throughout the novel as a borrowed-into-English cultural term for this specific type of clothing, but never in dialog or thoughts by the actual characters. This word is apparently inspired by "bok", the Korean word for "clothing", which backs up the general Korean-inspired aesthetic of the whole book.

Also on page 14, we get an explicit discussion of historical linguistics and grammar:

Yumi quickly rose. “Is it time, Warden-nimi?” she said, with enormous respect.
Yumi’s and Painter’s languages shared a common root, and in both there was a certain affectation I find hard to express in your tongue. They could conjugate sentences, or add modifiers to words, to indicate praise or derision. Interestingly, no curses or swears existed among them. They would simply change a word to its lowest form instead.

This obviously, and Brandon has publically admitted, directly ripped off from Korean and Japanese. But much like "kihomaban" and "getuk", we don't really see this surfaced in the text; instead, dialog is annotated with parenthesized "(lowly)" and "(highly)" where relevant. That's not really something I would've predicted would work, and the fact that Brandon is massively famous and popular already means that I can't really use this book as evidence that it's a good idea. Maybe it's a failed experiment. But, I haven't actually seen any complaints about it in any reviews so far, so maybe that's a positive signal. I probably need to do a survey about this--comment if you have thoughts!

A good bit later, across pages 44 and 45, we get the common nouns "kon":

“Six? A bowl normally costs two hundred kon.”
...
He laid a ten-kon coin on the counter,

Which in context is pretty obviously a unit of currency. After that, all the language evidence is in  proper names of people and places. For Yumi's time period (and thus Yumi's language), we have:

Personal Names: Chaeyung Dwookim Gyundok Honam Hwanji Liyun Samjae Sunjun Yumi
Places: Torio Gongsha Ihosen
Common Nouns: getuk kihomaban tobok

For Nikaro's time period, we have:

Personal Names: Akane Gaino Guri Hikiri Ikonora Ito Izumakamo Lee Masaka Nikaro Shinja Shishi Sukishi Takanda Tatomi Tesuaka Tojin Usasha Yuinshi
Places: Fuhima Futinoro Jito Kilahito Nagadan Shinzua
Common Nouns: kon hion

That's a decent corpus of words for a conlanger to start working with. The Nikaro-era names are pretty clearly Japanese-inspired, while the Yumi-era names are more Korean-esque, which implies a quite significant level of cultural changes in naming practices and and phonological shifts between Yumi's ancient language and Nikaro's modern one.

The Frugal Wizard's Handbook for Surviving Medieval England has some explicit paratextual discussion of linguistic issues, but otherwise not much of note. There are culturally-appropriate names for the simulated time period, which is neat and reflects a commendable research effort, but actually feels a little off given that the native-to-the-world characters speak essentially modern English, not the language in which those names would have been generated. There are a few other period-appropriate terms but for the most part they just get diegetically translated. There are two excerpts from the eponymous Handbook which directly address linguistic issues; on page 67 of the Dragonsteel Premium Hardcover Edition, we get this explanation:

GUARANTEE TWO
The people on Great Britain will speak a language that is intelligible to modern English speakers. We chose our dimensional band specifically for this reason!

In other words, there's a darn good diegetic reason why there is no language barrier in this interuniversal travel situation!

And then much later on page 146:

UNINTELLIGIBLE DIMENSIONS
The population of the British Isles in these dimensions doesn’t speak a language intelligible to any known Earth language speakers. Perfect for linguists or those who want an extra challenge! Visit the speedrun section of our website for current records for full dictionary creation in the various language groups.

Which I like to point out just because acknowledgment of linguists makes me happy. On page 132, we have a situation where a proper language might become relevant, as our protagonist runs into some foreigners who do not speak I-Can't-Believe-It's-Not-English; But then... it turns out that their leader does speak English after all. Oh well.

The most interesting thing about this book is the parallel between the exposition provided by excerpts from the Handbook and the more non-diegetic linguistic and cultural notes in Sara Nović's True Biz. With two examples of intercalated paratext, I've gotta think this is a solid expositional technique for linguistic information that deserves further attention. (And I've really gotta just write something up on paratext in general one of these days--especially the more traditional forms, like glossaries and pronunciation guides.)

Tress of the Emerald Sea has even fewer references to language, but there are a few. Starting on page 10 of the Dragonsteel Premium Hardcover Edition, we get a bit of description that acknowledges the existence of multiple languages and writing systems on Tress's world:

As they ate, she considered showing the two men her new cup. It was made completely of tin, stamped with letters in a language that ran top to bottom instead of left to right.

And much later on page 254, we get the sole mention of the (Anglicized) name of Tress's language, and a reference to the translation magic that we also see used in The Sunlit Man:

“Are you even speaking Klisian?” Tress asked.
“Technically yes, though I’m using Connection to translate my thoughts, which are in a language you’ve never heard of.[...]"

And while I don't want to ascribe a character's statements to the author (I have no idea how much Brandon knows about psycholinguistics or translation theory, so I'll give him the benefit of the doubt), I should point out for the sake of readers with less linguistic training that 

  1. Not everyone thinks in language--which will be a big "well, duh" to some of you, and absolutely mindblowing to some others. This particular character apparently does, though.
  2. Thinking in one language and then translating those thoughts into another language to speak is not a good way to think. It's very inefficient, and it's not how high-level speakers of adult-acquired languages work. Whether or not you perceive yourself as thinking "in" a particular language, for communicative purposes you should be aiming to encode your thoughts directly into the target language in a single step, not doing translation in your head. I have to assume that translation magic is being used sub-optimally in this case compared to its presentation in The Sunlit Man, and there's just sufficient power behind it to make the results seem competent and fluent anyway. 

On page 94, we get introduced to a deaf character (Fort) using an assistive device (acquired from off-world--Tress's planet has a far lower technological level) which transcribes speech for him and allows him to write his reponses. Brandon makes use of bold face to indicate writing on Fort's communication board to distinguish it from acoustic speech in dialog. But the fact that such a device is both needed and useful brings up all sorts of questions about the broader society on Tress's world, which are much more interesting than the mere fact of the typographical convention used to represent it in the story.

We are told that, before acquiring his assistive device, Fort relied on lipreading, despite its limitations (and we are warned about the actual limitations of strict lipreading, so good job dispelling popular misconceptions there, Brandon!), and that this was in his childhood--so he didn't acquire language and literacy, and then lose his hearing as an adult. The Coppermind page for Fort claims that he previously communicated with a mix of sign language and lip reading, but that's not actually supported by the text--the only explicit mention of sign language is on page 448:

And Fort . . . well, he understood. Not because he knew another sign language, but because of that same bond.

And that is narration, not attributed to Fort himself, and doesn't actually indicate that he does know any sign languages. There's an earlier oblique reference on page 293:

Fort didn’t fill the time with idle chitchat, and while you might ascribe this to his deafness, I’ve known more than a few Deaf people who were quite the blabberhands.

But again, that is the narrator talking, and Hoid does not actually say that Fort is capable of using sign language--only that he has met other Deaf people who do. 

So, we have a deaf guy on a pre-industrial world who knows how to read and write. His parents cared about him enough to ensure that he was not subject to language deprivation and could learn to lipread for as much as that is worth, and then to become literate. This indicates surprisingly progressive views about deaf people, and we can also infer from other dialog that deaf people aren't particular rare on this world (because someone once met a deaf dancer as well, who might have actually been a made-up stand-in for a deaf princess--but hey, deaf princess!) It's possible that Fort did grow up with sign language, but simply has to deal with a world full of other people who don't understand it themselves, so the board is useful--but given that no character other than narrator, Hoid, ever mentions sign, and Hoid does not mention sign when we are told how Fort actually communicates, it seems that there is not enough of a population of deaf people with the ability to find and interact with each other on this world to sustain a viable sign language community. That's a weird contrast with having the social support to learn lipreading, reading, and writing, and that being common enough that one character was able to meet two socially high-functioning deaf people in not-that-many years of traveling the world. Not at all inconsistent, just kinda weird, and an interesting contrast to the situation in The Sunlit Man, where there is an awareness of sign language despite the extremely small world and corresponding extremely small population.. Maybe everyone on Tress's world is actually a horrible audist and abused Fort into learning to interface with a language he could not perceive in its intended medium, but I kinda like the idea that everyone on Tress's world is just super supportive of deaf people while being completely ignorant of the concept of sign language.


If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Thursday, September 28, 2023

Babel: Or, the Necessity of Violence

Babel, by R. F. Kuang, is a 2022 Alternate-History low-fantasy novel about translators who perform enchantments for the glory of the British Empire. The magic is fictional, but the translation theory is real: the Oxford translation class lectures are a legit callback to grad school. Why are translators performing magic? Because true translation is fundamentally impossible, and magic arises from the sometimes-subtle, sometimes-vast differences in meanings between attempted translations from one language to another.

Naturally, there is quite a lot of non-English representation in such a novel. Our main character, Robin, is a native speaker of Cantonese, so the first example we get is a a string of orthographic Chinese characters, which I cannot type easily to reproduce for you here--but, we immediately get diegetic transcription and translation:

'Húlún tūn zǎo,' he read slowly, taking care to enunciate every syllable. He switched to English. 'To accept without thinking.'

Note the conventional use of italics for non-English text. Here we get three parallel representations of the same bit of language, allowing the reader to understand what it actually looks like written, approximately how it sounds via romanization, and approximately what it means through Robin's translation of what he just read.

Robin is quickly introduced to the non-magical responsibilities of translation and interpretation:

This all hinged on him, Robin realized. The choice was his. Only he could determine the truth, because only he could communicate it to all parties.

The book is chock full of this kind of stuff--not just directly representing other languages, but explicitly teaching the reader about real concepts in linguistics and translation theory through the mode of having the characters learn and discuss them. Skipping ahead a bit, here is a taste of one of the theory lectures:

'The first lesson any good translator internalizes is that there is no one-to-one correlation between wrds or even concepts from one language to another. [...] If [there was], then translation would not be a highly skilled profession - we would simply sit in a class full of dewy-eyed freshers down with dictionaries and have the completed works of the Buddha on our shelves in no time. Instead, we have to learn to dance between that age-old dichotomy, helpfully elucidated by Cicero and Heironymous: verbum e verbo and sensum e sensu. Can anyone--'
'Word for word,' Letty said promptly. 'And sense for sense.'

And a bit of philosophizing later on reminded me rather strongly of the aliens from The Embedding:

We will never speak the divine language. But by amassing all the world's languages under this roof, by collecting the full range of human expressions, or as near to it as we can get, we can try.

And in fact, this is not a bad description of the project of natural language documentation and typology. 

The next instance of non-English representation makes use of footnotes to provide a non-diegetic translation for what he character already understands:

Auferre trucidare rapere falsis nominibus imperium atque ubi solitudinem faciunt pacem appellant.
Robin parsed the sentence, consulted his dictionary to check that auferre meant what he thought it did, then wrote out his translation.*
*'Robbery, butchery, and theft - they call these things empire, and where they create a desert, they call it peace.'

Although in this case, the translation does exist in the story, and so could've been included in-line, that is not so for all of the footnotes, some of which exist entirely outside of the story. For example:

for a full year Robin thought The Rape of the Lock was about fornication with an iron bolt instead of the theft of hair.*
* A reasonable error. By rape, Pope meant 'to snatch, to take by force', which is an older meaning derived from the Latin rapere.

I could continue with a detailed analysis of every sample of non-English language, as I did exhaustively for some other books earlier on in this series--but I would end up quoting from about a thrid of all pages in the book, and we'd be here all day! The range of integrative and interpretive techniques in use is actually pretty well covered by those few examples I have quoted so far. But what's really unique about the book is the extent to which it confronts the reader with concepts that you might not otherwise have to face outside of a graduate-level course in linguistics or translation, and in ways that are actually relevant to the plot. Consider:

What was a word? What was the smallest possible unit of meaning, and why was that different from a word? Was a word different from a character? In what ways was Chinese speech different from Chinese writing?

That matters for understanding the magic system and for understanding the nature of the relationships between characters. This is a masterclass in science fiction with linguistics as the underlying science... except that it's technically fantasy instead of science fiction. There's refreshingly not a single whiff of Whorfianism or UG anywhere--as there shouldn't be as those concepts would not have existed in the historical period in which this story is set!

The book also briefly addresses The Forbidden Experiment--and contributes to foreshadowing the true villainy of one of our antagonists by having him seriously entertain it as a possibility (which is unsurprising, given how he has up till then manipulated the lives of Robin and his friends).

I shall leave off with one more quote on semantic theory:

Does meaning refer to something that supercedes the words we use to describe out world? I think, intuitively, yes. Otherwise we would have no basis for critiquing a translation as accurate or inaccurate, not without some unspeakable sense of what it lacked.

 

If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index 

True Biz & A Literature of Sign

So, remember that whole thing about A Literature of Sign, and how the heck are you supposed to put ASL into a book for an English audience when ASL has no standard orthography?

Well, Sara Nović does some stuff. True Biz is a 2022 novel about the administration, students, and families of students of the fictional River Valley School for the Deaf boarding high school. It's straight-up realistic fiction, practically literary, exploring civil rights and what it's like to grow up deaf in a hearing world--really not my usual genre, but dangit, I liked it anyway, and it's certainly linguistically interesting. There is so much linguistically-interesting stuff, in fact, that I gave up and stopped putting in bookmarks after page 87 out of 381 in the hardcover edition--so, I will not be quoting every example of non-English representation in this review, just a representative sample that's indicative of the range of techniques used.

The first notable thing Nović does in this novel is not use quotation marks to set off dialog, even when characters are speaking orally. It's a little jarring at first, but I got used to it fairly quickly. I am not sure what the authorial intent behind this decision was, but for me it had the effect of turning off (or rather, failing to turn on) my internal voice when encountering dialog, thus distancing my experience of the text from the mental audio loop. Which I could totally believe is part of the intent, since it's a book about Deaf people!

One of our viewpoint characters is Charlie, a severely hard-of-hearing girl whose parents opted for a cochlear implant that doesn't really work right, resulting in language deprivation. She begins learning ASL when transferred to River Valley, and her experience is contrasted with that of Austin, a native signer from a multi-generational Deaf family. Charlie doesn't alwasy understand everything that is being said to or around her, in ASL or in English, and Nović represents this with underscores inserted into dialog in place of words that Charlie missed. Where relevant, there misunderstandings are resolved diegetically--so you, the reader, understand exactly as much and in the same way Charlie. For example:

[The headmistress] looked back at Charlie. _____ here at school will be key, she said. As with any language.
The what? said Charlie.
The headmistress removed a notepad from beneath a pile of paperwork.
IMMERSION she wrote.

Immediately before this, we get a nuanced introduction to simcomm (simultaneous communication), although it is not explicitly referenced that way.

To sign and talk at the same time was an imperfect operation, the headmistress warned, and one Charlie wouldn't see much of at River Valley after today. Charlie longed to find meaning in the arc of the woman''s hands, but that meant looking away from her lips, something she couldn't afford to do.

ASL conversations are all translated into English in italics, but Nović captures some of the spatial nature of ASL by arranging the dialog in columns according to the speaker, so each speaker's ASL dialog is spatially separated on the page just as their signing spaces would be separated in reality. Even when quoting a single ASL speaker, not in a conversation, their words and dialog tags will be confined to a distinct column separated from the flow of the main text, emphasizing the spatially-confined nature of the ASL utterance. The first example of such a conversation is as follows:

You hungry?

Hi, sweetie. How's school? All set up?
Getting there.
How was the meeting?
Fine, she said.
The girl struggled in mainstream.
No surprise there.
I'm sure you'll fix her right up.
We will. Come eat.

Right at the beginning of the book, I was uncertain whether this was intended to be a book for a Deaf audience, or a book to explain Deafness to a hearing audience. One particular feature shifted me solidly to the "this is for us hearies" side, though--the periodic inclusion between chapters of non-fiction explanatory notes on aspects of ASL and of Deaf culture and history that may be relevant to understanding whats going on in the adjacent chapters. This feels like a form of paratext, but where linguistic paratext usually takes the form of, e.g., name pronunciation guides in the front matter or back matter, or glossaries in an appendix--all presentations which can be easily skipped over if the reader doesn't care about them--this is interleaved with the main text, so it must be engaged with. This seems like an excellent way to present additional information about a minority culture in the real world, but I am uncertain how well it would translate to, for example, explaining a conlang in a fictional world. I was slightly reminded of this by the fictionally-non-fictional excerpts from the eponymous guide in Brandon Sanderson's The Frugal Wizard's Guide to Surviving Medieval England (review forthcoming), so it might be workable.

Finally, Nović occasionally includes schematic illustrations of signs inline in the text. Most pervasively, each chapter is headed with an illustration of the ASL fingerspelling handshape for that chapter's viewpoint character's first initial. In a couple of places, however, where Charlie is learning new signs, dictionary-style schematic illustrations of complex signs are included in parallel with the italicized-English translations. This is not at all space efficient, so it can't be used everywhere, but limited deployment works to help teach the reader a small number of signs and provide an initial mental image to help inform how you interpret subsequent conversations as signalled by the ASL-specific page formatting. 


If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Tuesday, September 26, 2023

Stridulation in Landscape with Invisible Hand

Landscape with Invisible Hand is a 2023 sci-fi film based on a book of the same name from 2017, taking its title from a work of art created by the protoganist in the story. It is set in a world that has been economically colonized by aliens known as the Vuvv--though it is unclear where that name comes from, as their language is unpronounceable by humans. And, that's why we're doing this review!

The sounds of the Vuvv language are produced by stridulation--rubbing together pads on the ends of their appendages.


A Vuvv, seen rubbing pads together mid-sentence.

The Vuvv in the film are seen making a wide variety of articulatory gestures, which suggests the possibility of a range of distinguishable stridulation sounds which could form the basis of a phonemic inventory. However, this variety is not reflected in the accompanying audio. According to IMDB,
The unique sound of the alien Vuvv language was created using dried out coconuts with nails in them, rubbed against mossy rocks.
The inspiration for the sound of the alien Vuvv language came from a line in the book that the film is based on that describes the Vuvv language as "someone walking forcefully in corduroys."
Now, there is no inherent reason why a fully fleshed-out language could not be articulated by rubbing coconuts with nails in against rocks... but between the experience of actually listening to the film, and the fact that IMDB doesn't list any language creator or consultant in the credits, I'm pretty sure they didn't bother. Also note that Vuvv language lessons for humans are a thing in the film, so we know that the relevant acoustic patterns are audible to humans, and it's not a matter of just not bothering to represent stuff that is theoretically there but not perceivable by the human characters or audience, as would be the case in, for example, a film adaptation of Little Fuzzy. (It's possible that the glyphs for Vuvv writing actually mean something, but I don't have high hopes for that.) Awesome idea for an alien language, and the presentation of the fictional language works for the film, but it's a little disappointing that there isn't more there. On the other hand, if Phil Lord and/or Chris Miller are reading--hey, you still have a chance to make Project Hail Mary the first major film to feature a fully fleshed-out alien language not pronounceable by human actors! And it would really be a shame to deprive audience of the opportunity to learn to recognize Eridian words right alongside Ryland Grace...

But anyway, back to Landscape--there's really just one consistent choice of integration techniques to make the Vuvv dialog comprehensible to the audience. It's 100% diegetic translation, which is carried out automatically by translator boxes that allow the characters in the scene to understand the Vuvvs talking to them. Meanwhile, all of the Vuvvs we see on-screen seem to be receptively bilingual--they can't pronounce human languages, just we can't pronounce theirs, but they can comprehend English when spoken to. This arrangement actually works out really well--since translation is necessary for the characters, this nicely avoids the need for any additional integration mechanisms just for the sake of the audience. I.e., we don't need to worry about the possible need for subtitles. And that's a darn good thing, because a few possible integration techniques are taken off the table by the simple fact that this is a fully fictional language, rather than an artificial-but-real conlang--there is no meaning actually encoded in the Vuvv speech, so there's no way to expect the audience to extract what isn't there!

So, while I am disappointed at the lack of depth, we can take at least two good lessons from this film:
  1. Certain settings and stories lend themselves naturally to specific secondary-language integration techniques, and theoretically you could consciously choose to structure your story to take advantage of a particular technique. (I don't know if this is the case, but I would not be surprised if that was the case here--maybe they gave everybody translator boxes specifically to avoid having to do subtitles?)
  2. Stridulation! Man, I'd love to see someone tackle this as a modality for a real alien conlang.


Monday, September 25, 2023

Three Miles Down

Three Miles Down is the latest Alternate History novel from the prolific Harry Turtledove. In 1974, marine biology grad student Jerry Stieglitz is recruited by the CIA to assist with a secret operation to raise a sunken Russian nuclear sub... which turns out to be a cover story for an even more secret operation to raise a crashed alien spacecraft. Why do they need a marine biologist? Because Jerry has been studying whale vocalizations and trying to decode them, making him one of the best-qualified people on the planet to potentially decipher an alien language in a first-contact scenario.

This sounds like the perfect book for me, no? Well, don't get me wrong, it is a good book, and if you want a spy thriller featuring a bunch of classic SF authors, this is the book for you!--but I was left dissapointed on the linguistic front. We don't actually come face-to-face with the aliens themselves until the last few pages, and the story wraps up before Jerry ever has to try actually talking to them. This is not Turtledove's first novel involving alien contact, and others (notably, the Worldwar series) do put human-alien interaction more front-and-center, so I'm gonna have to go re-read some of those older ones and see how the language barrier was handled--I hadn't taken a single formal class in linguistics or literary analysis the last time I read through A World of Difference, for example! (But hey, Harry--if you ever feel like writing a sequel, and you need advice on portraying the process of establishing communication in detail, hit me up!)

However, there is some Russian (because they're messing around near a Russian sub, so Russians show up) and some Yiddish (because Jerry is Jewish) which we can look at to see what techniques Turtledove employs for integration. When Jerry gets ont he radio with a Russian ship, we are mostly treated to conventional non-diegetic translation of the conversation into English, but there are some Russian words thrown in; for example:

"I read you loud and clear," a Russian voice answered in his headphones.
"Talk slow, pozhaluista. My Russian nye khorosho."

We've got a lot of context clues just in this tiny excerpt to establish the translation convention and the fact that they are actually speaking Russian--a "Russian voice" answers,  Jerry is talking about his Russian, and we get a few untranslated words thrown in as well, italicized to set them off as foreign. This is similar to the overall technique that Graham Bradley used in Kill the Beast, where untranslated French words are thrown into the mostly-English representation of the dialog just to periodically remind us that the characters are actually French. Note that Turtledove chose to us a Romanized transliteration of the Russian, so Anglophone readers can have some hope of figuring out what it ought to sound like, rather than putting the orthographic Cyrillic in the text.

The next interesting bit involves fome diegetic code-switching, with non-diegetic translation for the sake of the reader (quoting as little as possible to avoid spoilers):

"[...]If you don't want to get dealt in to whatever people can learn from that spaceship, go ahead. Laugh at me. And yob tvoyu mat'."
His Russian TAs and profs had all warned him never to say that: 

At first, I thought Turtledove had chosen to render that in Russian so as not to offend the Anglophone reader, or at least "soften the blow" since foreign insults tend not to have the same emotional effect as those in your native language. But then he goes right ahead and gives the English translation on the next line (which you will note I have cut out from my quotation), and it hit me that exactly the opposite thing is going on: Jerry is explicitly trying to insult the person he is talking to, and knows that speaking in the audience's native language will both provide a greater emotional impact and remind them that, yes, Jerry does understand Russian himself. That's some excellent sociolinguistics right there.

And the next interesting bit subtly provides some insight into Russian culture:

After a while, the man in the outdated black suit looked in and asked, "You would like dinner?"
"Da, Georgi Pavlovich. Bolshoye spasibo," Jerry answered. He still had no idea of the Russian's family name. First name and patronymic were enough for politeness.

Now, my ability to analyze this objectively is a little strained by the fact that I already speak Russian, so what I think was made obvious is confounded by what I already know, and might not perfectly reflect the experience of the naive reader. But I think Turtledove has done a pretty good job here. "Da", as the equivalent of "yes", was introduced in an earlier conversation in the book, so Turtledove is doing just a little bit of Teaching the Reader here, but beyond that: Jerry is clearly addressing the guy who just asked a question, and we can recognize his name, Georgi, in the response. Jerry is choosing to respond in Russian to be polite because Georgi has just demonstrated that his English isn't perfect, by asking a grammatically-ill-formed question. The narrator tells us that Jerry doesn't know Georgi's family name, so we can conclude that "Pavlovich" is not his family name, and infer that it must be the patronymic, and that that is the formula for polite address in Russian. And you can probably make a good guess that "Bolshoye spasibo" is some variation on "thank you" because we know that Jerry is specifically concerned with politeness. Perfect exchange, dense narration, no notes.

There's a little bit more Russian, and I've completely ignored the Yiddish, but that should be sufficient to cover the most linguistically-interesting bits.


If you liked this post, please consider making a small donation!

Saturday, March 25, 2023

The Sci-Fi Linguistics of The Embedding

The Embedding, by Ian Watson, is... not good. It tied for the Campbell Award in 1974 and won the Nebula Award for Best Novel in 1975, and has been called a "modern classic", but, much like the Hugonauts' review of Dune*, while I recognize that it has some great ideas, I just don't think they're actually executed all that well. Now, young people don't always like old SF, but I've read and liked a lot of old SF, and this one just really doesn't hold up on the strength of writing or the story. My feelings pretty much mirror those expressed in this review from 2006--its understanding of linguistic theory is slightly confused, and the multiple plot threads are poorly integrated and redundant, which is a great disappointment given the novel's stellar reputation. I can only assume it has that reputation because it blew everyone's minds by actually engaging with theoretical linguistics at all back in 1974, and nobody's seriously re-evaluated it since then.  But it is the most linguisticky linguistic fiction ever, and explores ideas that have not been done better since, despite the low bar! So, let's talk about those ideas.

*Go subscribe to the Hugonauts! They deserve more listeners.

The introduction to the Gollancz SF Masterworks edition says that "There are two ideas in linguistics that have had a particular influence on twentieth-century science fiction.": the Sapir-Whorf hypothesis, and Universal Grammar. The Sapir-Whorf hypothesis--the idea that the language you speak can influence or control how you think--is low-hanging fruit, and there's tons of SF, some of which I have previously reviewed, that plays with that.

The core idea of Universal Grammar is that humans come pre-wired with an understanding of how language should work; that our brains are built with a standard template for grammar with a multitude of switches that different language merely set in different ways. This idea originates with Noam Chomsky, who famously claimed that Martians studying Earth would conclude that we all spoke mere dialects one humanese--although quite a lot of advancement has been made in linguistics since that time, and Chomsky himself no longer holds that extreme view. If you do run with that extreme view, however, it lends itself to the trope of Incomprehensible Aliens--if we are hardwired to Do Language in a particular way, and they are hardwired to Do Language in a different way, then presumably we could never learn to understand each other's languages and communication would be forever impossible.

The idea of built-in, innate grammar arises from the "Poverty of the Stimulus" argument, which basically goes that human children aren't exposed to a large enough sample of language to learn how it works from first principles in the time it actually takes for children to acquire their first language. Any language whose rules didn't conform to whatever that built-in template is then could not be learned by children. This, of course, depends on the assumption that the stimulus is actually impoverished--that there really isn't enough information in the ambient linguistic data to which children are exposed during their lives to calculate the correct rules of a human language--and that assumption is not without controversy.

It is clear that humans must have some innate capacity for language--after all, something makes the difference between a human baby who learns to understand and speak English in only a few years, versus, say, a kitten who grows up in the same house and maybe learn to recognize a few individual words. Whatever that is is called "the biological endowment", and the idea that that could vary between linguistically-capable species is unexplored here, or in any other published story that I am aware of. But exactly how extensive our innate knowledge specific to language is, is still an active area of research, and the idea of an extensive Universal human Grammar can be attacked from two directions:

  1. Showing that, for some particular linguistic feature, the stimulus is not actually impoverished--that children are exposed to enough of the right kinds of examples to just "figure it out".
  2. Showing that we have some innate cognitive biases relevant to linguistic learning, but that they are not specific to linguistic learning--thus, other linguistically-capable species may well exhibit exactly the same linguistic biases, because of developing the same general reasoning capabilities.

Somewhat confusingly, some people use "Universal Grammar" to refer to any innate knowledge relevant to language, not just that which is specific to human language, and only evidence of type 1 is relevant to disproving that kind of Universal Grammar. But given the particular feature that Ian Watson chose to focus on (the eponymous "embedding"), and how interactions with aliens are portrayed in the book (they are fascinated by exactly the same constraint), I have to assume that that was the understanding that Watson had of the term "Universal Grammar"--that it was not merely universal to humans, but cosmologically "universal", based on principles that would be reliably replicated in any intelligent mind.

In some ways, Watson's choice of feature to focus on is a clever one; center embedding is an easy concept to explain to readers who are otherwise lacking in theoretical linguistic education, and he does just that in conversations between characters. For those who do not wish to go read the novel looking for the definition, center embedding is just taking a particular grammatical structure--like a relative clause--and sticking n the middle of another structure of the same type, rather than at one end or the other. For example, take the sentence "This is the malt that the rat ate."--it's got a relative clause in it. We can self-embed another relative clause at the edge like this: "This is the malt that was eaten by the rat that was worried by the cat." Or, we can center-embed that relative clause--stick it in the middle of the first relative clause, breaking that up--like this: "This is the malt that the rat that the cat worried ate." That's harder to understand, but it's the sort of thing that might be said from time-to-time. But what if we add a third clause? "This is the malt that the rat that the cat that the dog chased worried ate." That's... really hard to interpret, and people just don't speak that way! And if you one more level... no, that'll never happen!

However, despite being a clever choice of linguistic phenomenon, it's not actually a test of Universal Grammar, as Chomsky intended the term! In fact, this gets a completely different bit of Chomskyan linguistics, which Watson completely ignores: the distinction between competence and performance. "Competence" is what you know about the rules of language, and your ability to judge things as grammatical or ungrammatical. It is competence that allows us to say, yeah, mechanically, we could add a fourth embedded clause to that horrible incomprehensible sentence, and it wouldn't violate any grammatical rules. We are capable of learning the rules that would let us do that. Competence is what lets us look at the famous sentence "Colorless green ideas sleep furiously," and say "yeah, it's grammatical, but...." Meanwhile, performance is the fact that we sometimes make mistakes that we know are mistakes, and that we can rate things as acceptable or unacceptable, because they do or don't make sense or because they are easy or hard to interpret, independent of whether or not they are grammatical. The limitations on center embedding in English aren't grammatical, and this tell us nothing about the rules of Universal Grammar--they are just a consequence of the fact that humans have limited short-term working memory, so we lose track of the first halves of multiply-embedded structures before we get to the end! And in fact, you can prove that there is no hard grammatical limit on embedding depth by observing that equally-embedded structures can be more or less acceptable depending on which precise nouns, pronouns, and adjectives you happen to use; compare, for example: "The rat which the cat which the dog chased bit fell." vs. "The elegant woman whom the man that I love met moved to Barcelona."

Each of the three major plot threads in the novel has its own mini-linguistic-ideas as well. The aliens engage with the Sapir-Whorf hypothesis in thinking that learning more languages--and specifically, learning a heavily-center-embedding language--will allow them to achieve new metaphysical abilities. The Amazonian natives use drugs to expand their linguistic competence, which could be interpreted as a precursor to the drugs used by Sheila Finch's Guild of Xenolinguists. And the opening thread of the book deals with conducting The Forbidden Experiment--isolating children from natural adult language to see what happens, or what you can make happen, in order to explore the boundaries of the biological endowment and of any Universal Grammar that we might have. As you can see from that Wikipedia link, The Embedding was not the first to explore this idea--it shows up in books, comics, and even The Twilight Zone. And if you believe in the strong Sapir-Whorf hypothesis, or linguistic determinism, it can be quite a compelling idea--raising children without exposure to any existing language would release them from the limitations of those languages, would it not? But... no. That's not actually what happens at all. It is very easy when thinking about problems in language acquisition and Universal Grammar to start thinking, "ugh, if only we could controlled experiments on the acquisition process, we could answer so many questions so much more easily!" In fact, while working on this article, complete by accident, I came across this Tweet:

Which is a serious exaggeration, and if you click through to read the ensuing thread and quote-tweets, it's one which many people disagree with and have very strong feelings about, for good reason! But it is an exaggeration of something real. Let's be clear: when we say "intrusive thoughts", we mean "intrusive thoughts"; a lot of linguists really want to know what's going on in kids' heads when they acquire language, and Lingthusiasm even sells baby onesies with "Daddy's Little Longitudinal Language Acquisition Project" on them (which I have proudly clad all three of my children in!) but nobody is going around thinking "man, I would totally raise some kids in linguistic isolation for 10 years if it weren't for that pesky Ethics Review Board!" (Or at least, we all hope nobody is thinking that!) It is the sort of thing that you put in a depressing dystopian SF novel (which the Embedding most definitely is! No one makes good decisions, and the ending is typically Cold-War-Era depressing)--or, if you think of it "for real", you immediately feel bad about it and move on to trying to find practical methods of getting the data you want, work on something else. Unfortunately, we actually do have some data on linguistic deprivation, from studies of rescued feral children and deaf children of hearing adults who do not speak a sign language, and the effects of these "natural experiments" are dire, and a source of ongoing trauma to the Deaf community. So, no, language deprivation does not give you special insight or psychic powers--it just gives you brain damage.

The fact that the main character of The Embedding actually performed a deprivation experiment thus clearly marks him as a villain, and that's only the first of the many unethical things that are done for the sake of "science" and "progress" in this book. And what's more, the particular experiments Watson describes aren't actually testing Universal Grammar, (the embedding experiment, realistically, is just training short-term working memory) which removes even the scientific justification! Every character is just straight-up unlikeable. So please, if you are an author--go write something that engages with theoretical linguistics as deeply as The Embedding does, but is more fun to read!

An additional note: The novel claims that "Stone Age children" took "hundreds of generations" to develop language; that's seriously misleading, based on what we know from some natural experiments. We have no idea exactly how long it took our biological endowment for language to evolve, but it seems that biologically-modern human children, in an appropriate social context, will spontaneously generate languages within one generation. This is evidenced by the development of pidgins into creole languages, and the spontaneous generation of new sign languages when new deaf communities are established--see, for example, the case of Nicaraguan Sign Language.


If you liked this post, please consider making a small donation!


Wednesday, March 15, 2023

The Sci-Fi Linguistics of Babel-17

Despite being far from the only, or even the best, novel, novella, or short story about the Sapir-Whorf hypothesis, Samuel R. Delaney's Babel-17 (Amazon Affiliate link as usual) is famous as "that novel about the strong Sapir-Whorf hypothesis". But, there is a bit more to it than that.

The opening of the story is highly reminiscent of the much later Story of Your Life by Ted Chiang: a language expert who has previously done work for the military is recruited by a general to decipher some alien communication and tells him that it's impossible without more data:

"Unknown languages have been deciphered without translations, Linear B and Hittite for example. But if I'm going to get further with Babel-17, I'll have to know a great deal more. [...] General, I have to know everything you know about Babel-17; where you got it, when, under what circumstances, anything that might give me a clue to the subject matter. [...] You gave me ten pages of double-spaced typewritten garble with the code name Babel-17 and asked me what it meant. With just that, I can't tell you. With more, I might. It's that simple."

In fact, Rydra Wong is in a much worse position with Babel-17 than historical linguists were with Hittite and Linear B--in each of those cases, although we lacked an equivalent to the Egyptian Rosetta Stone, at least we had the context of history and knowledge of other possible related languages to provide some direction in decoding their texts. Or at least, she would be... if she weren't psychic. Delaney neatly sidesteps the entire problem of actually deciphering and learning the language (excusable, because that's not actually the point of the story) by giving Rydra Wong supernatural powers to extract meaning that just isn't actually there. Some biotechnobabble explanation is given for how her ability to read minds works, but it fails to extend to the fact that she is said to have a history of being able to look ant unbroken code and suddenly intuit what it was meant to say--an ability which she also employs to start cracking Babel-17, and which kind of undercuts the otherwise entirely reasonable claim that she needs more data to actually decipher it! She might as well be a D&D character casting Comprehend Languages.

Once Rydra learns Babel-17, we get only minimal descriptions of how it actually works as a language. It appears to be a sort of oligosynthetic speedtalk and taxonomic language, in which the form of every word encodes its definition, a feature which supposedly promotes clearer thinking and deep understanding of everything in the world that it can name. Additionally, it has no word for "I", which is supposed to imply that thinking in Babel-17 prevents someone from acting with self-awareness, with the explanation that

"Butcher, there are certain ideas which have words for them. If you don't know the words, you can't know the ideas."

Which is, well... crap. After all, we coin new words after conceiving of the new words for, so clearly having the words for ideas is not necessary to having the ideas; rarely do we coin new words and then go looking for novel ideas to attach to them! When Rydra talks about language throughout the rest of the book, it's a mixture of reasonable stuff and linguistic technobabble. For example, as a weaker form of the previous statement, Rydra also explains that

"If you have the right words, it saves a lot of time and makes things easier."

which is absolutely true! That's why technical jargons exist. But this gets taken to a ridiculous science-fiction extreme in the description of another alien language: Supposedly, Çiribians can describe the complete schematics of an industrial facility with novel features that they want to duplicate in nine short words, which is... implausible, to say the least. Why would anyone have pre-existing short words to describe previously-unknown technological innovations developed by other aliens?

Then, we have this:

"Mocky, when you learn another tongue, you learn the way another people see the world, the universe."

Also very true! This is one of the many arguments for why documenting and trying to save dying languages is such important work--every time a language dies, the worldview communicated through that language, and the cultural knowledge encoded in that language, dies with it. But then...

"Well, most textbooks say language is a mechanism for expressing thought, Mocky. But language is thought. Thought is information given form. The form is language."

I was a little surprised that Delaney-via-Rydra would even provide the hedge of "most textbooks say..." there, because for a long time real-world textbooks would've agreed with Rydra, and this is a commonly-assumed position among linguistically-naïve people. The fact is, many people do experience their own thoughts in the form of language, and are shocked and disbelieving when they discover that not everyone else shares this experience! Yet, such people do exist, despite the existence of a good bit of 20th century academic literature claiming that they can't possibly--literature which I spent a mid-term paper in my grad school Intro to Semantics class tearing to shreds. So, there is a certain type of person who would've read that line in the book, and just like me, immediately thought "Bull! Crap! Rydra!"--but if you are not that sort of person, just take it from me that language is not identical with thought.

But let's get to the actual point: that learning Babel-17 turns a person into an agent of the enemy. There is actually a teeny-tiny kernel of truth underlying this conceit: multilingual people do often tend to develop different personalities when using different languages. This is a multilayered effect--partially, it can probably be attributed to the fact that different languages require that you pay attention to different things. Thus, Russian speakers are, on average, better at distinguishing shades of blue than English speakers, and Guugu Yimithirr speakers are better at absolute orientation than English speakers, because the vocabulary choices and grammatical categories required by their languages require them to pay more attention to those things, and thus develop the skills; and its not too hard to imagine that shifting aspects of your attention when shifting between languages could have some impact on personality. But a much, much larger component of the effect is simply an extension of the fact that we all have varying presentations of ourselves in different social groups, and languages are strongly associated with the social groups among whom we learned them and with whom we used them, and with the purposes we have in communicating with those groups. Rydra did not learn Babel-17 from "native speakers", in the presence of the enemy, so in reality, there is no particular reason to believe that just learning the language from decoding intercepted communications would have had anywhere near such a drastic effect on thought processes or personality.

So: neat idea, definitely science fiction. However, we can draw a parallel with a slightly more plausible idea from Neal Stephenson's novel Snow Crash. In both works, language is used as an attack vector to allow an enemy to take control of other people's actions. In Snow Crash, there is a language which acts like a programming language to insert instructions into people's brains; in Babel-17, the language itself is the program. (How is Snow Crash's take on this concept more realistic than Babel-17's? Well, you'll just have to wait for me to get around to reviewing Snow Crash to find that out.)

There is, however, another sci-fi linguistic idea in Babel-17 which is completely overlooked in most discussions of the novel: communication from "discorporate" people can't be remembered. Babel-17 is a wild ride through a psychedelic future with all kinds of ridiculous world-building details thrown in that have no direct bearing on the core premise of intergalactic war and Whorfian linguistic weapons, and one of those is the existence of ghosts, and in fact the requirement that some positions on a starship crew be filled by literal ghosts--or, as they are called in the novel, "discorporate people". The integration of discorporate people into the crew is complicated by the fact that living humans cannot remember anything said by a ghost for more than a few seconds, so special machinery is necessary to allow communication between the living and dead crew members--which is really kind of a neat concept all by itself, and I'd love to see that explored as the basis of a story on its own. (Not necessarily communication with ghosts, but just the idea that there is some class of people whose words cannot be remembered. Cf. the Silence from Doctor Who, but in that case nothing about the person can be remembered once you stop perceiving them, not merely their words.) Rydra ends up using her multilingualism to derive an advantage in this regard--while she can't remember the actual words spoken by a ghost, she gets around this by translating ghosts' speech into another language in her head as they are talking. And while she forgets the original words, she can remember the process of translation, and what she translated them into, and thus recall the content of the conversation without the need for assistive machinery.


If you liked this post, please consider making a small donation!


Tuesday, March 14, 2023

Linguistics & Andy Weir

If you have read one book by Andy Weir, it's probably The Martian (also available in a classroom edition, with less swearing!); or perhaps you have seen the movie, starring Matt Damon!

Unfortunately, there isn't much interesting going on with linguistics or language representation in The Martian. However, Andy Weir has published two other space-adventure hard-SF books: Artemis, set on the Moon, and Project Hail Mary, which goes interstellar. And they actually do some neat stuff which isn't covered by previous books I've reviewed!

The protagonist of Artemis is bilingual in English and Arabic. For the most part, this is just an interesting bit of character-and-world-building background, which ties in to her national origin (not white or American), which in turn ties in to the economy of the titular city of Artemis. The vast majority of the time, she speaks English, and there are couple of brief bits of dialog that are italicized to indicate, aha, this is not English, she's speaking Arabic now. But, there is one absolutely brilliant line of transliterated, but not translated, Arabic dialog, which occurs when Our Heroine is being bothered by a tourist:

"Ma'alesh, ana ma'aref Englizy," I said with a shrug. [...] Nothing like a language barrier to make people leave you alone.

I do not speak Arabic, but I would bet that means something like "Sorry, I don't speak English."

[Goes to check Google Translate.]

Ah, apparently it means "I suck at English." Close enough! As far as I could tell, that is the only place in which bilingualism actually impacts the plot, and it could easily have been left out, but that is totally a thing that a bilingual person might do, in a very relatable situation! It's like a context clue, but relying on your understanding of the social context being described, rather than the literal context.

Project Hail Mary has a much higher count of interesting linguistic bits, but I can't tell you about them without some spoilers. So, if that's a thing you care about, click that Amazon Affiliate link, buy the book, read it, and then come back here and give me another page view!

Are we good now? Good!

The main character, Ryland Grace, starts out monolingual in English. However, he has to interact with speakers of Russian and Chinese, and an alien, along with text in three of their languages. The only Chinese which is directly represented is the name of the Hail Mary mission commander, Yáo Li-Jie, whose family name "Yáo" is represented as a written character and transliterated in Ryland's dialog and narration. The representation of Russian, on the other hand, is not completely consistent, but spans several representational levels in different circumstances:

Level 0: People are said to be speaking in Russian, but Grace doesn't understand it, so we get no explicit representation.

Level 1: Grace can hear people speaking in Russian, and recognize the sounds, so we get a transliteration of the Russian speech into Latin characters. E.g.:

"Eto Stratt. Chto sluchylos?" she demanded.
"Vzryv v issledovatel'skom tsentre," came the reply.
"The research center blew up," she said.

(Also note the partial diegetic translation, with context that allows the non-Russophone reader to infer what the initial question probably was.)

Level 2: When Grace sees Russian text, that text is represented as-is, in the original orthography, regardless of whether or not Ryland can understand it. E.g.:

The name patch reads ИЛЮХИНА, another name from the crest. This was Ilyukhina's uniform.

In this case, Grace does understand it, because he recognizes his crewmate's name, even if he doesn't speak Russian, and we get a diegetic transliteration. The same thing is done with the character for Yáo's name. And we know that he never actually learned Russian because of another instance of direct orthographic representation:

Five 1-liter bags of clear liquid labelled водка. It's Russian for "vodka". How do I know that? Because I spent months on an aircraft carrier with a bunch of crazy Russian scientists. I saw that word a lot.

Not because he learned to actually read Russian--because he saw that word a lot.

There is one example of orthographic representation of Russian in a Russian person's dialog--just a single word--which is where the inconsistency comes in. Ryland wouldn't have understood it (well, maybe he would, just because it sounds really similar to the English word in this case) or known how to write it, so it should've been transliterated for consistency. Unless Andy Weir was just trying to do some fancy thing beyond my understanding with that.

Anyway, the really cool stuff happens once Grace meets an alien, whom he names "Rocky". Rocky is from 40 Eridani A, lives under 28 atmospheres of pressure at over 200 degrees, and "sees" with passive sonar--very reminiscent of the Hot Abyormenites from Hal Clement's Cycle of Fire! (Although the precise mechanism of sound perception and processing between those species is quite different; in that respect, the Eridians remind me more of the Tenebrans from another Hal Clement novel, Close to Critical.) And...

"Fortunately, Rocky speaks with musical chords."

Like the Machi do, or the aliens from The Jupiter Theft by Donald Moffitt. (Huh. Maybe I should review that book some time....) And yeah, that is pretty dang fortunate, because it makes the alien language ridiculously easy to analyze, and to synthesize. I kinda have to assume that that's exactly why Andy Weir decided to design the Eridians that way--Ryland Grace is not a linguist, and while Weir does a remarkably good job of not sweeping first-contact language barriers under the rug, he's made several decisions about how Eridians and their language work that allow skipping a lot of the potential complexity. Donald Moffitt had a slightly different motivation in his work--giving the aliens a musical language allowed him to make it important to the plot that his main character had perfect pitch, which not all humans do, which made that main character specially suited to learn the alien language and, well... be the main character! But, in another parallel between these two works, by the end of the book Weir has Grace using a keyboard to "speak" to Eridians in their native language.

Grace is not stated to have perfect pitch, but he does rely on Rocky speaking in a consistent scale, particularly to have his computer (which does have perfect pitch!) automatically recognize Eridian words. That's not completely unreasonable, but I am quite glad that Weir did not explicitly state that the Eridian language was actually tied to an absolute pitch scale, because, as briefly mentioned in my review of the Machi languages, there are good reasons to think that any naturally-evolved audio communication system for biological beings could not be based on an absolute scale. Additionally, unless I missed something, the simplest syllables that are actually described in the text from Rocky's speech consist of chords of at least two notes, so identifying phonemes by frequency ratios with no fixed scale is a possibility. Unfortunately, we are told two unlikely-seeming things about the nature of Rocky's speech:

  1. Some Eridian words use chords consisting of notes that can be described in terms of named notes on the Western musical scale. That particular pattern of frequencies (or rather, family of patterns of frequencies, depending on which tuning system you use) for making up a scale is not even universal among human cultures, and certainly has no relation to the use of pitch in any human language with phonemic tone or any whistling language, so it kind of defies belief that an alien species would develop a tone-chord phonology that lined up with the modern Western musical scale. I choose to retcon this by saying that Ryland Grace just picked notes that were close enough to the frequency values spit out by his waveform analysis to make things easier to write down.
  2. Rocky is described as transposing his speech by an octave to indicate certain emotional states. It's important that the transposition is exactly one octave, because that makes it easy for Grace to figure out what's going on and fix it when his computer stops recognizing all of Rocky's words. Now, the octave is a very mathematically natural interval... but the idea of octave equivalence isn't actually natural even for humans; it has to be learned, and its importance as a musical concept it also not universal in human cultures. So... why would an alien species develop octave equivalence as a key feature of their natural language?
A lot of the complication of learning an alien language is avoided by making Rocky (a non-viewpoint character) take on most of the load, rather than Grace. Rocky (if not Eridians in general) apparently has an eidetic memory for sounds, including human speech sounds, and can pick up Grace's English words for things on a single exposure. I have to wonder what implications this might have for the childhood Eridian language acquisition process, and how language works for them in general. The immediate implication, however, is that they quickly get to a point where Grace can just speak English and have Rocky understand him, while Rocky adopts a sort of Eridian-English pidgin in which he speaks Eridian words (not being able to articulate the human speech sounds of English) slotted into an English-like grammar. This has the convenient side-effect of meaning that Weir didn't have to actually construct any Eridian grammar! Although, it does appear that Rocky's native language lacks a distinction between nominative and possessive personal pronouns, based on the fact that his italicized dialog never features possessive pronouns.

This kind of "receptive multilingualism", in which each person speaks their own language while understanding the other, is not a new thing, although I believe this is the first media I have reviewed that uses it. It's notably quite common in Star Wars, where it is used for exactly the same purpose: to portray communication between species who can't pronounce each other's languages, most famously when Han Solo is conversing with Chewbacca, or anyone at all is talking with a beeping R2-series droid. However, receptive multilingualism is also a thing in real life, where it does not occur because of differences in physical articulatory abilities (which are the same for nearly all humans), but either as a side-effect of the simple fact that learning to understand a new language is far easier than learning to speak it, or due to cultural restrictions on who is permitted to use various languages.

While the diegetic purposes are the same, however, the presentation to the audience of receptive multilingualism in Star Wars vs. Project Hail Mary is quite different. In Star Wars, multilingual conversations without a translator are always structured such that the half of the conversation which the audience has access to is enough to infer all of the necessary information from the scene. Weir, however, uses a two-layered approach similar to what he does with Russian and Chinese: any Eridian speech that Grace does not understand is presented as a string of Unicode musical note symbols (e.g., ♪ and ♫)--a conceit which I have seen only once before, in Lorinda J. Taylor's The Termite Queen. There are no appropriate Unicode symbols for chords or staffs, so we have to assume that the actually chosen symbols do not represent anything salient about the actual phonetic content of Rocky's speech, except maybe the total number or chords/syllable, or the relative utterance length. Meanwhile, when Grace understands something that Rocky has said, it is presented as an English translation in italics.

As briefly implied above, during their initial interactions Grace uses a computer to record Rocky's utterances and recognize known utterances later to help him understand what Rocky is saying before he learns to recognize Eridian words himself. Additionally, he uses audio waveform analysis software to extract the component frequencies of each utterance. Computer assistance would almost certainly be essential in documenting and decoding any alien language we might come across, but it's too bad that Grace was not trained as a linguist, or he might have known about all of the software tools that exist for analyzing and documenting human language already, and pulled out Praat for doing spectral analysis of Rocky's speech--it would not be the first time Praat had been used to analyze non-human utterances! (A note on worldbuilding: the starship Hail Mary is supposed to have been loaded with every piece of software available to humanity at launch, just in case, so Praat would definitely have been in there.)

There is one instance in which Weir-via-Grace makes an explicit claim about linguistics:
The oldest words in a language are usually the shortest.

Which is... sketchy. Depending on how exactly you interpret it, it might not be false, but it's not particularly useful. For example, old words tend to be common words, and common words tend to be short... but not all common words are old, and not all old words are common. And this topic comes up when Grace is learning Rocky's words for numbers, which brings up the further question of why Grace assumes that numbers would necessarily be old words. However, this statement has absolutely no relevance to the story. Charitably, perhaps it is meant to show that Grace only has no linguistic training, and only folk-understanding of linguistic science? But what really comes across is that the author didn't really know what he was talking about, and the book would've better with that one sentence just cut out.

This does give us a nice segue to talking about Eridian numbers, though. For the most part, the problem of translating between numeric and unit systems, just like the problem of learning a new language, is offloaded to the non-viewpoint character, who is not merely a linguistic savant but also a mathematical savant, able to do unit-of-measure and numeric base conversions instantaneously in his head (er... cephalothorax?). Grace does, however, learn Eridian numbers to decode Eridian clocks, and works out pretty quickly that they have a base-six numeral system. The choice of how to represent Eridian numerals in the text is kind of interesting--much like using musical note symbols to represent Eridian speech (or at least, that Eridian speech is happening), Weir makes use of existing Unicode symbols that are not typically used in English text and which approximate the diegetic forms of the Eridian symbols to show Eridian numerals in the text. That's the closest we come to any representation of Eridian writing, and cleverly avoids needing to include any pictures in the text (aside from the diagram of the ship provided in the front of the book). Now, Rocky has 5 limbs and 15 fingers, so why would the Eridians have a base-6 system? Well, while all of Rocky's limbs are functionally interchangeable, balancing on two legs for a natural tetrapod would be unnecessarily tricky--but an Eridian could stand, and possibly walk, on any three limbs at a time, leaving two free to use as arms, with a total of 6 fingers between them. Thus, developing a base-6 numeral system based on counting the six fingers of two Eridian hands would be directly analogous to humans developing base-10 numeral systems based on counting the 10 fingers of two of our hands. Note that the actual logic behind Eridian numerals is not addressed in the story, but this seems like a reasonable reverse-engineering of the author's probable intent. If Project Hail Mary had instead been written by a human who natively spoke a minority language of Papua New Guinea with a base-27 body-counting system, perhaps the Eridian numeral system would be slightly more opaque.

If you liked this post, please consider making a small donation!