Monday, January 17, 2022

Secondary Languages in _Time_ and _Heterogenia Linguistico_

Heterogenia Linguistico is a manga series about a field linguist / ethnologist exploring a fantasy realm and documenting the languages and cultural practices of fantastical races.

Time is a Hugo-award-winning long-form webcomic / animation hybrid thing, published as the 1190th strip of XKCD in 2013 about far-future humans occupying a dried-out Mediterranean basin discovering that the ocean is about to flood back through the straits of Gibraltar and destroy their home.

What could these two bits of media possibly have in common? They both integrate secondary languages using techniques uniquely suited to the comic / graphic novel medium.

Additionally, both are ambiguous in their possible usage of a narrative translation convention! It is possible, though unlikely due to the large time span for linguistic evolution to take place in, that the main characters in Time actually do speak English. Meanwhile, though I read Heterogenia Linguistico in English translation (which obviously establishes a translation convention by virtue of the fact that it was literally translated, in the real world), the original is in Japanese, and seems to originate in a fantastical analog of Japan--so the human characters might very well be intended to actually be speaking Japanese. These things are not always clear-cut! However, Heterogenia Linguistico does display an explicit translation convention insofar as all speech which the main character understands is presented on the page as English (Japanese)--thus ensuring that the reader remains on the same metaphorical page as the viewpoint character.

Time features one full conlang (first appearing in frame 2658, externally labelled "Beanish" since its speakers appear to wear beanies), and one fictitious language (first appearing in frame 2865, externally labelled "Unglish" since it's... not actually English).

Beanish is presented in its own unique script. This has the effect of making it obvious to the reader that they, just like the main characters, are not expected (and thus not required) to understand what is being said. In fact, it short-circuits any attempt at understanding, as the invented script eliminates any possible phonetic cues that might prompt a reader to try... well, reading it! While theoretically this could be done in purely-written media (and written media like War and Peace will on occasion include examples of secondary natural languages in their native scripts, even when those scripts differ from the primary script of the work), it is considerably more difficult to do both for technical reasons (the need to create custom fonts or embed images) and for audience-compatibility reasons; encountering an unreadable script switches one's brain from "visual language processing mode" into "generic image processing mode", and/or triggers skipping over that span of incomprehensible text to the next bit that you can recognize, and that kind of cognitive interruption is more unexpected and more jarring in running prose than in the context of a comic panel, where you are already primed to take in narrative-relevant information from the whole image rather than just text, and in where the eye is already practiced at skipping between non-contiguous dialog sections.

In frame 2703, Cueball makes an attempt at speaking Beanish, and subsequently is excited that he has finally learned a word, giving a diagetic translation of it; however, in subsequent frames, we discover that he is aware of the "gavagai problem" (which previously came up in John Carter (of Mars)), and is unsure of the precise meaning of the word after all. Still, this is probably the best entry point we have into the decryption on Beanish--which is, as of yet, still incomplete. Despite the lack of a decryption, however, we can be confident that Beanish really is a consistent conlang, rather than simple visual gibberish, for two reasons: internally, it has regular repeating structure that looks language-like (although, so does the Voynich Manuscript, and plenty of people are convinced that that is just very cleverly-constructed nonsense); and externally, Randall Munroe has stated that he got a linguist to create it for him! Thus, rather than being purely a matter of Making It Irrelevant, the use of Beanish in Time seems like a very long-term example of Easter Egging, having presented Beanish as a puzzle to be solved.

Unlike Beanish, Unglish is partially comprehensible, with some difficulty, to Time's main characters. To convey that same experience to the readers, Unglish is presented as distorted English text, with smudging, odd grammar, and overlaid words. This is kind of the opposite of Making It Obvious--it is Made Unobvious, but accessible with difficulty. Again, this is an approach that simply could not be done with anything like comparable effectiveness in a different medium.

All dialog from Time, including images of the Beanish and Unglish text, with links to the source frames, can be found transcribed here.

In comparison, Heterogenia Linguistico, despite having more dialog and being explicitly about linguistics, shows much less sophistication in its presentation of secondary languages. As noted earlier, it simply translates everything that the main character can understand; but what about things that main character can't understand? Unlike Time, it neither presents a textual representation of non-human languages (except for a few personal names, which are approximately-phonetically transcribed), nor does it visually distort words to impede easy comprehension. Instead, partially-understood speech is peppered with black boxes--kind of like redaction bars--replacing words, morphemes, or just weirdly-pronounced sounds. Much like the unique script of Beanish, the use of obviously-non-linguistic blocks of blackness makes it very clear to the reader that they are not meant to understand, and that that is in fact part of the intended experience; if the reader is missing something, that's fine, because the viewpoint character is also missing something and will act accordingly. As the narrative progresses, and main character gains better familiarity with local languages, the distribution of redaction boxes shifts from context words almost entirely to function words (articles, prepositions), which (especially with the visual and narrative context) leaves the complete meaning easily recoverable while sill conveying the idea that the main characters language competence is still not perfect. This partial redaction approach to secondary language representation is still something that I don't think you could get away with in any other medium... but it is very similar to something that you could get away with in English literature a Long Time Ago; in particular, works like H. G. Wells's The Time Machine occasionally simply replacing bits of dialog with dashes, as illustrated in the line:

“Where’s——?” said I, naming our host.

in which it appears that the author simply couldn't be bothered to come up with a name that really should have appeared in dialog.  If that were an established modern trope, it could probably be extended to representation of secondary languages with very explicit Make It Irrelevant messaging--but I would not recommend actually trying it outside of a comic strip setting!

While Heterogenia Linguistico is less sophisticated than Time in its handling of secondary language content, it nevertheless is a decent example of linguistic science fiction, largely focusing on alternate modalities for interspecies communication, like in Semiosis. And, of course, it is one of very very bits of popular media (like Disney's Atlantis) which have a linguist as the main character, so it gets points for that! But wait, you might be thinking, didn't I say it was fantasy? Well, yes... but it leans on linguistic science, and presents fictional linguistic scenarios to be scientifically analyzed. It may be within a fantasy setting, but none of the linguistics is explicitly fantastical; there is no telepathy, or Whorfian determinism, for example. And that is enough for me to call it linguistic sci-fi, even if its set in a fantastical background. A sample of the linguistic fictions that it explores:

  1. A color-based writing system derived from the color-based language of Krakens (which can alter their skin color at will, just like real-world octopodes and squid), which appears to be used logographically / semasiographically as it is not phonetically connected to the speech of the Lizard People who use it.
  2. A werewolf community which uses anomalously little acoustic speech because they can obtain so much social information through smell instead.
  3. A language that uses primarily ingressive rather than egressive sounds.
  4. An interspecies lingua-franca that uses a different phonetic inventory for each species that uses it, tailored to their articulatory abilities.
  5. A sign language based on full-body dance movements as an interlanguage for an avian race which cannot make humanoid-like speech sounds and lacks hands for humanoid sign languages.
I'd like it better if some of these things were developed as actual conlangs and more fully integrated into the text (you'd think they'd have to do something for the color writing... but it's a black-and-white comic, so they nicely sidestepped that!), but the ideas are pretty cool on their own.

If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Thursday, January 13, 2022

Marvel's Multilingual Eternals

Marvel's Eternals showed up on Disney+ yesterday. So, I watched it.

My expectations were low, and thus I was pleasantly surprised! 

The most obvious linguistic thing about it is that it features a bunch of different languages in dialog! Including Babylonian, for which Assyriologist Martin Worthington was consulted, and ASL.

But, for the most part, its usage of them is non-notable. It's just using subtitles all the time, all over the place.

I suppose that is kind of notable, however, insofar as it demonstrates that you really can do that. In case there was still any doubt, it seems that movie audiences are just fine with reading subtitles--so if that's keeping you from writing a screenplay featuring a secondary language, get over it!

Now, Eternals does not have the best audience review score ever (78% right now according to Rotten Tomatoes), so you might be thinking "well, what if the people who hated it hated it because of subtitles? I don't want to drive away 22% of my potential audience!" So, I read every audience review on Rotten Tomatoes, to find out exactly why people liked or disliked it. (Well, skimmed; there are a lot of audience reviews!) And in all of those reviews, I could only find one that kinda sorta obliquely may have been related to language issues:

"Like why is there even a deaf chick? How does that help the story?"

So, I'm pretty sure subtitling was not the problem. Go forth, screenwriters! Let your characters be multilingual, secure in the knowledge that if you can't figure out anything more interesting, you can just subtitle them, and it'll be fine!

There were, however, two actually interesting things done with Babylonian & ASL early on in the film: First, we have a brief shot in which ALS is diagetically interpreted into, not English, but Babylonian, while simultaneously being translated in English subtitles for the audience. Three languages at once (two diagetic and one non-diagetic) is not something I have seen before, and certainly something that is much more suited to the medium of film than to prose. Second, we have this line spoken (in English) by Ikaris:

"If I want to spend more time with you, I need to get to know them."

as an explanation of why he had just spoken his previous line in Babylonian. In other words, the writers are showing Ikaris learning a new languages specifically in order to insert himself into a social position that that language will give him access to! Good job with the (probably unintentional) sociolinguistic awareness! This nicely ties back into my comments on Toolmaker Koan, regarding the need for a secondary language to serve some purpose in the story. Without some secondary language in play (which one in this case being dictated by the setting), would have been missing a tool to establish Ikaris's character development.

Now, let's look back at that Rotten Tomatoes review: why is there a "deaf chick" speaking ASL? How does that help the story? Well, sadly, it didn't help the story quite as much as it could have. Several scenes show Makkari relying on lip-reading to understand lines spoken by other characters who totally knew sign language as well, and could've signed to her. But, for some reason, they just... didn't, always. Nor were the unique advantages of the visual vs. auditory medium ever exploited--e.g., to communicate more effectively in noisy environments, or without attracting attention by making noise. And without those kinds of plot-integrated justifications, we are left wondering why a godlike Celestial like Arishem would create an Eternal hero with an apparent sensory disability. Nevertheless, the use of ASL, and the associated insertion of a Deaf character, does have a purpose in the story even if it doesn't have relevance to the plot--it's simply characterization. Why should there be a Deaf character? Well, because why not? Why shouldn't there be a Deaf character? She is there to challenge the audience's implicit conceptions of what a "default" character type is, and to provide representation with which an additional audience segment can identify. Now, there is almost always a way to make the secondary language plot relevant, whether signed or auditory, as Michaelbrent Collings did with Portuguese in This Darkness Light--and I do fault the writers of Eternals a bit for not bothering to find those ways. But simply representing someone from a different speech community is a totally valid reason to have a secondary language all by itself, despite what certain Rotten Tomatoes reviewers might think.

It is also worth noting that, as in The Dragon Prince, ASL is used to establish a second narrative translation convention alongside English. While we can assume that, in the contemporary scenes, the Eternals are in fact speaking English when they appear to be speaking English, and thus may actually be speaking ASL when they appear to be speaking ASL, this cannot be the case for the historical scenes, in which none of them would ever have heard of English or ASL yet, let alone had a chance to learn those languages. (Some media sources have reported that they actually created a whole new sign language for the movie, but that's a severely misleading headline--in fact, the actor Lauren Ridloff and her husband Douglas, who was the film's ASL consultant, just created new ASL name signs for the characters.) Thus, the Eternals circa 5000 BC were presumably using some unspecified fictional alien sign language, which is merely represented as ASL no-diagetically for the benefit of the modern audience

If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Tuesday, January 11, 2022

The Toolmaker Metaphor

Toolmaker Koan (only available from third-party sellers, which is incredibly amusing if you've read the book, but I've provided an Amazon affiliate link anyway), is a 1988 science fiction novel by John McLoughlin about first contact with two different non-human... species? (don't want to spoil too much!)... while humanity is on the cusp of an apocalyptic nuclear war. It's only a few months older than me, just barely older than the collapse of the Soviet Union, and a prime example of just how consistently late 20th century sci-fi authors failed to predict said collapse!

There is quite a range of linguistically interesting stuff going on in here. Not a large total volume of it, but a lot of small bits of different kinds of things.

We start out, upon meeting our first alien intelligence (who has been studying humanity in secret for some time, and thus conveniently speaks English), with a technical articulatory phonetic description of how to pronounce it's chosen name! ("Charon") This sort of linguistic detail doesn't show up again, but the specification that the Greeks would pronounce it with "the back voiceless stop consonant you call K" becomes relevant in the denouement for recognizing the adjective "Karonic".

A bit later, we get some Greek when discussing the significance of the alien's  name:

"The Greeks always buried their dead with a coin offering, the danake, in their mouths. Charon, I'm afraid, was a bit of a miser; with the coin the soul paid Charon's naulon, his toll."
This is straightforward diegetic translation, in which the character is using appositive definitions. I'm not really sure why the character, and thus why the author who wrote that character, bothered in this case, though; it feels very much like a forcibly inserted chance to show off cool background knowledge, which reminds me of this XKCD comic (even though the words in this case are not made up by the author!) This seems like a good time to point out that, while I am a great proponent of Interesting Linguistic Content, all of the techniques I am documenting are pointless if they are not used in support of the story! You've got to find a way to give it a function, or, as much as it pains me, cut it out.

Later, the alien Charon, by application of god-like alien technology, uh... reconstitutes a tribe of Australopithecus. Several of their vocalizations are quoted (presumably onomatopoetically), but our human protagonists obviously don't understand them (or even know if there is anything to understand in the first place!), which makes this a very clear case of Making it Irrelevant. (For reference, if the Australopithicii have language, two of their words appear to be "Skaroch!" and "Skuh!")

Eventually, Charon introduces the humans to the whileelins--or at least, their name is spelled "whileelin". How it is pronounced is anyone's guess, as their language is supposed to be whistled! (Or perhaps "sung", as it is supposed to sound similar to birdsong.) Whistling is a modality in which natural human languages actually exist, but never a primary modality; like writing, it always serves as a secondary encoding of normally-spoken language. So a non-human species that naturally whistles is a neat idea--especially since I happen to working on a primarily-whistled conlang myself right now!

Sadly, there is no indication of how information is actually encoded in the whistled signal, and no description of how the transcription system works, and thus no easy way to compare the whileelin language to human whistle languages. On the bright side, that means I am free to assume that it does not rely on absolute pitch discrimination! No indication of how data is encoded in the whistle signal. The only descriptions of the sounds or lexicon of the whileelin language that we get is that "Hwiliria"(the name of the whileelins' spaceship) sounds like a "four-toned burst of music" (a very strange description given that there are more than four letters and more than four types of letters in the name), and this bit of non-diagetic translation:

"Haijar," agreed the First.

which takes advantage of the specific semantics of the word to give you an approximate definition in the speech tag! That's not something you can get away with very often!

Prior to the humans' introduction to the whileelins, however, McLoughlin establishes some dramatic irony by shifting to the whileelins' point of view for a couple of scenes, which are used to establish a particular kind of Narrative Translation Convention, in which the use of archaic thee/thou pronouns and associated verb conjugations in English to represent the "Patriarchal mode" of the whileelin language.

Whileelins are not built like humans, physically or mentally. Thus, there are three stages to whileelin languages: They are born with an innate, genetically-programmed understanding of a basic "creche language"; upon reaching adulthood, their brains grow to unlock another innate Patriarchal/Matriarchal language. In between, there is a ten-year period of high intelligence and mental flexibility in which all the variety of arbitrary language can be learned and developed--unless a whileelin is neutered, halting their transition into full adulthood and allowing them to maintain mental flexibility indefinitely. We only ever encounter one linguistic community of whileelin in this novel, but presumably this means that whileelin languages can diverge from one another... but all possible whileelin languages would be much more similar to each other than human languages are, due to the constraints of developing from a common innate language, and needing to accommodate the integration of a second innate language--at least, as long as juveniles and castrati care about learning to understand the speech of sexually-mature adults! This is a fascinating bit of fictional linguistic science that qualifies this work as linguistic science fiction--which does not focus on the strong Sapir-Whorf hypothesis!

Returning to human languages, there are three insertions of Spanish; one short bit of code-switching right near the beginning, and two longer phrases in the last third of the book. That bit of code-switching looks like this:

"Nossir. But we're good hackers, que no?"

In this case, English syntax Makes the meaning Obvious. It's a tag question, and English is pretty liberal with what can go in the tag slot already when multiple dialects are considered (cf. "eh?", "innit?", etc.)--so when you drop some random short thing with a question mark after it (which happens in this case to be Spanish) in the tag slot, it's pretty obvious what it means just from where you put it! And while we have been told that the characters here are a multi-ethnic, multi-national group, this little bit of inserted Spanish helps to show us that--even though the rest of the dialog is English (excepting the couple of words of Greek mentioned above) for the next 212 pages!

But, on page 242, we get a reintroduction to the character who was the addressee of that tag question, who was out of the action for a god long while. And just in case you forgot who he was, his reintroduction consists of thinking "Madre de dios!" while "wiping his lap frantically" because he spilled hot coffee! The context, and exclamation mark, and implicit background knowledge that people tend to slip back into their native(or most comfortable) languages when stressed or cursing (or stressed and cursing) makes it pretty Obvious that this is an expletive, Irrelevant what the literal meaning is, and helps remind us who this character is--oh yeah, it's the guy who was addressed with Spanish!

Later on, we get this interesting passage:

--but then this Charon had claimed to be a sentient machine, one speaking like a crazy old man. Un viejo poquito loco, and it claimed to know a great deal.

Like the earlier inclusion of Greek, I cannot see an obvious purpose for this; it might just be there to remind you again, in case you forgot, that at least one character here (this one) has a native language other than English. However, regardless of purpose, the structure is fascinating. You've got the English "crazy old man" and the Spanish "Un viejo poquito loco" in direct textual juxtaposition--but, they aren't actually in the same sentence, and so not in syntactic apposition, which considerably increases the cognitive load on the reader to identify one as a (rough) translation of the other. Now, that could be a bad thing if the meaning is really important--or perhaps a good thing if you want the reader to pause and think about a particular passage. But McLoughlin side-steps the issue by simultaneously ensuring that the meaning of the Spanish is totally irrelevant. You can just delete it, and the sentence is still perfectly comprehensible, so it doesn't matter if a monolingual English reader doesn't figure it out! They still will have been shown the Spanish and given that reminder.

Finally, in the Epilogue, we get this:

    "Munirda, strangeko!" The girl glanced shyly at the Mother, back again at the Karonic. "Mensch, two, Marma, oltimaku wringlerising!"
    "And to me, my noisy descendant, you speak English!"

Now, I really hope that that's just Irrelevant, 'cause I cannot figure out what exactly it's supposed to say. (I'm pretty sure it is, 'cause the rest of the epilogue makes perfect sense.) However, several individual bits are tantalizingly familiar--which, in the context of the response, suggests that all this seeming gibberish is supposed to accomplish is to show you that this girl in the future is speaking something descended (at least partially) from English, but different from contemporary English, which helps to suggest the depth of time that has passed between the last chapter and the epilogue.

But, what does all this have to do with metaphor? Well, the linguistic content of Toolmaker Koan reminded me of the conflict between the Conduit Metaphor and the Toolmakers Paradigm, first described by Michael Reddy.

The Conduit Metaphor is a conceptual metaphor deeply embedded in the English language; it is the conception of utterances as containers into which thoughts can be placed, and sent (through the conduit of speech) to another mind. A few examples from Reddy's paper:

  1. Try to get your thoughts across better.
  2. None of Mary’s feelings came through to me with any clarity.
  3. You still haven’t given me any idea of what you mean.
  4. Whenever you have a good idea practice capturing it in words.
  5. You have to put each concept into words very carefully.
  6. Try to pack more thoughts into fewer words.
  7. Insert those ideas elsewhere in the paragraph.
  8. Don’t force your meanings into the wrong words.
Though it may feel entirely natural to speak this way, it is not a necessary conception of how to talk about language. Other frameworks are possible. And to prove this, Reddy proposed the Toolmakers Paradigm; in this alternative metaphor, we are all isolated minds living in different mental environments, and creating tools (ideas) appropriate to those environments. We can pass blueprints for tools (utterances) between our environments, but, lacking shared context outside of the blueprints themselves, there's no way to ever to tell if you actually built what someone else sent you the plans for, or if anyone else has interpreted your plans correctly. (I am aggressively summarizing here; I strongly suggest actually reading Reddy's paper.)

Any neurodivergent person who has encountered the double empathy problem, or any author who has encountered baffling analyses of their work, can easily understand the far more accurate nature of the Toolmaker Paradigm. And yet, despite being toolmakers, as Toolmaker Koan repeatedly reminds us that we are, we English speakers at least seem to really want it to not be so! It would be so nice if language actually contained thought and transmitted it accurately; and I'm sure it doesn't help that the Conduit Metaphor can be made more accurate for transmission of information between machines; but it just ain't that way for humans! In fact, I think we can even do better than the Toolmaker Paradigm as described by Reddy; language itself is a tool. No metaphor required! Even apart from acting as cognitive technology (as a I referenced in my review of Ted Chiang stories), language is a blunt tool with which we try to sculpt crude replicas of our thoughts in other people's minds. The simple fact that humans create and use languages makes us toolmakers all on its own.

If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Friday, January 7, 2022

Linguistically Interesting Media Index


    1.  Alien Communication in Semiosis
    2. The use of French in Kill the Beast
    3. Learning Portuguese in This Darkness Light
    4. Integrating a Conlang in A Game of Thrones
    5. Fictional Linguistics in Reading the Bones
    6. Why Disney's Luca is Bad, Actually
    7. Disney's Conlangs
    8. Barsoomian & John Carter
    9. Rabbits, Smeerps, and Empires
    10. Vance's Language of Pao
    11. Mr. Holland's ASL
    12. Linguistic Representation in The Dragon Prince
    13. The Hidden Language of K. A. Parkinson's Chosen Chronicles
    14. British Sign Language in Doctor Who
    15. Война et Paix: French in the Great Russian Novel
    16. Into the Night of Language Diversity
    17. The Mandalorian & Tusken Sign Language
    18. Shadowscent: The Darkest Conlang
    19. The Steerswoman & the Wood Gnome
    20. The Transgalactic Guide to Solar System M-17
    21. The Steerswoman & the Outskirters
    22. Rosemary Kirstein vs. The Enderverse
    23. The Language of Power: Unanswered Questions
    24. Language Planning in the Enderverse
    25. The Other Ted Chiang Stories
    26. The Toolmaker Metaphor
    27. Marvel's Multilingual Eternals
    28. Secondary Languages in Time and Heterogenia Linguistico
    29. The Trilingual Fiction of Eric James Stone
    30. A Literature of Sign
    31. Linguistics as the Science of Science Fiction
    32. Rylan & The Last Starfighter
    33. Decoding Sangheili in Halo
    34. How Can We Portray Languages In Games?
    35. OK, fine, I'll do Arrival
    36. Linguistics & Andy Weir
    37. The Sci-Fi Linguistics of Babel-17
    38. The Sci-Fi Linguistics of The Embedding
    39. Xenolinguistics: A Review for Authors & Conlangers
    40. Larry Niven's Grammar Lesson
    41. Three Miles Down
    42. Stridulation in Landscape with Invisible Hand
    43. True Biz & A Literature of Sign
    44. Babel: Or, the Necessity of Violence
    45. How is Castlevania like Luca?
    46. The Year of Sanderson
    47. What if... Marvel Audiences Had to Read Subtitles for Mohawk Dialog?
    48. Review: Reading Fictional Languages

    Lexember 2021: A vec2word Retrospective

    So last Lexember, I made heavy use of vec2word for machine-assisted vocabulary creation.

    As should be expected with an experimental prototype thing, it did not go as smoothly as I had hoped it would. But, it worked well enough that I think I can recommend the concept! In fact, it worked well enough that, for the first time ever, I did not miss a single day of Lexember--and actually produced significantly more than one word per day.

    I ended up using vec2word's suggested semantic fields to do my glossopoesis for both Tjugem (an in-progress whistled conlang) and Fysh A (the result of my pondering about speech in the modality of modulated electric fields), but I used it in slightly different ways for each language.

    For Fysh A, I generated two cluster lists from the same vector model--one for single-syllable words, and one for two-syllable words. The idea here was that the shorter list of one-syllable words would produce more semantically broad clusters, and short words should be semantically broad so that they can get used a lot. It turns out that that was not the best line of thinking after all, because common words are not always super semantically broad! That problem can be ameliorated in a couple of ways, though: First, Lexember doesn't have enough days to exhaust the complete monosyllabic word list, so there are a lot of monosyllables left over--I could try to continue using the vec2word outputs to assign meaning to all of them, but I can just as well simply decide not to, and get my common-yet-specific words through more traditional Artisanal Lexicon Creation processes. That will mess up the phonosemantic tendencies... but well, natural languages don't have universal phonosemantic systems anyway! Second, on several occasions I just decided that some particular word was going to mean a much more specific subset of what was suggested by the model. That requires more thought than I originally anticipated, but honestly it's how I would probably recommend using the system if you stick with the cluster-generation algorithm I described in my last vec2word post (i.e., trying to get clusters of words that are as semantically coherent as possible).

    For Tjugem, I also generate two cluster lists, one for (a subset of) possible single-syllable roots, and one for (a subset of) possible two-syllable roots. These served very different purpose from the lists for Fysh A, however. The intention was to use the much larger (with correspondingly narrower semantic fields) list of two-syllable roots as slightly-polysemous stems, with single-syllable suffixes corresponding to broader semantic fields to disambiguate the precise meaning--sort of like Chinese two-character compounds. This approach, however, had a couple of problems: In many cases, two-syllable stems would already have precise enough meanings that any broadly compatible suffix wouldn't actually add anything useful; and conversely, it was often difficult to find a variety of suffixes that would all make sense in different ways with one stem. Almost always possible, but requiring a lot more searching and thought about how meanings might shift or how useful meaning might be implied by weird combinations. To make this kind of structure work, I suspect it would work better to not try to ensure that the clusters are as coherent as possible--in fact, bothering with vector clustering is probably entirely unnecessary, as one could just produce random jumbles of source words to represent random homophone sets; then, compounds could be automatically generated by finding stems and suffixes that have overlaps in their polysemies.

    It may also be worth exploring entirely different clustering strategies. For example, a proper ontological hierarchy might be generated by, rather than producing all the clusters at once and then sorting them along their centroid dimensions, instead looking for a small number of high-level clusters corresponding to an initial phoneme or syllable; and then independently finding another layer of clusters within each of those, and so on, until you have the total number that you want. This is essentially how philosophical languages like Wilkins's Real Character work, although John Wilkins produced his ontology entirely manually!

    I also experimented last month with various ways of trying to extract meaningful semantic relations in an unsupervised manner, which might suggest possible morphological processes for adding to a conlang. As I predicted it would be, this is an incredibly computationally expensive process, so not much ended up coming of it; however, some new approaches to efficient clustering have been suggested to me, so there are a few more approaches to this that I might still try in the future.

    That's pretty much all I have to say about that, but, as always, just in case you feel like giving me money: you can do that right here.

    Wednesday, January 5, 2022

    The Other Ted Chiang Stories

    If you've seen the movie Arrival, you may know that it is based on a novella by Ted Chiang: The Story of Your Life, available in the collection Stories of Your Life and Others. And if you are a conlanger, you may know that the alien language Heptapod B had been discussed in conlanging circles for quite some time before the movie was made.

    But, I don't really want to talk about that story; it's been done to death, with it's premise that's kind of frustrating because its just another rehash of the strong Sapir-Whorf hypothesis but is nevertheless a fresh and intriguing take on that extremely broad category of linguistic sci-fi premises... and there I go talking about it on accident anyway! Let's move on!

    I acquired the above-mentioned anthology of Ted Chiang stories for Christmas; I had read all but one of the stories before, in other venues, but coming back to them again one after another in this format allowed me to recognize the repetition of linguistic themes, which go beyond the boundaries of Whorfianism. So, let's talk about some other Ted Chiang stories!

    Understand

    Understand is not about language, per se; it is about imagining what it might be like to have intelligence so much quantitatively greater than the smartest normal humans to experience a qualitative difference in the experience of cognition. But in so doing, linguistic concepts are brought up; in fact, conlanging is brought up! (Although not by name.) The protagonist effectively identifies language as a cognitive technology (1, 2)--and, having determined that the natural language he speaks is sub-optimal for continuing to expand his cognitive abilities, sets out to create a new language that would be more suited to thought. He even plans to discard the basics of natural language and consult logic for the fundamental units on which to build his new language--rather like my approach to building WSL (although the intent is quite different)! The new language is described as being "gestalt-oriented", incapable of being spoken or written linearly; a description which puts me in mind of non-linear conlangs like UNLWS, and of course is parallel to the description of the ideograms used to write Heptapod B! Seeing as this story is older than The Story of Your Life, it seems likely to me that this is where the idea of gestalt ideograms originated for Ted Chiang, then to be re-used in the more famous story. However, the theoretical language of Understand is further described as being sub-optimal even for a static page, best represented as a hologram or a video of a time-evolving image. I have no idea if these are the terms Ted Chiang was thinking in when he wrote that description, but that is in fact exactly what I would expect from a perfect non-linear loglang, which encodes a semantic graph. The planarity of a page puts limits on the types of semantic graphs that can be encoded, necessitating workarounds for notating crossing edges, or else allowance for ambiguity, but a 3-dimensional "surface" would in fact allow embedding any arbitrary graph!

    Seventy-Two Letters

    Seventy-Two Letters is a unique take on the idea of a magical language--one which is inherent to the universe, rather than arbitrary, and command of which allows for the production of magical effects. The "magic" of the seventy-two-letters universe is realized in the form of twelve-by-six grids of letters constituting "true names" used to animate golems of various shapes of functions, thus constituting a kind of natural programming language whose semantics the characters must discover through a scientific process of investigation. The greatest breakthrough comes when the ability to produce self-referential statements is discovered (e.g., the famously paradoxical "this sentence is false"), which then leads to the creation of quines--and the ability of golems to contain the information necessary to build more of themselves. The idea of a magical language in and of itself is certainly not new, but this is a very original fictional take on how such a thing might be used!

    If you liked this post, please consider making a small donation!

    Monday, January 3, 2022

    Language Planning in the Enderverse

    Way back in the early 2000s, I read the whole Ender's Game quartet.

    And then, while comparing The Lost Steersman to the Enderverse, I discovered that the quartet is now a sextet! Apparently, there was a new sequel to Ender's Game published in 2008--and a follow-up to Children of the Mind published just last year!

    So, I picked up Ender in Exile just for fun... and then found out it has linguistic content!

    (Note that all links to books are Amazon affiliate links.)

    Ender in Exile features several Italian characters (with just a smidge of Italian text as a secondary language), as well as peripheral references to other language communities. In chapter 8, we get a direct reference to language planning:

        "I've been thinking of teaching English," said Valentine. "Offering a class."
        "Not English," said Ender. "Common. It's spelled better--no ughs and ighs--there's no subjunctive, no 'whom', and the word 'of' is spelled as the single letter 'v'. To name just a few of the differences."

    In the Enderverse, "Common" is established as a version of Controlled English, similar to Aviation English or Simplified Technical English, with the addition of a spelling reform.

    Simplified Technical English, like International Fleet Common was intended for easier accessibility to second-language learners. And, as it happens, there was, historically, an attempt at English spelling reform for the same purpose: the Deseret Alphabet, a phonemic alphabet for (a particular range of dialects of) English that was supposed to simplify English language instruction, and thus cultural integration, of the large numbers of international converts to the Church of Jesus Christ of Latter-day Saints migrating to the Territory of Deseret (which later became parts of Utah, Colorado, Idaho, Wyoming, New Mexico, Arizona, Nevada, California, and Oregon--basically, the US Mountain West) in the mid to late 1800s. Sadly, the Deseret Alphabet never really caught on, and is now effectively extinct, but the Enderverse's International Fleet has a great deal more political clout to enforce a spelling reform of English (historically a very difficult project to pull off), and, given Orson Scott Card's religious affiliation, I have to wonder if the Deseret Alphabet was any sort of inspiration for this feature of International Fleet Common.

    A few pages later, we get an example of Diagetic Narrative Translation of Italian:

        "It doesn't matter," said Alessandra. "Not enough women ruoli, parti--how do you say it?" She turned to Valentine hopelessly.
        "'Role'," said Valentine. "Or 'part'."

    which is followed up by a couple of examples of Making It Obvious with other Italian words that are very similar to their English translations (holografi - holographs, Il teatro - the theatre).

    In chapter 14, we get a bit of Easter Egging:

    Alessandra stood there, her hand to her mouth. Then tears came to her eyes. "Per tutte sante," she said. "I was... doing what she wanted. [...]"

    No translation or explanation is given or necessary, as the context makes it obvious that this is some sort of emotional interjection. (Incidentally, it happens to mean "For all saints", a pretty standard entry in the category of religiously-based expletives, which also serves to reinforce the character's Catholic cultural background.)

    A little earlier, in chapter 13, however, we get this intriguing exchange:

        "[...]That way, no matter who wins this little power struggle, we'll be able to cash in. Am I correct?"
        Alessandra had spoken the phrase "cash in" in English. Dorabella seized on that. "Shakespeare Colony has no cash yet, darling," said Dorabella. "It's all bartering and allotment so far. [...]"

    which nicely serves to highlight a narrative translation convention. Of course, two native speakers of Italian having a private conversation with each other can reasonably be expected to actually be speaking Italian, even if the text on the page is English for the reader. But when they have each been portrayed in extensive conversation with English speakers, it might be easy to forget that! This callout of a character specifically using English is a sneaky way to remind us that we should actually be envisioning the remainder of the conversation in Italian, without ever having to explicitly say "they were speaking Italian."

    If you liked this post, please consider making a small donation!

    The Linguistically Interesting Media Index

    Saturday, January 1, 2022

    2021 Fiction Reading Year in Review

    Inspired by Graham Bradley's Year In Review, I have attempted to catalog all of the fiction which I have read to completion for the first time this year, not including kids books that I read to my children.

    This is a very approximate list, and in no sort of chronological order, because a year ago is a long time, and I did not make notes of this stuff as I went... Where possible, I have included Amazon affiliate links for acquiring the books. Generally speaking, I have enjoyed everything I have read (I wouldn't have finished it otherwise!), so presence on this list can be interpreted as an endorsement and recommendation.

    Since I've already mentioned Graham, here's what I read by him this year:

    • Kill the Beast, a steampunk / gaslamp horror retelling of Beauty and the Beast, which I reviewed as one of the first entries in my series on secondary languages in fiction.
    • Sleepless Hollow (also available as a free audiobook, narrated by the author), a contemporary sci-fi / supernatural sequel to The Legend of Sleepy Hollow.
    • The Guild of Eldritch Adventurers (podcast serial only), a time-travelling homage to The League of Extraordinary Gentlemen featuring characters from American literature.
    And since I do ARC reading for Michaelbrent Collings, here are his new releases:
    • Synchronicity, a contemporary sci-fi thriller.
    • Malignant, a contemporary thriller based on the horrors of human trafficking & sex slavery.

    Note that I also reviewed another Michaelbrent book for my secondary language series.

    I made a decent dent in the H. P. Lovecraft bibliography:

    And since I seem to be grouping things by author so far, let's just keep that up!

    By Sue Burke, I read Semiosis (which I previously reviewed) and its sequel Interference, a sci-fi duology dealing with human contact with aliens who have radically different means of communication.

    By Sheila Finch, I read Reading the Bones, a novel in the Guild of Xenolinguists universe which I also previously reviewed.

    I made a pretty good dent in Terry Pratchett's Discworld series... which I may at some point have to do a short review of addressing the narrative usage of Dwarfish:
    By P. M. Freestone, I read the Shadowscent duology, a low-magic secondary world fantasy about which I previously interviewed linguist & conlanger Lauren Gawne. This consists of:
    By Arkady Martine, I read the Hugo-winning A Memory Called Empire, which I also reviewed. Even without having read the sequel yet, I have many additional thoughts that did not make it into that review, but unfortunately Arkady Martine's publicist has not yet gotten back to me about doing an interview, so I've been holding off on publishing a sequel review.

    In the course of a couple weeks, I knocked out the entire Hugo-and-Nebula-winning Murderbot Diaries series by Martha Wells:
    And finally, I read Rosemary Kirstein's 4-book-so-far Steerswoman series, an excellent entry in the surprisingly large genre of things that look like fantasy but actually turn out to be science fiction with female protagonists written by female authors starting the in the 70s or 80s!

    I acquired a good bit more than this, but my reading rate has not kept up with my acquisition rate....