Wednesday, December 29, 2021

The Language of Power: Unanswered Questions

The Language of Power is the third book in Rosemary Kirstein's Steerswoman series (the three previous entries of which I reviewed herehere, and here).

This will not be a completely spoiler-free review; so, if that bothers you, I recommend clicking on those Amazon affiliate links, buying the books, reading the books, and then coming back here to see what I have to say about them!

First off, let's get the title out of the way: coming off of The Lost Steersman, I was kinda hoping that the Language of Power would be the demon language; but, no. The Language of Power abandons that sidetrack to get back to the main series arc as Rowan returns to civilization and continues trying to track down the mysterious master wizard Slado. In so doing, it picks back up several threads from book 1 that were dropped in books 2 and 3; in particular, we are re-introduced to the boy Willam, who had become apprenticed to the wizard Corvus--and from him, we learn that the Language of Power is actually computer programming. Or maybe mathematical physics... but probably computer programming. Much to my disappointment, there is no serious linguistic exploration going on here, but Willam's knowledge of "magic" and his attempts to explain it to Rowan give the most complete and straightforward evidence so far of the science-fictional, rather than fantastical, nature of this world.

Unlike the previous three books, which range over wide geographic areas, this volume takes place almost entirely in, or near to, the city of Donner, where Rowan has come following leads from an old Steerswoman's journal in order to try to uncover the history of Slado's origins and the fall of the guidestar. It's a fun combination of historical research, detective story, and even spy thriller when Willam spearheads a plan to break into the wizard Jannik's house to access records through his computer and satellite dish link (although, of course, the relevant equipment isn't actually called that in-world).

Unfortunately (or, I guess, fortunately for the author, because it means I will absolutely buy the next as soon as it comes out), we are still left with a number of unanswered questions at the end:

  1. Will Willam ever actually get his revenge on the wizard Abremio and get his sister back?
  2. How do Basilisks work? This volume features a detailed explanation of dragons, but that still leaves basilisks unaccounted for; what is the scientific explanation for what is presumably a warmachine that looks like an animal, kills with its gaze, and is as dangerous to its own side as to the enemy?
  3. And, of course, can the demon language be deciphered?

If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Monday, December 27, 2021

Rosemary Kirstein vs. The Enderverse

The Lost Steersman is the third book in Rosemary Kirstein's Steerswoman series (previous entries of which I reviewed here and here, and the next of which I review here).

This will not be a completely spoiler-free review; so, if that bothers you, I recommend clicking on those Amazon affiliate links, buying the books, reading the books, and then coming back here to see what I have to say about them!

Now, here is a book that gets into some serious linguistic speculation! And somehow, the chunkiest of the currently-published volumes also manages to spend most of its time on a hard left turn away from the main series plot, while diving deep into things (such as the eponymous Lost Steersman, and the nature of demons) which were hinted at as background in the first two books and providing a huge amount of fascinating new worldbuilding detail.

The worldbuilding exploration is instigated by a series of demon attacks on the town of Alemeth--which conveniently gives us, the readers, as well as Rowan, our first insight into what "demons"--which have so far existed firmly "out of frame"--actually are. The subsequent discovery that the formerly-lost Steersman is in fact the only person to have acquired a "magical" talisman that can keep the creatures at bay--and that this is in fact intimately connected with the story of how he became lost to the order and later found again by Rowan--kicks off an adventure well beyond the boundaries of the known world to discover where all of these demons, previously presumed to be nearly extinct, are coming from, and why.

In the end, the story of the lost Steersman and the demons seems to me to bear a great deal of resemblance to Orson Scott Card's Speaker for the Dead (book 2 of the Ender series), which deals with the interactions of humans with the alien Pequeninos on their homeworld Lusitania. To recap (or just spoil it if you haven't gotten around to reading a book from 1986 yet), the Pequeninos' reproductive cycle requires males to die--and thus ritualistically killed, such that they can reproduce, is considered a major honor. One which, of course, they wish to provide to their favorite human friends, without anyone bothering to talk about the situation first, even though there's enough of an established relationship that they totally could have, thus leading to serious misunderstandings. In fact, this isn't actually all that far off from the background inciting incident for the whole Ender series, in which the alien Formics don't realize that individual humans are sentient, and so don't think accidentally killing a few is a big deal.

Now, the Ender series are not bad books; in fact, they are very commercially successful books! But I don't think it should be too controversial to claim that the repeating circumstances in which aliens don't realize that killing humans is "bad, actually" are a little bit... contrived.

The conflict between humans and demons in The Lost Steersman is, on the other hand, much more reasonably motivated, with causes firmly rooted (at least in part) in human history--conquest of territory, and driving back of invaders. Initially, humans do not realize that demons are, in fact, people--and, most likely, demons don't realize that humans are people either. That's not too far off from the situation between Formics and humans in the Enderverse... except that "first contact" in the Steerswomen's world is between low-tech, nomadic peoples who have no means of communicating with each other, which makes the mistake far more reasonable than when you first encounter an alien in a friggin' starship. And, well, even if humans had figured out that demons are people much earlier on... let's be honest, they would've committed committed genocide anyway to take over their land. The humans of the Steerswomen's world are, after all, in the process of terraforming it and destroying all native life to make way for human habitation (as revealed in the last book), even if the current major players don't remember that that's what their cultural traditions are for.

Key to the misunderstanding is that demons do not communicate vocally, but rather visually--through sculpture. So when the lost Steersman is shipwrecked on their shores, losing evidence of their much more advanced maritime technology, the native demons see only unusual invasive animals to be driven away or eaten--and the humans see only monstrous animals trying to kill and eat them, and quite reasonably retaliate! Eventually, Rowan is able to deduce that the "magical talisman" that allowed the lost Steersman to survive and make his way back to human civilization--eventually bringing vengeful demons in pursuit--is (of course!) not magical at all, but merely a physical word in the demons' language, which they recognize and respond to.

Unfortunately for other humans, this word is impossible to replicate, because demon words are not sculpted from environmental materials; rather, they are produced by demons' biology. In particular, they are an exaptation of females' ability to excrete material for forming egg cases--which means that only female demons are actually capable of direct speech! Males can understand, and can collect discarded utterances and re-arrange them to communicate, but only in secret, as (at least in the particular demon society to which we are introduced), they are immediately executed if caught trying to speak. And this is in fact depicted as a complete language, not just a finite code, with females capable of gluing together individual word-objects into larger 3D sculptural discourses--although there does seem to be some innate, instinctual component to it, based on demons' consistent reactions to the talisman that keeps our human protagonists safe among them.

Sadly, the demon language is not actually deciphered to any significant extent, and I am skeptical that it has actually been worked out as a proper conlang due to the alien boldness of the premise. Nevertheless, it is a fascinating and bold premise, and I would love to see someone try to create something that would fit the descriptions in the book!

We also see a bit more exploration of sign language, which featured in book one, as a result of Rowan attempting to decode and replicate (as well as a human with only two arms can) the demons' paralinguistic postural body language (which is the only method of spontaneous communication available to males). If the demons can be made to understand the concept of sign language, I would love to see this used as a bridge for communication between demons and humans in future stories. 

Stay tuned for my thoughts on the next entry in this series; and in the meantime, if you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Friday, December 24, 2021

The Steerswoman & the Outskirters

The Outskirter's Secret is the second book in Rosemary Kirstein's Steerswoman series (the first entry of which I reviewed here, the third here, and the fourth here).

This will not be a completely spoiler-free review; so, if that bothers you, I recommend clicking on those Amazon affiliate links, buying the books, reading the books, and then coming back here to see what I have to say about them!

This episode of the series sees Rowan, the eponymous Steerswoman, becoming embedded in the Outskirter culture from which her friend Bel originates as they travel across Outskirter territory on the way to the site of the fallen guidestar. As such, the story is less about the overall series arc introduced in book one, and more of a deep exploration of the world itself, in which the author continues to employ dramatic irony that relies on the fact that the reader knows more about physics and technology than the characters in the book do, and thus can make faster inferences about the nature of said world. This helps to firmly establish the series as science fiction, rather than fantasy--a distinction which I, at least, still found somewhat blurred by the end of the first book. Although not every fantastical object or occurrence is explained, there is enough of a pattern here to convince me that there is a scientific explanation for everything, even if it has not yet been revealed to the reader.

There are two notable bits of linguistics; one of the reasons I was convinced to start reading this series in the first place was the presence of Old English poetry. Sadly, there isn't any actual Old-English-style poetry presented in the text, but the structural specifications of Old English poetry are identified by Outskirters as the signal of a true poem--that is, a poem which has been composed with the intent to convey real historical information via oral tradition. I thought that was neat.

The second bit of interesting linguistics comes when Rowan elicits a listing of all of the Outskirters' family lines... which are listed in English alphabetical order! Except for one, which Rowan notices is out of place--a fact which she attributes to historical sound change in the Outskirter culture. This is a major clue that this world is in fact a sci-fi human colony world, rather than a fantasy secondary world, with a direct historical connection to Earth, and furthermore indicates that either

  1. Contrary to expectations for secondary-world fantasy stories, there is not in fact any narrative translation convention employed for this story; the characters who are presented as speaking English on the page, are in fact intended to be speaking English intrafictionally!
  2. Or, whatever translation convention there is is nevertheless very thin, only adapting a future version of English in which common contemporary names at least are still recognizable in written form back into a contemporary dialect for the audience.

Apart from the linguistic bits, though, I also found it fascinating how the portrayal of the Outskirter culture partially embraces and partially undermines an ancient literary trope which Bret Devereaux terms The Fremen Mirage. Societies of the Mirage are portrayed as:

  1. Unsophisticated and Poor.
  2. Morally Pure, highly in gender roles, sexual purity, and abstinence from physical luxuries.
  3. Ruthless and Clever. Unsophisticated, yes, but not stupid.
  4. Martially Superior and quick to make war.
  5. Having a logical basis for their peculiarities, usually rooted in the environment (cf. "Hard times make hard men."), or, in more modern versions, in genetics and race.
  6. Existing in contrast with decadent civilization.
So, how do the Outskirters match up with each of these features?

They are absolutely unsophisticated and poor. They are nomadic, and have no possessions which they cannot carry with them. They have extremely limited food supplies. They have no ability to produce metal, and little enough access to it that the make swords from dense wood.

They are ruthless and clever, and perceived, both by themselves and by most Inner Landers, as excellent fighters; however, when tested, they are not inherently better than Inner Landers who have had martial training (although they do, quite naturally, have a better grasp of the dangers unique to their native environment than do Inner Landers who visit them). They are definitely quick to make war, and extremely untrusting of anyone not belonging to their same tribe.

They certainly have a logical basis for all of their peculiarities, even though they themselves do not always know what that basis is. In fact, identifying the reasons for all of the strange practices that Rowan is told about or introduced to among the Outskirters could be argued to be the central purpose of this book (sure, Rowan gets to the guidestar in the end, but that's just what happens, not what the book is really about), as it establishes the science fiction background of the setting. Additionally, the Outlanders themselves believe their peculiarities to stem from their own genetic superiority and racial responsibilities; in particular, they have a strongly held belief that they are the original humans, either separate from or ancestral to Inner Landers, and that being Outskirters requires them to live the way that they do.

They definitely exist in contrast with what they perceive as a decandent Inner Lander civilization; but the author, and the Inner Lander characters themselves, do not share that perception. More interestingly, however, the more-outer Outskirters believe themselves to be morally superior to culturally-decayed versions of themselves that exist closer to the Inner Lands--a belief which does seem to be shared by Rowan once she is able to compare them.

Note that I skipped over addressing point 2, because this is where the Outskirters most seriously diverge from the core of the Fremen Mirage trope. In a sense, the Outskirters do put significant moral weight on gender roles--but, specifically, in not having many of them. In almost all ways, men and women are treated completely equally in Outskirter society. They are also portrayed as quite sexually liberal, and while the particular customs around sexual encounters are new and weird for Rowan, the liberality is seemingly unremarkable. Given the harshness of their environment, which might reasonably be predicted to force significant division of labor along gendered lines, I can only see this as an attempt by the author to deliberately subvert the trope of sexual morality being connected to sexual conservatism, instead showing the Outskirters as moral because of their sexual liberation, in line with more progressive contemporary views on sexuality.

Stay tuned for my thoughts on the next entry in this series; and in the meantime, if you liked this post, please consider making a small donation!

Saturday, December 11, 2021

Conlanging with vec2word

Word2Vec is a family of machine-learning based NLP (Natural Language Processing) algorithms which encode the semantics of words as vectors on a unit hypersphere--that is, the meaning of each word is encoded as a big list of numbers (a vector) whose sum-of-squares is 1. The smaller the angle between any two vectors, the more similar their associated words are in meaning. These kinds of models let you do some kinda neat stuff, like evaluating the semantic similarity between documents (which is useful for fuzzy searches), and doing arithmetic on vectors to complete analogies (e.g., "king - man + woman = queen").

It occurred to me (thanks to a Zoom discussion about a taxonomic philosophical language based on Semitic-style triliteral roots) that one could automatically generate vocabulary with taxonomic structure by starting with a word vector model, sorting the vectors along each dimension, and then mapping each vector entry to a phoneme based on its position along that dimension. Going not from source language words 2 vectors... but from vectors 2 conlang words.

(Side note: in theory, it would make more sense to convert semantic vectors into polyspherical coordinates and factor out the redundant radius dimension first... but in practice, sorting by all-but-one Euclidean coordinate gets you the same groupings, just in a possibly-different order [e.g., think about expressing your latitude and longitude in degrees, vs. miles north or south of the Earth's core and miles under Null Island, respectively; longitude coordinates end up sorting differently, and the scale in not linear, but coordinates that are close in one scheme are still close in the other], and is way more computationally efficient--not because there's anything special about Euclidean coordinates, but because pretty much every model ever already comes in that format, and the fastest conversion operation is the one you never do.)

Of course, decent models tend to have between 100 and 300 dimensions, which would make for really long words... but, we can do a neat thing called Principle Component Analysis (PCA) to figure out which vector components are the most important (i.e., in this case, carry the greatest semantic load), which means we just choose however long we want our words to be, extract that many principle components, and then proceed as before.

Additionally, if we don't allow an arbitrarily large number of phonemes to scatter along every dimension (which we don't, because languages have finite phonemic inventories, and rules about what phonemes can appear in what contexts), we won't necessarily be able to give every vector in the model a unique form. Thus, we will have to group semantically-similar source words together in the output. This is actually a good thing, because we don't want to just create an algorithmic relex of the source language (uh... unless you actually do, in which case, go for it, I guess!); rather, the list of source words associated with any given output word can serve as exemplars of a more general semantic field from which the precise definition of your new conlang word can be picked. It's not completely-automated, definitions-included word generation, but it is a lot easier than coming up with new words and definitions completely from scratch! And, even if you aren't intending to create a proper taxonomic language, this approach to producing semantic prompts for specific word forms can help produce a conlang lexicon which naturally contains discoverable sound symbolism patterns, without obvious taxonomic morphology.

Having realized that we will need to do some clustering, there are a few ways we could go about that. We could try just dividing the semantic space into rectangular regions, by dividing up each dimension individually, either at regular intervals or adjusted to try to get an equal distribution of source words in each cluster (and of course we can do that in Euclidean or polyspherical space)... but natural boundaries in semantic space aren't necessarily rectangular, and forcing rectangular clusters can end up putting weirdly different source words together in the same bucket. If you want that, to give you more flexibility in choosing which way to go with a definition, cool! But, there's another option: K-Means Clustering, which takes a set of points (i.e., vectors) and a number of clusters to group them into, and tries to find the best grouping into that number of clusters, dividing the space into Voronoi regions around the cluster centers. The number of clusters can be determined from the number of possible forms that are available, and then forms can be assigned just based on the locations of cluster centers.

(Another side note: it turns out that K-Means clustering of points in a sphere in Euclidean space produces exactly the same results as clustering points in polyspherical coordinate space--so once again, no coordinate transformation is required!)

If you choose to do K-Means clustering, there is then also the choice to perform clustering before or after doing dimensionality-reduction with PCA. Even though dimensionality reduction with PCA keeps the most important information around, when you are going from 300 dimensions down to, say, 3 (for a triliteral root), or even 10 (for a set of pretty darn long words), the stuff that gets thrown out can still be pretty important, so PCA will smush a bunch of stuff together which genuinely does have some kind of objective semantic relation... but whose relations you might be hard-pressed to actually figure out from the list of exemplars! Again, that could be a plus or a minus, depending on what you are going for (and clustering post-PCA will be more computationally efficient), but if you want the most semantically-coherent categories, clustering should be done prior to PCA.

That is the process that I have currently implemented in the vec2word Python program, hosted in the conlang-software-dev GitHub organization. And for this Lexember, I have been using prompts generated from a filtered (numbers and proper nouns removed) version of a word2vec model generated from a 2017 Wikipedia dump to create vocabulary for 2 new conlangs. This process might not work for everybody, but it's been my most consistent and productive Lexember to date! The exact process I have developed around this tool is not quite what I thought it would be when I first conceived of it, and is in fact different for each language, so I might write up some more on that later. But for now, the software is there for other people to try out, and I wanna get more some more experience through the end of Lexember to solidify my process thoughts before putting them out here for the world.

Additional Thought: Useful semantic relations could be automatically extracted by taking the differences of vectors (e.g., "woman - man = queen - king") and looking to see how often applying that difference to some other vector yields another known word. Such vectors are not guaranteed to correspond to any existing regular morphological process in the source language used to build the model, but that's just perfect for providing inspiration for new stuff that you could do in a conlang!

I have not implemented this yet, partly because it would be extremely computationally expensive, but I am very tempted to.

And, as always, just in case you feel like giving me money: you can do that right here.

Tuesday, December 7, 2021

The Transgalactic Guide to Solar System M-17

I have been meaning to review the next three books in the Steerswoman series for months now, and continually failing to actually do so. And when I started reading The Transgalactic Guide, I had no idea that it would present me with any reason to review it here. But then it went and had alien language content (and what was I supposed to do? Not blog about that!?), so here we go...

The Transgalactic Guide to Solar System M-17 (beware the Amazon Affiliate link!) by Jeff Rovin is satirical science-fiction travel guide, supposedly published by the Transgalactic touring company to describe the tourist attractions and accommodations available in the eponymous solar system M-17 (but actually published by the Perigee imprint of the Putnam Publishing Group in 1981). As science fiction, it is decidedly archaic; the author doesn't really give a crap about physical, chemical, or ecological plausibility, such that aside from the genre trappings of spaceships and alien planets, it's really more a work of fantasy--think C.S. Lewis's Space trilogy with less plot, less allegory, and more description of the bad science. Rovin makes repeated use of the "make them alien by not giving them eyes" trope (which Wayne Barlow has applied to much greater effect), which is sometimes justified and sometimes... not so much. Nevertheless, for the modern author it may provide a decent source of inspiration for weird and interesting environments and creatures, if you are willing to do the work to clean them up a bit for modern audiences.

But the reason I am bothering to review it here is that each of the 5 worlds of M-17 has at least one, and sometimes several, native alien languages which are represented in the text, along with brief tourist glossaries. As conlangs go, they are also.. not great, although there are some neat ideas. There is some decent effort put into the Alladis logography (which is supposedly tactile in nature), and the basic idea of an Oleran scent-based language (a concept which is developed in more detail in the Semiosis duology).

Excerpts of alien languages are frequently integrated into the text to refer to alien concepts or proper nouns. For the most part, a very straightforward translation strategy, using appositives or parenthesized translations, is employed--and that's really all you would expect from something presenting itself in the style of a travel guide! However, I found it notable that chapter 3, on the planet Morana, actually attempts to Teach The Reader, employing italicized alien words untranslated to refer to objects and locations after they are first introduced; the attempt is not particularly skillful, but it is there!

If you liked this post, please consider making a small donation.

The Linguistically Interesting Media Index