Saturday, May 22, 2021

Learning Portuguese in _This Darkness Light_

So, I guess this is a Book Blog now?

This Darkness Light is a 2014... horror? Thriller?... novel by Michaelbrent Collings. One of the main characters is Portuguese/English bilingual nurse Serafina Cruz. In contrast to Kill the Beast, which I discussed last time, the multilingualism in this book exists both at the level of the text and intrafictionally.

Although she is perfectly comfortable operating in an Anglophone environment, Serafina's point of view exposes us to memories in Portuguese, and she naturally prefers her native Portuguese over English in stressful situations. Our first exposure to Portuguese is in the following lines:
ʺOnde está a minha filha?ʺ
Serafina stopped walking. Motionless as she had been that day as words spoke through time.
ʺOnde está a minha filha?ʺ
Where is my daughter?
At the lowest level, this is using simple narrative translation (i.e., "the subtitle method")--you show the second language, then you show the explicit translation. This is a slight variation on the basic pattern, as the first instance of second-language text is separated from the translation; i.e., you aren't meant to understand or focus on the literal meaning immediately (or if you do, it's a small-scale Easter Egg). Rather, the first repetition of the Portuguese text serves solely to remove the reader from their familiar context and show Serafina's Portuguese background. That characterization is more important than the literal meaning of the text. This first exposure to Portuguese also begins a larger-scale pattern of Teaching the Reader. This same sentence--ʺOnde está a minha filha?ʺ--is repeated on the very next page without translation. At that short remove, the reader can be expected to remember what it means, and additional exposure, exercising the reader's memory, helps to cement that in their mind.

We encounter the sentence again several chapters later. And a few chapters after that, a second instance of narrative translation:
ʺOnde está a minha filha?ʺ
Where is my daughter?
Later on, we have an instance of Making It Obvious:
(ʺOnde está a minha filha?ʺ)
ʺSheʹs here, Mom,ʺ said Serafina. ʺYour daughterʹs here. Not running anymore.ʺ
And another unmarked usage:
Serafina had never felt like this in a church. Her mother–
(ʺOnde está a minha filha?ʺ)
–had taken her every week, often two or three times a week, until she ran away.
With the context of these usages helping to establish this particular sentence in the reader's mind as a motif for Serafina's memory of her mother. Throughout the book, Collings has thus used a variety of comprehension tactics (narrative translation and Making It Obvious) along with unmarked usages of the target phrase to enact a very simple program of spaced repetition--present the information, force recall of the information, present the information again, etc.--to teach this phrase to the reader, as well as teaching them the emotional context with which it is associated. And that allows Collings to execute this scene:
John touched her cheek. ʺOnde está a minha filha?ʺ he said. Isaiah didnʹt understand the words, but he felt Serafina start trembling. More so when John continued, ʺEla espera por você ainda.ʺ Then she was weeping as he finished, ʺYou have always been a good daughter. She loves you.ʺ
The character Isaiah doesn't understand the words, but you do, and you know what they represent for Serafina beyond their literal textual meaning. (Note that we also have an instance here of Make It An Easter Egg; you don't need to understand ʺEla espera por você ainda.ʺ for the scene to make sense, but it's fine if you do.)

Mixed in with all that, we have a another minor instance of Making It Obvious when Serafina thinks about her mother:
Serafina knew her mother would not approve of such language. Even mentally.
Sorry, mamãe.
If the similarity in form weren't enough, the context makes the meaning of "mamãe" abundantly clear, and continues to establish the association of Portuguese with Serafina's memories of her family.

Interleaved with the spaced repetition of the ʺOnde está a minha filha?ʺ theme, Collings introduces another bit of thematic Portuguese with a variation on explicit narrative translation:
Pai Nosso, que estás no céu….
The Lordʹs Prayer was so beautiful in Portuguese. She looked down at Hershel. He always wore a white lab coat. So proud of that. He was a doctor, and he never let anyone forget it. The lab coat was no longer white.
Santificado seja o Teu Nome….
Then over him. She landed in a puddle of blood. Almost slipped and wished madly that she had some FiveFinger shoes–good traction!
Venha o Teu Reino….
She was past him. Room 752 was at her right. Room 753 coming up on her left. John Doeʹs room.
Seja feita a Tua Vontade….
That was where the prayer stopped in her mind. Thy Will be done.
In the first two lines of this passage, Collings could have simply employed the subtitle method to directly translate the opening of the Lord's Prayer--but in this case, the literal meaning of the text is of almost no importance. What matters is that the reader identify the cultural significance of the text, regardless of whether or not they remember the actual English words--and for that, implicitly telling them that this is part of the Lord's Prayer does the job. At the end of the passage though, the literal meaning of the last line is significant. The prayer transitions from a simple mantra to an expression of the character's intentions. Thus, rather than relying on the reader to be keeping track of where we are in the text by the time we get to that point (which you probably haven't been doing even if you do have the English version memorized, since it was implicitly established as unimportant at the beginning), Collings employs explicit translation to transition the text from mantra to plan in the reader's mind as well as the character's.

Later on, we see another identification of the opening of the Lord's Prayer (along with explicit narrative translation of a one-off variant):
Pai Nosso, que estás no céu….
The words of the Lordʹs Prayer, begun in the Portuguese of Serafinaʹs mother and her motherʹs mother, the prayer that had always comforted her, failed this time to give her comfort or any hope. Instead of hearing the words continue as they should, she heard others.
Pai Nosso, onde você está?
Our Father, where are you?
Now, this theme is not as important as ʺOnde está a minha filha?ʺ, so Collings doesn't spend as much time on it. But given these two labelled repetitions, the reader can be expected to recognize the theme when it shows up for a third time:
She just kept on whispering, over and over.
ʺPai Nosso, que estás no céu….ʺ
[...]
ʺ…santificado seja o Teu Nome, Venha o Teu Reino ….ʺ
ʺSHUT HER UP!ʺ
And in this case, if you have forgotten the identity of the text, that's OK; unlike with the ʺOnde está a minha filha?ʺ theme, all that is necessary for the intended effect here is that you can recognize this bit of Portuguese as something Serafina has memorized, so the spaced repetition patterns does not need to be as well established throughout the book.

This Darkness Light is available in Kindle or Paperback formats; and as usual note that as an Amazon Associate I will get a cut of purchases made through links in this post.

If you liked this post, please consider making a small donation.

The Linguistically Interesting Media Index

Friday, May 21, 2021

The use of French in _Kill the Beast_

Kill the Beast is a 2017 novella by author & illustrator Graham Bradley. It is a new, gaslamp-fantasy take on the "Beauty and the Beast" genre of fairytales, in which the Beauty is a ditz, the Beauty's father is genuinely insane, the Beast really is a dangerous monster, and the self-absorbed town heartthrob... is right. And grows a bit by the end.

Being set in a fantastical version of 19th-century France, the characters are naturally assumed to be "actually" speaking French. But being written for a modern English-speaking audience, the language of narration is, of course, English. Thus, there is an assumed translation convention--the dialogue is written, mostly, in English, because that is what the readers will understand, but we also understand that that English text is just a translated representation of fictionally-underlying French.

And to make the fictional reality of the setting clear, Bradley periodically allows the underlying French to shine through.

The instances in which this is done in Kill the Beast can be broadly categorized into three approaches:

1. Make It Obvious

"Tell us your name, dearie,"
"Je m'appelle Danielle"

A new character stumbles into the tavern (where all the best adventures start) from out in the cold. They are asked their name. They respond with... some French, and a name.

Is there any doubt about what the French dialogue might mean? No! The context makes it obvious. Sure, the author could have put something else there. The character could've been an uncooperative jerk and said something different. But if that were the case, the author wouldn't have chosen that specific place to show the French.

If you have a bit of text that you could simply blank out from the page, and the reader would not be lost because the context allows them to fill in what meaning must have been there, you have an opportunity to employ this strategy. It doesn't matter that the reader doesn't speak the language, because what it means... is obvious.

2. Make It Irrelevant

"Mon frère! It took my brother!"

What did she say first? Eh, I dunno, does it matter? If you know what it says, you know that it doesn't matter; everything you need to know is in the English.

People naturally say a lot of things that don't add much to the literal meaning of a conversation. Natural-sounding dialogue includes them. If you can delete it from the text entirely and the story will still make sense--not just redact it so the reader knows there's something to fill in, but remove all evidence of it--then you have an opportunity to employ this strategy. Most interjections fall into this category. Context usually makes the pragmatic meaning of exclamations like "sacre bleu!", "bon sang!", and so on, obvious enough; but more importantly, an understanding of their literal meaning is just not needed. Putting them in anyway, though, helps with the worldbuilding! 

3. Make it an Easter Egg

"Tais-toi, Leroux. Grab your things, we're going."

A story can be enjoyed without an Easter Egg, but noticing it makes the story just a bit better. The ideal deployment of an Easter Egg leaves a reader who doesn't know the language thinking that it was just another case of "Make It Irrelevant", but reveals something interesting-but-non-critical to the reader who does understand. In this case, the French looks like just another random interjection to get somebody's attention, as we have seen many times before. In reality, though, the speaker has just told Leroux to shut up. Is that a critical distinction to understand what comes next? No. But it does tell you something about the characters' relationship if you do understand it.

In other media, John Carpenter's The Thing makes excellent use of this strategy in the first 5 minutes. If you don't speak Norwegian, that's OK; you're not meant to understand it, and thus it is not subtitled. But if you do understand Norwegian, it adds an extra level of dramatic irony over everything that follows.

Other Strategies

Il était un ami fidèle.
"'He was a loyal friend.' That's quite beautiful, actually."

A fourth strategy--explicit narrative translation-- is marginally present. This is usually employed as an extra-fictional narrative device, as when movies and TV shows use subtitles to make foreign-language dialogue accessible to the audience, but here it occurs entirely in character--a considerably more difficult thing to pull off, which also causes this example to overlap with the "Make it Obvious" strategy.

Kill the Beast is available for free on Kindle Unlimited or as a paperback through Amazon.

Note that, as an Amazon associate, I earn a cut of sales made through the links in this post.

If you liked this post, please consider making a small donation.

The Linguistically Interesting Media Index

Tuesday, May 18, 2021

Alien Communication in _Semiosis_

Semiosis is a 2018 science fiction novel by Sue Burke about the human colonization of an alien world (Pax) where life is a billion years older than it is on Earth. While there are several very bright species of animals on Pax (including some who have mastered the use of fire for hunting and cooking), the principle native intelligences are plants. What need might plants have for intelligence? Well, to more effectively wage chemical warfare with other plants, and, of course, to control useful animals! While mutualistic relationships between animals and plants on Earth are the result of millions of years of co-evolution, producing pairs of bees pollinating flowers or ants sheltering in, being fed by, and defending acacia trees, such relationships on Pax are the result of much faster processes of plants noticing useful animals in their environments and intentionally producing fruits, medicines, and poisons to encourage certain beneficial behaviors by those animals and discourage others.

And humans, being intelligent ourselves, turn out to be both easier to train and much more broadly useful than most native animals.

The novel has a rather unconventional structure, with each chapter switching viewpoints to a new human character in a different generation of colonists (and alternating male and female points of view). This allows for numerous timeskips between "interesting times", which is mostly OK... except for where it skips from the initial human realization that a plant is trying to get their attention and establish linguistic communication right to a time when such communication has been entirely figured out and made routine. This is probably for the best as far as acceptance by a general audience is concerned, but I for one was seriously looking forward to a proper depiction of monolingual fieldwork with an alien plant, and I didn't get it!

In light of my previous post on speech in a weird modality, however, it is worth pointing out that linguistic communication with Stevland (the human name for the primary vegetative character) is carried out, at least initially, entirely though writing; Stevland has chromatophores that can be used to change the coloration of his stalks and write out messages. And as it turns out, the limitations of this medium end up being quite significant to the plot.

Things get more interesting, however, when the humans finally encounter another group of interstellar colonists, dubbed "Glassmakers", who had beaten them to Pax, and communicated with Stevland themselves before losing their technological base and reverting to nomadic barbarism. While Stevland is able to teach the humans the written form of the Glassmaker language (with assistance from a collection of archeological artifacts they had left behind), mastering their spoken language turns out to present some difficulties, and not merely because human vocal tracts are the wrong shape--rather, because the Glassmakers' spoken language is multi-modal--like Donald Boozer's Dritok... but using a different pair of simultaneous modes.

Specifically, the Glassmaker language uses both the acoustic channel and scent, with the Glassmakers being able to produce a large number of different volatile organic compounds at will. Now, there are some significant problems with using odor alone to encode language--notably, that it is rather difficult to keep odors separated and control the order in which they will be perceived on short time scales, making the encoding of syntactic structure close to impossible. Burke seems to be well aware of this, however, not only providing the acoustic channel as a second modality for the language parallel and complementary to scent, but also ensuring that the glosses for odoremes all convey complete useful, common utterances without need for additional syntactic support (although they may require deictic support), making them comparable to human interjections or ideophones; the full list of explicitly-identified odoremes is as follows:

ethanol - welcome / relax
methanol - come
eugenol - what do you see?
2-heptanone - alert
limonene - attack
citronellol - defend
beta-pinene - flee

The acoustic portion of the language is represented in the text through approximate English translation, with distorted grammar. Intriguingly, many-but-not-all Glassmaker sentences are presented as having extra pronouns suffixed to the main verbs, but it is not at all obvious what role these are supposed to play; when they are used and what they coreference lacks any pattern that I was able to identify (and in one case, a proper noun is compounded with the verb instead!)

If this summary interest you, you can get your own copy of Semiosis here. Trigger warning: there is a brief depiction of rape in the second chapter. Also, note that, as an Amazon affiliate, I will get a cut of qualifying purchases made through that link.

If you liked this post, please consider making a small donation.

The Linguistically Interesting Media Index

Sunday, May 16, 2021

Speech in the Electroreceptive Modality

 A few weeks ago, I saw this SciShow video on electric eels that hunt in packs. 

And then I went to the local aquarium where my kids spent several minutes watching the electric eel and the "shock meter" posted above its tank.

So I got to thinking about how electric field modulation might be used as a modality for a proper language. After a little googling, I found a couple of interesting articles: Electrocyte Physiology: 50 years later and Regulation and modulation of electric waveforms in gymnotiform electric fish.

Which reveal that it is possible to modulate electrocyte activity on millisecond timescales, and that (at least one family of) electric fish can produce multiple simultaneous overlapping waveforms.

From which I conclude that it should be totally possible (modulo the fact that we humans happen not to have electrocytes).

What the phonological structure of a language in this modality would be considerably less constrained than our own--it could consist of arbitrary combinations of some finite number of simultaneous formants, rather than the particular types of noises that happen to be obtainable from mechanical vibrations of air in a vocal tract with a limited range of geometries. But any language using this modality would naturally be adapted for use in an underwater environment (since electroception is much less effective in air), and for communication at relatively close range--not quite like protactile communication, because you have the option for lower-fidelity communication at longer range, but kinda similar.

For reference, information in human speech is not encoded in the specific frequencies, but in the relations between frequencies, which allows speech to be frequency-shifted without changing meaning (not the case for, e.g., the tonal language of the aliens from the novel The Jupiter Theft; note that I have linked to the book for convenience, but as an Amazon Associate I earn from qualifying purchases). This is necessary because human vocal tracts come in different sizes! A species using electrostatic communication will have similar constraints, but for a different reason--speaking more "loudly", to cover a longer range, entails a reduction in the maximum available signal frequency, since it takes time to build charge in electrocytes, and a larger charge producing a stronger electric field takes more time to build up. Additionally, electric field waveforms are influenced by hormonal and neuro-structural changes in real electric fishes, so the use of specific frequency bands may hold identifying information, just as human voices do.

So, suppose that our intelligent electric fish can generate up to three independent waveforms simultaneously, with differing maximum frequency components, because that seems biologically plausible based on the "Electrocyte physiology" paper. Since real electric fish can modulate electrocyte activity at scales of a couple of milliseconds, we'll put the top of the speech frequency range around 500Hz; that's considerably lower than the top of the standard human speech range used for telephony purposes, but well within the ranges at which human speech exists, and well above the minimum frequencies that humans can hear--so even though it is surely plausible to encode language in infrasound, we can maintain even more solid plausibility by containing at least the near-range frequency bands entirely within the equivalent of human hearing range.

We can use the lowest-frequency signaling component as an analog to the "voicing bar" in human acoustic phonology. Whatever the fundamental frequency is, it will set the baseline for interpreting all of the higher frequency components. It doesn't need to be expressed all the time (so there can be voiceless segments), but it does need to be expressed frequently, so that listeners don't lose track of the frequency base.

"Voicing" all by itself, however, does not carry linguistic information--it could just be the sound of someone engaged in active sensing. So for any given independent segment, we need at least one additional "formant" (or "just voicing" could be the equivalent of a schwa vowel or something, but I am ignoring that possibility for now)--although there could be "dependent" segments, or sub-segmental features, which involve just the higher formants, or just a single formant. Two-component signals could correspond to vowels, or perhaps more broadly to sonorants--they are unlikely to be "unvoiced" because that leaves you with only a single frequency component which is difficult to interpret as a segment in isolation. Adding in the third "formant" gives you literal con-sonants--sonorants with an additional frequency component added. Just like human acoustic phonology, some segments could be defined by motion of formants... but I am not sure how precise higher-order modulation of electrocyte activity like that might be, so I am more comfortable as a first pass just saying that each significant segment is defined by a fixed pattern of frequency relations, with movement between them being entirely incidental.

So, what can we conclude about this communication medium if we just look at what we know about production, and some generic information-theoretical constraints?

So, suppose that for near-range communication, we use 125Hz as the average base frequency (with some variation between individuals), with 500Hz as the absolute top, soprano-singer level of the available range. For maximum volume, that might scale down to 20Hz base (the lower limit of human hearing) with an 80Hz top. That gives us two full octaves of range in which to place the additional formants--analogous to a typical human vocal range. If we restrict segmental patterns to falling within a single octave, that would allow plenty of room for linguistically-significant tone, if you wanted it, where the whole frequency structure is shifted up or down without altering volume/range.

Unlike human articulatory phonology, there is no obvious reason why there should be limits on the articulation of sequences of consonants vs. sonorants, but the need to bring in the "voicing bar" periodically to establish the frequency baseline means that it makes sense to me to define syllable units that always begin with voicing, and may or may not end with voicing. Segmentation can be further improved (especially if many syllables are voiced throughout) if every syllable has a consonant onset, and a sonorant rhyme--analogous to a human CV syllable structure. The equivalent of unvoiced onsets would be partially-devoiced, leaving a "voiceless vowel" rhyme; additionally, one could have voiced onset / voiceless rhyme, and full voiced syllables, but in any case you get a regular pattern of 3 components, reducing to 2 components (either dropping the voicing bar or the consonant bar), optionally reducing to a single component (voiceless vowel), before reintroducing all three components for the next syllable.

Presuming that you need at least 2 full cycles of the base frequency to identify said frequency, that implies that light syllables could be spoken at a rate of 10Hz, and heavy syllables at a rate of 6Hz, at maximum volume. A typical English speech rate is 3 to 6 syllables per second (and of course people slow down when speaking loudly), so that should be fine!

If total charge is directly proportional to cycle time, that provides a dynamic of 6.25 times in volume between "normal close range voice" (and of course, one could "whisper" by reducing magnitude further without further increasing frequency) and maximum volume, which is a 2.5 times increase in physical range for "yelling at the top of one's voice". Not a lot, but still potentially useful for "talking to one other person" vs. "talking to your whole hunting group". And that advantage is magnified by the fact that you can pack more people within a certain radius in a 3D aquatic environment than you can in a 2D land-based environment.


Now, what if we imagine some species-specific biophysical phonetic constraints? I'll call the relevant creatures "Fysh" (because they are like fish, but aliens). 

Fysh communication is accomplished the modulation of electrostatic fields produced by electrocytes and detected by electroreceptor organs. There are a total of 3 electrically-active tissue systems under active neurological control, providing the possibility of producing 3 independent simultaneous frequencies of electrostatic field modulation.

One of these systems, evolved for active sensing, is semi-autonomic, similar to human breathing; it can be actively controlled, but when not under conscious control it produces a constant low-amplitude background pattern. When multiple individuals are near each other, they will instinctively adjust their frequencies to avoid confusion, with the lower-status individual adjusting to higher frequencies.

While the exact waveform is unique to each individual, it is an approximate sinusoid. This system is always used for the lowest frequency component in linguistic communication.

The other two systems are fully voluntary, and each produce sawtooth waves evolved for hunting and stunning prey. These are mutually indistinguishable from each other, so the specific organ or tissue used to produce a higher or lower formant is irrelevant, and the precise perceived ratio of volumes between these formants may change depending on the relative orientation of the speaker and listener.

The communication channel has a perceptual limit at approximately 20Hz, below which changes in electric field strength are not intuitively perceived by most individuals as being part of a single consistent wave pattern. The upper limit is set by articulatory constraints; Fysh cannot consciously produce frequencies over 500Hz in any of the three systems.

The average fundamental frequency for linguistic communication across all individuals is approximately 125Hz. Individuals can speak arbitrarily "quietly" at any frequency should they so choose, but higher volumes inherently limit the maximum achievable frequency, since there is a minimum time required to build any given level of charge. Shifting down to a 20Hz fundamental allows Fysh to "yell" at a maximum of about 6.5 times their normal volume with some distortion, but clear communication is impossible at higher volumes.

Due to volume restrictions on frequency and individual variations in the natural fundamental, Fysh speech segments are independent of absolute frequency (just like humans') and are defined by ratios within chords. Below 100Hz, Fysh can reliably recognize frequency differences of about 2Hz (finer percentage distinctions are perceptible above 100Hz, but the lower, louder end sets the limits for linguistic usage), resulting in around 10 potentially distinct frequencies per octave. Of course, any given language will not use that many distinctions, but variations in which precise ratios are used can be indicators of different dialects. Also, while I have used octaves as a basis for human reference, just as audio octave equivalence is not a universal experience across human cultures, any particular Fysh culture may or may not actually recognize electrostatic octave equivalence or give it any linguistic significance.

Between segments, it is possible for a Fysh to initiate multiple formants quickly enough to be perceptually simultaneous. Ceasing articulation, however, can only be done one formant at a time, and instantaneous transitions between discrete frequencies are not possible.

Linguistically-significant segmental features always extend across at least two full cycles of the fundamental frequency, giving a maximum speech rate of 10 minimum-length segments per second at the bottom of the frequency range. Faster speech is possible at higher frequencies, but between 6 and 10 segments per second is a typical speed range for average speech frequencies.

Independent phones may consist of chords of any combination of 2 or 3 formants, such that they can be identified by the frequency interval between the formants. The one exception is the "schwa" phone, consisting of the base frequency by itself, which needs no second reference frequency because it has a unique distinguishable waveform.
Dependent phones may consist of one or two non-fundamental formants. These only occur as parts of larger utterances which contain a fundamental formant for reference at some point.


With that foundation more precisely specified, we can now consider the phonology and romanization of one specific language, which I shall identify as Fysh A. 

Fysh A features frequent "devoicing", where the fundamental formant is suppressed. Segments are organized into syllables based on a consistent 3-part chord. The possible "notes" of these chords are:

  1. An arbitrary fundamental frequency, roughly analogous to the human voicing feature.
  2. An "a" note, in a frequency band centered on 4/3 times the fundamental.
  3. An "o" note, in a frequency band centered on 3/2 times the fundamental.
  4. An "u" note, in a frequency band centered on 5/3 times the fundamental.
  5. An "i" note, in a frequency band centered on double the fundamental.

Note that the octave span of these frequency bands is essentially coincidental (i.e., I liked it); other Fysh languages may not have a similar structure. They may have more or fewer vowel frequencies, or they may allow a vowel frequency to overlap with the fundamental, being distinguished by waveform.

Syllables may begin fully voiced, or (exclusively at the beginning of a word) have a delayed-fundamental feature which offsets the initiation of the fundamental formant. All syllables then drop at least one formant, which may be any of the three.

Complex syllables will drop a second formant; the remaining formant cannot be the fundamental. Complex syllables are required word-finally to avoid simultaneous cessation of multiple formants (a universal feature of Fysh languages, although they may not be phonemic in all languages).

Sequential syllables within a word must have at least one matching non-fundamental formant at their boundary. Where a phonemically-simple syllable occurs before another syllable which does not have two matching formants, there is a sub-segmental period in which the non-matching formant is dropped before the new syllable begins.

Between words, a final formant may transition smoothly to a neighboring formant in the following onset, or a sub-segmental length pause may be automatically inserted.

This results in a total of 8 syllable types:

1. Simple Devoiced: syllables which drop the fundamental and are not word-final.
2. Simple Voiced: syllables which drop a higher formant and are not word-final.

These first two types set the basic unit of syllable length.

3. Complex Devoiced: syllables which drop the fundamental followed by a higher formant.
4. Complex Voiced: syllables which drop a higher formant followed by the fundamental.

Complex syllables are the same length as simple syllables, but divide the length evenly among the three parts rather than two.

5 - 8: Voice-delayed. These syllables are one-third longer than non-delayed syllables due to time devoted to the initial unvoiced section. They can only occur word-initially.

Additionally, any syllable can be phonemically lengthened, which extends the time spent on the voiced core.

Epenthetic single-formant subsegments are one-sixth the length of a non-delayed syllable.

Fysh A also features lexical stress, realized as an increase in field amplitude by a factor of about 1.2 compared with immediately adjacent syllables in the same word, with one stressed syllable per word. Stress may also be associated with a proportional drop in frequency and corresponding increase in syllable length, but this is non-contrastive.

Romanized symbols can be used to represent the features of Fysh A syllables in a way that allows them to be mapped onto human pronunciations. As indicated above, single high-formant bands are represented by 4 vowel letters, by analogy with the rhymes of human CV syllables.

Each possible core chord, of which there are 6, is represented by a consonant letter, again by analogy with human CV syllables.

A straightforward mapping of Fysh A segments to Roman letters might use unvoiced letters for 2-part chords and voiced letters for 3-part chords. However, while the following strategy does not intuitively represent the detailed internal structure of a Fysh A syllable, the romanization is shortened and made more easily pronounceable if we instead use sonorant letters for the onsets of syllables which drop the fundamental first (because these can stand as entire syllables on their own), and obstruent letters for the onsets of syllables which drop the fundamental second.

Each syllable type is thus romanized as follows:

  1. Simple Devoiced: A single sonorant letter.
  2. Simple Voiced: A sonorant letter followed by a vowel letter.
  3. Complex Devoiced: A voiceless obstruent letter followed by a vowel letter.
  4. Complex Voiced: A voiced obstruent letter followed by a vowel letter.

Voice delay is indicated by a leading <h>, and <e> indicates phonemic gemination of the voiced core of a syllable.

Lexical stress is indicated by an apostrophe at the beginning of the syllable, unless stress is word-initial.

The conventional selections for voiced obstruent, unvoiced obstruent, and sonorant romanizations, paired with their component vowels, are as follows:

  • z, s, l -> ao
  • x, c, r -> au
  • b, p, m -> ai
  • g, k, w -> ou
  • d, t, n -> oi
  • v, f, y -> ui

Suggested conventional pronunciations are as follows:

  • z /z/; s /s/; l /l/
  • x /ʒ/; c /ʃ/; r /ɹ/
  • b /b/; p /p/; m /m/
  • g /g/; k /k/; w /w/, /u/
  • d /d/; t /t/; n /n/
  • v /v/; f /f/; y /j/, /ɨ/
  • h /hə̥/
  • a /a/
  • o /o/
  • u /ʊ
  • i /i/
  • e /e/, or actual gemination

And that's enough information to write a few computer programs that will generate the complete list of possible syllables, valid sequences of syllables, and words of arbitrary length; and to read the romanization and synthesize an audio representation of the actual electric field patterns that it describes....


So, I am ready to move on to actual words and grammar!

Now, the interesting part of this is simply being in a weird modality, so I don't intend to put in too much effort on a super intricate grammar--just whatever is needed to produce an introduction to the Conlangery Podcast! But being in this modality, and in the sort of environment that permits this modality (i.e., aquatic) will have some influence on the lexicon, and perhaps on the grammar as well. For example, because electric communication is inherently limited in range, the idea of a speech or address or broadcast to a large audience would be entirely absent--large-scale communication would have to rely on multiple steps of person-to-person repetition. So perhaps "podcast episode" ends up being translated with a word meaning something like "an extended-length memorized text for repetition to many people", with "podcast" somehow derived from that.

Without getting too much into other details of the speakers' anatomy, they must obviously have an electric sense, which suggests there should be basic vocabulary for electrical properties of materials--e.g., high vs. low relative permittivity, and high vs. low voltage.  And we should expect deictics and other spatial terminology addressing a full 3D environment (although with the vertical dimension still distinguished from the horizontal, as it is the axis of both gravity and pressure change)...

But none of that is necessary right now to introduce a podcast!

OK, so the Conlangery intro is: "Welcome to Conlangery, the podcast about constructed languages and the people who create them."

Let's suppose there is in fact a cultural tradition of extended memorized texts that can be passed around; those might be called something like "utterance memories". I don't want a word for "word", because words aren't real, and these are aliens, so why should I impose my Anglophone human ideas of linguistic analysis on their vocabulary? But "utterances" are real, and can be of arbitrary length, so there you go!

So I'm thinking the beginning will end up as something like "Welcome! This is an utterance-memory from language-art". For "podcast", I'm thinking I can go as far as assuming that Fysh will have some kind of mythological cycles within which individual stories might be named and extracted, and that could be generalized to refer to other compendia of knowledge. For "language", I'm thinking "utterance-way". "Constructed" and "people" and "create" are fairly basic vocabulary, so the end result is something like this:

"Welcome! Hear an utterance-memory from utterance-way-art, which is a myth_cycle about created utterance-ways and the people who create them."

That requires the following vocabulary:

  • Welcome - <yei>
  • Hear - <lo>
  • utterance - <re'go>
  • memory - <'pama>
  • way / method - <'weza>
  • art - <'tifu>
  • myth_cycle - <mi'feu>
  • create - <'deoza>
  • person - <'yino>

And enough grammar to combine them in the appropriate ways.

Sticking that vocab into some kind of grammatical framework (head-initial, heavily isolating so I don't have to think too hard about morphology in this system), we get:

<Yei> <lo> IMP  OBJ <'pama>-<re'go>. BE_PART REL OBJ <'tifu>-<'weza>-<re'go>, BE_EXAMPLE REL OBJ <mi'feu>, BE_PART REL OBJ "topic" <'weza>-<re'go>, <'deoza> REL SUBJ <'yino> <'yino>, OBJ "topic" <'yino> <'yino>, <'deoza> REL OBJ "it" "it".

So now I can assign phonological forms to grammatical morphs:

  • IMP - eh, let's go ahead and reduplicate that, just like I did for plurals!
  • OBJ - <za>
  • SUBJ - <gu>
  • BE_PART - <'bala>
  • BE_EXAMPLE - <ye'vi>
  • it / that - <ma>
  • REL - <rea>
  • topic - <'hreyi>

And boom, we've got a translation!

Yei 'lo lo za 'pama re'go 'bala rea za 'tifu 'weza re'go ye'vi rea za mi'feu 'bala rea za 'hreyi 'weza re'go 'deoza rea gu 'yino 'yino za 'hreyi 'yino 'yino 'deoza rea za ma ma.

"Welcome, hear 'memory of utterance' which consists in the art of way of utterance which is a myth-cycle which consists in the topic of way of utterance which people create [and] the topic of people which create them."

Which you can also hear at the beginning of this episode of Conlangery.

If you liked this post, please consider making a small donation.

Wednesday, December 23, 2020

On the superpermutations of the family of multisets [1, 1, n]

Inspection of the preliminary results for this family which we discovered last time reveals that some of the minimal-length superpermutations of the multisets [1, 1, 2] and [1, 1, 5] are palindromic.

Conveniently, a brute-force search of the space of possible palindromic superpermutations is considerably faster than a brute-force search of the entire superpermutation space. This is because it is only necessary to find a string which contains half of the full set of permutations (eliminating mirrored pairs of permutations), which can then be mirrored to produce to the second half of a palindromic superpermutation. This requires checking (n/2)! orders in which to combine 2^(n/2) possible subsets of the full set of n permutations, but since the exponential function grows much slower than the factorial function, this turns out to be a net win.

It turns out that all members of this family have palindromic optimal superpermutations of a common form (some have multiple palindromic superpermutations). For n=1 through 9, the relevant results are as follows:

(Note that these examples have all undergone rewriting to ensure that digits have their initial appearances in increasing order, and the palindromic suffix is marked out with brackets.)

[1,1,1]: 12312[1321]
[1,1,2]: 112311213[12113211]
[1,1,3]: 11123111213112[1131211132111]
[1,1,4]: 11112311112131112113[1121113121111321111]
[1,1,5]: 111112311111213111121131112[11131121111312111113211111]
[1,1,6]: 11111123111111213111112113111121113[1112111131121111131211111132111111]
[1,1,7]: 11111112311111112131111112113111112111311112[1111311121111131121111113121111111321111111]
[1,1,8]: 111111112311111111213111111121131111112111311111211113[11112111113111211111131121111111312111111113211111111]
[1,1,9]: 11111111123111111111213111111112113111111121113111111211113111112[1111131111211111131112111111131121111111131211111111132111111111]

Each of these starts with n ones, followed by repetitions of the following pattern:

'2' '1'{0} '3' '1'{n-0} '2' '1'{1} '3' '1'{n-1} '2' '1'{2} '3' '1'{n-2}...

I.e., alternating '2's followed by x '1's, and '3's followed by n-x '1's, with x increasing by one in each chunk of symbols, up to n-1 chunks of ones.

This leads to a straightforward O(n^2) algorithm (because the final length is of order n^2, so it takes at least that much time to output the entire string) for generating minimal superpermutations of any multiset in the family [1, 1, n], with O(1) complexity for generating each subsequent symbol incrementally, with the lengths tightly following the previously determined lower bound for this family:

k = 2n + (n+1)(n+2) + 1

Tuesday, December 22, 2020

An initial exploration of superpermutations of multisets

 A superpermutation of a set is a sequence of n distinct symbols which contains all n! permutations of those n symbols as substrings. A trivial superpermutation just lists all possible permutations one after another, for a total length of n*n!. However, considerably smaller superpermutations can be achieved by overlapping permutations with common prefixes and suffixes.

Without loss of generality, we can label all of the elements of a finite set with integers, and work exclusively with sets of consecutive integers. Each set, and its corresponding set of superpermutations, can then be parameterized with a single number--its size, or cardinality.

The unique superpermutation of the set of size 1 is "1".

The superpermutations of the set of size 2 are "121" and "212" (which are equivalent up to relabelling).

And minimal-length superpermutations are known for sets of up to size 5.

But what if, instead of working with proper sets, in which each symbol must occur exactly once, we decided to look at permutations of multisets instead?

Multisets allow each element to occur any number of times, known as the multiplicity of the element. The (equivalence classes of) multisets are no longer characterized by a single number; instead, we can talk about the support cardinality (the number of different kinds of elements it has), the multiplicities of each of those elements, and the multiset cardinality, which is the sum of the multiplicities. All of that information can be encoded as an ordered list of multiplicities (which, for the sake of consistency and without loss of generality, we can assert must be non-decreasing). Thus, the multiset { 1, 1, 1 } can be represented by the list [3], and the multiset { 1, 2, 2 } can be represented by the list [1, 2].

This provides several different dimensions along which to extend the size of a multiset, with different implications for their effect on the superpermutation.

Multisets with all multiplicities equal to 1 are equivalent to sets of the same cardinality, and have the same superpermutations. Due to the repeated elements, however, multisets with multiplicities greater than one have fewer distinct permutations than sets of the same cardinality, and thus have shorter minimal superpermutations. In the trivial case of multisets with support cardinality 1, they have a single permutation (a list of their single element repeated as many times as its multiplicity), which is also the unique superpermutation--just like sets of cardinality 1.

Things get more interesting for multisets of the form [1, n], with support cardinality 2. These have minimal superpermutations of the form 2{n}12{n}--i.e., the second element repeated as many times as its multiplicity on either side of the first element. E.g., 

super([1, 1]) = 212, equivalent to the superpermutation of the set of cardinality 22
super([1, 2]) = 22122
super([1, 3]) = 2221222

etc.

Each individual permutation consists of inserting the first symbol somewhere in the list of repetitions of the second symbol, and can be obtained by sliding a window of length n+1 across the superpermutation, and the lengths of these superpermutations are trivial characterized by the formula L([1, n]) = 2n + 1.

If we move on to multisets of support cardinality 3, however, things suddenly get considerably more complicated. Adding only a single additional element with multiplicity 1 increases the number of permutations by a factor of n+1 (going from 3 for n=2 to 12=3(n+1) for n=3), as the additional element can be placed in every position in each of the previous permutations. Through exhaustive search, the minimal unique (i.e., with mirrored pairs and pairs which can be turned into each other via relabelling) superpermutations of [1, 1, 2] are

13321331233132313
33123313231332133

with a length of 17.

A conservative lower bound on the possible lengths of minimal superpermutations can be acquired by observing that the number of permutations achievable by sliding a window along the superpermutation must equal or exceed the number of distinct permutations of the multiset. The number of windows of length n over a string of length k, where n<=k, is simply k - n + 1. The permutations of a multiset S are given by the formula C(S)!/P(m! | m in M(S)) where C is the cardinality function, P is the product operator, and M is the multiplicity function; in other words, it is the factorial of the cardinality of S divided the product of the factorials of each of the multiplicities of S. Thus,

k >= C(S) + C(S)!/P(m! | m in M(S)) - 1

For the [1, n] multisets, this formula gives lower-bound lengths of 3, 5, and 7, respectively--so it is in fact a tight bound. For the multiset [1, 1, 2], however, this gives lower bound length of 4 + 4!/(1!*1!*2!) - 1 = 4 + 24/2 - 1 = 4 + 12 - 1 = 15--while the actual minimal length, discovered by exhaustive search, was 17. And in fact, it is easy to see that each of the superpermutations have 2 length-4 substrings which are not permutations of the multiset, accounting for the extra length. This is unsurprising, since it is already well known that this is not a tight lower bound for superpermutations of sets, and multisets are a strict superset of sets.

There are a few ways in which we could try to extend this result--increase the value of n again, to create a parameterized family of [1, 1, n] multisets; extend the number of singleton elements, to create a parameterized family of [1{n}, 2] multisets, with increasing support cardinality; or pivoting entirely to see what happens with [n, n] multisets, with support cardinality 2 and multiset cardinality 2n.

It turns out that increasing the number of unique elements increases the number of permutations much faster than increasing the multiplicity of an individual element. Trying to brute force [1, 1, 1, 2] (with 60 permutations) just causes my software to run out of memory and fall over and die. That could probably be fixed, but it makes it much easier to go in different directions for now.

The Family [1, 1, n]

Doing a complete brute-force search of the superpermutation space for n > 2 takes a very long time; however, some superpermutations closely approaching the lower bound can be found relatively quickly.

The multiset [1 1, 3] has a lower bound superpermutation length of 24. It appears to have optimal superpermutations of length 27 -- 3 more than the lower bound. An example is 332313323313233312333213332.

The multiset [1, 1, 4] has a lower bound superpermutation length of 35. It appears to have optimal superpermutations of length 39 -- 4 more than the lower bound. An example is 323331323333123333213333231333233133233.

The multiset [1, 1, 5] has a lower bound superpermutation length of 48. It appears to have optimal superpermutations of length 53 -- 5 more than the lower bound. An example is 33233313323333132333331233333213333323133332331333233.

Recalling that [1, 1, 2] had a minimal superpermutation length of 17 -- two more than its theoretical lower bound -- then assuming that this empirical pattern holds, this family appears to have optimal superpermutation lengths given by k = C(S) + C(S)!/P(m! | m in M(S)) + n - 1, which can be simplified to

k = 2n + (n+1)(n+2) + 1

The Family [n, n]

The multiset [2,2] has a total of 6 permutations, and a lower bound superpermutation length of 9. Its single unique superpermutation is

1122112121

which has a single length-4 substring which is not a permutation of the multiset, and a total length of 10.

As with the [1, 1, n] family, doing a complete brute-force search for n>2 takes a very long time, but we can find probable solutions fairly quickly.

The multiset [3, 3] has a lower bound superpermutation length of 25, and an apparent optimal superpermutation of length 29. An example is 12221112221211221212122112122.

The multiset [4, 4] has a lower bound superpermutation length of 77, and an apparent optimal superpermutation length of 117. And example is 222121112222111122221121222111212221121212211122122111222121121221211221212112221122112211211222121212122112122211211.

The multiset [5, 5] causes my search process to run out of memory and fall over. :(

The multiset [1, 1], equivalent to the set { 1, 2 } has a minimum bound of 3, and an actual minimal superpermutation length of 3.

The differences from the minimum bound formula are 0, 1, 4, and 20 for n = 1, 2, 3, and 4 respectively.

Unfortunately, there is a large number of matches for the sequence 0, 1, 4, 20... in The Online Encyclopedia of Integer Sequences... and none for 3, 10, 29, 117. So, without a more principled method construction for these superpermutations, I am unfortunately at a loss to properly characterize them.

Friday, December 4, 2020

Why the Braid Group of Order 4 is the Best Braid Group

 See also: A New Kind of Algebra

Consider the braid group of order 1. It has a single element, the identity element "1", representing section of a single strand. If you stick them together... well, 1*1 = 1, and one untwisted strand stuck onto another untwisted strand just gets you... an untwisted strand. Completely trivial and uninteresting.

Consider the braid group of order 2. It has two generators (basic members which cannot be created by combining other members, but which can combine to create the rest) in addition to the identity element: a twist, and an untwist. This is already an infinite group, because you can make an unlimited number of new unreducible things just by adding more twists! But, it's still a pretty uninteresting one. The only elements are just powers of a single generator--because any sequence that contains both a twist and an untwist simplifies to one that doesn't.

Consider the braid group of order 3. It has four generators, because you can put twists between either of the two pairs of adjacent strands. And finally, you get some non-trivial structure, because twists on different sets of strands don't cancel each other out!

But now, consider the braid group of order 4. It has six generators, and commutative relations! Now you can actually get some interesting algebra going on. And if you continue on to the braid group of order 5? Nothing new. After 4, you just get more things that commute with each other. No truly new structure ever appears again.

Furthermore, if we select elements of B4 with specific interesting properties, we get an interesting number...

Suppose we take a subset of B4 such that all elements obey the rules of Celtic knotwork: i.e., any given strand must alternate between over-crosses and under-crosses; no single strand can cross over another or under another twice in a row. All six generators follow this rule trivially, because they are all single crossings. However, we can characterize the remaining members of this set rather simply: they are all of the braids which contain only sequences of the following pairs of generators:

  1. aa
  2. bb
  3. cc
  4. /a/a
  5. /b/b
  6. /c/c/
  7. a/b
  8. b/a
  9. /ab
  10. /ba
  11. b/c
  12. c/b
  13. /bc
  14. /cb
  15. ac (=ca)
  16. a/c (=/ca)
  17. c/a (=/ac)
  18. /a/c (=/c/a)
So, the entire set can be reduced down to 24 basic elements.

Now, imagine that we print the graphical representations of these elements out of cards, which we can physically manipulate. In addition to lining them up to concatenate them (equivalent to group multiplication), physical cards permit a new operation which is not part of the normal braid algebra: 180-degree rotation. Rotation has the following effect on each of the basic elements:
  • r(a)    = c     r(c)    = a
  • r(/a)   = /c    r(/c)   = /a
  • r(b)    = b
  • r(/b)   = /b
  • r(aa)   = cc    r(cc)   = aa
  • r(/a/a) = /c/c  r(/c/c) = /a/a
  • r(bb)   = bb
  • r(/b/b) = /b/b
  • r(a/b)  = /bc   r(/bc)  = a/b
  • r(b/a)  = /cb   r(/cb)  = b/a
  • r(/ab)  = b/c   r(b/c)  = /ab
  • r(/ba)  = c/b   r(c/b)  = /ba
  • r(ac)   = ac
  • r(a/c)  = c/a   r(c/a)  = a/c
  • r(/a/c) = /a/c

So, suppose that we wanted to build a deck which could provide any of these 24 elements with the minimum number of cards. For any rotational pair, we only need to print a single card, since we can get either member of the pair depending on which way we lay it down. That eliminates 9 elements, getting us down to 15. (It's on odd number because one pair of inverses rotate into each other: a/c ~ c/a.)

If you remove double twists, because they just aren't as cool, that gets you 11 basic cards--which is few enough to fit into the space of a 13-card suit in a regular deck! Thus, you can do Celtic braid algebra on 4 strands by assigning a simple braid value to each of a set of regular playing cards, with two left over--and you get four copies per deck, so you can actually do interesting things!

Go up to order 5, and the number of basic Celtic sequences is considerably larger, so even by throwing out double-twists, you can't get it to fit into a normal card deck nearly as nicely.