Showing posts with label conlang. Show all posts
Showing posts with label conlang. Show all posts

Tuesday, March 25, 2025

Some Thoughts on Iljena

    Iljena is an alien conlang by Pete Bleackley, also the author of Khangaþyagon which I reviewed previously.

    The key conceit of Iljena is that all words encode both a nominal root and a verbal root--and based on both the grammar notes and the dictionary, there are no other parts of speech. All verbs are monovalent, and you construct large propositions by chaining together noun-verbs that describe what each participant is doing. It's sort of like the disambiguation strategy sometimes employed in natlangs where a transitive clause that lacks distinctive subject and object marking (like two neuter nouns in a positive-polarity Russian sentence, or nouns with equal animacy in a direct-inverse language) can be split in two with an antipassive clause and a passive clause--i.e., instead of "Bob saw Bill", "Bob saw, Bill was seen". Except that Iljena doesn't have a passive construction, it just has enough different verb roots to cover all the necessary meanings, whether one is an agent or patient or instrument or whatever in any particular scene.

    With the lack of any other parts of speech, however, it is unclear how boundaries between clausal constituents are determined, how attachment ambiguities might be resolved, or how references to events-as-things are made, and the only ordering constraint is that 

Word order is used to convey the flow of the action between the participants, and to bring together closely related participants.

    However, David Gil has shown us that you don't really need to formalize all that grammatical machinery all of the time, and the corpus of Conlang Relay texts in Iljena, which have been translated reasonably faithfully by following relay participants, demonstrates that it does work well enough. Pete's own documentation notes that Iljena could be considered a "verbless" language, based on the idea that verb roots could instead be interpreted as noun cases (which is one of the possible solutions to verblessness I discussed in my own article How To Not Verb), but he (and the fictional Leyen people who speak Iljena) prefer to think of the relevant open lexical class as verb roots, rather than case morphology--and I tend to agree. The complete lack of function words makes Iljena a decidedly non-human language, but that's fine--it's not supposed to be!

    As noted, Iljena does seem to work just fine as it is, so I won't presume to suggest improvements--but I think it would be neat to see a language that takes the one verb--one noun approach and embeds it in a larger system of grammatical function words for eliminating structural ambiguities. And it would also be neat to see some more detailed analyses of the existing corpus texts, beyond simple interlinear glosses, that might be able to extract more empirical rules about Iljena grammatical structure.

Some Thoughts... Index

Sunday, October 20, 2024

On the Tjugem Alphabet & Font

This Bluesky thread with Howard Tayler reminded me that, although I posted progress updates about it on Twitter back in the day, I never did a comprehensive write-up on how the thing works.

    A good place to start is this Reddit comment on Toki Suli.Yeah, it's not Tjugem, but phonetically it works the same way. Quote:

in the WAV files, the 'm' sounds seem to be going up rather than down, such as with "mi", even though the "m" is supposed to be grave. sharp and acute sounds seem to go down rather than up, such as in "tu".

is the linguistic term for "downward" vs "upward" the opposite of what i'd expect from a western music theory perspective? or am i maybe missing something as i'm listening to the files?

    Yes, Reddit user, you were missing something! Because in the phonetics of human whistle registers, "grave" and "acute" are positions. not motions. So, if you move from a vowel to a grave consonant, the formant will go down in pitch--from a middle-pitch vowel locus to a low-pitch consonant locus. But when going from a grave consonant to a vowel, pitch will go up--from a low-pitch consonant locus to a middle-pitch vowel locus. An "m" in between two vowels willl be realized by a down-then-up formant motion, while a "t" between two vowels will be realized by an up-then-down motion.

    Now, because whistled speech only has a single formant, it turns out to be not-unreasonable to write whistled speech as an image of the formant path on a spectrogram. You can just write a continuous line with a pen! Or, almost. There are some details--like amplitude variation--that are lost if you try to write with a ballpoint, and still difficult to get right if you write with a wide-tip marker or fountain pen. Thus, a few extra embellishments and decorations are useful, but that is the basic concept: each letter is just the shape that that letter makes on a spectrogram when pronounced. And with just that background, you should be able to start to make sense of this chart of Tjugem letters, as they would be written on lined paper:


    The correspondence between Tjugem glyphs and the standard romanization is as follows:

   
    Keep in mind, however, that the actual phonemes are whistles--not sounds that are representable with the IPA, despite the fact that the romanization is designed to be pronounceable "normally" if you really want to. And for the sake of space, only the allographs for one vowel environment are shown for each consonant. The G glyph is not so much a "glyph" as a lack of one, which is why it does not show up in the first image; acoustically, the phoneme is just a reduction in the amplitude of a vowel, represented by a break in the line. Thus, any line termination could be interpreted as a G. That necessitated the introduction of the line termination glyphs, which have no phonetic value but just indicate that a word ends with no phonemic consonant. The above-line vs. below-line variants of the Q glyph are chosen to visually balance what comes before or after them. Additionally, the "schwa" vowel (romanized as "E") is not represented by any specific glyph. The existence of a schwa sound in the first place is an unavoidable artifact of the fact that transitioning between certain consonants requires moving through the vowel space, but which vowel loci end up being hit isn't actually important. So, in the Tjugem script, the schwa just turns into whatever stroke happens to make the simplest connection between adjacent consonants.

    You shouldn't be expected to always be writing on lined paper, which explains the extra lines--a mark above or below a vowel segment tells you whether it is a high vowel or a low vowel, for those curves which could be ambiguous. And the circular embellishments help to distinguish manner of articulation for different consonants, which have the same spectral shape but different amplitude curves, which would otherwise have to be indicated by varying darkness or line weight. But note in particular that every consonant comes in a pair of mirror-symmetric glyphs: one moving from the vowel space to the consonant locus, and one moving from the consonant locus to the vowel space. And there are three different strokes for each half-consonant depending on which vowel is next to it! Making for a total of six different strokes for every consonant, because the actual spectral shapes of consonants change depending on their environment! It's allophony directly mirrored in allography.

    This makes creating a font for Tjugem rather... complicated. Sure, we could assign every allograph to a different codepoint, but that would be very inconvenient to use. It would be nice if we could just type out a sequence of phonemes, one keystroke per phoneme, and have the font take care of the allographic variation for us! Is that sort of thing possible? Yes! Yes, it is!

    The individual letter forms get assigned to a list of display symbols, specifying every possible consonant/vowel pairing:
# i_t i_d i_n i_k i_g i_q i_p i_b i_m
# a_t a_d a_n w_a_k j_a_k a_g w_a_q j_a_q a_p a_b a_m
# u_t u_d u_n u_k u_g u_q u_p u_b u_m
# t_i d_i n_i k_i g_i q_i p_i b_i m_i
# t_a d_a n_a k_a g_a q_a p_a b_a m_a
# t_u d_u n_u k_u g_u q_u p_u b_u m_u
# i_i j_a j_u_a j_u
# u_u w_a w_i_a w_i

and the slots for the romanized letters that we actually type out (a b d e g i j k m n p q t u w) are left blank. Contextual ligatures are then used to replace the sequence of input phonemes with an expanded sequence of intermediate initial, final, and transitional symbols, which are then finally substituted by the appropriate display symbols, which are then used to look up the correct alloglyphs. Then, it we update the boring straight-ruled glyph set with a slanted, more flowy-looking version, we can get a calligraphic font slightly reminiscent of Nastaliq, where lines can overlap each other because the ornamentation disambiguates; the Tjugem Tadpole script:



Monday, August 12, 2024

Mapping out Tetrachromat Color Categories

tetrachromacy is kind of convenient because you still have only 2 dimensions of hue, so you can actually diagram out what the color regions are, and just tell people "y'all already know how brightness and saturation work, so I don't need to put those on the chart".

    But I didn't actually try to make such a diagram. However, the last two episodes of George Corley's Tongues and Runes stream on Draconic gave me a solid motivation to figure out how to do it.

    Any such diagram will have to use some kind of false-color convention. We could try subdividing the spectrum to treat, e.g., yellow or cyan like a 4th physical primary for producing color combinations, and that might be the most accurate if you're trying to represent the color space of a tetrachromat whose total visual spectrum lies within ours, just divided up more finely--but the resulting diagrams are really hard to interpret. It's even worse if you try to stretch the human visible spectrum into the infrared or ultraviolet, 'cause you end up shifting colors around so that, e.g., what you would actually perceive as magenta ends up represented as green on the chart. The best option I could come up with was to map the "extra" spectral color--the color you can't see if it happens to be ultraviolet or infrared--to black, and use luminance to represent varying contributions of that cone to composite colors. Critically, if you don't want to work out the exact spectral response curves for a theoretical tetrachromatic creature to calculate their neurological opponent channels, you can map out the color space in purely physical terms, like we do with RGB color as opposed to, e.g., YCrCb or HSV color spaces. That doesn't require any ahead-of-time knowledge of which color combinations are psychologically salient.

    My first intuition on how to map out the 2D hue space was to arrange the axes along spectral hue--exactly parallel to the human sense of hue--and non-spectral hue, which essentially measures the distance between two simultaneous spectral stimuli. As the non-spectral hue gets larger, the space that you have to wiggle an interval back and forth before one end runs off the edge of the visible spectrum shrinks, so the space ends up looking like a triangle:

    This particular diagram was intended for describing the vision of RGBU tetrachromats, with black representing UV off the blue end of the spectrum; you could put black representing IR at the other end, but ultimately the perceivable spectrum ends up being cyclic so it doesn't really matter. If you want the extra cone to be yellow or cyan-receptive, though.. eh, that gets complicated, and any false-color representation will be bad. But, that highlights a general deficiency of this representation: it does a really bad job of showing which colors are adjacent at the boundaries. The top-edges spectrum is properly cyclic, but otherwise the edges don't match up, so you can't just roll this into a cone.

    Another possible representation is based on the triangle diagram of trichromat color space:

    Each physical primary goes at the corner of a simplex, and each point within the simplex is colored based on the relative distance from each corner. This shows you both hue, with the spectrum running along the exterior edges, and saturation, with minimal saturation (equal amount of all primaries) in the center. We can easily extend this idea to tetrachromacy, where the 4-point simplex is a tetrahedron:

    The two-dimensional hue space exists on the exterior surface and edges of the tetrahedron, with either saturation or luma mapped to the interior space. Note that one triangular face of the tetrahedron is the trichromat color triangle, but the center of that face no longer represents white.If we call the extra primary Q (so as not to bias the interpretation towards UV, IR, or anything else), the the center of the RGB face represents not white, but anti-Q, which we percieve as white, but which is distinct from white to a tetrachromat. This is precisely analogous to how the center of the dichromat spectrum is "white", but what a dichromat (whose spectral range is identical as hours) sees as white could be any of white, green, or magenta to us. Similarly, what we see as white could be actual 4-white, or anti-Q.

    Since the surface of a tetrahedron is still 2D, we can unfold the tetrahedron into another flat triangle:

    Here, in is unfolded around the RGB face, but that is arbitrary--it could equally well be unfolded around any other face, with a tertiary anti-color in the center, and that would make no difference to a tetrachromat, you as spinning a color wheel makes no difference to you. Note that, after unfolding, the Q vertex is represented three times, and every edge color is represented twice--mirrored along the unfolded edges. This becomes slightly more obvious if we discretize the diagram:

    Primary colors at the vertices, secondary colors along the edges, tertiary colors (which don't exist in trichromat vision) on the faces. This arrangement, despite the duplications, makes it very easy to to put specific labels on distinct regions of the space--although the particular manner in which the color space is divided up is somewhat artificial. And the duplications actually help to show what's going on with the unfolded faces--yes, the Q vertex shows up three times, but note that the total area of the discretized region around the Q vertex is exactly the same size as the area around the R, G, and B vertices.

    If we return to the trichromat triangle, note that you can obtain a color wheel simply be warping it into a circle; the spectrum of fully-saturated hues runs along the outside edge either way. Similarly, we can "inflate" the tetrahedron to get a color ball.

    If we want it flattened out again, any old map projection will do, but we have to keep in mind that the choice of poles is arbitrary; here's the cylindrical projection along the Q-anti-Q axis:

    And here's a polar projection centered on anti-Q:

    This ends up looking quite a lot like a standard color wheel, just extended past full saturation to show darkening as well as lightening; note the fully saturated ring at half the radius. However, the interpretation is quite different; remember, that center color isn't actually white. True tetrachromat white exists at the center of the ball, and doesn't show up on this diagram. And the false-color black around the edge isn't just background, it's the Q pole. If you need extra help to get your brain out of the rut of looking at this as a trichromat wheel, we can look at 7 other equally-valid polar projections that show exactly the same tetrachromatic hue information:

The Q pole.
The B pole
The anti-B pole.
The R pole
The anti-R pole
The G pole
The anti-G pole
(I probably should've done some scaling for equal area on these; the opposite poles end up looking like they take up way more of the color gamut than they actually do, and the false-color-black Q pole ends up getting washed out as a result. But I don't really expect anybody to use these alternate projections for labelling regions of hue--they're just to help you understand that the space really is a sphere, not a wheel!)

    And we could produce alternately-oriented cylindrical projections as well, if we wanted to.

    Of course, the full tetrachromat color space still contains two more whole dimensions--saturation and luminosity. But those work exactly the same way as they do for trichromats. Thus, if you want to create separate named color categories for tetrachromatic equivalents of, say, brown (dark orange) or pink (light red), you can still place them on the map by identifying the relevant range of hues and then just adding a note to say, e.g., "this region is called X when saturated, but Y when desaturated".

    Now, go forth and create language for non-human speakers with appropriate lexical structure in color terms!

Friday, August 9, 2024

Some More Thoughts on Toki Pona

What the heck is Toki Pona?

After publishing my last short article, several people expressed interest in a deeper analysis of various aspects of toki pona--among them, Sai forwarding me a request from jan Sonja for one conlanger's opinion about how to categorize toki pona. So, I shall attempt to give that opinion here.

The Gnoli Triangle, devised by Claudio Gnoli in 1997, remains the most common way to classify conlangs into broad categories.


Within each of these three categories are numerous more specific classifications, but broadly speaking we can define each one as follows based on the goals behind a conlang's construction:

Artlang: A language devised for personal pleasure or to fulfill an aesthetic effect.

Engelang: A language devised according to meet specific objective design criteria, often in order to test some hypothesis about how language does or can work.

Auxlang: A language devised to facilitate communication between people who otherwise do not share a common natural language. Distinct from a "lingua franca", a language which actually does function to facilitate communication between large groups of people without a native language in common.

Any given language can have aspects of all three of these potential categorizations. But, to figure out where in the triangle toki pona should fit, we need to know the motivations behind its creation.

To that end, I quote from the preface of Toki Pona: The Language of Good:

Toki Pona was my philosophical attempt to understand the meaning of life in 120 words. 

Through a process of soul-searching, comparative linguistics, and playfulness, I designed a simple communication system to simplify my thoughts.

I first published [Toki Pona] on the web in 2001. A small community of Toki Pona fans emerged.

In relation to the third point, in private communication jan Sonja confirmed that she never actively tried to get other people to use it. The community just grew organically. Even though the phonology was intentionally designed to be "easy for everyone", that tells me that the defining motivation behind toki pona was not that of an auxlang. In practice, it does sometimes serve as a lingua franca, but it wasn't designed with the intention of filling that role. It was designed to help simplify thoughts for the individual. Therefore, we can conclude that toki pona does not belong in the auxlang corner, or somewhere in the middle. A proper classification will be somewhere along the engelang-artlang edge--what I am inclined to call an "architected language" or "archlang" (although that particular term has been slow to catch on in anyone's usage but my own!)

So, what are the design criteria behind toki pona? Referring again to The Language of Good, toki pona was intended to be minimalist, using the "simplest and fewest parts to create the maximum effect". Additionally, "training your mind to think in Toki Pona" is supposed to promote mindfulness and lead to deeper insights about life and existence.

Toki Pona is also described as a "philosophical attempt"; can it then be classed as a "philosophical language"? I referred to it as such in my last post, and I think yes; it is, after all, the go-to example of a philophical language on the Philosophical language Wikipedia page! The term "philosophical language" is sometimes used interchangeably with "taxonomic language", where the vocabulary encodes some classification scheme for the world, as in John Wilkins's Real Character, but more broadly a philosophical language is a type of engineered language designed from a limited set of first principles, typically employing a limited set of elemental morphemes (or "semantic primes"). Toki Pona absolutely fits that mold--which means it can be legitimately classed as an engelang as well.

However, Toki Pona was clearly not constructed entirely mechanistically. It came from a process of soul-searching and playfulness, and encodes something of Sonja's own sense of aesthetics in the phonology. Ergo, it is clearly also an artlang. Exactly where along that edge it belongs--what percentage of engelang vs. artlang it is--is really something that only jan Sonja can know, given these categorial definitions which depend primarily on motivations. But I for one am quite happy to bring it in to the "archlang" family.

To cement the artlang classification, I'll return to the "minor complexities" I mentioned in the last article. To start with, what's up with "li"? It is supposed to be the predicate marker, but you don't use it if the subject is "mi" or "sina"... yet you do for "ona", so it's clearly not a simple matter of "pronoun subjects don't need 'li'". But, if we imagine a fictional history history for toki pona, it makes perfect sense. There is, after all, a fairly common historical process by which third person pronouns or demonstrative transform into copulas in languages that previously had a null copula. (This process is currently underway in modern Russian, for example.) So, suppose we had "mi, sina, li" as the "original" pronouns; "li", in addition to its normal referential function, ends up getting used in cleft constructions with 3rd person subjects to clarify the boundary between subject and predicate in null-copula constructions. Eventually, it gets re-analyzed as the copula, except when "mi" and "sina" are used because they never required cleft-clarification anyway (and couldn't have used it if they did, because of person disagreement), and a new third-person pronoun is innovated to replace it--which, being new, doesn't inherit the historical patterning of  "mi" and "sina", so you get a naturalistic-looking irregularity.

Or, take the case of "en". It seems fairly transparently derived from "and", and that is one of its glosses in The Toki Pona Dictionary, based on actual community usage, but according the The Language of Good it does not mean "and"--it just means "this is an additional subject of the same clause". Toki Pona doesn't really need a word for "and"; clauses can just be juxtaposed. and the particle "e" makes it clear where an object phrase starts so you can just chain as many of those together as you want with no explicit conjunction. So, we just need a way to indicate the boundary between multiple different subject phrases. You could interpret that as just as kind of marked nominative case--except you don't use it when there's only one subject. It's this weird extra thing that solves a niche edge case in the basic grammar. A strictly engineering-focused language might've just gone with an unambiguous marked nominative, or an explicit conjunction, but Toki Pona doesn't. It's more complicated, in terms of how the grammatical rules are specified, than it strictly needs to be.

And then, we've got the issue of numerals. All numerals follow the nouns which they apply to, whatever their function--but that means an extra particle must be introduced into the lexicon to distinguish cardinal numerals (how many?) from ordinal numerals (which one?). That is an unnecessary addition which makes the lexicon not-strictly-minimalist. The existing semantics of noun juxtaposition within a phrase make it possible to borrow the kind of construction we see in, e.g., Hawai'ian, where using a numeral as the head of a noun phrase forces a cardinal interpretation (something like "a unit of banana", "a pair of shoes", "a trio of people", etc.), while postposing a numeral in attributive position forces an ordinal interpretation ("banana first", "shoe second", "person third"). But Toki Pona doesn't do that!

Finally, as discussed previously, the lexicon is not optimized. These are all expressions of unforced character--i.e., artistic choice.

But what if Toki Pona were an auxlang? How would it be different?

Well, first off, we'd fix those previous complexities. At minimum, introduce an unambiguous marked nominative (which also helps with identifying clause boundaries), unify the behavior of pronouns and the copula / predicate marker, and get rid of the unnecessary ordinal particle. Then, we look at re-structuring the vocabulary. I collected a corpus of Toki Pona texts, removed all punctuation, filtered for only the 137 "essential words", and ended up with set of 585,888 tokens from which to derive frequency data. Based on this data set, 7 of the "essential words" appear zero times... which really makes them seem not that essential, and argues for cutting down the word list to an even 130. (Congratulations to jan Sonja for getting so close to the mark with the earlier choice of 120!) There are 72 two-syllable words that occur "too infrequently"--in the sense that there are three-syllable words that occur more frequently, and so should've been assigned shorter forms first. And similarly, there are 23 one-syllable words which are too infrequent compared to the two-syllable words. Honestly, predicting what these frequency distributions ought to be is really freakin' hard, so jan Sonja can't be blamed for these word-length incongruities even if she had been trying to construct a phonologically-optimized auxlang, but now we have the data from Toki Pona itself, so we could do better! Design a phonology, enumerate all of the possible word forms in order of increasing complexity, and then assign them to meanings according to the empirical frequency list!

For that, of course, we need to define a new phonology. It needs to produce at least 129 (remember, we're dropping the ordinal particle) words of three syllables or less, but no more than that. Based on picking the most cross-linguistically common segments according to Phoible data, we can go with the following inventory:

i,a,u
n (/m), p, k, w (/v)

With a strict syllable structure of CV, that produces 12 monosyllables and 144 disyllables.
Cutting out w/v gives us 9 monosyllables and 81 disyllables--not enough to squish everything into two syllables or less. But there are 729 trisyllables--way more than we need! So, we could cut it down even more... But, that gets at a hard-to-quantify issue: usability. Aesthetics, it turns out, can be an engineering concern when engineering for maximal cross-cultural auxlang usability! Too few phonemes, and the language gets samey and hard to parse. Toki Pona as it is seems to hit a sweet spot in having some less-common phonemes, but sounding pretty good--good enough to naturally attract a speaker community. If I were doing this for real, I'd probably not just look at individual segments, but instead comb through Phoible for the features that are most cross-linguistically common, and try to design a  maximnally-large universally-pronounceable inventory of allophone sets based on that to give variety to the minimal set of words. But if we accept the numbers of phonemes, and accept their actual values as provisional, what happens if we enumerate words while also aliminating minimal pairs?

Well, then we get a maximum of 3 monosyllables (re-using any vowel would produce a minimal pair) well under a hundred disyllables, but plenty of trisyllables. It would be nice to not do worse than Toki Pona in the average word length, though, which means we probably need 118 monosyllables + disyllables--we can get that pretty easily by relaxing the word-difference constraints such that we can have minimal pairs between, e.g., /n/ and /k/, which are extremely unlikely to be confused. Or, we just go up to 5 consonants instead of four, probably adding in something like j (/l).

I'm still not super inclined to add the mountain of failed auxlangs or tokiponidos in the world... but, that's the process I would use to properly engineer an optimal auxlang-alternate to Toki Pona.

Some Thoughts... Index

Friday, August 2, 2024

Some Thoughts on Toki Pona

Toki Pona is a minimalist philosophical artistic language, not an auxlang. Nevertheless, it has attracted a fairly large and international community of users--enough so that it was possible for Sonja Lang to publish a descriptive book on natural usage of Toki Pona (The Toki Pona Dictionary)! Thus, while this should in no way be seen as a criticism in the negative sense of Sonja's creation, it seems fair to critique Toki Pona on how optimized its design is as an auxlang.

Toki Pona has 92 valid syllables, composed of 9 consonants and 5 vowels. Accounting for disallowed clusters at syllable boundaries, this results in 7519 possible 2-syllable words--far, far more than any accounting of the size of Toki Pona's non-proper-noun vocabulary, which does not surpass 200 words. In developing the toki suli whistle register, I discovered that some phonemes can be merged without any loss of lexical fidelity--so even if we wanted to additional restrictions like spreading out words in phonological space to eliminate minimal pairs, or ensuring that the language was uniquely segmentable, the phonetic inventory and phonotactic rules are clearly larger and more permissive than they strictly need to be. And a smaller phonemic inventory and stricter phonotactics would theoretically make it trivially pronounceable by a larger number of people. For example, we could reduce it to a 3-vowel system (/a i u/), eliminate /t/ (merging with either /k/ or /s/) and merge /l/ and /j/. More careful consideration in building a system from scratch rather than trying to pair away at Toki Pona's existing system could minimize things even further, but if we start there, and require that all syllables be strictly CV, then 7x3=21 valid syllables and 441 valid 2-syllable words. We could rebuild a lexicon on top of that with no minimal pairs and unique segmentation just fine, or choose to make the phonemic inventory even smaller--all while still reducing the average Toki Pona word length, since the current vocabulary does include a few trisyllabic words!

The grammar, on the other hand, I really have no complaints about. It is not quite as simple as it could be (e.g., li could be made always obligatory, rather than onligatory-unless-the-subject-is-mi-or-sina), but it's really quite good--and the minor complexities actually help add to its charm as an artlang.

I am not much inclined to actually construct a phonologically-optimized relex of Toki Pona, as what would be the use? But it is fun to imagine an alternate history in which Toki Pona was designed from the outset with usage as an auxlang in mind. Would it actually have become as successful as it is, had Sonja taken that route? Perhaps we need to consider another contributor to Toki Pona's popularity--Sonja's specific phonological aesthetic. As mathematically sub-optimal as it is, Toki Pona sounds nice. Would it still have become popular if its sounds were instead fully min-maxed for optimal intercultural pronoucneability, length, and distinctiveness? Maybe I'll build a Toki Pona relex after all, just to see if it can be made to sound pretty....

Some Thoughts... Index

Tuesday, March 19, 2024

Human Actors Shouldn't Be Able to Speak Alien Languages

Isn't a little weird that humans can speak Na'vi? Or that aliens can learn to speak English? Or, heck, Klingon! The Klingon language is weird, but every single sound is used in human languages.

Of course, there's an obvious non-diegetic reason for that. The aliens are played by human actors. Actors wanna act. Directors want actors to act. It's less fun if all of your dialog is synthesized by the sound department. But while it is an understandable and accepted trope, we shouldn't mistake it for representing a plausible reality.

First, aliens might not even use sound to communicate! Sound is a very good medium for communication--most macroscopic animals on Earth make use of it to some extent. But there are other options: electricity, signs, touch, light, color and patterning, chemicals. Obviously, a human actor will not, without assistance, be able to pronounce a language encoded in changing patterns of chromatophores in skin, nor would a creature that spoke that language have much hope of replicating human speech. But since sound is a good and common medium of communication, let's just consider aliens that do encode language in sound.

The argument was recently presented to me that aliens should be able to speak human languages, and vice-versa, due to convergent evolution. An intelligent tool-using species must have certain physical characteristics to gain intelligence and use tools, therefore... I, for one, don't buy the argument that this means humanoid aliens are likely to start with, but supposing we do: does being humanoid in shape imply having a human-like vocal tract, or a vocal tract capable of making human-like noises? I propose that it does not. For one thing, even our closest relatives, the various great apes, cannot reproduce our sounds, and we can only do poor approximations of theirs. Their mouths are different shapes, the throats are different shapes, they have different resonances and constriction points. We have attempted to teach apes sign languages not just because they lack the neurological control to produce the variety of speech sounds that we do, but also because the sounds they can produce aren't the right ones anyway. Other, less-closely-related animals have even more different vocal tracts, and there is no particular reason to think they would converge on a human-like sound producing apparatus if any of them evolved to be more externally human-like. We can safely assume that creatures from an entirely different planet would be even less similar to us in fine anatomic detail. So, Jake Sully should not be able to speak Na'vi in his human body, and should not be able to speak English in his avatar body--yet we see Na'vi speaking English and humans speaking Na'vi all the time in those movies.

And that's just considering creatures that make sounds in essentially the same way that we do: by using the lungs to force air through vibrating and resonant structures connected with the mouth and nose. Not all creatures that produce sound do so with their breath, and not all creatures that produce sound with their breath breathe through structures in their heads! Intriguingly, cetaceans and aliens from 40 Eridani produce sound by moving air through vibrating structures between internal reservoirs, rather than while inhaling or exhaling--they're using air moving through structures in their heads, but not breath!

Hissing cockroaches make noise by expelling air from their spiracles. Arguably, this should be the basis for Na'vi speech as well: nearly all of the other animals on Pandora breathe through holes in their chests, with no obvious connection between the mouth and lungs. They also generally have six limbs and multiple sets of eyes. Wouldn't it have been cooler to see humanoid aliens with those features, and a language to match? But, no; James Cameron inserted a brief shot of a monkey-like creature with partially-fused limbs, no operculi, and a single set of eyes to provide a half-way-there justification for the evolution of Na'vi people who are just like humans, actually.

Many animals produce sound by stridulation. No airflow required. Cicadas use a different mechanism to produce their extremely loud songs: they have structures called tymbals which are crossed by stiff ribs; flexing muscles attached to the timbals causes the ribs to pop, and the rest of the structure to vibrate. It's essentially the same mechanism that makes sound when you stretch or compress a bendy straw (or, as Wikipedia calls them, straws with "an adjustable-angle bellows segment"). This sound is amplified and adjusted by passage through resonant chambers in the insects' abdomens. Some animals use percussion on the ground to produce sounds for communication. Any of these mechanisms could be recruited by a highly intelligent species as a means of producing language, without demanding any deviation from an essentially-humanoid body plan.

There is, of course, one significant exception: birds have a much more flexible sound-production apparatus than mammals, and some of them are capable of reproducing human-like sounds, even though they do it by a completely different mechanism (but it does still involve expelling air from the lungs through the mouth and nose!) Lyrebirds in particular seem to have the physiological capacity to mimic just about anything... but they extent to which they choose to imitate unnatural or human sounds is limited. Parrots and corvids are known to specifically imitate human speech, but they do so with a distinct accent; their words are recognizable, but they do not sound like humans. And amongst themselves, they do not make use of those sounds. Conversely, intraspecific communication among birds tends to make use of much simpler sound patterns, many of which humans can imitate, about as well as birds can imitate us, by whistling. So, sure, some aliens may be able to replicate human speech--but they should have an accent, and if their sound production systems are sufficiently flexible to produce our sounds by different means, there is no reason they should choose to restrict themselves to human-usable sounds in their own languages. Similarly, humans may be able to reproduce some alien languages, but they will not sound like human languages--and when's the last time you heard a human actor in alien makeup whistling? (Despite the fact that this is a legitmate form of human communication as well!)

The most flexible vocal apparatus at all would be something that mimics the action of an electronic speaker: directly moving a membrane through muscular action to reproduce any arbitrary waveform. As just discussed, birds come pretty close to capturing this ability, but they aren't quite there. There are a few animals that produce noise whose waveform is directly controlled by muscular oscillation which controls a membrane, but they are very small: consider bees and mosquitoes, whose buzzing is the result of their rapid wing motions (or, in the case of bumblebees, muscular vibrations of the thorax). Hummingbirds are much bigger than those insects, and they can actually beat their wings fast enough to create audible buzzing sounds (hence, I assume, the name "humming"bird), but they are still prety small animals. And despite these examples of muscule-driven buzzing, it seems rather unlikely that a biological entity--or at least, one which works at all similarly to us--could have the muscular response speed and neurological control capabilities to replicate the complex waveforms of human speech through that kind of mechanism. But if they did (say, like the Tines from Vernor Vinge's A Fire Upon the Deep), just like parrots and crows, why would their native communication systems happen to use any sounds that were natural for humans?

Now, some people might argue with my assertion that "any of these mechanisms could be recruited... as a means of producing language". That doesn't really impinge on my more basic point that an alien language should not reasonably be expected to be compatible with the human vocal apparatus, but let's go ahead and back up the assertion anyway. Suppose a certain creature's sound-production apparatus isn't even flexible enough to reproduce the kinds of distinctions humans use in whistled speech, based on modulating pitch and amplitude (which cicadas certainly can). Suppose, in fact, that it can produce only four distinct sounds. That should be doable by anybody that can produce sound ata ll--heck, there are more than 4 ways of clapping your hands. With 2 consecutive sounds, you can produce 16 distinct words. If you allow 3, it goes up to 80 words. At a word length of 4 or less, you've got 336 possible words. So far, that doesn't sound like very much. But then, there are 1360 possible words of length 5 or less, and 5456 of length 6 or less. At a length of 7, you get 21,840 possible words--comparable to the average vocabulary of an adult English speaker. The average length of English words is a little less than 5 letters, and we frequently (9 letters) use words that are longer than 7 letters, so needing to go up to 7 to fit your entire adult vocabulary isn't too bad. And that's before we even consider the ability to us homophones to compress the number of distinct words needed! So: we might argue about exactly how many words are needed for a fully-functional language with equivalent expressive power to anything humans use, but through the power of combinatorics, even small numbers of basic phonetic segments can produce huge numbers of possible words--indisputably more than any number we might come up with as a minimum requirement. A language with only four sounds might be difficult for humans to use, as it would seem repetitive and difficult to segment... but we're talking about aliens here. If 4 sounds is all their bodies have to work with, their brains would simply specialize to efficiently process those specific types of speech sounds, just as our brains specialize for our speech sounds.

Now, to be clear, this is not intended to disparage any conlanger who's making a language for aliens and using human-compatible IPA sounds to do so. It's an established trope! And even if it's not ever used in a film or audio drama, it can be fun. There are plenty of awesome, beautiful examples of conlangs of this type, and there's no inherent problem with making more if that's what you want to do. Y'all do what you want. But we should not mistake adherence to the trope for real-world plausibility! And it would be great to see more Truly Alien Languages out there.

Sunday, February 25, 2024

Review: "Reading Fictional Languages"

I'm going meta! I'm reviewing people who are reviewing people who use conlangs in fiction!

Reading Fictional Languages (that's an Amazon Affiliate link, but you can also get it directly from Edinburgh University Press) is a collection of articles that follows up on the presentations given at the eponymous Reading Fictional Languages conference, which brings together both creators and scholars of constructed languages used in fictional works. I was provided with a free review copy as a PDF, but not until after I had bought my own hardcover anyway.

The first thing to note is that the title is kind of poorly chosen. It is telling that articles by conlangers refer to their subject as "constructed languages" or "conlangs", while articles by literary scholars refer to their subject as "fictional languages". Based on personal communication with some of the contributors, it seems that the organizers of the conference on which this volume was based (which I did submit an abstract for myself, but was not accepted) were unaware of the modern conlanging community and taken somewhat by surprise when actual language creators showed up to talk about their work! And they had thus developed their own analytical terminology ahead of time in isolation from conlanging practitioners.

Chapter 1, the introduction, contrasts "real" languages with languages which are "imagined for an equally fictional community of users, where the environment is being imagined at the same time as the language is being constructed". However, that misses out on a very important distinction in the types of non-natural languages that are actually used in fictional works: those that do not exist as usable languages in the real world, and those that do. I.e., those which actually are fictional, and those which are real, despite being artificially constructed.

Skipping to page 77, in Chapter 6: "Design intentions and actual perception of fictional languages: Quenya, Sindarin, and Na’vi", by Bettina Beinhoff, specifies that "fictional languages" are a subset of "constructed languages", being languages constructed for use in fictional works. That's sensible, but when talking about Quenya, Sindarin, and Na'vi in particular--all languages which have been heavily developed and actively used by communities outside of their fictional contexts--it really highlights the inadequacy of this academic terminology.

We also get an explanation of the "Reading" part of the title--in short, it's about the reader's interaction with a text, and how the use of invented languages influences the creative process and the reading experience. Apart from defining terminology, however, Chapter 1 does provide a decent overview of the history of invented languages in fiction and of the proceeding contents of the book. 

Chapter 2, by David Peterson and Jessie Sams (who has since become Jessie Peterson) explores the nature of working with television and film makers as a language creator. I couldn't possibly do this justice in summary; David and Jessie probably have more experience with film and TV language construction than everyone else in the industry combined, and they certainly know what they're talking about! One complication of working in Hollywood, however, is not unique to working in Hollywood:

A script writer often won’t have heard of language creation and will have no sympathy for someone whose role they don’t understand commenting that the line of dialogue they want to be cut mid-word won’t work in translation because the verb in the conlang comes at the end of the sentence and won’t have been uttered yet if cut off after three words

That's basically the lament of every translator ever! Especially the ones that have to translate dialog for foreign-language editions of novels, movies, and TV shows.

Just from having been active in the conlanging community for a good long time, there was a lot in this chapter that I already knew, even though I could not have articulated it as well as David and Jessie do. But the biggest insight I gained came in an explanation of how the form of a constructed language is constrained by the needs of a film production--and not just in the sense that actors need to be able to use it. Additionally, the language creator needs to be able to translate rapidly, which means they need to construct a language that is easy for them to use without too much practice. I have long thought that Davidsonian languages all seem to have a common sort of character about them, which is partially attributable to David's construction process--but now I can see there's a darn good reason for it, and I can't actually blame him! That's just more reason to work towards getting a greater diversity of language creators into the film industry, so that we can start to see a greater diversity of languages reflecting differences in what is easy for individual creators to use in service of the needs of a film production.

I found Chapter 3 "On the inner workings of language creation: using conlangs to drive reader engagement in fictional worlds", by BenJamin Johnson, Anthony Gutierrez, and Nicolás Matías Campi, to be the most immediately useful to me, and probably to most of the people who read my blog (or at least, the intended audience for the Linguistically Interesting Media Index, which is authors who want to figure out how to do this better!) It's pretty comprehensive, covering why you might want to do this, how to handle collaboration between an author and a conlanger if you don't happen to fill both rolls yourself, and some very basic stuff about the mechanics of actually using a conlang in fiction. This is where BenJamin introduces his 5-level categorization of the types of textual representation for conlangs, which I immediately latched onto and began expanding on after seeing the conference presentation that preceded this chapter, as a complement to my own categorization of comprehension-support strategies.

Chapter 4 is a case study in creating dialectal variation in a constructed language. Useful for a language creator, but you're left on your own as far as making use of that variation in your fiction writing. Personally, I think it might be hard to justify, given the difficulty of representing natural language dialects in a non-annoying way in most modern writing. Of course, if you get one of those coveted film jobs, it becomes more practical; see, for example, Paul Frommers call back to create a new dialect of Na'vi for The Way of Water.

Chapter 5, by Victor Fernandes Andrade and Sebastião Alves Teixeira Lopes, is an exploration of the visual influence of Asian scripts on alien typography in science fiction media. I'm not completely convinced, but the argument is worth reading. They've got interesting data to look over, at least.

I already briefly mentioned Chapter 6; essentially, it determines that the languages studied were perceieved as intended on some subjective axes, such as "pleasantness", by a surveyed population, but failed in aethetic design aims on other axes, and that cultural context is important to aesthetic evaluations. Chapter 7 "The phonaesthetics of constructed languages: results from an online rating experiment" by Christine Mooshammer, Dominique Bobeck, Henrik Hornecker, Kierán Meinhardt, Olga Olina, Marie Christin Walch, and Qiang Xia is essentially the same thing, just better, as it covers a broader selection of conlangs, and gathers responses from both English and German speakers, rather than just English speakers from the UK, and controls for gender, age, and linguistic background. They additionally tested listeners' abilities to discriminate between conlangs, as well as their subjective evaluations. This is potentially useful information for conlangers who are trying to target a particular aesthetic effect on a particular audience--however, it also suggests that doing specific research on this isn't really necessary for a creator, as the languages studied were pretty good at achieving their creators' stated goals already!

Chapter 8 "Tolkien’s use of invented languages in The Lord of the Rings" by James K. Tauber is basically exactly what I do on this blog--an analysis of how secondary languages are used in a fictional work to augment the narrative! I've avoided doing this sort of analysis on The Lord of the Rings myself because it is a Very Large Work, so I'll definitely be coming back to this chapter to see what I can integrate into my own analytical system later.

Chapter 9 "Changing tastes: reading the cannibalese of Charles Dickens’ Holiday Romance and nineteenthcentury popular culture" by Katie Wales analyses the representation of a truly fictional language--one which does not exist as a developed and usable language in the real world--in terms of the sociological environment in which it was published, and how the tastes of modern audiences and thus the appropriate means of cultural representation have changed over time. It is a reminder that appreciating old literature often requires being intentional about not ascribing modern points of view and modern judgments on people of the past, and trying to understand the literature as it would've been read by it's original intended audience.

Chapter 10 "Dialectal extrapolation as a literary experiment in Aldiss’ ‘A spot of Konfrontation’" by Israel A. C. Noletto reads like a pretty standard sample of Dr. Noletto's work; he's the only academic author represented in this volume with whom I have a prior acquaintance, such that I can compare his other work! Noletto argues that " the presence of an unfamiliar fictional language interlaced with English as the narrative medium does not necessarily constitute a barrier to understanding as might otherwise be expected", and that the use of the extrapolated dialect in fact serves as an important means of conveying the theme of the story through narrative style. There's a little bit of my sort of detailed analysis of the text to show it is constructed to support comprehension.

Chapter 11 "Women, fire, and dystopian things" by Jessica Norledge examines the successes, failures, and impact of Suzette Haden Elgin's Láadan language as a language for a dystopia--and particularly as a language meant to expand the user's capacity for thought, in contrast to other dystopian languages, like 1984's Newspeak, which are intended to restrict thought in a Whorfian fashion. The title is of course a reference to George Lakoff's Women, Fire, and Dangerous Things.

Chapter 12 "Building the conomasticon: names and naming in fictional worlds" by Rebecca Gregoryis a broad survey of how names are constructed and reflect language and culture--or fail to do so--in a variety of fictional works. She ends with "with a bid for names to be seen as just as fundamental a part of language creation and conceptualisation as any other of language’s building blocks", which I can only read as a plea to academics doing literary analysis, not language creators or authors, given the broad recognition that already exists in the conlanging community of "naming languages" as a thing that is useful in worldbuilding for fiction across many types of media.

Chapter 13 "The language of Lapine in Watership Down" by Kimberley Pager-McClymont analyses the idioms, conceptual patterns, and attested formal structure of the Lapine language, how it is connected to the embodied experience of rabbits, and thus contributes to generating empathy in the reader for non-human protagonists. An excellent case study to reference for conlangers who want inspiration on the developing the connection between language and culture, and especially for those working on non-human languages.

The final chapter, 14, "Unspeakable languages" by Peter Stockwell, presents another case where my intuitions clash with the chosen terminology. Stockwell examines languages which are difficult or impossible to represent directly in the narrative--i.e., a subset of truly fictional languages which necessarily remain fictional for practical reasons related to their asserted nature, not merely because the author didn't bother to flesh them out. Stockwell introduces the term "nonlang" for what I would simply call a fictional language. Terminological disputes aside, though, this chapter presents an intriguing overview of how science fiction works have dealt with the concept of the "linguistically ineffable"--languages which we can never hope to decipher or understand. The only quibble I have with the actual content is that Stockwell claims that "it is evident that the pragmatics of a question and an exclamation are still carried even in Speedtalk by intonation (marked here by ‘?’ and ‘!’)."--but that is an unwarranted conclusion based on the evidence presented, as intonation is definitely not evident on the page, and we should not assume that the use of '?' and '!' in the text actually correspond to intonation contours in the fictional spoken form--or, if they do, that the intonation contours so indicated actually correspond to questions and exclamations, given that the Speedtalk text is untranslated and explicitly not understood by the character transcribing it.

Overall: I have some complaints, and not all chapters are of equal quality or usefulness from my point of view--but there is plenty of good stuff in here that makes it worth a read, and I for one am strongly in favor of further, perhaps more intentional, collaborations between academics and conlangers in analyzing the use of constructed languages in fiction.

If you liked this post, please consider making a small donation!


Saturday, January 20, 2024

Describing Non-human Vision

Thanks to LangTime Studio creating languages for a lot of mammals with dichromatic vision, I few years ago I did a good bit of research into how visual perception varies between different species. The issue of non-human vision came up again yesterday in George Corley's (of Conlangery fame) latest Draconic language stream, so I dug up some old notes on how to describe colors that you can't see. And in fact, this isn't just useful for conlangers trying to come up with vocabulary for a non-human language; this is good information for fantasy and sci-fi writers, too!

Since I started out with researching rabbits... let's talk about rabbits. It turns out that rabbit vision differs from human vision in just about every way that tetrapod vision can, so it makes an excellent case study. Rabbits have 2 types of color-receptive cone cells, corresponding to peak sensitivities in the green and blue ranges, and one rod cell type. I.e., they are dichromats, like most mammals. Rods don't contribute to color differentiation, so we can ignore those. At first glance, this seems similar to human red-green color blindness, except the peak sensitivities of the rabbit green cone and the red/green cones of a deuteranopic human are not in the same place! This is the first are in which human and non-human visual perception can differ--even other trichromats (e.g., penguins, honeybees) may not have the same spectral sensitivities as humans, and so see completely different color distinctions than we do. The rabbit cone sensitivities are shifted downward to a 509nm peak, compared to the human green cones with peak at 530nm, and red cones which peak at 560nm. Thus, not only can rabbits not distinguish red from green, but everything on the red end of the spectrum appears much dimmer than it would to a human, due to weaker response of the Long-Wavelength Cones to those spectral colors. Note, however, that not having separate cones for red and green does not mean that rabbits (or dogs, for that matter) would always see things-we-perceive-as-red and things-we-perceive-as-green as indistinguishable--it depends on the actual spectral signature of each object. For example, where we perceive two objects as having equal perceptual brightness but different hue, rabbits might perceive identical hue but lower perceptual brightness for the red object compared to the green.

Much like humans have an anomalous blue response in our red cones, which causes us to conflate purple (red+blue) and violet (a spectral color, extreme blue),  rabbit and rat green cones also have a
sensitivity peak in the ultraviolet. Initially, I assumed that, unlike the human anomalous blue response, UV light would be blocked by the structures of the eye, as it is for humans; however, while talking with a sci-fi writer friend of mine about non-human vision last night (as ya do, y'know), when I mentioned that rabbit and rat green-cone pigments have a weird bi-stable response to UV light, but UV is absorbed by mammalian eye tissue, so it's probably just a random non-conserved evolutionary quirk... he noted that UV is absorbed by primate eye tissue, but had I actually explicitly checked on rabbits? And I had not. So I did. And it turns out that that lapine corneal, lens, and vitreous humor tissues are considerably more transparent to near-UV light than human eye tissues are. Now, nobody (that I have been able to find) is actually saying outright that rabbits (or rats) can see UV... but rabbits might actually be able to see UV. If they can, it would be indistinguishable to them from green (not blue!) If it was not already clear from the shifted sensitivity peaks, I think that should highlight the impossibility of just taking, e.g., a JPEG image captured with equipment built for humans and transforming it into an accurate representation of what some other animal would see--if nothing else, the UV information would be completely missing!

Incidentally, if rabbits are UV-sensitive, the bistable nature of the UV response in their green cones means that they would actually be more strongly sensitive to UV in the dark than they are during daytime illumination. I have no idea what to make of that, as there isn't really a whole lot of
environmental UV going around at night or in tunnels... but that's a quirk you can keep in mind as a possibility for fictional creatures. In general, just note that spectral response can vary in different environmental conditions; in humans, we lose the ability to distinguish color entirely in low-light conditions (and your brain lies to you to fill in the colors that you believe things should be), but things can be more complicated than that.

Another interesting feature of rabbit eyesight is that they have a much less dense foveal region than humans (so less effective resolution), and their color-sensitive cells are not evenly distributed--there is a thin band with a mixture of both green and blue cones, with blue cones concentrated at the bottom of the retina (corresponding to the top of the visual field) and green cones concentrated at the top (corresponding to the bottom of the visual field). I.e., their vision along the horizon is in color, but the top and bottom extents of their visual fields are black and white, and specialized for better spectral response to the most common wavelengths of light coming from those directions--blue from the sky, green from the ground. This isn't too different from human peripheral vision (where color information is inferred by the brain, not actually present in the raw retinal output), except that in rabbits different parts of the peripheral fields actually have a different peak spectral response! In wild rabbits, this is probably just an adaptation to getting the maximum information out of a predominantly-blue-background sky and a predominantly-green(/red)-background ground, but intelligent rabbits could theoretically learn to extract additional color information (e.g., distinguishing monochromatic white from dichromatic white) from an object by wiggling their eyes up and down or tilting their heads to put it in different parts of the visual field. Or not, if their brains just fill in missing color information automatically like ours do.... But if you want to write about creature that can do that, by authorial fiat, they could have a whole auxiliary class of color words, analogous to pattern words like "speckled" or "sparkly", to describe objects that have different appearances in different parts of the visual field.

But, if we abstract away from physiological perceptual abilities, what would their experience of color space be like? Tetrapod retinas pre-process raw cone cells signals into antagonistic opponent channels before color information gets sent to the brain; i.e., what your visual cortex has access to is not the original cone cell activations, but sums and differences of the activations of multiple types of cone cells. In human eyes, that means our brains see color coming down the optic nerves as a combination of red vs. green and blue vs. yellow signals--even though yellow isn't actually a physiological primary color! In dichromats like rabbits, the two raw spectral signals (green and blue) are still
processed by an antagonistic opponent system in the retinal ganglia; thus, just like we can't perceive the impossible colors "reddish green" or "yellowish blue", they cannot have any perception of a distinct blue-green mixture--dim dichromatic light at both spectral peaks will look exactly the same as bright monochromatic light exactly in between, which will be indistinguishable from white. In effect, the loss of one cone type compared to humans reduces the color space from 3 dimensions to 2, and the perceptual dimension that is lost after ganglial processing is that of saturation.

The lapine color space is thus defined by a 2D, triangular range with black at one vertex, white (or whatever you want to call it) at the center of the opposite edge, and pure green and pure blue at the
remaining vertices. The hue and saturation axes are the same, with green fading into white and then white fading into blue.



If the most basic colors are defined by the extrema of the opponent-process space, as they are for humans, there should be 3 basic colors, corresponding to black, blue, and green. White would be
the natural next step, followed perhaps by light and dark shades of blue and green. Or you could call the green extremum "yellow" instead, as the Long Wavelength Cone still has sensitivity into the yellow and red ranges of the spectrum, even though its peak is in green, as I have done in the image above. Fundamentally, the 3D human color space and 2D dichromat color spaces are mathematically incommensurate, so all human-perceptible representations involve some arbitrary choices anyway. Treating the long-wavelength end as "yellow" rather than "red" makes is convenient if you want to do something like copying the Old Norse poetic convention of treating blood and gold as being the same color. :)

We can squish and stretch that gamut to get a representation of the dichromat color wheel, with a radial saturation axis and polar hue and brightness:


And the sort of Cartesian representation that an intelligent dichromat graphic designer would use to pick out colors in a computer graphics program:


Keep in mind that the actual colors used in these illustrations are completely arbitrary, aside from being "towards the long-wavelength end" vs. "towards the short-wavelength end". What matters is just the set of possible distinctions. Figuring out exactly what lapine colors any particular object would correspond to would require recording the actual emission spectrum of that object, and then mapping it into the rabbit color space--and being dichromatic does not merely mean that they see a subset of the colors that we can see; the available distinctions are different. E.g., two objects which look identically purple to a human may be monochromatic in the violet spectral range, or they may be dichromatic with light in the
blue and red ranges, but those two objects will look distinct to a rabbit--the first one being obviously pure blue, the second being light blue or white.

So, that's dichromatism... what about tetrachromatism, or higher? My best reference on this subject is this absolutely lovely article: Ways of Coloring: Comparative Color Vision as a Case Study for Cognitive Science, which contains descriptions of comparative color spaces for humans, bees (also trichromats, but with different frequency response), goldfish, turtles (both of which are tetrachromats), and pigeons (suspected pentachromats). And it has an excellent statement of what the problem actually is:
It is important to realize that such an increase in chromatic dimensionality does not mean that pigeons exhibit greater sensitivity to the monochromatic hues that we see. For example, we should not suppose that since the hue discrimination of the pigeon is best around 600nm, and since we see a 600nm stimulus as orange, pigeons are better at discriminating spectral hues of orange than we are. Indeed, we have reason to believe that such a mapping of our hue terms onto the pigeon would be an error: [...] 
Among other things, this result strongly emphasizes how misleading it may be to use human hue designations to describe color vision in non-human species. This point can be made even more forcefully, however, when it is a difference in the dimensionality of color vision that we are considering. An increase in the dimensionality of color vision indicates a fundamentally different kind of color space. We are familiar with trichromatic color spaces such as our own, which require three independent axes for their specification, given either as receptor activation or as color channels. A tetrachromatic color space obviously requires four dimensions for its specification. It is thus an example of what can be called a color hyperspace. The difference between a tetrachromatic and a trichromatic color space is therefore not like the difference between two trichromatic color spaces: The former two color spaces are incommensurable in a precise mathematical sense, for there is no way to map the kinds of distinctions available in four dimensions into the kinds of distinctions available in three dimensions without remainder. One might object that such incommensurability does not prevent one from “projecting” the higher-dimensional space onto the lower; hence the difference in dimensionality simply means that the higher space contains more perceptual content than the lower. Such an interpretation, however, begs the fundamental question of how one is to choose to “project” the higher space onto the lower. Because the spaces are not isomorphic, there is no unique projection relation.

It is also the case that lower-dimensional color spaces, such as those of dogs or rabbits (both dichromats, but in slightly different ways) are incommensurate with our 3D color space, in exactly the same way that our 3D color space is incommensurate with the higher-dimensional perceptions of a pigeon, turtle, or goldfish, and have no unique projections. Thus, visualizations of how your dog or cat sees things are always only approximations--we can try to recreate the kinds of distinctions relevant to a dichromatic animal in our own color space, but we will always experience it differently.

A common feature of all of the systems described is the production of a combined luminance channel from the raw n-dimensional cone cell inputs, and n-1 oppositional chroma channels--in humans, these are the red-green and blue-yellow oppositions, which produce a two-dimensional neurological color space othogonal to the luminosity axis. The YCbCr color space (used for analog color TV transmission) arises from representing the two chromatic dimensions directly in Cartesion coordinates. Saturation arises as the radial dimension--distance from the white-black axis--in a polar transformation of this oppositional color space to produce the trichromat color wheel, with hue arising as the radial coordinate. Trichromat color spaces for different species can vary both in their precise spectral sensitivities, and in how the oppositional chroma channels are generated in the retina; i.e., instead of an RG-B apposition, where R and G physical channels combine to produce Y, there can also be an R-GB opposition: red-cyan vs green-blue. For us, there's no such thing as reddish-green (nor blueish-yellow), because yellow comes in between, but we do have blueish-green. For that other sort of trichromat, reddish-green would make perfect sense, but blueish-green and reddish-cyan would be impossible to perceive instead.

Monochromatic vision is pretty easy to understand--it's just black-and-white / greyscale--luminosity is the only dimension, and leaves zero additional channels for chroma information. As illustrated above, in dichromat vision, the equivalent of the trichromatic color "wheel" is just a line--the radial dimension is not meaningfully distinct from the single linear chromatic dimension, and while we require an additional axis to represent brightness, the dichromat color wheel really does represent every color they can possibly see. As a result, "saturation" and "hue" (or, alternatively, brightness and hue) are indistinguishable to dichromats, and grey (or white, depending on whether you represent the space as a triangular gamut or a Cartesian diamond) is a spectral color. There are only two primary colors (or 4, if you count white and black), and no secondary colors.

In higher-dimensional color spaces, as determined by discrimination experiments on tetrachromatic and pentachromatic organisms, we still see the generation of oppositional color channels from retinal processing. How to generate these oppositional channels, however, is not obvious a-priori; for example, in humans one opposition is between red and green, both of which are primary colors, but the other is between blue, a primary color, and yellow, a composite--and, as mentioned above, that could be reversed in a different species with different specific spectral sensitivities. But why that particular combination for us?

It turns out, across different species, opponent channels are constructed to maximize decorrelation--in other words, to remove redundant information caused by the overlapping response curves of different receptor types. Thus, the precise method of calculating color channels will be slightly different for each species, dependent on physical characteristics of the retinal cells, but they are all qualitatively the same kind of signal, and end up producing a a higher-dimensional chroma-space orthogonal to the white-black luminosity axis. However, there's pretty good reason to believe that this would be a convergently-evolved process to maximize visual acuity (except in some specific circumstances like Mantis shrimp), so this analysis of color perception plausibly applies universally, to most kinds of weird aliens you might come up with, so long as they have eyes at all. Effectively, the retinal ganglia are performing Principle Component Analysis to turn "list of specific frequency activations" information into "total luminosity vs. list of chroma components" information.

Meanwhile, in any such neurological color space, there is only ever a single radial coordinate. Trichromatic vision is kind of special in that it is the first dimensionality at which chroma can be split into saturation and hue components. At higher dimensionalities, the hue space gets more complex, but we can say with some confidence that the extra dimensions introduced in higher-dimensional perceptual color spaces are not some extra sort of radial-coordinate saturation or any kind of weird third thing, but are in fact additional dimensions of hue--and along with extra dimensions of hue, qualitatively different kinds of composite colors!

Monochromats don't have any color. Dichromats don't have any secondary colors--just the spectral colors which, strangely to us, include white/grey. Our three dimensional human color space allows us to perceive two opponent channels, corresponding to 4 pure hues--red, yellow, green, and blue--and weighted binary combinations thereof that give rise to the secondary colors--r+y (orange), y+g (chartreuse?), g+b (cyan), and b+r (magenta), with one non-spectral hue (magenta). Non-spectral colors derive from simulataneous activation of cones with non-adjacent response peaks, and with three cones, there's only one such possibility. Meanwhile, a tetrachromatic system would have 3 opponent axes with 6 basic hues (r-g, y-b, and the new p-q), binary combinations of those hues with their non-opponents producing 12 secondary colors (r+y, r+b, r+p, r+q, g+y, g+b, g+p, g+q, y+p, y+q, b+p and b+q), and ternary combinations producing 8 extremal instances of an entirely new kind of hue--tertiary colors--not found in the perceptual structure of trichromatic color space (r+y+p, r+y+q, r+b+p, r+b+q, g+y+p, g+y+q, g+b+p, g+b+q), just as our secondary colors are not found in the dichromatic space. Additionally, there is not merely one non-spectral secondary color (magenta) in the fully-saturated hue space, but 3--and in general, that number will correspond to however many pairs of non-spectrally-adjacent sensor types there are (which actually works out to the sequence of triangular numbers!) If we assume that r, g, b, and q are the physiological primaries (note that the spectral locations of y and p depend on the decorrelation output for a specific set of 4 receptors with species-specific sensitivities), then the non-spectral secondaries are r+b, r+q, and g+q. All of the tertiary colors are non-spectral.

Ultimate writer takeaway: you may not be able to intuitively understand what non-human color experiences are like, but you can make some arbitrary implicit decisions about retinal physiology (i.e., just decide where you want to the opponent colors to appear along the spectrm), do some basic combinatory math, and then you have a list of descriptions of basic focal colors that you can assign words to--or, if you want to be a little more realistic, assign words to ranges of those focal colors, which you can precisely mathematically describe. This gets more complicated at higher dimensionalities (like pigeons' pentachromatic color space), but tetrachromacy is kind of convenient because you still have only 2 dimensions of hue, so you can actually diagram out what the color regions are, and just tell people "y'all already know how brightness and saturation work, so I don't need to put those on the chart".

Someday, I aspire to have a program where you can input the physiological frequency response curves for an arbitrary organism, and a spectrum, and it'll give you the mathematical description of the perceptual color that that would produce. But till then, you'll just have to do your best at guessing what the aliens and monsters and anthropomorphic animals see whenever a human thinks something is a particular color--but guess informedly, knowing what the structure of their color spaces is like!

P.S. What was that about Mantis shrimp? Well, Mantis shrimp have 16 different light receptor types, with 12 different color receptors, which kinda suggests that they should have a 12-dimensional color space with 10 dimensions of chroma. But... empirically, that's not what happens. Experimentally, they don't actually have all of those different color categories, or a particularly fine capacity for spectral distinction. Rather, they have a large number of different receptor types so that they can identify spectral colors at high speed, without doing any retinal pre-processing--chartreuse cone fires? Cool, that's a chartreuse thing! No need to bother with oponent processing! These kinds of extreme high dimensional visual systems might end up working more like our senses of smell or taste than like our perception of color. However, there's also another aspect of Mantis shrimp vision that's outside of color perception (and not entirely unique to Mantis shrimp, either): they can see polarization (hence the 4 visual receptor types that aren't for color, rather than just 1). This ability is comparatively easy to imagine and describe--it's an overlay of geometric information, that tells you "not only does this light have a particular color, it is also oriented in a particular way". Mantis shrimp are, however, unique in being able to distinguish circularly polarized light; other creatures with polarization sensitivity would be unable to tell circularly polarized from unpolarized light.