Sunday, October 20, 2024

On the Tjugem Alphabet & Font

This Bluesky thread with Howard Tayler reminded me that, although I posted progress updates about it on Twitter back in the day, I never did a comprehensive write-up on how the thing works.

    A good place to start is this Reddit comment on Toki Suli.Yeah, it's not Tjugem, but phonetically it works the same way. Quote:

in the WAV files, the 'm' sounds seem to be going up rather than down, such as with "mi", even though the "m" is supposed to be grave. sharp and acute sounds seem to go down rather than up, such as in "tu".

is the linguistic term for "downward" vs "upward" the opposite of what i'd expect from a western music theory perspective? or am i maybe missing something as i'm listening to the files?

    Yes, Reddit user, you were missing something! Because in the phonetics of human whistle registers, "grave" and "acute" are positions. not motions. So, if you move from a vowel to a grave consonant, the formant will go down in pitch--from a middle-pitch vowel locus to a low-pitch consonant locus. But when going from a grave consonant to a vowel, pitch will go up--from a low-pitch consonant locus to a middle-pitch vowel locus. An "m" in between two vowels willl be realized by a down-then-up formant motion, while a "t" between two vowels will be realized by an up-then-down motion.

    Now, because whistled speech only has a single formant, it turns out to be not-unreasonable to write whistled speech as an image of the formant path on a spectrogram. You can just write a continuous line with a pen! Or, almost. There are some details--like amplitude variation--that are lost if you try to write with a ballpoint, and still difficult to get right if you write with a wide-tip marker or fountain pen. Thus, a few extra embellishments and decorations are useful, but that is the basic concept: each letter is just the shape that that letter makes on a spectrogram when pronounced. And with just that background, you should be able to start to make sense of this chart of Tjugem letters, as they would be written on lined paper:


    The correspondence between Tjugem glyphs and the standard romanization is as follows:

   
    Keep in mind, however, that the actual phonemes are whistles--not sounds that are representable with the IPA, despite the fact that the romanization is designed to be pronounceable "normally" if you really want to. And for the sake of space, only the allographs for one vowel environment are shown for each consonant. The G glyph is not so much a "glyph" as a lack of one, which is why it does not show up in the first image; acoustically, the phoneme is just a reduction in the amplitude of a vowel, represented by a break in the line. Thus, any line termination could be interpreted as a G. That necessitated the introduction of the line termination glyphs, which have no phonetic value but just indicate that a word ends with no phonemic consonant. The above-line vs. below-line variants of the Q glyph are chosen to visually balance what comes before or after them. Additionally, the "schwa" vowel (romanized as "E") is not represented by any specific glyph. The existence of a schwa sound in the first place is an unavoidable artifact of the fact that transitioning between certain consonants requires moving through the vowel space, but which vowel loci end up being hit isn't actually important. So, in the Tjugem script, the schwa just turns into whatever stroke happens to make the simplest connection between adjacent consonants.

    You shouldn't be expected to always be writing on lined paper, which explains the extra lines--a mark above or below a vowel segment tells you whether it is a high vowel or a low vowel, for those curves which could be ambiguous. And the circular embellishments help to distinguish manner of articulation for different consonants, which have the same spectral shape but different amplitude curves, which would otherwise have to be indicated by varying darkness or line weight. But note in particular that every consonant comes in a pair of mirror-symmetric glyphs: one moving from the vowel space to the consonant locus, and one moving from the consonant locus to the vowel space. And there are three different strokes for each half-consonant depending on which vowel is next to it! Making for a total of six different strokes for every consonant, because the actual spectral shapes of consonants change depending on their environment! It's allophony directly mirrored in allography.

    This makes creating a font for Tjugem rather... complicated. Sure, we could assign every allograph to a different codepoint, but that would be very inconvenient to use. It would be nice if we could just type out a sequence of phonemes, one keystroke per phoneme, and have the font take care of the allographic variation for us! Is that sort of thing possible? Yes! Yes, it is!

    The individual letter forms get assigned to a list of display symbols, specifying every possible consonant/vowel pairing:
# i_t i_d i_n i_k i_g i_q i_p i_b i_m
# a_t a_d a_n w_a_k j_a_k a_g w_a_q j_a_q a_p a_b a_m
# u_t u_d u_n u_k u_g u_q u_p u_b u_m
# t_i d_i n_i k_i g_i q_i p_i b_i m_i
# t_a d_a n_a k_a g_a q_a p_a b_a m_a
# t_u d_u n_u k_u g_u q_u p_u b_u m_u
# i_i j_a j_u_a j_u
# u_u w_a w_i_a w_i

and the slots for the romanized letters that we actually type out (a b d e g i j k m n p q t u w) are left blank. Contextual ligatures are then used to replace the sequence of input phonemes with an expanded sequence of intermediate initial, final, and transitional symbols, which are then finally substituted by the appropriate display symbols, which are then used to look up the correct alloglyphs. Then, it we update the boring straight-ruled glyph set with a slanted, more flowy-looking version, we can get a calligraphic font slightly reminiscent of Nastaliq, where lines can overlap each other because the ornamentation disambiguates; the Tjugem Tadpole script:



A Brief Note on John Wick

The actual Russian dialog in the John Wick movies is, uh... not great? But, the fact that John Wick is diegetically fluent in Russian ends up kicking off the plot of the first movie, when Russian gangster Iosef tries to buy John's car. Iosef asks how much, John says it ain't for sale, then, from  the script:

                                              IOSEF
                         (in Russian, subtitled)
                     Everything's got a f[*****]g price.
                         
                                              JOHN
                         (in Russian, subtitled)
                     Maybe so... but I don't.

          Taken aback by John's fluency, he watches as John enters the
          vehicle, guns the engine, and drives off.

(Censored for sensitive eyes.)

However, that's not actually how it was filmed! The Russian dialog for that scene in the movie is as follows (or at least, my interpretation of it; the pronunciations are bad):

                                              IOSEF
                     У всего, сука, своя цена.
                         
                                              JOHN
                     А у этой суки нету.
This is closed-captioned as
                                              IOSEF
                     Everything's got a price, b[***]h.
JOHN Not this b[***]h.

Which is not word-for-word, but essentially accurate. Given that Iosef did not expect John to understand him, we have to assume that his switch into Russian was expressing frustration to himself, even though it contains a vocative, clearly addressing the sentiment to John. Possibly, he was going to switch back into English to attempt another pitch, after reminding himself that everything has a price. And if that's what had happened, then this insertion of Russian dialog would've been just a bit of implicit character exposition, with a bit of an Easter Egg for a Russophone audience. But John responding at all suddenly changes the dynamic. That's also an implicit character exposition moment--we learn that John, despite being American, speaks Russian for some reason, which is further explicated later on. But in the scene, Iosef realizes that John must have understood him, and knows that Iosef was insulting him!  That turns the outcome of the interaction into a face-threatening issue. Now, in addition to still wanting the car which John has denied him, Iosef has to back up the implied threat of his insult to save face.

The change in dialog from the script also adds a layer of double meaning, because John has his (female) dog with him in the car. Thus, Iosef could be interpreted as insulting the dog (which--spoiler alert--he later kills), which John has a strong emotional attachment to. (It turns out the Russian word for "female dog" has exactly the same insulting double-meaning that it does in English!) Out of context, John's reply could even be interpreted as claiming that his dog is not for sale, as opposed to his car--and both interpretations are true! The same cannot be said about Iosef's statement, but the oblique association is a nice addition to the scene as filmed.

If you liked this post, please consider making a small donation!

The Linguistically Interesting Media Index

Wednesday, October 9, 2024

Newtonian Mechanics in 4+1 Dimensions

In the higher-dimensional universe of the world of Ord, most of Newtonian mechanics generalizes to 4 spatial dimensions (and 5 total dimensions when you include time--hence the 4+1 in the title) just fine. 

 is still true when F and a are 4-component vectors instead of 3-component vectors, and so is 
 for linear momentum. Squaring vectors still produces scalar quantities, so

KE = 1/2mv^2
is still true, and 
still works just fine. Rotation occurs in a plane with some fixed center for all numbers of dimensions, so the formula for moment of inertia in a given plane, 
is also still valid.

But when it comes to angular momentum and torque, we've got a problem. 

and 
contain cross products, which only exist in exactly 3 dimensions. Usually, these are explained as creating a unique vector that is perpendicular to both of the inputs; but in less than 3D, there is no such vector, and in 4 dimensions or more, there is a whole plane (or more) of possible vectors. In reality, angular momentum and torque are not vectors--they are bivectors, oriented areas rather than oriented lines, which exist in any space of more than 1 dimension. It just happens that planes and lines are dual in 3D--for every plane, there is a normal vector, and for every vector there is a perpendicular plane, so we can explain the cross product as producing the normal vector to the plane of the bivector.

In 4D, you can't implicitly convert a bivector into its dual vector and back, so we have to deal with the bivectors directly. Bivectors are formed from the outer product or wedge product (denoted ∧) of two vectors, or the sum of two other bivectors. Thus, we can write the angular formulas for a point particle in any number of dimensions as 

and 
And those a good for orbital momentum and torque about an external point on an arbitrary body as well. To get spin, we need a sum, or an integral, all of the components of an extended body. That means we need to be able to sum bivectors! That's easy to do in 2D and 3D; in 2D, bivectors can be represented by a single number (their magnitude and sign), and we know how to add numbers; in 3D, as we saw, bivectors can be uniquely identified with their normal vectors, and we can add normal vectors. In either case, you always get a simple bivector (also called a blade) as a result; i.e., for any bivector in 2D and 3D space, you can find a pair of vectors whose wedge product is that bivector. But in 4 dimensions and above, that is no longer true. This is because, once you identify a plane in 4+ dimensions, there are still 2 or more dimensions left over in which you can specify a second completely perpendicular plane which intersects the first at exactly one point (or zero or one points in 5+ dimensions), and there is no set of two vectors that can span multiple planes. This also means that there can be two simultaneous independent rotations, with unrelated angular velocities, and the formulas for angular momentum and torque must be able to account for arbitrary complex bivector values. You could, of course, just represent sums of bivectors as... sums of bivectors, with plus signs in between them. But that's really inconvenient, and if you can't simplify sums of bivectors, then those formulas aren't very useful for predicting how an object will spin after a torgue is applied to it!

Fortunately, even though the contributions of multiple not-necessarily-perpendicular and not-necessarily-parallel simple bivectors will not always simplify down to a single outer product, it turns out that in 4 dimensions, any bivector can be decomposed into the sum of two orthogonal simple bivectors--and most of the time, the result is unique. Unlike vector / bivector addition in 3D, this is not a simple process of just adding together the corresponding components, but there are fixed formulas for computing the two orthogonal components of any sum of two bivectors. They are complicated and gross, but at least they exist! So, we can, in fact, do physics!

The result of bivector addition does not have a unique decomposition exactly when the two perpendicular rotations have exactly the same magnitude. This is known as isoclinic rotation. With isoclinic rotations, you can choose any pair of orthogonal planes you like as a decomposition. Once you pick a coordinate system to use, there are exactly 4 isoclinic rotations, depending on the signs of each of the two component bivectors. In isoclinic rotation, every point on the surface of a hypersphere follows an identical path, and there is no equivalent of an equator or pole. Meanwhile, simple rotation results in a circular equator, but also a circular pole--i.e., a circle of points that remain stationary as the body spins. That circle is also the equator for the second plane of rotation, so the ideas of "equator" and "pole" become effectively interchangeable for any object in non-isoclinic complex rotation. One plane's equator is the other plane's pole, and vice-versa.

Looking ahead a little bit to quantum mechanics, particle spin in 4D is still quantized, still inherent, still divides particles into fermions and bosons--but has two components, just like the angular momentum of a macroscopic 4D object. Whether or not a particle is a boson or a fermion depends on the sum of the magnitudes of the two components. If the sum is half-integer, the particle is a fermion. If the sum is integer, then its a boson. Thus, bosons can (but need not necessarily) have isoclinic spins, and the weird feature of quantum mnechanics that the spin is always aligned with the axis you measure it in would not be so weird, because that's the case for isoclinic rotation of macroscopic objects, too! Fermions, on the other hand, can never have isoclinic spins! Because if one component has a half-integer magnitude, the other must not. In both cases, however, there end up being four possible spin states for all particles with complex spins, allowing fermions to pack more tightly than they do in our universe; 2 spin states (as in our universe) for particles with simple spins; and of course only a single spin state for particles with zero spin.