Saturday, May 14, 2016

Jelinek & Demers Were Wrong

But not about what everybody else thinks they're wrong about! No, I'm totally on board with whole "Straits Salish has no nouns" thing.

In Predicates and Pronominal Arguments in Straits Salish, Eloise Jelinek and Richard A. Demers make the following assertion:
[F]or a language to lack a noun/verb contrast, it must have only [pronouns] in A-positions (i.e. argument positions). Otherwise, if each root heads its own clause, there would be an infinite regress in argument structure.
This is further footnoted as follows:
We thank an anonymous reviewer for raising the question of whether there might be a language just like Straights Salish, except for having DetPs in A-positions. In such a language, the predicates of which the argumental DetP would be based would in turn have their own DetP argument structure, and so on ad infinitum.
But this is an obvious false dichotomy! Yes, there cannot be a language otherwise like Straits Salish that only allows DetP arguments, because that would lead to infinite regress. But who says that just because you decide to allow DetP arguments, that you must eliminate your base case? You can still use pronominal arguments, or just dropped arguments, to halt the recursion!

So, of course, I had to conlang this. Not creating a language just like Straits Salish (that would be boring!), but something that has all of the same relevant bits of morphosyntax, but also allows nesting determiner phrases as clause arguments. This is a project I've been thinking about for a good long while, but as I just mentioned the language in another post, I figured this would be a good time to finally blog about it.

The name of the language is given in that other post as "Valaklusha", but that's an English exonym- it's a close approximation of how the name should be pronounced in normal English orthography. The proper romanization of the language's endonym (it's name in itself) is "Valaklwuuxa". I do not promise to be at all consistent in which spelling I use at different times.

This name was inspired by Guugu Yimithirr, which essentially means "language that has yimi", where yimi is the Guugu Yimithirr word for "this". The meaning of yimi is totally irrelevant to the name of the language, but it's a distinctive word that that language has and its neighbors don't. The practice of naming based on some distinctive word occurs in many other languages, and that's what I did with Valaklwuuxa. That name essentially means "about saying lwuuxa", or "how one says lwuuxa" where lwuuxa is the word for "woman" (pronounced /ɬʷu:ʃa/; the whole name is /fa ɬak ɬʷu:ʃa/). If Valakluuxa had a bunch of neighbor languages, presumably their words for "woman" would be different. Extra-fictionally, this came about because I had a hard time deciding which option I liked best for the word for "woman" in the as-yet-unnamed language, so I picked one and told myself that all of its uncreated, theoretical close relative languages have all of the other versions that I didn't pick.

(Side note: While collecting my notes, I was reminded of Lila Sadkin's Tenata, which was presented at LCC2. Tenata and Valaklwuuxa have no real relation, except that Tenata is another language that erases the noun/verb distinction, but it's cool and you should check it out. Tenata is actually much more similar to WSL, which I've blogged about extensively.)

But enough of that! I can get back into interesting bits of lexical semantics later- on to the hard-hitting morphosyntax! I'm just gonna give a brief overview of the important parts here; if you want gory details, see the ever-evolving Google Doc documentation.

Like WSL, Valaklusha, despite being a proof-of-concept engelang, is not a minimal language- it has a lot of accidental complexities, because they are fun and make it more natural-looking. Many of these accidental complexities make it not much at all like a Salish language (which have their own collections of accidental complexities), except in this basic core structure:

The only open-class roots are basically verbs, divided into basic transitives and basic intransitives. There are proclitic pronouns used for subjects, when an explicit subject phrase is missing, and pronominal agreement suffixes for all other arguments. There is one preposition (maybe two, depending on how you analyze it), used to mark certain oblique arguments, and there are a few determiners. Additionally, there are a few basic adverbs, and "lexical suffixes", which is a very Salishan feature.

Verbs show polypersonal agreement with suffixes and hierarchical alignment with a 2p > 1p > 3/4p animacy hierarchy. Further gradations of animacy in non-discourse-participants are unnecessary, because the subject marking clitic (or lack thereof) tells you whether any explicit 3/4p argument is the subject or not. The 3rd person refers to proximate objects or people who are present or may be addressed, while 4th person refers to distal objects and people who are not present or who are otherwise excluded from a conversation. Verbs also take several valence-altering affixes, including passive, antipassive, inverse, and transitivizer suffixes, as well as applicative prefixes.

Aside from the subject clitics, there are no explicit pronouns that can stand as arguments on their own; Valaklusha is essentially obligatorily pro-drop. All other arguments are determiner phrases.

Brief aside for phonology & romanization
I'm about to start quoting actual morphemes from Valaklusha, and some readers may want to know how to actually read those. If you don't care, you can skip this bit.

The phonology is very not Salishan, but it has some fun features I've wanted to play with for a while- notably, clicks. Stops come in a voiced and unvoiced series: the usual p/t/k, b/d/g distinctions, and then a voiced glottal stop and an unvoiced pharyngeal, <g'> and <k'>. Yes, there are apostrophes, but they actually mean something sensible! There are two basic clicks: a dental/alveolar click <tq>, and a lateral click <lq>.
All stops and clicks can, however, be prenasalized, indicated by an n-digraph, for a total of 12 stop and 4 click phonemes. Voiced prenasalized stops, however, are only realized as stops pre-vocalically; in other positions, they become homorganic nasal continuants. There are no phonemic simple nasals, so this creates no risk of homophony.

There is only a single series of fricatives: <l> (/ɬ/, a lateral fricative), <x> (/ʃ/), <s> (/s/), <h> (/θ/), and <v> (/f/). The fricatives and clicks default to unvoiced, but gain voicing between two voiced sounds (i.e., intervocalically, or between a vowel and a voiced stop), in a sort of Old-Englishy way. I chose <v> rather than <f> for that labiodental fricative, however, just because I think it looks nicer.

The phonemic fricative inventory is very front-heavy, so to balance things out /k/ and /g/ undergo allophonic variation and become fricatives in intervocalic positions, thus providing some back-of-the-mouth fricative phones (even though they're not distinctive phonemes), one phonemic voicing distinction between fricative phones, and one unvoiced intervocalic phone.

There are three basic vowels, which come in rounded and unrounded pairs: <a> (/a/ ~ /ɑ/), <e> (/ɛ/ ~ /i/), and <u> (/ɯ/); and <wo> (/ɔ/ ~ /o/), <we> (/ø/ ~ /y/), and <wu> (/u/). The digraph <wo> was chosen instead of <wa> for the low rounded vowel just because I think that the average reader is more likely to remember to pronounce that correctly. The rounded vowels induce rounding-assimilation on preceding consonants and preceding hiatus vowels. This can induce mutation of previously-final vowels during affixation. Rounded vowels can occur in hiatus with each other (in which case the <w> is only written once, as in <lwuuxa>), but unrounded vowels can only occur individually. Identical vowels in hiatus (again, as in <lwuuxa>) are half-long.

There are a bunch of other phonotactic rules, but you only need to know those to make up new words; this should suffice for reading.

Back to morphosyntax

Determiner phrases are formed from an article or other determiner (e.g., demonstrative) followed by a relative or complement clause. Only subjects can be relativized, and relative clauses never contain subject proclitics- any explicit arguments are always objects or obliques. Complement clauses (with one exception) are distinguished from relatives by the presence of a subject clitic. In sentences with explicit subjects, where the subject marking clitic would otherwise be absent, the clitic <he=> is used, which can result in ambiguity in sentences that are conjugated for a 4th person argument.

The quotative determiner <lak> is used to introduce direct quoted speech as the argument of a verb; this determiner can never introduce a relative clause, and no additional clitics are required to mark the complement. Clauses introduced by <lak> can be used as core argument or obliques, without additional marking- thus the possibility that this might be analyzed as a preposition as well.

The language has only one unambiguous preposition (<va>), which is used before a determiner (other than the aforementioned <lak>) to mark obliques. Some verbs (like passivized transitives, or natural semantic ditransitives like "give") assign a specific role to at least one oblique argument, but extra oblique arguments can be added just about anywhere as long as they "make sense"- e.g., for time or locative expressions. If, for example, there were two obliques in a passive clause, one of which refers to a person and one of which refers to a building, nobody's going to be too terribly confused about which one is the demoted agent and which one is the locative.

Aside from subject clitics, any clause can have at most one explicit core argument. This is because the hierarchical alignment system cannot distinguish participants if there are two 3rd or 4th person arguments in the same clause; some languages are fine with this ambiguity, but the typical Salishan approach, copied here, is just to make it ungrammatical- if you need to talk about two 3rd or 4th person referents with explicit determiner clauses, you'll just have to find a way to re-arrange it.

The explicit marking on obliques means that oblique and core arguments can come in any order, and you are thus free to rearrange them in whatever way is most convenient- for example, to minimize center-embedding.

And it gets a whole lot more complicated with serial verb constructions, and tense and aspect and mood and so forth, but those are the bare essentials.

Now I think it's time for some examples!

First, some basics:

le-swetqe /ɬɛ.sʷyǀɛ/
le=swetqe-0
3sg.SUB=man-3sg
"He is a man."

le-nk'ap /ɬɛ.n͡ʡap/
le=nk'ap-0
3sg.SUB=coyote-3sg
"It is a coyote."

Here we have words for "man" and "coyote", but they are not nouns- by themselves, they are intransitive verbs, meaning "to be a man", and "to be a coyote". Also note that the third-person singular subject clitic does not distinguish gender, and the third-person singular intransitive conjugation is null.

le-tupund /ɬɛ.tɯ.pɯn/
le=tupund-0
3sg.SUB=hit.TRANS-3sg.3sg
"It hit(s) it"

Here we see the word for hit, which is transitive, but otherwise behaves identically to the words for "man" and "coyote"- they're all verbs.

tupund txe swetqe-la /tɯ.pɯn tʃɛ sʷy.ǀɛ.ɬa/
tupund-0 txe swetqe-0=la
hit.TRANS-3sg.3sg DEF man-3sg=ART
"The man hit(s) it" / "The hitter is the one who is a man"

Now we have relativized <swetqe> ("to be a man") with the definite article <txe> (ignore the <-la> bit for now- it's complicated). We can also switch this around to get

swetqe ta tupund-la (sʷy.ǀɛ ta tɯ.pɯn.ɬa)
swetqe-0 ta tupund-0=la
man-3sg IND hit-3sg.3sg=ART
"The/A man is a hitter."

which just further confirms that there is no morphosyntactic distinction between <swetqe> and <tupund>- either one can act as a predicate, and either one can act as an argument.

It certainly looks in both cases like that determiner phrase is acting like a non-pronominal argument, either of <tupund> or <swetqe>, especially since the subject clitics are missing! If we put the subject clitic back in along with an explicit determiner phrase, the meaning changes substantially:

le-tupund txe nk'ap-la /ɬɛ.tɯ.pɯn tʃɛ n͡ʡap.ɬa/
le=tupund-0 txe n'kap-0=la
3sg.SUB=hit.TRANS-3sg.3sg DEF coyote-0=ART
"He/it hit/is hitting the coyote."

But with some fiddling one could argue that perhaps there really is a null pronominal argument, and the relative clause isn't actually nested, but just adjacent... so let's go further!

swetqe txe tupund txe nk'ap-la /sʷy.ǀɛ tʃɛ tɯ.pɯn tʃɛ n͡ʡap.ɬa/
swetqe-0 txe tupund-0 txe nk'ap-0=la
man-3sg DEF hit.TRANS DEF coyote-3sg=ART
"The man hit(s) the coyote." / "The one who hit(s) the coyote is a man."

nk'ap txe tupundsa txe swetqe-la /n͡ʡap tʃɛ tɯ.pɯn.sa tʃɛ sʷy.ǀɛ.ɬa/
nk'ap-0 txe tupund-sa-0 txe swetqe-0=la
coyote-3sg DEF hit.TRANS-3sg.3sg DEF man-3sg=ART
"The coyote is/was hit by the man."

Note that we can't actually make <tupund> the main verb in this sentence, because then we'd have both <swetqe> and <nk'ap> left over to use as arguments, and we can't have two 3rd person core arguments in one clause. But here we can clearly see that, in either case, there must be an intransitive relative clause nested inside a transitive relative clause nested inside another intransitive matrix clause- finite, two-level deep recursive nesting, with no pronominal arguments and no noun/verb distinction!
The two determiner phrases can't both be arguments to the initial predicate both because we know that those predicates are intransitive and because we can't have more than one core argument in a clause, so either they are disconnected sentences talking about different things, which is ruled out by the known interpretation, or they are connected in some other way. Furthermore, we have evidence that the intransitive relative clauses must be nested as arguments inside the transitive relative clauses because changing their order radically changes the interpretation and is only marginally grammatical, if you treat the first part as a parenthetical aside with the first half of the sentence missing:

... (swetqe txe nk'ap-la) txe tupund
... (swetqe-0 txe nk'ap-0=la) txe tupund-0
... (man-3sg DEF coyote-3sg=ART) DEF hit.TRANS
"... (and the man is also a coyote), who hit it."

So, bam, take that, Jelinek and Demers. No nouns, DetPs in argument position, finite recursive depth. Booyah.

Stay tuned for more Fun With Valaklusha!