Sunday, September 13, 2015

Non-intersective Nouns & Negative Scopes

This post is a follow-up to several prior discussion on formal semantics for a minimalistic monocategorial language. If you aren't familiar with them already, you may want to read part one and part two before proceeding.

Non-intersective Nouns

There is a class of adjectives called "non-intersective adjectives" because the set of referents of a noun phrase that includes them does not intersect the set of referents for the same noun phrase without.
This concept is best explained with examples: "a short basketball player" is also "a basketball player", so "short" is an intersective adjective--the set of "short things" intersects the set of "basketball players", and the meaning of the whole phrase is the intersection of those two sets.
On the other hand, "a former basketball player" is not "a basketball player", so "former" is non-intersective. The meaning of "former basketball player" is not a subset of "basketball player", and isn't formed by intersecting it with anything. It's a completely disjoint set, but one that does have a logical relationship to the set of "basketball players".

Similarly, if you "almost finish", then you do not "finish", so "almost" is a non-intersective adverb.

My formal education in formal semantics only really exposed me to one way of modelling non-intersective adjectives/adverbs: as higher-order functions that take predicates as arguments and produce new, different predicates. Thus, the predicate logic notation for a "former player" is former(player')(x) (vs. player(x)) , for some entity x.

But this model is not very compositional (i.e., if a "former player" is former(player')(x), a "former basketball player" is... what exactly?), and results in icky complicated interpretation rules. As a result, I have so far mostly avoided them in WSL, and the few times I've needed them I've punted
and just decided that they are taken care of by morphological affixes. That works as long as you just want to say something like "a former player", and leave the "basketball" (or any other additional descriptors) out of it.

Contemplating the programming language Prolog, however, provides a way out of this mess. Basic versions of Prolog do not have functions that can compute arbitrary values--just predicates that can be true or false. A Prolog system can, however, simulate functions by allowing you to query it about what the possible values are that would need to be plugged in to one or more argument positions of any given predicate in order to make it evaluate to "true". You can specify "input" values for whatever arguments you want, and for whatever arguments you leave unspecified, the Prolog system will give you a set of all possible values of those variables that satisfy the predicate as "output". Since Prolog is based on first-order logic, the exact same transformation for simulating functions works for modelling non-intersective adjectives in a predicate logic model for formal semantics.

So, non-intersective adjectives can be modeled as two-place predicates (with one input argument and one output argument) that specify the relation between the actual final referent of a noun phrase and the thing-that-it-isn't. Plus some mathematical machinery to glue it all together. That insight will allow us to add the capacity for translating non-intersective adjectives into our monocategorial language. I say "translate" because, being monocategorial, the language does not have a distinct class of "adjectives" to add non-intersective members to. Instead, it will have "non-intersective nouns" (or "non-intersective noun-jectives").

Introducing that machinery into our monocategorial language requires a little bit of updating to the existing interpretation rules. The altered syntactic rules are as follows:

[|P: w P|] = λx.λy.λz.[|w|](x, y, z, [|P|])
[|P: w ,|] = λx.λy.λz. [|w|](x, y, z, λx.λy.λz. y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)})

Essentially, rather than immediately evaluating the denotation of the current word and the rest of the phrase with the same arguments, we pass the denotation of the rest of the phrase into the semantic function for the current word, which will allow the lexical semantics of a word to control what arguments are given to the rest of the phrase. At the end of the phrase, we pass in the "default" expressions that account for the possibility that no explicit quantifier or relation words were present. Note also that I have introduced the comma-separated argument list notation as "sugar" for repeated application of a curried function, since the large number of parameters to our lexical semantic level has started to become unwieldy.

Of course, since we've changed the lexical semantic interface in the syntactic interpretation rules, we have to also update the templates for our lexical semantic classes. The existing classes are updated as follows:

a) λx.λy.λz.λp. z ⊆ red & p(x, y, z)
b) λx.λy.λz.λp. y ⊆ {w : ∃e. e ∈ x & ag(e, w)} & p(x, y, z)
c) λx.λy.λz.λp. x ⊆ run & p(x, y, z)
d) λx.λy.λz.λp. x = y & p(x, y, z)
e) λx.λy.λz.λp. |y| > |z - y| & p(x, y, z)

In every case, we simply pass along all the original arguments directly to the semantic function for the remainder of the phrase, and logically conjoin it to the original lexical semantic expression. However, we now have the option of messing with those arguments if we so please; thus, we can introduce an additional semantic class for non-intersectives:

f) λx.λy.λz.λp.
        y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
        ∃x'.∃y'.∃z'. z ⊆ {a : ∃b. b & ∈ y' & former(a, b)} &
        p(x', y', z')

There's a lot going on here, but most of it is just book-keeping boilerplate. Let's break it down:
First, we cleanly terminate the description of the current referent; by the time we get to the end of the phrase, we won't be talking about the same thing anymore, so we have to throw in all the "just-in-case" assertions (that the referent set is some subset of the quantifier base and that it is has some relation to the sentential event) in here as well. In the next line, we assert the existence of a new quantifier base and a new referent set (z' and y', respectively) and, crucially, a new event x'--because if we're talking about a "former basketball player", then the "basketball player" which-he-isn't doesn't have any relation to this clause's event, so we have to replace it with something else. We then assert that all members of the current quantifier base have a specific relationship (in this case, "former") to some member of the new referent set. And finally, we use the rest of the phrase, denoted by p, to describe the new referent and new event.

The addition of this mechanism to the language has a few interesting long-range consequences. The most obvious is that word order actually matters. Up until now, we could've cheated on the phrase-structure rule and just said P → w w* , using the Kleene star operator for arbitrary repetition, instead of P → w P | w ,; but now, the fact that later words are in smaller phrase embedded at a lower syntactic level than previous words is very significant. Changing the ordering of words around a non-intersective noun can change which referent those words are actually describing. Similar effects are present in English; a "former blue car", for example, is not necessarily the same thing as a "blue former car". The first one may still be a car, but of a different color, while the second is definitely no longer a car, but definitely is blue.

Note that while syntax now encodes new information in word order about referent scope, the syntactic rules no longer encode information about logical connectives; the implicit conjunction of all lexical items is now a function of lexical semantics instead. We could consider this an accidental artifact of the formal framework we're using, but we might also come back to it later and exploit the lexification of logical connectives to come up with some new lexical semantic classes.

The second long-range consequence is that class-c words now have a real essential function; in the beginning of a phrase, prior to any non-intersectives, they are essentially adverbs, unconnected to the phrasal referent, which can float between phrases with no change in sentential meaning. After an intersective, however, they no longer act on the clausal event. Instead, they describe some event that the new referent is a participant in. The same applies to class-b and -d relation words.

Negative Scopes

Most non-intersectives have an implicit "not" built in to them. A "former basketball player" is not a "basketball player", a "fake Picasso" is not a "Picasso", and while an "alleged thief" might be a "thief", he also might not. And it turns out that with the interpretive machinery we've built so far, we can actually translate "not" as a non-intersective noun with the following lexical semantics:

not: λx.λy.λz.λp.
        y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
        ∃x'.∃y'.∃z'. z ∪ y' = {} &
        p(x', y', z')

I.e., "the relation between the quantifier base for the current referent set and the new referent set is that they have no elements in common".[1]

We can thus expect all non-intersectives to behave somehow similarly to negatives in the behavior they trigger in lower scopes.[2] This gives us a possible use for semantically-significant stress focus, in addition to the rising-falling intonation patterns that are used to denote phrase and sentence boundaries. We can use stress focus to indicate the specific reason that a referent "is not" something else. Referring to a previous example, the ambiguity of "former blue car" can be resolved by stating that it is either a "former blue car" or a "former blue car"--indicating that the reason the item in question is "former" is because it is no longer "blue" in the first case and no longer a "car" in the second.

More traditionally, we might want to make some special accommodations for phrases that are described by monotone-decreasing quantifiers, like "no" or "few", which are also "negative", in a slightly different way. Perhaps the "reason" for a decreasing quantifier can also be indicated by stress focus (do "few men work" or do "few men work"?) The practical impact of that level of ambiguity ("does this instance of focus refer to the quantifier or the non-intersective scope?") is likely to be minimal.

In either case, this will already be a very long blog post, so updating the syntax to recognize focused items is left as an exercise for the reader.

The Weird Ones

Non-intersective nouns, it turns out, don't have to be used only to translate what English encodes as non-intersective adjectives. They're just words that specify some arbitrary not-necessarily-intersecting relationship between the quantifier base for one referent set, and some other referent set. Y'know what else acts like that? Adnominal adpositions. E.g., prepositions that describe noun phrases. Also, genitives (for which English can use the preposition "of", but doesn't always).

Wanna say "Bob's cat climbs trees" in monocategorial form? "Cat agent of Bob, tree theme, climb event." The word "of" ends up as a non-intersective noun! Crazy! Incidentally, its semantics are as follows:

of: λx.λy.λz.λp.
        y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
        ∃x'.∃y'.∃z'. z ⊆ {a : ∃b. b & ∈ y' & ∃r. r(a, b)}
        p(x', y', z')

I.e., "all members of this quantifier base have some kind of relation with some member of the new referent set".

And there are all sorts of normal English nouns that entail a relationship to some other unspecified referent. "Father", for example. A "father" cannot exist without a child, so the concept of "father is naturally expressed in terms of a two-place predicate... which will be embedded inside the machinery for non-intersectives to allow describing both halves of the relationship, should you so desire[3]. So, "Bob's father build's houses"? In monocategorial form, it's "Agent father Bob, many house patient, build event". Note that in this case, the word order in the phrase "Agent father Bob" is critical. If we swap things around, we get the following different readings:

"father agent Bob": "Bob-who-does-something's father is somehow involved"
"Bob father agent": "Bob, who is the father of someone who does something, is somehow involved"
"Bob agent father": "Bob, who is a father, is the agent (builds houses)"

And if we want to cover both sides of the relationship at the same time: "John agent father Bob, many house patient, build event" ("Bob's father, John, builds houses.")

[1] Note that this is not the same as a translation for "no", the negative quantifier, which is addressed a little further down the page. This sort of "not" means "I'm talking a referent that is identified by not being that other thing", as opposed to saying "no referents of this description are involved in the action".
[2] While similar, this is actually not the same thing as the "negative scopes" that license negative polarity items like "anymore" in English. Those are a feature of the behavior of certain quantifiers, like the negative quantifier "no" described in [1].
[3] WSL encodes these kinds of nouns as Roles. When we get back to the semantic model for WSL in a later post, we'll see the utility of a productive morphological derivation system to turn Roles into non-intersective Nouns and vice-versa.