Tuesday, September 8, 2015

Generalized Quantifiers for a Monocategorial Language

At the end of yesterday's post, I briefly mentioned the concept of generalized quantifiers. Today, I want to investigate how the semantics of the minimalistic language can be extended with that concept to eliminate the need for "built-in" existential quantification.

The first step is to recognize that noun phrases do not necessarily denote single referents. In a sentence such as "Every student goes to school", for example, the phrase "every student", while grammatically singular, does not refer to just one student; rather, it denotes a set of students (in this case, all of them), and the sentence makes a statement about the properties of that set- all of it's members also belong to the set of things that go to school. Or, in other words, [|every student|] is a subset of [|goes to school|].

We can thus modify yesterday's semantics so that all entity variables actually refer to sets. In this case, the form of the syntactic interpretation rules need not change (although how we read them may be tweaked, replacing "There exists some x" with "There exists some set x"), but lexical semantics for each possible word type end up looking like this:

a) λx.λy. y ⊆ red
b) λx.λy. y ⊆ {z : ∃e. e ∈ x & ag(e, z)}
c) λx.λy. x ⊆ run
d) λx.λy. x = y

Note that predicates are defined by the set of arguments for which they are true. Thus, predicates are set-valued entities on which we can use set operators like ⊆ (subset) and ∈ (element of); the traditional predicate logic notation that we have been using so far, pred(x), is simply shorthand for the set-theoretic formula x ∈ pred.

The only major alteration introduced here is seen in the semantics for relation words, class b, which must be modified to explicitly construct the subset of entities which have a particular relation to some element of the set of events.

It is now straightforward to add a fifth semantic class of words which in some way restrict the cardinality of (or quantify) a set:

e) λx.λy. |y| = 5

(There could be an additional sixth class that restricts the cardinality of the event set, represented by the variable x, but for simplicity we will ignore that possibility for now.)

This works great for simple numerals (like 5, as shown in the example) and more vague things like "many" or "a few"; but, it causes problems for quantifiers like "most" or "one-third" which restrict the cardinality of the referent set compared to what it would have been if it were not quantified. We need some way of keeping track of that original maximal set.

In a more "normal" language, generalized quantifiers would operate at a separate syntactic level from nouns, and could take in the compositional denotation of the rest of a noun phrase all at once, and produce a new restricted set from it. In this language, however, we don't have that luxury. If we want to keep things monocategorial, we need to find some way of keeping track of the base set and restrictions on the final quantified set simultaneously as additional quantifiers and other words are added in arbitrary orders. This will require altering our lexical semantics to account for a third argument, and that will in turn require altering the syntactic interpretation rules to provide that third argument.

The altered interpretation rules look like this:

[|S|] = ∃x.[|C|](x)

[|C: P C|] = λx. ∃y. ∃z. [|P|](x)(y)(z) & [|C|](x)
[|C: P .|] = λx. ∃y. ∃z. [|P|](x)(y)(z)

[|P: w P|] = λx.λy.λz.[|w|](x)(y)(z) & [|P|](x)(y)(z)

[|P: w ,|] = λx.λy.λz. [|w|](x)(y)(z) & y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)}

Here we have simply asserted the existence of an additional set variable, z, and added a term to define y, the set of referents for a phrase, to be a subset of z.

The new forms of the different lexical semantic classes look like this:

a) λx.λy.λz. z  red
b) λx.λy.λz. y ⊆ {w : ∃e. e  x & ag(e, w)}
c) λx.λy.λz. x  run
d) λx.λy.λz. x = y
e) λx.λy.λz. |y| > |z - y|

Now, class-a noun-jective words operate on the set z, which forms the basis set for quantification. Relation words (classes b and d) act on y, which represents the actual referents of the phrase and is the result of quantification, and x, as before; and class-e quantifier words specify some relation between the set of referents y and the basis set z. The example given shows the semantics for the quantifier "most": the cardinality of the set of referents is greater than the cardinality of its difference with the basis set.

So far, we have not actually eliminated the need for logical existential quantifiers "built-in" to our semantics, but we have eliminated their semantic effect; all entities described by a sentence in the monocategorial language are no longer implicitly existentially quantified. Rather, the quantifiers in our predicate logic forms serve only to bind variables that we can use to refer to the different referent sets; and the referent sets are explicitly quantified by whatever quantifier words you feel like using. In the absence of any explicit quantifier word, sentences are evaluated as being true for some, unspecified, subset of the basis. This is exactly the same level of ambiguity present in natural languages (like Mandarin) which lack obligatory grammatical number.

Actually eliminating the existential quantifiers from our predicate logic forms would require directly constructing the relevant sets from unions and intersections, so as to eliminate the need for a common variable to use to tie the different parts together. That, in turn, requires separating the classes of quantifier and relation words from the class of noun-jectives, which cannot be done while maintaining the monocategorial analysis. It will, however, be possible to do so in WSL, which does have the necessary multiple syntactic levels.

It should also be noted that in adding quantifier words to the monocategorial language, we actually did not need to go all the way to introducing the full formalism of "generalized quantifiers"- and in fact, it looks like we can't, since the order of compositional operations which motivates the use of generalized quantifiers in the semantics of natural languages just doesn't exist here. The denotations of our monocategorial phrases are simple sets of referents, whose members have some thematic relation to the implicit event. In contrast, the denotations of natural-language noun phrases composed of nouns and generalized quantifiers are sets of possible sets of referents with the appropriately restricted cardinalities; and the correct set of referents is then extracted at the next higher level of composition, when a role is assigned to the noun phrase by a verb or adposition. Without that extra level, the monocategorial language must simply specify the final referent set directly. Again, however, WSL does have the more typical separation of nouns, quantifiers, and role-assigning words at different syntactic levels, and so we will be able to explore a more naturalistic analysis for that language.

For more thoughts on the monocategorial language, see this following post.