Last time we saw how to introduce generalized quantifiers and arbitrary specifier positions into our model of WSL; but, in the process, we lost any recognition of the scoping effects between quantifiers. Figuring out how scoping effects work for generalized quantifiers represented by sets-of-sets can get pretty complicated and confusing, so we're gonna really slow and step-by-step.
First, let's revise and review our model of the syntax so far. The complete syntactic model that I'll be using for the rest of this post is given by the following grammar:
S → P AP
AP → RP AP | 0
RP → QP R | QP e
QP → Q NP
NP → N NP | 0
Where an S is a Sentence, an AP is a Argument Phrase, an RP is a Role Phrase, an R is a Role, a QP is a Quantifier Phrase, a Q is a Quantifier, an NP is a Noun Phrase, and N is a Noun.
Next, let's look at some examples to get a better grasp on how scope should work. Consider the English sentence
"Everybody loves somebody."
This has several possible translations into WSL; two of them are:
1) Ka ves anz i siru jest anz jo.
and
2) Ka jest anz jo siru ves anz i.
Sentence (1) states that, for every person, there is someone whom that person loves- but not every lover necessarily loves the same lov-ee. Sentence (2), on the other hand, states that there is some single person whom everybody else loves, because the existentially-quantified patient is now outside the scope of the universally quantified agent.
Additionally, sentence (1) allows that every agent might be participating in a totally separate instance of "loving", while sentence (2) indicates that there is only one instance of "loving" going on, and every person is a simultaneous agent in it. This is because of differences in the scope of the existentially-quantified specifier ("siru", with a phonologically-null quantifier). Moving "siru" to different positions produces more subtly-different interpretations; e.g., taking (2) and moving the specifier to the end would indicate that there is one person who is separately and independently loved by everyone- possibly at different times. And adding multiple conjoined specifiers would make things even more complicated.
Now let's take a look at how Roles get assigned to QPs inside Role Phrases, as of last time:
[|RP: QP R|] = λx. {y : ∃z. z ∈ x & [|R|](z)(y)} ∈ [|QP|]
What we're doing here is constructing the set of all things that bear a particular relation to some element of the specifier set, and then asserting that that is the same as one of the potential referent sets from the Quantifier Phrase. This implicitly imposes some constraints on the identity of the specifier set as well. A Role Phrase containing a specifier works a little differently:
[|RP: QP e|] = λx.∃Y. Y ∈ [|QP|] & Y ⊆ x
This just imposes an explicit constraint on the identity of the specifier set. (Note that I have chosen here to use an upper-case Y for the referent set in the rule for specifiers, to distinguish it from the lowercase y used for an element of the referent set for normal Role Phrases.)
In order to respect quantifier scoping, we need to arrange things so that we can reconstruct all lower-scope quantifier sets and select the relevant constraints independently for every element y of the referent set that we're constructing for the current RP.
In order to do that, we first have to make the denotations of all lower-scoped RPs available during the interpretation of any given RP. That means modifying our interpretation rules for APs as follows:
[|AP|] = λx.[|RP|](x)([|AP|])
such that the denotation of next-lower-scope Argument Phrase (which contains all the remaining Role Phrases) is filtered through the current Role Phrase as a parameter. That will, of course, require updating the rules for RPs to take multiple arguments, and doing the right thing with them:
[|RP: QP R|] =
λx.λa. {y : ∃z. z ∈ x & [|R|](z)(y) & a(x)} ∈ [|QP|]
[|RP: QP e|] =
λx.λa. ∃Y. Y ∈ [|QP|] & Y ⊆ x & a(x)
This places the evaluation of each Argument Phrase inside the scope of the variable y bound by the next higher Argument Phrase. Although y itself is not accessible in the lower scopes (we only pass along the shared specifier set as a parameter to a), this means that lower quantifiers are re-evaluated for every y. Thus, in "Ka ves anz i siru jest anz jo.", it is possible in the evaluation of "jest anz jo" to select a different existentially-quantified referent to correspond to every member of the universally-quantified referent set of "ves anz i". This is obvious in the semantics for specifiers, where we're still cheating a little bit by using the "∃" symbol to bind the variable Y and borrowing its scoping behavior.
The interpretation for the internal structure of a QP, containing a Q and an NP, remain unchanged. If we round things out with an explicit rule for null APs, we get the following completed model for the syntax-semantics interface:
[|S|] = [|P|]([|AP|])
[|AP: RP AP|] = λx.[|RP|](x)([|AP|])
[|AP: 0|] = λx. true
[|RP: QP R|] =
λx.λa. {y : ∃z. z ∈ x & [|R|](z)(y) & a(x)} ∈ [|QP|]
[|RP: QP e|] =
λx.λa. ∃Y. Y ∈ [|QP|] & Y ⊆ x & a(x)
[|QP|] = [|Q|]([|NP|])
[|NP: N NP|] = [|N|] ∩ [|NP|]
[|NP: 0|] = U
[|P|] = G[P]
[|R|] = G[R]
[|Q|] = G[Q]
[|N|] = G[N]
This gets us the ability to model a pretty big chunk of all WSL declarative sentences. Still to come: non-intersective Nouns, modality, alternative Projectors, subordinate clauses, and controlling semantic projection.
Tuesday, September 22, 2015
Monday, September 21, 2015
A Compositional "yes" and Other Discoveries
Since WSL is radically everything-drop, it recently occurred to me that a complementizer (or "projector" in WSL-specific terminology) and a modal particle with no other content words nevertheless constitute a complete sentence (in fact, just a bare complementizer should constitute a complete sentence, but I decided that would be pragmatically weird in every case, and just Not Done).
So, of course, I had to set out to figure out what all the combinations would actually mean. Not as an exercise in formal semantics, but idiomatically; how would native speakers of WSL actually use these short phrases in everyday life?
The four basic modal particles are
es - normal realis mood
miy - approximately equivalent to "could be" or "might, for all I know".
bi - logical possibility
pek - roughly equivalent to "should" or "it better be"
There are ten projectors, so that gives a total of 40 possible combinations. 16 of them (plus two more we'll consider later) can be complete sentences on their own:
k'es (ka+es): an assertion that some contextually-provided proposition is true. I.e., "yes". But not all the time! We'll come back to this later....
ka miy: "Sure, if you say so."
ka bi: "That is definitely a logical possibility." (I expect this one is typically used sarcastically, with the implication of "no, I don't really think so".
k'pek (ka+pek): "If not, we got problems."
tc'es: "No". But again, not all the time....
tce miy: "That is probably wrong"
tce bi: "Not necessarily"
tce pek: "I hope not"
em es: "Is it?"
e'miy: "Could it (for all you know)?"
em bi: "Could it ever?"
em pek: "Should it?"
mi's: "Ain't it?"
mi miy: "Couldn't it?"
etc, You can guess the last two.
While figuring those out, I also came up with this bonus phrase, which includes an extra third word:
ka ves bi: "This is an obvious logical necessity in all possible worlds".
Or, more colloquially: "Well, duh!"
(This sentence, small as it is, is actually structurally ambiguous, but the alternate reading
is the very weird "it is a logical possibility that everything is that thing". Not something you'd need to say very often!)
The next 16 combinations are not complete sentences, but are valid nominal clauses:
vr'es (vor+es): Something. Anything. I'm thinking this might end up getting used as a generic indefinite pronoun for when you really don't care what (cf. "mek", which is the indefinite pronoun "one", but frequently gets used as a cataphor for dislocated nominal clauses)
vor miy: a hypothetical entity which you posit to exist, but might not
vor bi: any hypothetical entity, whose actual existence is irrelevant
vor pek: that which darn well ought to be. Like flying cars.
Em intc mot "flying cars" jesihu!? K'ajnu vor pek es!
Where are my flying cars!? We should have those by now!
votc es: a nonexistent thing. I'm thinking this might be an idiom comparable to "unicorn"
votc miy: a thing which you are really darn sure does not exist. The emphatic "rainbow-vomiting nuclear unicorn", for example.
votc bi: a logical impossibility
votc pek: that which darn well oughtn't to be. Like social spiders... ick!
Ka votc peku "social spiders" es!
Social spiders are a thing which should not be!
vm'es (vem+es): "whether it is"
ve'miy (vem+miy): "whether it could be, as far as you know"
vem bi: "whether it's logically possible"
vem pek: "whether it should be"
vmi's (vmi+es): "whether it isn't"
vmi'y (vmi+miy): "whether it isn't, as far as you know"
etc.
Finally, the following combinations can be either complete stand-alone sentences, or nominal clauses:
s'es (sa+es): "Yes, they are" or "the fact that they are", asserting that a predicative relationship holds.
satc es: "No, they aren't" or "the fact that they aren't"
You can probably fill in the remaining 6 combinations for the other moods.
So, not only have I discovered the WSL words for "yes" and "no, we've also found that there are two different ways of saying "yes" (k'es and s'es) and two different ways of saying "no" (tc'es and satc es), conditioned on whether the question you are responding to is about a predicative relationship or not.
Additionally, the internal structure of a compositional "yes" or "no" parallels the syntactic structure of the question. So, if somebody asks you a negative question, like "Aren't you going?" responding with "Tc'es" means "No, I'm not"- confirming that the negative that the questioner used was
appropriate. If, however, you respond with "K'es", that actually means "Yes, I am"- contradicting the questioner's use of a negative. Thus, there is no confusion over "was that a 'yes, I'm not', or a
yes-like-'no, I am'?", and no need for yet another word like the French "si" just to take care of answering negative question unambiguously.
So, of course, I had to set out to figure out what all the combinations would actually mean. Not as an exercise in formal semantics, but idiomatically; how would native speakers of WSL actually use these short phrases in everyday life?
The four basic modal particles are
es - normal realis mood
miy - approximately equivalent to "could be" or "might, for all I know".
bi - logical possibility
pek - roughly equivalent to "should" or "it better be"
There are ten projectors, so that gives a total of 40 possible combinations. 16 of them (plus two more we'll consider later) can be complete sentences on their own:
k'es (ka+es): an assertion that some contextually-provided proposition is true. I.e., "yes". But not all the time! We'll come back to this later....
ka miy: "Sure, if you say so."
ka bi: "That is definitely a logical possibility." (I expect this one is typically used sarcastically, with the implication of "no, I don't really think so".
k'pek (ka+pek): "If not, we got problems."
tc'es: "No". But again, not all the time....
tce miy: "That is probably wrong"
tce bi: "Not necessarily"
tce pek: "I hope not"
em es: "Is it?"
e'miy: "Could it (for all you know)?"
em bi: "Could it ever?"
em pek: "Should it?"
mi's: "Ain't it?"
mi miy: "Couldn't it?"
etc, You can guess the last two.
While figuring those out, I also came up with this bonus phrase, which includes an extra third word:
ka ves bi: "This is an obvious logical necessity in all possible worlds".
Or, more colloquially: "Well, duh!"
(This sentence, small as it is, is actually structurally ambiguous, but the alternate reading
is the very weird "it is a logical possibility that everything is that thing". Not something you'd need to say very often!)
The next 16 combinations are not complete sentences, but are valid nominal clauses:
vr'es (vor+es): Something. Anything. I'm thinking this might end up getting used as a generic indefinite pronoun for when you really don't care what (cf. "mek", which is the indefinite pronoun "one", but frequently gets used as a cataphor for dislocated nominal clauses)
vor miy: a hypothetical entity which you posit to exist, but might not
vor bi: any hypothetical entity, whose actual existence is irrelevant
vor pek: that which darn well ought to be. Like flying cars.
Em intc mot "flying cars" jesihu!? K'ajnu vor pek es!
Where are my flying cars!? We should have those by now!
votc es: a nonexistent thing. I'm thinking this might be an idiom comparable to "unicorn"
votc miy: a thing which you are really darn sure does not exist. The emphatic "rainbow-vomiting nuclear unicorn", for example.
votc bi: a logical impossibility
votc pek: that which darn well oughtn't to be. Like social spiders... ick!
Ka votc peku "social spiders" es!
Social spiders are a thing which should not be!
vm'es (vem+es): "whether it is"
ve'miy (vem+miy): "whether it could be, as far as you know"
vem bi: "whether it's logically possible"
vem pek: "whether it should be"
vmi's (vmi+es): "whether it isn't"
vmi'y (vmi+miy): "whether it isn't, as far as you know"
etc.
Finally, the following combinations can be either complete stand-alone sentences, or nominal clauses:
s'es (sa+es): "Yes, they are" or "the fact that they are", asserting that a predicative relationship holds.
satc es: "No, they aren't" or "the fact that they aren't"
You can probably fill in the remaining 6 combinations for the other moods.
So, not only have I discovered the WSL words for "yes" and "no, we've also found that there are two different ways of saying "yes" (k'es and s'es) and two different ways of saying "no" (tc'es and satc es), conditioned on whether the question you are responding to is about a predicative relationship or not.
Additionally, the internal structure of a compositional "yes" or "no" parallels the syntactic structure of the question. So, if somebody asks you a negative question, like "Aren't you going?" responding with "Tc'es" means "No, I'm not"- confirming that the negative that the questioner used was
appropriate. If, however, you respond with "K'es", that actually means "Yes, I am"- contradicting the questioner's use of a negative. Thus, there is no confusion over "was that a 'yes, I'm not', or a
yes-like-'no, I am'?", and no need for yet another word like the French "si" just to take care of answering negative question unambiguously.
A Progressive Model of WSL Syntax & Interpretation: Part 2
Last time, I ended with the note that properly modelling specifier phrases would require splitting the interpretation of argument phrases in half; in particular, I had in mind the idea that variable bindings for noun phrases would need to be moved around to ensure that the specifier variable would be in-scope in the semantics for every argument phrase.
It turns out that the solution is actually much simpler. First, we will introduce a very simple change to the syntax rule for a sentence to account for sentential Projectors (a part of speech which heads independent clauses in WSL):
S → P QP e AP
Next, we'll stop treating the specifier phrase separately, and account for it as a special case of an argument phrase:
S → P AP
AP → A AP | 0
A → QP R | QP e
We could also choose to treat the specifier clitic as a kind of Role, which would be slightly simpler, but this formulation better reflects my own psychological perception of what a specifier is (and thus presumably reflects the intuition of the fictional native speakers of WSL as well). Note that this allows a single clause to contain multiple specifiers, as well as putting them in arbitrary positions with respect to the other arguments; that situation is accounted for in the WSL Primer, which says that the semantics of multiple specifier phrases is the same as that of multiple conjoined specifiers, except that using multiple specifiers allows you to place them all in different quantifier scoping levels- the same as the interpretation for repeated roles.
The semantics for these bits of syntax is as follows:
[|S|] = [|P|]([|AP|])
[|AP|] = λx.[|A|](x) & [|AP|](x)
[|A: QP R|] = λx.[|QP|](λy. [|R|](x)(y))
[|A: QP e|] = λx.[|QP|](λy. y ⊆ x)
To summarize: the denotation of a sentence is the denotation of the argument phrase chain filtered through the denotation of the Projector; the denotation of an argument phrase is the denotation of the argument given an entity variable x conjoined with the denotation of the remaining argument phrase given x; and the denotation of an argument is the denotation of a relation on the shared entity variable x and the phrase-specifier entity variable z filtered through the denotation of the quantifier phrase (which for the moment is unchanged from last time).
The case of an argument containing a QP and specifier clitic instead of a QP and a Role just contains the explicit relation that the phrasal entity z is a subset of x. Note that this requires interpreting z and x not as representing single referents, but as sets of possible referents- an idea I introduced earlier in the series on semantics for a monocategorial language.
For now, we'll only deal with a single Projector: "ka", which indicates a simple declarative sentence. It's semantics are very simple:
[|P|] = G["ka"] = λy.∃x. y(x)
This just says that some x (which now refers to a set) exists, and its identity will be constrained by y (which is bound to the denotation of the argument phrase chain). We could build this into the interpretation of an S directly, but we will have to deal with the semantics of Projectors at some point, so we might as well start here.
We now have the ability to intersperse specifiers and other arguments in any order, with the members of the specifier set constrained by the specifier phrases, and the scope of all quantifiers corresponding exactly to their surface order!
[|QP|] = G[Q]([|N|])
And some examples of Quantifier semantics are as follows:
G["ves"] = λy. {x ⊆ U : y ⊆ x} ("every" or "all"; i.e, the set of all sets in the universe U that contain the entire set y)
G["jest"] = λy. {x ⊆ U : |y ∩ x| > 0} ("some"; i.e, the set of all sets in the universe that contain at least on element of y)
G["hiq"] = λy. {x ⊆ U : |y ∩ x| = 5} ("five"; i.e, the set of all sets in the universe that contain exactly some five elements of the set y)
This says that a normal argument containing a role marker asserts that the set of all entities y which satisfy a particular relation with some element of the specifier set x is in the denotation of the QP; or, that an argument containing the specifier clitic asserts that some element y of the the denotation of the QP is a subset of the specifier set x.
Unfortunately, we've just undone our progress in allowing for the correct quantifier scopes! I constructing a compositional semantics for quantifiers in terms of sets, we have thrown out the conventions for establishing variable scopes in predicate logic- because we eliminated the variables! In order to recover the proper quantifier scopes, we're going to have to find a way to take into account the constraints imposed in lower scopes while constructing the sets for higher-scoped quantifiers.
We'll take a look at that problem in the next post in the series.
It turns out that the solution is actually much simpler. First, we will introduce a very simple change to the syntax rule for a sentence to account for sentential Projectors (a part of speech which heads independent clauses in WSL):
S → P QP e AP
Next, we'll stop treating the specifier phrase separately, and account for it as a special case of an argument phrase:
S → P AP
AP → A AP | 0
A → QP R | QP e
We could also choose to treat the specifier clitic as a kind of Role, which would be slightly simpler, but this formulation better reflects my own psychological perception of what a specifier is (and thus presumably reflects the intuition of the fictional native speakers of WSL as well). Note that this allows a single clause to contain multiple specifiers, as well as putting them in arbitrary positions with respect to the other arguments; that situation is accounted for in the WSL Primer, which says that the semantics of multiple specifier phrases is the same as that of multiple conjoined specifiers, except that using multiple specifiers allows you to place them all in different quantifier scoping levels- the same as the interpretation for repeated roles.
The semantics for these bits of syntax is as follows:
[|S|] = [|P|]([|AP|])
[|AP|] = λx.[|A|](x) & [|AP|](x)
[|A: QP R|] = λx.[|QP|](λy. [|R|](x)(y))
[|A: QP e|] = λx.[|QP|](λy. y ⊆ x)
To summarize: the denotation of a sentence is the denotation of the argument phrase chain filtered through the denotation of the Projector; the denotation of an argument phrase is the denotation of the argument given an entity variable x conjoined with the denotation of the remaining argument phrase given x; and the denotation of an argument is the denotation of a relation on the shared entity variable x and the phrase-specifier entity variable z filtered through the denotation of the quantifier phrase (which for the moment is unchanged from last time).
The case of an argument containing a QP and specifier clitic instead of a QP and a Role just contains the explicit relation that the phrasal entity z is a subset of x. Note that this requires interpreting z and x not as representing single referents, but as sets of possible referents- an idea I introduced earlier in the series on semantics for a monocategorial language.
For now, we'll only deal with a single Projector: "ka", which indicates a simple declarative sentence. It's semantics are very simple:
[|P|] = G["ka"] = λy.∃x. y(x)
This just says that some x (which now refers to a set) exists, and its identity will be constrained by y (which is bound to the denotation of the argument phrase chain). We could build this into the interpretation of an S directly, but we will have to deal with the semantics of Projectors at some point, so we might as well start here.
We now have the ability to intersperse specifiers and other arguments in any order, with the members of the specifier set constrained by the specifier phrases, and the scope of all quantifiers corresponding exactly to their surface order!
Generalizing Quantifiers
The next step in building our model of WSL semantics is to improve the handling of quantifiers. As described in this article, the denotation of a Quantifier phrase will be represented not by a proposition in predicate logic, but by a set of sets of possible referents- all those sets of referents which contain the appropriate quantity of the type of referent identified by a given Noun phrase. This helps our model conform to the intuition that a bare noun or quantifier phrase does not correspond to a logical assertion- i.e., that a given entity exists- but merely to a possible entity itself. The semantics for any individual quantifier are given by a function which takes in the denotation of a Noun phrase and uses it to construct the appropriate set of sets. The new interpretation rule for QPs is as follows:[|QP|] = G[Q]([|N|])
And some examples of Quantifier semantics are as follows:
G["ves"] = λy. {x ⊆ U : y ⊆ x} ("every" or "all"; i.e, the set of all sets in the universe U that contain the entire set y)
G["jest"] = λy. {x ⊆ U : |y ∩ x| > 0} ("some"; i.e, the set of all sets in the universe that contain at least on element of y)
G["hiq"] = λy. {x ⊆ U : |y ∩ x| = 5} ("five"; i.e, the set of all sets in the universe that contain exactly some five elements of the set y)
This of course also requires a new formulation of the semantics for Noun phrases that produces the basis set of "referents of the right type". The new Noun rule is as follows:
[|N: n N|] = G[n] ∩ [|N|]
[|N: 0|] = U
Rather than binding a new entity variable which we assert to satisfy a given predicate, or to be a member of a given set (given by looking up the Noun n in the lexicon), we simply directly construct the intersection of all of the sets that are the denotations of individual Nouns.
Finally, we have to update the interpretation rules for arguments (again) to handle the new kind of denotation for QPs:
[|A: QP R|] = λx. {y : ∃z. z ∈ x & [|R|](z)(y)} ∈ [|QP|]
[|A: QP e|] = λx.∃y. y ∈ [|QP|] & y ⊆ x
This says that a normal argument containing a role marker asserts that the set of all entities y which satisfy a particular relation with some element of the specifier set x is in the denotation of the QP; or, that an argument containing the specifier clitic asserts that some element y of the the denotation of the QP is a subset of the specifier set x.
Unfortunately, we've just undone our progress in allowing for the correct quantifier scopes! I constructing a compositional semantics for quantifiers in terms of sets, we have thrown out the conventions for establishing variable scopes in predicate logic- because we eliminated the variables! In order to recover the proper quantifier scopes, we're going to have to find a way to take into account the constraints imposed in lower scopes while constructing the sets for higher-scoped quantifiers.
We'll take a look at that problem in the next post in the series.
Sunday, September 13, 2015
Non-intersective Nouns & Negative Scopes
This post is a follow-up to several prior discussion on formal semantics for a minimalistic monocategorial language. If you aren't familiar with them already, you may want to read part one and part two before proceeding.
This concept is best explained with examples: "a short basketball player" is also "a basketball player", so "short" is an intersective adjective--the set of "short things" intersects the set of "basketball players", and the meaning of the whole phrase is the intersection of those two sets.
On the other hand, "a former basketball player" is not "a basketball player", so "former" is non-intersective. The meaning of "former basketball player" is not a subset of "basketball player", and isn't formed by intersecting it with anything. It's a completely disjoint set, but one that does have a logical relationship to the set of "basketball players".
Similarly, if you "almost finish", then you do not "finish", so "almost" is a non-intersective adverb.
My formal education in formal semantics only really exposed me to one way of modelling non-intersective adjectives/adverbs: as higher-order functions that take predicates as arguments and produce new, different predicates. Thus, the predicate logic notation for a "former player" is former(player')(x) (vs. player(x)) , for some entity x.
But this model is not very compositional (i.e., if a "former player" is former(player')(x), a "former basketball player" is... what exactly?), and results in icky complicated interpretation rules. As a result, I have so far mostly avoided them in WSL, and the few times I've needed them I've punted
and just decided that they are taken care of by morphological affixes. That works as long as you just want to say something like "a former player", and leave the "basketball" (or any other additional descriptors) out of it.
Contemplating the programming language Prolog, however, provides a way out of this mess. Basic versions of Prolog do not have functions that can compute arbitrary values--just predicates that can be true or false. A Prolog system can, however, simulate functions by allowing you to query it about what the possible values are that would need to be plugged in to one or more argument positions of any given predicate in order to make it evaluate to "true". You can specify "input" values for whatever arguments you want, and for whatever arguments you leave unspecified, the Prolog system will give you a set of all possible values of those variables that satisfy the predicate as "output". Since Prolog is based on first-order logic, the exact same transformation for simulating functions works for modelling non-intersective adjectives in a predicate logic model for formal semantics.
So, non-intersective adjectives can be modeled as two-place predicates (with one input argument and one output argument) that specify the relation between the actual final referent of a noun phrase and the thing-that-it-isn't. Plus some mathematical machinery to glue it all together. That insight will allow us to add the capacity for translating non-intersective adjectives into our monocategorial language. I say "translate" because, being monocategorial, the language does not have a distinct class of "adjectives" to add non-intersective members to. Instead, it will have "non-intersective nouns" (or "non-intersective noun-jectives").
Introducing that machinery into our monocategorial language requires a little bit of updating to the existing interpretation rules. The altered syntactic rules are as follows:
[|P: w P|] = λx.λy.λz.[|w|](x, y, z, [|P|])
[|P: w ,|] = λx.λy.λz. [|w|](x, y, z, λx.λy.λz. y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)})
Essentially, rather than immediately evaluating the denotation of the current word and the rest of the phrase with the same arguments, we pass the denotation of the rest of the phrase into the semantic function for the current word, which will allow the lexical semantics of a word to control what arguments are given to the rest of the phrase. At the end of the phrase, we pass in the "default" expressions that account for the possibility that no explicit quantifier or relation words were present. Note also that I have introduced the comma-separated argument list notation as "sugar" for repeated application of a curried function, since the large number of parameters to our lexical semantic level has started to become unwieldy.
Of course, since we've changed the lexical semantic interface in the syntactic interpretation rules, we have to also update the templates for our lexical semantic classes. The existing classes are updated as follows:
a) λx.λy.λz.λp. z ⊆ red & p(x, y, z)
b) λx.λy.λz.λp. y ⊆ {w : ∃e. e ∈ x & ag(e, w)} & p(x, y, z)
c) λx.λy.λz.λp. x ⊆ run & p(x, y, z)
d) λx.λy.λz.λp. x = y & p(x, y, z)
e) λx.λy.λz.λp. |y| > |z - y| & p(x, y, z)
In every case, we simply pass along all the original arguments directly to the semantic function for the remainder of the phrase, and logically conjoin it to the original lexical semantic expression. However, we now have the option of messing with those arguments if we so please; thus, we can introduce an additional semantic class for non-intersectives:
f) λx.λy.λz.λp.
y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
∃x'.∃y'.∃z'. z ⊆ {a : ∃b. b & ∈ y' & former(a, b)} &
p(x', y', z')
There's a lot going on here, but most of it is just book-keeping boilerplate. Let's break it down:
First, we cleanly terminate the description of the current referent; by the time we get to the end of the phrase, we won't be talking about the same thing anymore, so we have to throw in all the "just-in-case" assertions (that the referent set is some subset of the quantifier base and that it is has some relation to the sentential event) in here as well. In the next line, we assert the existence of a new quantifier base and a new referent set (z' and y', respectively) and, crucially, a new event x'--because if we're talking about a "former basketball player", then the "basketball player" which-he-isn't doesn't have any relation to this clause's event, so we have to replace it with something else. We then assert that all members of the current quantifier base have a specific relationship (in this case, "former") to some member of the new referent set. And finally, we use the rest of the phrase, denoted by p, to describe the new referent and new event.
The addition of this mechanism to the language has a few interesting long-range consequences. The most obvious is that word order actually matters. Up until now, we could've cheated on the phrase-structure rule and just said P → w w* , using the Kleene star operator for arbitrary repetition, instead of P → w P | w ,; but now, the fact that later words are in smaller phrase embedded at a lower syntactic level than previous words is very significant. Changing the ordering of words around a non-intersective noun can change which referent those words are actually describing. Similar effects are present in English; a "former blue car", for example, is not necessarily the same thing as a "blue former car". The first one may still be a car, but of a different color, while the second is definitely no longer a car, but definitely is blue.
Note that while syntax now encodes new information in word order about referent scope, the syntactic rules no longer encode information about logical connectives; the implicit conjunction of all lexical items is now a function of lexical semantics instead. We could consider this an accidental artifact of the formal framework we're using, but we might also come back to it later and exploit the lexification of logical connectives to come up with some new lexical semantic classes.
The second long-range consequence is that class-c words now have a real essential function; in the beginning of a phrase, prior to any non-intersectives, they are essentially adverbs, unconnected to the phrasal referent, which can float between phrases with no change in sentential meaning. After an intersective, however, they no longer act on the clausal event. Instead, they describe some event that the new referent is a participant in. The same applies to class-b and -d relation words.
not: λx.λy.λz.λp.
y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
∃x'.∃y'.∃z'. z ∪ y' = {} &
p(x', y', z')
I.e., "the relation between the quantifier base for the current referent set and the new referent set is that they have no elements in common".[1]
We can thus expect all non-intersectives to behave somehow similarly to negatives in the behavior they trigger in lower scopes.[2] This gives us a possible use for semantically-significant stress focus, in addition to the rising-falling intonation patterns that are used to denote phrase and sentence boundaries. We can use stress focus to indicate the specific reason that a referent "is not" something else. Referring to a previous example, the ambiguity of "former blue car" can be resolved by stating that it is either a "former blue car" or a "former blue car"--indicating that the reason the item in question is "former" is because it is no longer "blue" in the first case and no longer a "car" in the second.
More traditionally, we might want to make some special accommodations for phrases that are described by monotone-decreasing quantifiers, like "no" or "few", which are also "negative", in a slightly different way. Perhaps the "reason" for a decreasing quantifier can also be indicated by stress focus (do "few men work" or do "few men work"?) The practical impact of that level of ambiguity ("does this instance of focus refer to the quantifier or the non-intersective scope?") is likely to be minimal.
In either case, this will already be a very long blog post, so updating the syntax to recognize focused items is left as an exercise for the reader.
Wanna say "Bob's cat climbs trees" in monocategorial form? "Cat agent of Bob, tree theme, climb event." The word "of" ends up as a non-intersective noun! Crazy! Incidentally, its semantics are as follows:
of: λx.λy.λz.λp.
y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
∃x'.∃y'.∃z'. z ⊆ {a : ∃b. b & ∈ y' & ∃r. r(a, b)}
p(x', y', z')
I.e., "all members of this quantifier base have some kind of relation with some member of the new referent set".
And there are all sorts of normal English nouns that entail a relationship to some other unspecified referent. "Father", for example. A "father" cannot exist without a child, so the concept of "father is naturally expressed in terms of a two-place predicate... which will be embedded inside the machinery for non-intersectives to allow describing both halves of the relationship, should you so desire[3]. So, "Bob's father build's houses"? In monocategorial form, it's "Agent father Bob, many house patient, build event". Note that in this case, the word order in the phrase "Agent father Bob" is critical. If we swap things around, we get the following different readings:
"father agent Bob": "Bob-who-does-something's father is somehow involved"
"Bob father agent": "Bob, who is the father of someone who does something, is somehow involved"
"Bob agent father": "Bob, who is a father, is the agent (builds houses)"
And if we want to cover both sides of the relationship at the same time: "John agent father Bob, many house patient, build event" ("Bob's father, John, builds houses.")
[1] Note that this is not the same as a translation for "no", the negative quantifier, which is addressed a little further down the page. This sort of "not" means "I'm talking a referent that is identified by not being that other thing", as opposed to saying "no referents of this description are involved in the action".
[2] While similar, this is actually not the same thing as the "negative scopes" that license negative polarity items like "anymore" in English. Those are a feature of the behavior of certain quantifiers, like the negative quantifier "no" described in [1].
[3] WSL encodes these kinds of nouns as Roles. When we get back to the semantic model for WSL in a later post, we'll see the utility of a productive morphological derivation system to turn Roles into non-intersective Nouns and vice-versa.
Non-intersective Nouns
There is a class of adjectives called "non-intersective adjectives" because the set of referents of a noun phrase that includes them does not intersect the set of referents for the same noun phrase without.This concept is best explained with examples: "a short basketball player" is also "a basketball player", so "short" is an intersective adjective--the set of "short things" intersects the set of "basketball players", and the meaning of the whole phrase is the intersection of those two sets.
On the other hand, "a former basketball player" is not "a basketball player", so "former" is non-intersective. The meaning of "former basketball player" is not a subset of "basketball player", and isn't formed by intersecting it with anything. It's a completely disjoint set, but one that does have a logical relationship to the set of "basketball players".
Similarly, if you "almost finish", then you do not "finish", so "almost" is a non-intersective adverb.
My formal education in formal semantics only really exposed me to one way of modelling non-intersective adjectives/adverbs: as higher-order functions that take predicates as arguments and produce new, different predicates. Thus, the predicate logic notation for a "former player" is former(player')(x) (vs. player(x)) , for some entity x.
But this model is not very compositional (i.e., if a "former player" is former(player')(x), a "former basketball player" is... what exactly?), and results in icky complicated interpretation rules. As a result, I have so far mostly avoided them in WSL, and the few times I've needed them I've punted
and just decided that they are taken care of by morphological affixes. That works as long as you just want to say something like "a former player", and leave the "basketball" (or any other additional descriptors) out of it.
Contemplating the programming language Prolog, however, provides a way out of this mess. Basic versions of Prolog do not have functions that can compute arbitrary values--just predicates that can be true or false. A Prolog system can, however, simulate functions by allowing you to query it about what the possible values are that would need to be plugged in to one or more argument positions of any given predicate in order to make it evaluate to "true". You can specify "input" values for whatever arguments you want, and for whatever arguments you leave unspecified, the Prolog system will give you a set of all possible values of those variables that satisfy the predicate as "output". Since Prolog is based on first-order logic, the exact same transformation for simulating functions works for modelling non-intersective adjectives in a predicate logic model for formal semantics.
So, non-intersective adjectives can be modeled as two-place predicates (with one input argument and one output argument) that specify the relation between the actual final referent of a noun phrase and the thing-that-it-isn't. Plus some mathematical machinery to glue it all together. That insight will allow us to add the capacity for translating non-intersective adjectives into our monocategorial language. I say "translate" because, being monocategorial, the language does not have a distinct class of "adjectives" to add non-intersective members to. Instead, it will have "non-intersective nouns" (or "non-intersective noun-jectives").
Introducing that machinery into our monocategorial language requires a little bit of updating to the existing interpretation rules. The altered syntactic rules are as follows:
[|P: w P|] = λx.λy.λz.[|w|](x, y, z, [|P|])
[|P: w ,|] = λx.λy.λz. [|w|](x, y, z, λx.λy.λz. y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)})
Essentially, rather than immediately evaluating the denotation of the current word and the rest of the phrase with the same arguments, we pass the denotation of the rest of the phrase into the semantic function for the current word, which will allow the lexical semantics of a word to control what arguments are given to the rest of the phrase. At the end of the phrase, we pass in the "default" expressions that account for the possibility that no explicit quantifier or relation words were present. Note also that I have introduced the comma-separated argument list notation as "sugar" for repeated application of a curried function, since the large number of parameters to our lexical semantic level has started to become unwieldy.
Of course, since we've changed the lexical semantic interface in the syntactic interpretation rules, we have to also update the templates for our lexical semantic classes. The existing classes are updated as follows:
a) λx.λy.λz.λp. z ⊆ red & p(x, y, z)
b) λx.λy.λz.λp. y ⊆ {w : ∃e. e ∈ x & ag(e, w)} & p(x, y, z)
c) λx.λy.λz.λp. x ⊆ run & p(x, y, z)
d) λx.λy.λz.λp. x = y & p(x, y, z)
e) λx.λy.λz.λp. |y| > |z - y| & p(x, y, z)
In every case, we simply pass along all the original arguments directly to the semantic function for the remainder of the phrase, and logically conjoin it to the original lexical semantic expression. However, we now have the option of messing with those arguments if we so please; thus, we can introduce an additional semantic class for non-intersectives:
f) λx.λy.λz.λp.
y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
∃x'.∃y'.∃z'. z ⊆ {a : ∃b. b & ∈ y' & former(a, b)} &
p(x', y', z')
There's a lot going on here, but most of it is just book-keeping boilerplate. Let's break it down:
First, we cleanly terminate the description of the current referent; by the time we get to the end of the phrase, we won't be talking about the same thing anymore, so we have to throw in all the "just-in-case" assertions (that the referent set is some subset of the quantifier base and that it is has some relation to the sentential event) in here as well. In the next line, we assert the existence of a new quantifier base and a new referent set (z' and y', respectively) and, crucially, a new event x'--because if we're talking about a "former basketball player", then the "basketball player" which-he-isn't doesn't have any relation to this clause's event, so we have to replace it with something else. We then assert that all members of the current quantifier base have a specific relationship (in this case, "former") to some member of the new referent set. And finally, we use the rest of the phrase, denoted by p, to describe the new referent and new event.
The addition of this mechanism to the language has a few interesting long-range consequences. The most obvious is that word order actually matters. Up until now, we could've cheated on the phrase-structure rule and just said P → w w* , using the Kleene star operator for arbitrary repetition, instead of P → w P | w ,; but now, the fact that later words are in smaller phrase embedded at a lower syntactic level than previous words is very significant. Changing the ordering of words around a non-intersective noun can change which referent those words are actually describing. Similar effects are present in English; a "former blue car", for example, is not necessarily the same thing as a "blue former car". The first one may still be a car, but of a different color, while the second is definitely no longer a car, but definitely is blue.
Note that while syntax now encodes new information in word order about referent scope, the syntactic rules no longer encode information about logical connectives; the implicit conjunction of all lexical items is now a function of lexical semantics instead. We could consider this an accidental artifact of the formal framework we're using, but we might also come back to it later and exploit the lexification of logical connectives to come up with some new lexical semantic classes.
The second long-range consequence is that class-c words now have a real essential function; in the beginning of a phrase, prior to any non-intersectives, they are essentially adverbs, unconnected to the phrasal referent, which can float between phrases with no change in sentential meaning. After an intersective, however, they no longer act on the clausal event. Instead, they describe some event that the new referent is a participant in. The same applies to class-b and -d relation words.
Negative Scopes
Most non-intersectives have an implicit "not" built in to them. A "former basketball player" is not a "basketball player", a "fake Picasso" is not a "Picasso", and while an "alleged thief" might be a "thief", he also might not. And it turns out that with the interpretive machinery we've built so far, we can actually translate "not" as a non-intersective noun with the following lexical semantics:not: λx.λy.λz.λp.
y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
∃x'.∃y'.∃z'. z ∪ y' = {} &
p(x', y', z')
I.e., "the relation between the quantifier base for the current referent set and the new referent set is that they have no elements in common".[1]
We can thus expect all non-intersectives to behave somehow similarly to negatives in the behavior they trigger in lower scopes.[2] This gives us a possible use for semantically-significant stress focus, in addition to the rising-falling intonation patterns that are used to denote phrase and sentence boundaries. We can use stress focus to indicate the specific reason that a referent "is not" something else. Referring to a previous example, the ambiguity of "former blue car" can be resolved by stating that it is either a "former blue car" or a "former blue car"--indicating that the reason the item in question is "former" is because it is no longer "blue" in the first case and no longer a "car" in the second.
More traditionally, we might want to make some special accommodations for phrases that are described by monotone-decreasing quantifiers, like "no" or "few", which are also "negative", in a slightly different way. Perhaps the "reason" for a decreasing quantifier can also be indicated by stress focus (do "few men work" or do "few men work"?) The practical impact of that level of ambiguity ("does this instance of focus refer to the quantifier or the non-intersective scope?") is likely to be minimal.
In either case, this will already be a very long blog post, so updating the syntax to recognize focused items is left as an exercise for the reader.
The Weird Ones
Non-intersective nouns, it turns out, don't have to be used only to translate what English encodes as non-intersective adjectives. They're just words that specify some arbitrary not-necessarily-intersecting relationship between the quantifier base for one referent set, and some other referent set. Y'know what else acts like that? Adnominal adpositions. E.g., prepositions that describe noun phrases. Also, genitives (for which English can use the preposition "of", but doesn't always).Wanna say "Bob's cat climbs trees" in monocategorial form? "Cat agent of Bob, tree theme, climb event." The word "of" ends up as a non-intersective noun! Crazy! Incidentally, its semantics are as follows:
of: λx.λy.λz.λp.
y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)} &
∃x'.∃y'.∃z'. z ⊆ {a : ∃b. b & ∈ y' & ∃r. r(a, b)}
p(x', y', z')
I.e., "all members of this quantifier base have some kind of relation with some member of the new referent set".
And there are all sorts of normal English nouns that entail a relationship to some other unspecified referent. "Father", for example. A "father" cannot exist without a child, so the concept of "father is naturally expressed in terms of a two-place predicate... which will be embedded inside the machinery for non-intersectives to allow describing both halves of the relationship, should you so desire[3]. So, "Bob's father build's houses"? In monocategorial form, it's "Agent father Bob, many house patient, build event". Note that in this case, the word order in the phrase "Agent father Bob" is critical. If we swap things around, we get the following different readings:
"father agent Bob": "Bob-who-does-something's father is somehow involved"
"Bob father agent": "Bob, who is the father of someone who does something, is somehow involved"
"Bob agent father": "Bob, who is a father, is the agent (builds houses)"
And if we want to cover both sides of the relationship at the same time: "John agent father Bob, many house patient, build event" ("Bob's father, John, builds houses.")
[1] Note that this is not the same as a translation for "no", the negative quantifier, which is addressed a little further down the page. This sort of "not" means "I'm talking a referent that is identified by not being that other thing", as opposed to saying "no referents of this description are involved in the action".
[2] While similar, this is actually not the same thing as the "negative scopes" that license negative polarity items like "anymore" in English. Those are a feature of the behavior of certain quantifiers, like the negative quantifier "no" described in [1].
[3] WSL encodes these kinds of nouns as Roles. When we get back to the semantic model for WSL in a later post, we'll see the utility of a productive morphological derivation system to turn Roles into non-intersective Nouns and vice-versa.
Tuesday, September 8, 2015
Generalized Quantifiers for a Monocategorial Language
At the end of yesterday's post, I briefly mentioned the concept of generalized quantifiers. Today, I want to investigate how the semantics of the minimalistic language can be extended with that concept to eliminate the need for "built-in" existential quantification.
The first step is to recognize that noun phrases do not necessarily denote single referents. In a sentence such as "Every student goes to school", for example, the phrase "every student", while grammatically singular, does not refer to just one student; rather, it denotes a set of students (in this case, all of them), and the sentence makes a statement about the properties of that set- all of it's members also belong to the set of things that go to school. Or, in other words, [|every student|] is a subset of [|goes to school|].
We can thus modify yesterday's semantics so that all entity variables actually refer to sets. In this case, the form of the syntactic interpretation rules need not change (although how we read them may be tweaked, replacing "There exists some x" with "There exists some set x"), but lexical semantics for each possible word type end up looking like this:
a) λx.λy. y ⊆ red
b) λx.λy. y ⊆ {z : ∃e. e ∈ x & ag(e, z)}
c) λx.λy. x ⊆ run
d) λx.λy. x = y
Note that predicates are defined by the set of arguments for which they are true. Thus, predicates are set-valued entities on which we can use set operators like ⊆ (subset) and ∈ (element of); the traditional predicate logic notation that we have been using so far, pred(x), is simply shorthand for the set-theoretic formula x ∈ pred.
The only major alteration introduced here is seen in the semantics for relation words, class b, which must be modified to explicitly construct the subset of entities which have a particular relation to some element of the set of events.
It is now straightforward to add a fifth semantic class of words which in some way restrict the cardinality of (or quantify) a set:
e) λx.λy. |y| = 5
(There could be an additional sixth class that restricts the cardinality of the event set, represented by the variable x, but for simplicity we will ignore that possibility for now.)
This works great for simple numerals (like 5, as shown in the example) and more vague things like "many" or "a few"; but, it causes problems for quantifiers like "most" or "one-third" which restrict the cardinality of the referent set compared to what it would have been if it were not quantified. We need some way of keeping track of that original maximal set.
In a more "normal" language, generalized quantifiers would operate at a separate syntactic level from nouns, and could take in the compositional denotation of the rest of a noun phrase all at once, and produce a new restricted set from it. In this language, however, we don't have that luxury. If we want to keep things monocategorial, we need to find some way of keeping track of the base set and restrictions on the final quantified set simultaneously as additional quantifiers and other words are added in arbitrary orders. This will require altering our lexical semantics to account for a third argument, and that will in turn require altering the syntactic interpretation rules to provide that third argument.
The altered interpretation rules look like this:
[|S|] = ∃x.[|C|](x)
[|C: P C|] = λx. ∃y. ∃z. [|P|](x)(y)(z) & [|C|](x)
[|C: P .|] = λx. ∃y. ∃z. [|P|](x)(y)(z)
[|P: w P|] = λx.λy.λz.[|w|](x)(y)(z) & [|P|](x)(y)(z)
[|P: w ,|] = λx.λy.λz. [|w|](x)(y)(z) & y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)}
Here we have simply asserted the existence of an additional set variable, z, and added a term to define y, the set of referents for a phrase, to be a subset of z.
The new forms of the different lexical semantic classes look like this:
a) λx.λy.λz. z ⊆ red
b) λx.λy.λz. y ⊆ {w : ∃e. e ∈ x & ag(e, w)}
c) λx.λy.λz. x ⊆ run
d) λx.λy.λz. x = y
e) λx.λy.λz. |y| > |z - y|
Now, class-a noun-jective words operate on the set z, which forms the basis set for quantification. Relation words (classes b and d) act on y, which represents the actual referents of the phrase and is the result of quantification, and x, as before; and class-e quantifier words specify some relation between the set of referents y and the basis set z. The example given shows the semantics for the quantifier "most": the cardinality of the set of referents is greater than the cardinality of its difference with the basis set.
So far, we have not actually eliminated the need for logical existential quantifiers "built-in" to our semantics, but we have eliminated their semantic effect; all entities described by a sentence in the monocategorial language are no longer implicitly existentially quantified. Rather, the quantifiers in our predicate logic forms serve only to bind variables that we can use to refer to the different referent sets; and the referent sets are explicitly quantified by whatever quantifier words you feel like using. In the absence of any explicit quantifier word, sentences are evaluated as being true for some, unspecified, subset of the basis. This is exactly the same level of ambiguity present in natural languages (like Mandarin) which lack obligatory grammatical number.
Actually eliminating the existential quantifiers from our predicate logic forms would require directly constructing the relevant sets from unions and intersections, so as to eliminate the need for a common variable to use to tie the different parts together. That, in turn, requires separating the classes of quantifier and relation words from the class of noun-jectives, which cannot be done while maintaining the monocategorial analysis. It will, however, be possible to do so in WSL, which does have the necessary multiple syntactic levels.
It should also be noted that in adding quantifier words to the monocategorial language, we actually did not need to go all the way to introducing the full formalism of "generalized quantifiers"- and in fact, it looks like we can't, since the order of compositional operations which motivates the use of generalized quantifiers in the semantics of natural languages just doesn't exist here. The denotations of our monocategorial phrases are simple sets of referents, whose members have some thematic relation to the implicit event. In contrast, the denotations of natural-language noun phrases composed of nouns and generalized quantifiers are sets of possible sets of referents with the appropriately restricted cardinalities; and the correct set of referents is then extracted at the next higher level of composition, when a role is assigned to the noun phrase by a verb or adposition. Without that extra level, the monocategorial language must simply specify the final referent set directly. Again, however, WSL does have the more typical separation of nouns, quantifiers, and role-assigning words at different syntactic levels, and so we will be able to explore a more naturalistic analysis for that language.
For more thoughts on the monocategorial language, see this following post.
The first step is to recognize that noun phrases do not necessarily denote single referents. In a sentence such as "Every student goes to school", for example, the phrase "every student", while grammatically singular, does not refer to just one student; rather, it denotes a set of students (in this case, all of them), and the sentence makes a statement about the properties of that set- all of it's members also belong to the set of things that go to school. Or, in other words, [|every student|] is a subset of [|goes to school|].
We can thus modify yesterday's semantics so that all entity variables actually refer to sets. In this case, the form of the syntactic interpretation rules need not change (although how we read them may be tweaked, replacing "There exists some x" with "There exists some set x"), but lexical semantics for each possible word type end up looking like this:
a) λx.λy. y ⊆ red
b) λx.λy. y ⊆ {z : ∃e. e ∈ x & ag(e, z)}
c) λx.λy. x ⊆ run
d) λx.λy. x = y
Note that predicates are defined by the set of arguments for which they are true. Thus, predicates are set-valued entities on which we can use set operators like ⊆ (subset) and ∈ (element of); the traditional predicate logic notation that we have been using so far, pred(x), is simply shorthand for the set-theoretic formula x ∈ pred.
The only major alteration introduced here is seen in the semantics for relation words, class b, which must be modified to explicitly construct the subset of entities which have a particular relation to some element of the set of events.
It is now straightforward to add a fifth semantic class of words which in some way restrict the cardinality of (or quantify) a set:
e) λx.λy. |y| = 5
(There could be an additional sixth class that restricts the cardinality of the event set, represented by the variable x, but for simplicity we will ignore that possibility for now.)
This works great for simple numerals (like 5, as shown in the example) and more vague things like "many" or "a few"; but, it causes problems for quantifiers like "most" or "one-third" which restrict the cardinality of the referent set compared to what it would have been if it were not quantified. We need some way of keeping track of that original maximal set.
In a more "normal" language, generalized quantifiers would operate at a separate syntactic level from nouns, and could take in the compositional denotation of the rest of a noun phrase all at once, and produce a new restricted set from it. In this language, however, we don't have that luxury. If we want to keep things monocategorial, we need to find some way of keeping track of the base set and restrictions on the final quantified set simultaneously as additional quantifiers and other words are added in arbitrary orders. This will require altering our lexical semantics to account for a third argument, and that will in turn require altering the syntactic interpretation rules to provide that third argument.
The altered interpretation rules look like this:
[|S|] = ∃x.[|C|](x)
[|C: P C|] = λx. ∃y. ∃z. [|P|](x)(y)(z) & [|C|](x)
[|C: P .|] = λx. ∃y. ∃z. [|P|](x)(y)(z)
[|P: w P|] = λx.λy.λz.[|w|](x)(y)(z) & [|P|](x)(y)(z)
[|P: w ,|] = λx.λy.λz. [|w|](x)(y)(z) & y ⊆ z & ∃r. y ⊆ {z : ∃e. e ∈ x & r(e, z)}
Here we have simply asserted the existence of an additional set variable, z, and added a term to define y, the set of referents for a phrase, to be a subset of z.
The new forms of the different lexical semantic classes look like this:
a) λx.λy.λz. z ⊆ red
b) λx.λy.λz. y ⊆ {w : ∃e. e ∈ x & ag(e, w)}
c) λx.λy.λz. x ⊆ run
d) λx.λy.λz. x = y
e) λx.λy.λz. |y| > |z - y|
Now, class-a noun-jective words operate on the set z, which forms the basis set for quantification. Relation words (classes b and d) act on y, which represents the actual referents of the phrase and is the result of quantification, and x, as before; and class-e quantifier words specify some relation between the set of referents y and the basis set z. The example given shows the semantics for the quantifier "most": the cardinality of the set of referents is greater than the cardinality of its difference with the basis set.
So far, we have not actually eliminated the need for logical existential quantifiers "built-in" to our semantics, but we have eliminated their semantic effect; all entities described by a sentence in the monocategorial language are no longer implicitly existentially quantified. Rather, the quantifiers in our predicate logic forms serve only to bind variables that we can use to refer to the different referent sets; and the referent sets are explicitly quantified by whatever quantifier words you feel like using. In the absence of any explicit quantifier word, sentences are evaluated as being true for some, unspecified, subset of the basis. This is exactly the same level of ambiguity present in natural languages (like Mandarin) which lack obligatory grammatical number.
Actually eliminating the existential quantifiers from our predicate logic forms would require directly constructing the relevant sets from unions and intersections, so as to eliminate the need for a common variable to use to tie the different parts together. That, in turn, requires separating the classes of quantifier and relation words from the class of noun-jectives, which cannot be done while maintaining the monocategorial analysis. It will, however, be possible to do so in WSL, which does have the necessary multiple syntactic levels.
It should also be noted that in adding quantifier words to the monocategorial language, we actually did not need to go all the way to introducing the full formalism of "generalized quantifiers"- and in fact, it looks like we can't, since the order of compositional operations which motivates the use of generalized quantifiers in the semantics of natural languages just doesn't exist here. The denotations of our monocategorial phrases are simple sets of referents, whose members have some thematic relation to the implicit event. In contrast, the denotations of natural-language noun phrases composed of nouns and generalized quantifiers are sets of possible sets of referents with the appropriately restricted cardinalities; and the correct set of referents is then extracted at the next higher level of composition, when a role is assigned to the noun phrase by a verb or adposition. Without that extra level, the monocategorial language must simply specify the final referent set directly. Again, however, WSL does have the more typical separation of nouns, quantifiers, and role-assigning words at different syntactic levels, and so we will be able to explore a more naturalistic analysis for that language.
For more thoughts on the monocategorial language, see this following post.
Monday, September 7, 2015
A Sister Language for WSL
My last post was triggered by recent discussions on CONLANG-L which inspired me to start formalizing the semantics of WSL. But after I started formalizing WSL, those discussions kept going. And as it happens, I got inspired to start on the design of a new language, similar in basic structure to WSL but also very different. So, before we get to Part II of WSL's semantic model, we're going to take a brief detour through the basic design of a new sister language.
Background: This idea came out of a discussion on how to describe the semantics of a monocategorial language whose complete syntax could supposedly be described by the following simple grammar:
S → w S | 0
Or, "A sentence consists of a list of words." That's it. Any words in the language, in any order- all of them are grammatical sentences. Which, really, is equivalent to "no syntax at all". The obvious choice for semantic rules when presented with that syntax (or lack thereof) is David Gil's polyadic association operator, but the creator was adamant that that was not a correct analysis of his language. My conclusion was that "S → w S | 0" was simply not the correct grammar as claimed, but rather that it was a two-level structure that grouped words into distinct phrases, but where phrase boundaries are maximally ambiguous. This still permits every possible linear arrangement of words as a valid grammatical sentences, since the order of words within a phrase and the order of phrases within a sentence are still completely free.
With only one class of words and completely free word order, resulting in no way to tell where phrase boundaries are and how words should be grouped together, such a language would initially seem to be fairly useless- even discounting lexical ambiguity, the number of different possible interpretations grows as the square of the number of words in a sentence- an ambiguity load that dwarfs what exists in any natural language, and would quickly swamp what you can reasonably handle with pragmatics.
In a spoken language, though, more function words or morphological words would not necessarily be required to eliminate that ambiguity- phrase and sentence boundaries could be quite adequately delimited by intonation. And intonation can in turn by encoded in text via appropriate punctuation, while still reasonably claiming that this is a monocategorial language at the lexical level (although it will have multiple types of internal syntactic nodes). I've never really played with the intonation rules for a conlang before, and especially not the effect of intonation on semantics; and I haven't seen much of that documented in other people's conlangs, either. So, this is a pretty enticing opportunity to really isolate the semantics of suprasegmental intonation.
Now, the point of WSL was to create something that very obviously does not have anything that could reasonably be called a category of "verbs" at any level, but not necessarily to be simple or minimalistic. And WSL does in fact have quite an array of different parts of speech. But for this one, the aim will be to see how far it can go before it becomes necessary to add any additional lexical classes.
The syntax of this new language ends up looking up like this:
S → C
C → P C | P .
P → w P | w ,
This reads as "A sentence consist of a clause, a clause consists of a phrase followed by another clause or a phrase followed by a period, and a phrase consists of a word followed by another phrase, or a word followed by a comma." We also specify the phonological / orthographical rule that a sequence of ", ." coalesces into a single "."
At the phonological level, the "," and "." are realized as particular intonation patterns on the preceding phrase. I'm not wedded to anything yet, but I'm thinking rising tone over the last word of a phrase for ",", and contrasting falling tone for ".". That would lead to an intonation pattern over a whole sentence that consists of a series of level tones followed by rises, and then terminated by a fall.
The extra level of rules that turns an S into a C may seem superfluous (and if we just want to describe syntactic structure by itself, they are), but the extra level makes the semantic interpretation rules much simpler.
Those basic interpretation rules look like this:
[|S|] = ∃x.[|C|](x)
"There exists some x such that the denotation of C is true for x."
[|C: P C|] = λx. ∃y. [|P|](x)(y) & [|C|](x)
[|C: P .|] = λx. ∃y. [|P|](x)(y)
"For some x, there exists some y such that the denotation of P applied to x and y is true, and
the denotation of C is true for x."
[|P: w P|] = λx.λy.[|w|](x)(y) & [|P|](x)(y)
"For some x and y, the denotation of w and the denotation of P applied to x and y are true."
[|P: w ,|] = λx.λy. [|w|](x)(y) & ∃r. r(x,y)
"For some x and y, the denotation of w applied to x and y is true and some relation r exists between x and y."
Basically, this is just a fancy mathematically formalized way of saying that a sentence describes an event which gets passed into each sub-clause, and then each phrase describes its own separate entity, and the meaning of the whole sentence is just the conjunction of the meanings of each word, applied to the whole-sentence event and the entity for that word's containing phrase, along with the assertion that the entity for a phrase has some kind of relationship to the sentence.
Every word in the language has the "semantic interface" of a two-place predicate, or a two-argument curried lambda expression, taking in an event variable and an entity variable and specifying some restriction on either or both referents and/or a relationship between them.
Some words will be simple predicates that restrict the referent of the phrase, or tell you about its properties. They will have meanings the look something like this:
a) λx.λy. red(y)
which completely discards the event and just applies some predicate (in this case, "red") to the entity variable.
Some other words will be two-place relations that tell you about the thematic role of the entity in relation to the event. They will have meanings like
b) λx.λy. ag(x, y)
which tells you that the referent of this phrase (represented by the entity variable y) is the agent of the event.
And a third class of words will tell you about the event itself. These words could come in two sub-varieties; things that look like
c) λx.λy. run(x)
which discard the entity variable and just apply a predicate to the event; and things that look like
d) λx.λy. x = y
which tells you that the entity for this phrase is, in fact, an event, and that the event is a subset or superset of (or, in this particular case, simply is) the entity described by the enclosing phrase.
Now, semantic class c has the interesting property that, since the meanings of words in that class do not depend on the entity of the phrase, they can appear in any phrase in a sentence without altering the literal meaning. That's a fairly unique behavior, and could be used to argue for recognizing them as a separate part of speech from the rest, but they don't have to be analyzed so. Their syntactic behavior is undistinguished from every other word. Even so, I'm not sure if I will want to include some in the language for "fun", or if they should be disallowed so as to avoid the argument.
Also, the boundary between classes b and d is very fuzzy, since subset, superset, and identity could just as well be modeled as binary relations between a phrasal entity and a sentential event as things like "agent" and "patient" are.
Finally, the a category, which would typically seem to correspond with nouns and adjectives, also does not have any distinguished behavior compared to classes b, c, and d. Relation words and event words can be left out, and you can have a complete sentence that consists only of class-a semantic noun-jectives, which are asserted to exist and to have some unspecified relation to some unspecified event[1]. In WSL, role markers are obligatory, but here we have the extra "& ∃r. r(x,y)" in the interpretation of phrases just to account for the case where words with the semantics of a role marker are missing.
On the other hand, you can also leave out all class-a words, and have a complete sentence that consists only of class-b relations; and the same applies to the last two classes of event words as well. Finally, there are no selection rules that cause a word of any of the four classes to disallow the use of any other particular class in the same phrase or sentence; some combinations of words may be contradictory or nonsensical, but every string of words is grammatical, and can be interpreted.
It should also be possible to represent quantifiers in this framework, as totally undistinguished words at the syntactic level which merely happen to have another different internal structure in their lexical semantics. This would allow getting rid of some of the built-in existential quantifiers, but will first require removing a few layers of abstraction from my current semantic notation in order to uncover the set-theoretic mechanics of generalized quantifiers. My efforts to that effect are detailed in this follow-up post.
Next, I'd like to figure out some useful application for stress-marked focus, which could be indicated orthographically with Capital Letters or something. That will take some thinking, since English examples often rely on the semantics of some focus-sensitive lexical item, and using it that way would provide a good argument for recognizing focus-sensitive items as a second part-of-speech. But some really simple rising/falling intonation gets us pretty dang far doing nothing but marking linear phrase boundaries!
[1] Which means that elliptical answers to questions aren't actually elliptical at all- they're still complete grammatical sentences!
Background: This idea came out of a discussion on how to describe the semantics of a monocategorial language whose complete syntax could supposedly be described by the following simple grammar:
S → w S | 0
Or, "A sentence consists of a list of words." That's it. Any words in the language, in any order- all of them are grammatical sentences. Which, really, is equivalent to "no syntax at all". The obvious choice for semantic rules when presented with that syntax (or lack thereof) is David Gil's polyadic association operator, but the creator was adamant that that was not a correct analysis of his language. My conclusion was that "S → w S | 0" was simply not the correct grammar as claimed, but rather that it was a two-level structure that grouped words into distinct phrases, but where phrase boundaries are maximally ambiguous. This still permits every possible linear arrangement of words as a valid grammatical sentences, since the order of words within a phrase and the order of phrases within a sentence are still completely free.
With only one class of words and completely free word order, resulting in no way to tell where phrase boundaries are and how words should be grouped together, such a language would initially seem to be fairly useless- even discounting lexical ambiguity, the number of different possible interpretations grows as the square of the number of words in a sentence- an ambiguity load that dwarfs what exists in any natural language, and would quickly swamp what you can reasonably handle with pragmatics.
In a spoken language, though, more function words or morphological words would not necessarily be required to eliminate that ambiguity- phrase and sentence boundaries could be quite adequately delimited by intonation. And intonation can in turn by encoded in text via appropriate punctuation, while still reasonably claiming that this is a monocategorial language at the lexical level (although it will have multiple types of internal syntactic nodes). I've never really played with the intonation rules for a conlang before, and especially not the effect of intonation on semantics; and I haven't seen much of that documented in other people's conlangs, either. So, this is a pretty enticing opportunity to really isolate the semantics of suprasegmental intonation.
Now, the point of WSL was to create something that very obviously does not have anything that could reasonably be called a category of "verbs" at any level, but not necessarily to be simple or minimalistic. And WSL does in fact have quite an array of different parts of speech. But for this one, the aim will be to see how far it can go before it becomes necessary to add any additional lexical classes.
The syntax of this new language ends up looking up like this:
S → C
C → P C | P .
P → w P | w ,
This reads as "A sentence consist of a clause, a clause consists of a phrase followed by another clause or a phrase followed by a period, and a phrase consists of a word followed by another phrase, or a word followed by a comma." We also specify the phonological / orthographical rule that a sequence of ", ." coalesces into a single "."
At the phonological level, the "," and "." are realized as particular intonation patterns on the preceding phrase. I'm not wedded to anything yet, but I'm thinking rising tone over the last word of a phrase for ",", and contrasting falling tone for ".". That would lead to an intonation pattern over a whole sentence that consists of a series of level tones followed by rises, and then terminated by a fall.
The extra level of rules that turns an S into a C may seem superfluous (and if we just want to describe syntactic structure by itself, they are), but the extra level makes the semantic interpretation rules much simpler.
Those basic interpretation rules look like this:
[|S|] = ∃x.[|C|](x)
"There exists some x such that the denotation of C is true for x."
[|C: P C|] = λx. ∃y. [|P|](x)(y) & [|C|](x)
[|C: P .|] = λx. ∃y. [|P|](x)(y)
"For some x, there exists some y such that the denotation of P applied to x and y is true, and
the denotation of C is true for x."
[|P: w P|] = λx.λy.[|w|](x)(y) & [|P|](x)(y)
"For some x and y, the denotation of w and the denotation of P applied to x and y are true."
[|P: w ,|] = λx.λy. [|w|](x)(y) & ∃r. r(x,y)
"For some x and y, the denotation of w applied to x and y is true and some relation r exists between x and y."
Basically, this is just a fancy mathematically formalized way of saying that a sentence describes an event which gets passed into each sub-clause, and then each phrase describes its own separate entity, and the meaning of the whole sentence is just the conjunction of the meanings of each word, applied to the whole-sentence event and the entity for that word's containing phrase, along with the assertion that the entity for a phrase has some kind of relationship to the sentence.
Every word in the language has the "semantic interface" of a two-place predicate, or a two-argument curried lambda expression, taking in an event variable and an entity variable and specifying some restriction on either or both referents and/or a relationship between them.
Some words will be simple predicates that restrict the referent of the phrase, or tell you about its properties. They will have meanings the look something like this:
a) λx.λy. red(y)
which completely discards the event and just applies some predicate (in this case, "red") to the entity variable.
Some other words will be two-place relations that tell you about the thematic role of the entity in relation to the event. They will have meanings like
b) λx.λy. ag(x, y)
which tells you that the referent of this phrase (represented by the entity variable y) is the agent of the event.
And a third class of words will tell you about the event itself. These words could come in two sub-varieties; things that look like
c) λx.λy. run(x)
which discard the entity variable and just apply a predicate to the event; and things that look like
d) λx.λy. x = y
which tells you that the entity for this phrase is, in fact, an event, and that the event is a subset or superset of (or, in this particular case, simply is) the entity described by the enclosing phrase.
Now, semantic class c has the interesting property that, since the meanings of words in that class do not depend on the entity of the phrase, they can appear in any phrase in a sentence without altering the literal meaning. That's a fairly unique behavior, and could be used to argue for recognizing them as a separate part of speech from the rest, but they don't have to be analyzed so. Their syntactic behavior is undistinguished from every other word. Even so, I'm not sure if I will want to include some in the language for "fun", or if they should be disallowed so as to avoid the argument.
Also, the boundary between classes b and d is very fuzzy, since subset, superset, and identity could just as well be modeled as binary relations between a phrasal entity and a sentential event as things like "agent" and "patient" are.
Finally, the a category, which would typically seem to correspond with nouns and adjectives, also does not have any distinguished behavior compared to classes b, c, and d. Relation words and event words can be left out, and you can have a complete sentence that consists only of class-a semantic noun-jectives, which are asserted to exist and to have some unspecified relation to some unspecified event[1]. In WSL, role markers are obligatory, but here we have the extra "& ∃r. r(x,y)" in the interpretation of phrases just to account for the case where words with the semantics of a role marker are missing.
On the other hand, you can also leave out all class-a words, and have a complete sentence that consists only of class-b relations; and the same applies to the last two classes of event words as well. Finally, there are no selection rules that cause a word of any of the four classes to disallow the use of any other particular class in the same phrase or sentence; some combinations of words may be contradictory or nonsensical, but every string of words is grammatical, and can be interpreted.
It should also be possible to represent quantifiers in this framework, as totally undistinguished words at the syntactic level which merely happen to have another different internal structure in their lexical semantics. This would allow getting rid of some of the built-in existential quantifiers, but will first require removing a few layers of abstraction from my current semantic notation in order to uncover the set-theoretic mechanics of generalized quantifiers. My efforts to that effect are detailed in this follow-up post.
Next, I'd like to figure out some useful application for stress-marked focus, which could be indicated orthographically with Capital Letters or something. That will take some thinking, since English examples often rely on the semantics of some focus-sensitive lexical item, and using it that way would provide a good argument for recognizing focus-sensitive items as a second part-of-speech. But some really simple rising/falling intonation gets us pretty dang far doing nothing but marking linear phrase boundaries!
[1] Which means that elliptical answers to questions aren't actually elliptical at all- they're still complete grammatical sentences!
Tuesday, September 1, 2015
A Progressive Model of WSL Syntax & Interpretation: Part 1
Late last year, as the result of a challenge to create a language with no distinction between verbs and nouns (and more strictly, with no verb phrases at all), I began working on a new conlang called WSL. Originally, this stood for "Weird Syntax Language", but has since been backformed to an acronym for the autonym "Wjerih Sarak Lezu", whose documentation is being slowly fleshed out in the linked-to Google Doc.
In response to a follow-up challenge, I have now started the process of producing a formal description of the syntax and semantics of WSL. While relatively simple compared to most languages, WSL is still rather intimidating to make a full syntactic model of in one go. So, I'm going to take the approach of doing little pieces of it at a time, with commentary.
First, we start with the most basic requirements for predicate-argument structure:
S → a e A
A → a p A | 0
This is (almost) equivalent in expressive power to the syntax for a fully binarized neo-Davidsonian predicate logic notation that has recently been under discussion on the CONLANG-L mailing list, whose syntax is given by the single production rule
S → paaS | 0
Where p represents a binary predicate and a represents an argument variable.
Compared to that, the WSL grammar has been altered in two significant ways:
Extracting the common first argument reduces the expressive freedom of this grammar compared to the paaS language, in that any situation that requires referring to multiple entities that both occupy first-argument positions requires multiple sequential clauses. In exchange, we get the benefit of much reduced repetition.
In a more standard predicate logic notation, like paaS, as would be variables or constants with unique referents, and we would require predicates both to indicate the argument place occupied by each variable and to restrict the referents of variables in so doing. In WSL, however, we define ps to represent two-place predicates which specify named argument positions, and as to specify one-place
predicates which each take a unique argument that is not present in the surface syntax. The e can then be a unique symbol (in fact represented by the phonologically-variable clitic <=u>), since all the information about the identity of the shared first semantic argument will be provided by the a that precedes it.
The next level of complexity looks like this:
S → N e A
A → N r A | 0
N → n N | 0
Here, I have replaced the a for argument places with n (for 'noun'), due to the insertion of an N(oun) phrase layer between the individual one-place-predicate words and the A(argument) phrase level, which contains a two-place predicate. The representation of two-place predicates has also changed, replacing p with r (for 'role'). This now allows us to use multiple logical predicates (represented by
multiple surface nouns) to describe the same argument (i.e., to take the same implicit semantic variable as their argument). This allows us to translate English phrases which, for example, use adjectives to describe nouns, or adverbs to describe verbs, except that WSL syntax does not distinguish the adjectives from the noun or the adverbs from the verb. Each shared variable can, however, still occupy only a single role. To relax this restriction, we make the following additional modification:
S → N e A
A → N R A | 0
N → n N | 0
R → r R | 0
Now, we allow multiple two-place predicates to take the same semantic arguments (where each level of A-phrase embedding introduces a new semantic variable), thus allowing for the expression of reflexives (among other things), as well as allowing multiple one-place predicates to take the same arguments. Given an appropriate range of lexical semantics selections for the predicates, this allows for the expression of arbitrarily complex semantic graphs (within the space allowed by the restriction that all two-place predicates share one common first argument) among arbitrarily-precisely described
referents.
The next significant addition to the syntax is to allow the use of explicit quantifiers ("all", "most", "some", etc.). That is done as follows:
S → QP e A
A → QP R A | 0
QP → Q N
N → n N | 0
R → r R | 0
We now have enough of WSL syntax built up to describe some basic, but interesting, declarative sentences. With that foundation laid, we will introduce the interpretation rules that give meaning to the syntax.
(In actuality, all grammatical WSL sentences require an additional part of speech known as a Projector, which distinguishes, for example, declarative sentences from question. The semantics of projectors are, however, rather complex; thus, we will ignore them for now and work strictly with declarative sentences with no explicit projector).
In the notation for interpretation, [|x|] is used to indicate the denotation the syntax x; in cases where some particular type of syntactic node may have multiple options for the kind and arrangement of daughter nodes that it contains, [|x : y...|] is used to indicate the denotation of some syntactic node x consisting of daughters y....
G[s] is used to indicate looking up the meaning of the symbol s in the lexicon, and G[x:s] is used to disambiguate homonymous symbols belonging to different syntactic categories given by x, in the case where their denotations are different.
The denotation of any null syntactic node will be assumed to be empty; there is, however, still the possibility of phonologically-null lexical items, which have contentful denotations, and occupy non-null syntactic nodes. This is the primary use-case for the G[x:s] notation- to distinguish the different kinds of null lexemes.
The interpretation rules for this subset of WSL are as follows:
[|S|] = [|QP|]([|A|])
[|A|] = λx.[|QP|](λy. [|R|](x)(y) & [|A|](x))
[|QP|] = λz.G[Q]([|N|])(z)
[|N|] = λy.G[n](y) & [|N|](y)
[|R|] = λx.(λy. G[r](x,y) & [|R|](x)(y))
And the forms of the denotations for lexical items (or phrases) in the classes of Q, n, and r are:
G[Q:_] = λz.λw. _y. w(y) → z(y); i.e., some quantifier (represented by the placeholder _) binds a variable y and provides that variable to both of its arguments, where one argument is the denotation of a noun phrase whose truth value implies the truth of the second argument.
G[n:] is always some monovalent predicate.
G[r:] is always some bivalent predicate.
Note that these definitions contain no free variables. All variables are bound by either a quantifier or a lambda expression. This allows us to freely rename variables for clarity and to ensure that we can perform valid beta reductions of lambda expressions in any order.
Temporarily ignoring the syntax and semantics of projectors, we can now fully interpret many simple sentences like
"Ka vesu jes i ajs tey mot."
The parse of this sentence (again dispensing with the projector) is
(S (QP (Q "ves") (N (n 0))) (e "=u")
(A (QP (Q 0) (N (n "jes"))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))))
And the first few steps of the semantic derivation are:
[|(S (QP (Q "ves") (N (n 0))) (e "=u")
(A (QP (Q 0) (N (n "jes"))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))))
|]
= [|(QP (Q "ves") (N (n 0)))|]([|
(A (QP (Q 0) (N (n "jes"))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot")))))
|])
= G["ves"]([|n:0|])([|
(A (QP (Q 0) (N (n "jes")))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))
|])
= ∀z. [|n:0|](z) →
[|(A (QP (Q 0) (N (n "jes")))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))
|](z)
= ∀z. U(z) →
[|(A (QP (Q 0) (N (n "jes")))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))
|](z)
Note here that the denotation of the null noun is the universal predicate U, which always true for any argument. This, we can simplify by removing "U(z) ->" from the formula with no change in meaning.
Skipping a few steps for brevity, we get to
= ∀z. G[Q:0]([|(N (n "jes"))|])(λy. [|(R (r "i"))|](z)(y)
& [|(A (QP (Q 0) (N (N (n "ajs")) (n "tey"))) (R (r "mot")))|](z))
= ∀z. ∃b.[|(N (n "jes"))|](b) →
(λy. [|(R (r "i"))|](z)(y)
& [|(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot")))|](z))(b)
Note here that the null quantifier has the semantics of an existential. After a long string of additional reductions, we end up with
= ∀z. ∃b. G[n:"jes"](b) →
ag(z, b) & ∃d. (λy. G[n:"tey"](y) & G[n:"ajs"](y))(d) →
G[r:"mot"](z)(y)
= ∀z. ∃b. 1sg(b) →
ag(z, b) & ∃d. place(d) & this(d) → near(z, d)
"For all entities z, there exists some entity b such that b is me and[1] that b is the agent of z and that there exists some entity d such that d is a place and d is 'this-ish' (i.e., nearby and capable of being pointed at), which implies that z occurs near d."
Or, in normal English: "I do everything around here."
Note, however, that this is an extremely hyperbolic version of "I do everything around here." Literally, it means that there exists nothing which is not both close by some other place that's near me and of my doing.
Expressing a more typical meaning for the English sentence like "for all z such that z is near here, I do z" requires additional syntactic machinery to insert the necessary qualifications into the body of the quantifier phrase, which I may or may not ever get around to actually formalizing.
Also note that this model contains no rules for quantifier raising; thus, other possible readings of the English version, like "there is some specific place ('here') near which everything is of my doing", also cannot be expressed. It turns out that WSL does not have quantifier raising at all (not merely excluded from the subset described so far)- the scope of quantifiers in the semantics is exactly given by the order of quantifier phrases in the surface syntax. Expressing different quantifier scopes thus requires some mechanism for allowing the clause specifier (the first QP which is not part of an argument phrase, marked by the e symbol which is realized on the surface as the clitic <=u>) to move around to non-initial positions.
Formalizing the semantics for a larger subset of WSL that allows arbitrary specifier placement to control quantifier scope is rather complicated (as it require splitting the interpretation of argument phrases in half), so I shall leave that for a later installment.
[1] Alternately: "there exists some entity b such that b being me implies...."
In response to a follow-up challenge, I have now started the process of producing a formal description of the syntax and semantics of WSL. While relatively simple compared to most languages, WSL is still rather intimidating to make a full syntactic model of in one go. So, I'm going to take the approach of doing little pieces of it at a time, with commentary.
First, we start with the most basic requirements for predicate-argument structure:
S → a e A
A → a p A | 0
This is (almost) equivalent in expressive power to the syntax for a fully binarized neo-Davidsonian predicate logic notation that has recently been under discussion on the CONLANG-L mailing list, whose syntax is given by the single production rule
S → paaS | 0
Where p represents a binary predicate and a represents an argument variable.
Compared to that, the WSL grammar has been altered in two significant ways:
- p follows a instead of preceding
- Binary predicates are replaced with unary predicates on the surface, and the first (potentially Davidsonian) logical argument of each predicate is required to be identical and specified once separately (being distinguished by the e token rather than a following p).
Extracting the common first argument reduces the expressive freedom of this grammar compared to the paaS language, in that any situation that requires referring to multiple entities that both occupy first-argument positions requires multiple sequential clauses. In exchange, we get the benefit of much reduced repetition.
In a more standard predicate logic notation, like paaS, as would be variables or constants with unique referents, and we would require predicates both to indicate the argument place occupied by each variable and to restrict the referents of variables in so doing. In WSL, however, we define ps to represent two-place predicates which specify named argument positions, and as to specify one-place
predicates which each take a unique argument that is not present in the surface syntax. The e can then be a unique symbol (in fact represented by the phonologically-variable clitic <=u>), since all the information about the identity of the shared first semantic argument will be provided by the a that precedes it.
The next level of complexity looks like this:
S → N e A
A → N r A | 0
N → n N | 0
Here, I have replaced the a for argument places with n (for 'noun'), due to the insertion of an N(oun) phrase layer between the individual one-place-predicate words and the A(argument) phrase level, which contains a two-place predicate. The representation of two-place predicates has also changed, replacing p with r (for 'role'). This now allows us to use multiple logical predicates (represented by
multiple surface nouns) to describe the same argument (i.e., to take the same implicit semantic variable as their argument). This allows us to translate English phrases which, for example, use adjectives to describe nouns, or adverbs to describe verbs, except that WSL syntax does not distinguish the adjectives from the noun or the adverbs from the verb. Each shared variable can, however, still occupy only a single role. To relax this restriction, we make the following additional modification:
S → N e A
A → N R A | 0
N → n N | 0
R → r R | 0
Now, we allow multiple two-place predicates to take the same semantic arguments (where each level of A-phrase embedding introduces a new semantic variable), thus allowing for the expression of reflexives (among other things), as well as allowing multiple one-place predicates to take the same arguments. Given an appropriate range of lexical semantics selections for the predicates, this allows for the expression of arbitrarily complex semantic graphs (within the space allowed by the restriction that all two-place predicates share one common first argument) among arbitrarily-precisely described
referents.
The next significant addition to the syntax is to allow the use of explicit quantifiers ("all", "most", "some", etc.). That is done as follows:
S → QP e A
A → QP R A | 0
QP → Q N
N → n N | 0
R → r R | 0
It may at first seem like we could have avoided adding an extra rule, and just modified the production rule for S to S → Q N e A, and for A to A → Q N R A | 0; it will become important later, however, that Q and N are bound together in a Quantifier Phrase, and that Quantifier Phrases are in fact internal to Arguments
(In actuality, all grammatical WSL sentences require an additional part of speech known as a Projector, which distinguishes, for example, declarative sentences from question. The semantics of projectors are, however, rather complex; thus, we will ignore them for now and work strictly with declarative sentences with no explicit projector).
In the notation for interpretation, [|x|] is used to indicate the denotation the syntax x; in cases where some particular type of syntactic node may have multiple options for the kind and arrangement of daughter nodes that it contains, [|x : y...|] is used to indicate the denotation of some syntactic node x consisting of daughters y....
G[s] is used to indicate looking up the meaning of the symbol s in the lexicon, and G[x:s] is used to disambiguate homonymous symbols belonging to different syntactic categories given by x, in the case where their denotations are different.
The denotation of any null syntactic node will be assumed to be empty; there is, however, still the possibility of phonologically-null lexical items, which have contentful denotations, and occupy non-null syntactic nodes. This is the primary use-case for the G[x:s] notation- to distinguish the different kinds of null lexemes.
The interpretation rules for this subset of WSL are as follows:
[|S|] = [|QP|]([|A|])
[|A|] = λx.[|QP|](λy. [|R|](x)(y) & [|A|](x))
[|QP|] = λz.G[Q]([|N|])(z)
[|N|] = λy.G[n](y) & [|N|](y)
[|R|] = λx.(λy. G[r](x,y) & [|R|](x)(y))
And the forms of the denotations for lexical items (or phrases) in the classes of Q, n, and r are:
G[Q:_] = λz.λw. _y. w(y) → z(y); i.e., some quantifier (represented by the placeholder _) binds a variable y and provides that variable to both of its arguments, where one argument is the denotation of a noun phrase whose truth value implies the truth of the second argument.
G[n:] is always some monovalent predicate.
G[r:] is always some bivalent predicate.
Note that these definitions contain no free variables. All variables are bound by either a quantifier or a lambda expression. This allows us to freely rename variables for clarity and to ensure that we can perform valid beta reductions of lambda expressions in any order.
Temporarily ignoring the syntax and semantics of projectors, we can now fully interpret many simple sentences like
"Ka vesu jes i ajs tey mot."
The parse of this sentence (again dispensing with the projector) is
(S (QP (Q "ves") (N (n 0))) (e "=u")
(A (QP (Q 0) (N (n "jes"))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))))
And the first few steps of the semantic derivation are:
[|(S (QP (Q "ves") (N (n 0))) (e "=u")
(A (QP (Q 0) (N (n "jes"))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))))
|]
= [|(QP (Q "ves") (N (n 0)))|]([|
(A (QP (Q 0) (N (n "jes"))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot")))))
|])
= G["ves"]([|n:0|])([|
(A (QP (Q 0) (N (n "jes")))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))
|])
= ∀z. [|n:0|](z) →
[|(A (QP (Q 0) (N (n "jes")))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))
|](z)
= ∀z. U(z) →
[|(A (QP (Q 0) (N (n "jes")))
(R (r "i"))
(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot"))))
|](z)
Note here that the denotation of the null noun is the universal predicate U, which always true for any argument. This, we can simplify by removing "U(z) ->" from the formula with no change in meaning.
Skipping a few steps for brevity, we get to
= ∀z. G[Q:0]([|(N (n "jes"))|])(λy. [|(R (r "i"))|](z)(y)
& [|(A (QP (Q 0) (N (N (n "ajs")) (n "tey"))) (R (r "mot")))|](z))
= ∀z. ∃b.[|(N (n "jes"))|](b) →
(λy. [|(R (r "i"))|](z)(y)
& [|(A (QP (Q 0) (N (N (n "ajs")) (n "tey")))
(R (r "mot")))|](z))(b)
Note here that the null quantifier has the semantics of an existential. After a long string of additional reductions, we end up with
= ∀z. ∃b. G[n:"jes"](b) →
ag(z, b) & ∃d. (λy. G[n:"tey"](y) & G[n:"ajs"](y))(d) →
G[r:"mot"](z)(y)
= ∀z. ∃b. 1sg(b) →
ag(z, b) & ∃d. place(d) & this(d) → near(z, d)
"For all entities z, there exists some entity b such that b is me and[1] that b is the agent of z and that there exists some entity d such that d is a place and d is 'this-ish' (i.e., nearby and capable of being pointed at), which implies that z occurs near d."
Or, in normal English: "I do everything around here."
Note, however, that this is an extremely hyperbolic version of "I do everything around here." Literally, it means that there exists nothing which is not both close by some other place that's near me and of my doing.
Expressing a more typical meaning for the English sentence like "for all z such that z is near here, I do z" requires additional syntactic machinery to insert the necessary qualifications into the body of the quantifier phrase, which I may or may not ever get around to actually formalizing.
Also note that this model contains no rules for quantifier raising; thus, other possible readings of the English version, like "there is some specific place ('here') near which everything is of my doing", also cannot be expressed. It turns out that WSL does not have quantifier raising at all (not merely excluded from the subset described so far)- the scope of quantifiers in the semantics is exactly given by the order of quantifier phrases in the surface syntax. Expressing different quantifier scopes thus requires some mechanism for allowing the clause specifier (the first QP which is not part of an argument phrase, marked by the e symbol which is realized on the surface as the clitic <=u>) to move around to non-initial positions.
Formalizing the semantics for a larger subset of WSL that allows arbitrary specifier placement to control quantifier scope is rather complicated (as it require splitting the interpretation of argument phrases in half), so I shall leave that for a later installment.
[1] Alternately: "there exists some entity b such that b being me implies...."
Thursday, June 11, 2015
Uniform Call Syntax in Parles
After a long hiatus, I've come back to do a little more work on Parles. The GitHub repository now has a very simple but functioning compiler and VM that can run very simple programs. Now, I've got some thoughts on the next feature I'd like to add.
Uniform Function Call Syntax is a concept implemented in several modern programming languages (like D, or Magpie) which helps make the core language smaller by treating object-oriented method calls, of the form
Parles already has two similar bits of token-reordering syntactic sugar (the
The obvious simple solution is to just swap expressions (individual words or blocks). But, what if the expression preceding
Additionally, we need to ensure that the method name (whatever comes after a
These typing rules are similar to the restrictions that come with parens, and can be handled in a similar way. So, we'll start out by simply swapping the expressions on either side of a
First, we could try using type information, just like we would for auto-parenthesization. There is some circularity involved if we use type information to figure out how to re-order things, because we can't actually do type unification until we know the proper order for all expressions, but there is a way around that: only the left argument is potentially ambiguous (since we already said the right argument must always be a single word), so the type checker just needs to keep track of the types of all contiguous subsequences of expressions up to the point where it encounters a
The longest possible sequence is unlikely to be what you really want in most cases, and it potentially requires programmers to hold a lot more stuff in their heads to understand how any particular program is put together. Using the shortest possible sequence seems like a much better choice; it keeps the scope of reordering smaller and easier to think about, and in most cases will identify just a single expression anyway. Unfortunately, that is also a problem. For example, if
So, it seems that relying on the typechecker to tell us how to rewrite method calls is not going to work. The second option is to simply ignore the problem, continue to only swap single expressions on either side of a
The third option is to still require explicit delimiters for left method arguments, but in a slightly more subtle way: we can simply always collect everything to the left of a
After all that consideration, then, that's what Parles will do:
Addendum: the git commit which implements method call syntax can be seen here on GitHub.
foo.bar(baz)
, as simple syntactic sugar for function calls of the form bar(foo, baz)
. If you remove the parentheses, it becomes obvious that this is a fairly simple reordering rule: foo . bar baz
gets rewritten as bar foo baz
, with bar
and foo
switching places.Parles already has two similar bits of token-reordering syntactic sugar (the
;
and |
operators), so adding a third one should really be no big deal. Since parentheses are already allowed in exactly the positions that you would put them in a method call in a language like C++ or JavaScript, this minor addition would allow Parles to simulate OO method call syntax exactly. We just need to figure out the rules for what exactly gets swapped with what.The obvious simple solution is to just swap expressions (individual words or blocks). But, what if the expression preceding
.
produces multiple values? Calling a method on multiple values doesn't make much sense. It would be nice to define the .
operator so that it requires the programmer to use it in a way that "makes sense" in terms of OO method calls, rather than as an arbitrary re-ordering operator. To handle that, we'll need to look at type information to guarantee that the output of foo
is in fact a single value. It's also important that foo
not require any arguments- i.e., that it actually be a value, not a partially-applied function call.Additionally, we need to ensure that the method name (whatever comes after a
.
) is in fact a name - a single word - not an arbitrary expression. In other words, we want to make sure that programmers can write things like foo.bar
, but not foo."hello world"
or foo.(+ 1 2)
. That's a very simple syntax check that doesn't require any type information. It may also be a good idea to ensure that bar
is a function that takes at least one argument (namely, foo
).These typing rules are similar to the restrictions that come with parens, and can be handled in a similar way. So, we'll start out by simply swapping the expressions on either side of a
.
, but adding implicit type annotations to the AST so that the type checker can ensure that a method call "make sense".Method Chaining
Swapping single expressions on either side of a.
lets us do simple method calls like foo.bar(baz)
, which gets re-written as bar foo (baz)
. But what if we want to chain method calls together, writing something like foo.bar(baz).qux(fred)
? If .
associates to the left, this gets re-written in two steps as bar foo (baz).qux(fred)
and then bar foo qux (baz) (fred)
, which is not what we want! It should be qux bar foo (baz) (fred)
. We don't want to use just the first expression to the left as the first argument to qux
- we want to use the whole preceding method call. There are several possible approaches to fixing this.First, we could try using type information, just like we would for auto-parenthesization. There is some circularity involved if we use type information to figure out how to re-order things, because we can't actually do type unification until we know the proper order for all expressions, but there is a way around that: only the left argument is potentially ambiguous (since we already said the right argument must always be a single word), so the type checker just needs to keep track of the types of all contiguous subsequences of expressions up to the point where it encounters a
.
. That takes O(n^2) time for arbitrarily long subsequences, but we can skip sequences that cross block boundaries or ;
or |
operators. In that case, splitting a long program into two halves with a single block or re-ordering operator also cuts the typechecking time in half. As long as the basic chunks are small enough, this should be reasonable fast, close to O(n) time to typecheck a complete program. With progressive typechecking, every time a .
is encountered, the compiler can then select some previous sequence that takes no arguments and returns a single value, and use that sequence as the left argument for the .
reordering operator. If the Parles implementation does auto-parenthesization, the necessary type information will have to be calculated anyway, and re-using it for method call left-argument identification adds no additional typechecking overhead. Given that there may be more than one type-compatible left subsequence, though, we still need a rule for how to consistently select exactly one of them. There are two sane options: take the longest type-appropriate sequence, or the shortest.The longest possible sequence is unlikely to be what you really want in most cases, and it potentially requires programmers to hold a lot more stuff in their heads to understand how any particular program is put together. Using the shortest possible sequence seems like a much better choice; it keeps the scope of reordering smaller and easier to think about, and in most cases will identify just a single expression anyway. Unfortunately, that is also a problem. For example, if
baz
returns a single value, then foo.bar(baz).qux(fred)
will end up being re-written as bar foo qux (baz) (fred)
, which is not what we wanted! This approach would work fine for chaining method calls with functions that take zero, two, or more arguments, but fails for methods of one argument, which is not an uncommon case.So, it seems that relying on the typechecker to tell us how to rewrite method calls is not going to work. The second option is to simply ignore the problem, continue to only swap single expressions on either side of a
.
, and make programmers explicitly mark off whatever they want to function as the left argument in a method call with a block of some sort whenever it is larger than a single word. In that case, we'd just require foo.bar(baz).qux(fred) to be written instead as (foo.bar baz).qux(fred)
, or {foo.bar baz}.qux(fred)
. That fails to faithfully simulate how method chaining works in other languages, though, and is thus rather unsatisfying.The third option is to still require explicit delimiters for left method arguments, but in a slightly more subtle way: we can simply always collect everything to the left of a
.
up to the last enclosing block boundary, variable binding, |
, or ;
, regardless of type information. That means that, if you want to embed foo.bar(baz)
in the middle of a larger line, you'll probably have to explicitly mark it off as {foo.bar(baz)}
so that the .
operator doesn't suck up anything more to the left of foo
. But honestly, if you're writing Parles code in an imperative, OO style where method call syntax makes sense, you're not likely to want to embed {foo.bar(baz)}
in the middle of a line- you'll stick it on a line by itself terminated by a semicolon and beginning with a variable binding and/or preceded by a block boundary or another line terminated by a semicolon.After all that consideration, then, that's what Parles will do:
.
becomes a straightforward re-ordering operator working purely at the surface syntactic level, like ;
and |
, which swaps everything to the left up the the last enclosing block, variable binding, semicolon, or pipe, with a single following word, and which introduces implicit type annotations similar to those implied by parentheses.Addendum: the git commit which implements method call syntax can be seen here on GitHub.
Subscribe to:
Posts (Atom)