Scholarly article on topic 'New methods for analyzing case and adposition meaning'

New methods for analyzing case and adposition meaning Academic research paper on "Languages and literature"

Share paper
Academic journal
Language Sciences
OECD Field of science
{Case / Adposition / Semantics / Pragmatics / "Formal methods" / "Higher-order logic"}

Abstract of research paper on Languages and literature, author of scientific article — Erkki Luuk

Abstract The paper proposes two new methods for analyzing case and adposition meaning. The method for analyzing case and adposition semantics is based on an analysis of semes (semantic components) as arguments and predicates in a higher-order logic. The method for case and adposition pragmatics is based on an analysis of case and adposition functions as a series of functional derivations. The methods are complementary, cross-linguistically universal and allow for cross-category generalizations. A thorough discussion of the methods (including comparisons with earlier ones), formal definitions of case/adposition meanings and a variety of examples are provided.

Academic research paper on topic "New methods for analyzing case and adposition meaning"

Language Sciences xxx (2012) xxx-xxx

Contents lists available at SciVerse ScienceDirect

Language Sciences

journal homepage:

New methods for analyzing case and adposition meaning Erkki Luuk *

Institute of Computer Science, University of Tartu, Postimaja pk 149, Tartu 51004, Estonia


Article history:

Received 13 September 2011 Received in revised form 15 June 2012 Accepted 13 August 2012 Available online xxxx

Keywords: Case

Adposition Semantics Pragmatics Formal methods Higher-order logic


The paper proposes two new methods for analyzing case and adposition meaning. The method for analyzing case and adposition semantics is based on an analysis of semes (semantic components) as arguments and predicates in a higher-order logic. The method for case and adposition pragmatics is based on an analysis of case and adposition functions as a series of functional derivations. The methods are complementary, cross-linguistically universal and allow for cross-category generalizations. A thorough discussion of the methods (including comparisons with earlier ones), formal definitions of case/adposition meanings and a variety of examples are provided.

© 2012 Elsevier Ltd. All rights reserved.

1. Introduction

1.1. Case and adposition

Case is a system for marking dependents for the type of relationship they bear to their heads (Blake, 2001, p. 1). The head is usually the head of the clause (frequently although not necessarily (Luuk, 2010) a finite V) but could also be e.g. the head of possessive phrase. The dependents are usually (but again not necessarily) nominals, interpreted here as the set of {N, ADJ, PRO}. Case and adposition, while formally different (ADP is a word and CAS an affix), are partly co-extensive semantically and pragmatically (Blake, 2001; Butt, 2006). However, there is probably no total functional overlap. I am not sure that there would be languages with adpositional equivalents of ABS or ERG. Still, this is more due to the parsimony of grammar than for semantic reasons. It is easier to make the basic distinction between S and DO with CAS or word order than to have separate words for this. Thus, we can be reasonably confident that CAS and ADP are functionally co-extensive and that a universal semantic analysis of cases would also apply to adpositions and vice versa.

However, as a rule of thumb, there are more adpositions than cases in a language. This phenomenon has its roots in the usage frequency and the parsimony related to it. Words are more costly than affixes to produce and parse, so it is optimal to have affixes for more and words for less frequently used meanings (cf. Lestrade, 2010). Correspondingly, the functions of adpositions tend to be narrower than the functions of cases. Thus, as a rule, adpositions are defined more narrowly than cases over (roughly) the same semantic domain. In other respects they are clearly distinct. CAS is an affix, which is a type of morpheme, and morphemes are form-meaning pairs. Thus, CAS is a form-meaning pair where the form and sometimes also the meaning component systematically differs from that of adposition, which is a word, frequently (but not necessarily) another type of morpheme. Below are a few examples of adpositions and cases in different languages:

* Tel.: +372 58896309. E-mail address:

0388-0001/$ - see front matter © 2012 Elsevier Ltd. All rights reserved.


= equivalence

! derivation, implication, function or rewrite arrow (depending on the context)

:= definition

x' the semantics of x

x|y|z selection from mutually exclusive categories x, y, z

x/y/z selection from mutually nonexclusive categories x, y, z

ADJ adjective

ABL ablative

ABS absolutive

ACC accusative

ADE adessive

ADP adposition

ADV adverb

ALL allative

CAS case

cl case landmark

cs case subject

DAT dative

DEL delative

DO direct object

DP determiner phrase

ELA elative

ERG ergative

GEN genitive

ILL illative

INE inessive

LAT lative

LOC locative (case)

NOM nominative

O object

PAR partitive

PRO pronoun

PUR purposive

S subject

SEP separative

TRL translative

V verb



(maja-s (house-INE

\maj-ja |house-ILL

walks (in|into|to) (a|the) house Swahili

yeye anatembea (na|ndani ya) (s)he walks (to|into)

(s)he walks (to|into)

\maja juurde) |house to)

nyumba house

(a|the) house

It is evident that cases and adpositions have meanings (as walks (in\into\to) a house do not mean the same, and neither do their Estonian and Swahili counterparts). It can be also observed that CAS and ADP are at least partly co-extensive across languages.

1.2. Semes or semantic components

The method we will use in analyzing CAS/ADP semantics is based on semanalysis or component analysis (e.g. Frawley, 1992). Seme or, equivalently, semantic component is defined as the smallest unit of meaning in language, the (non)presence1 of a

1 In a binary system, the nonpresence of a property can imply the presence of another property and vice versa. For example, the nonpresence of PLURAL can imply the presence of SINGULAR.

certain property. For example, the semantics of snow can be analyzed into the set of semantic components {frozen, granulated, water}, while other partitionings are possible. For example, 'water' could be analyzed further into {substance, tasteless, odorless, transparent, nontoxic, with freezing point at 0 °C at one atmosphere pressure, with boiling point at..., etc.} until sufficient precision has been achieved.

For focusing reasons, I will not digress far further into the topic of semes. In the hiearchy of semantic levels, they occupy the second level counting from the most elementary (Luuk, 2008). The third level is occupied by morphemes, while the first one is reserved to semantic primitives (but not in the Wierzbicka (1996, 2000) sense as lexical primitives -semantic primitives do not have to be lexicalized). As an example, one might hypothesize that the semantics of INE involves the following set of semes (at least some of which may be semantic primitives): {SPACE, LOCATION, OBJECT, 3-DIMENSIONAL}. In this case, it is clear that not all these possible semes are semantic primitives (as a semantic primitive cannot, by definition, be reduced to a more elementary meaning). Here, LOCATION, OBJECT, 3-DIMENSIONAL can be reduced to SPACE, which singles out the latter as the only possible semantic primitive in the set. Semantic primitives in this (non-Wierzbicka) sense are mentioned in passing in Gardenfors (1998), Langacker (1987), Taylor (1999) and explored to a certain depth in Luuk (2008).

2. The method for analyzing case and adposition semantics

2.1. The method

The method I am proposing involves a depiction of case semantics in a higher-order logic with semes and seme complexes as predicates and arguments.2 For example, the semantics of ALL and onto can be analyzed as

(3) DG(cs,RS(cl))

where DG := goal-determined direction; cs := case subject; cl := case landmark; RS := surface region (see Appendix A). All these are semes or seme complexes (at least two semes in both DG and RS). Thus, a two-place predicate DG taking cs and RS(cl) as arguments. With DG, I am following the observation that directional adpositions can be profiled only with respect to goal (e.g. (in)to, onto), source (from, off) or route (across, via) (Jackendoff, 1983; Zwarts and Winter, 2000). As adposi-tions tend to have more specific meanings than cases, the observation naturally extends to cases. Throughout our walk through case/adposition semantics, cs and cl will appear as the argument variables ('objects' in the ontology developed in Appendix A), whereas predicates over them are usually constants. It is likely that one will need more lowest-order arguments than cs and cl for describing certain other cases and adpositions (e.g. DAT, which subsumes three arguments). There is no fixed limit to the number of lowest-order arguments that the notation can accommodate but, for all practical purposes, 3-4 is probably sufficient. The semantics of a CAS or ADP subsumes the number (in our examples, 1-2) and valency (cs, cl) of the arguments, as well as the operation(s) with these. The arguments are semantic not syntactic categories, with names describing their roles: a case subject that usually acts upon, is oriented or moves to(wards), from or via a case landmark (in a 2-argument scenario). A straightforward semantic interpretation is given by the assignment hierarchy, which for (3) is

(4) cs>DG>RS>cl

in (other) words, a case subject is assigned a direction and goal, which is assigned a surface region, which is assigned a case landmark. For simplicity, > can be omitted, yielding

(5) csDGRScl

Observe that (5) is an infix notation3 of (3). Thus, the semantic interpretation of a case or adposition is given by the infix notation of its predicate formula.

2.2. Examples 2.2.1. Allative

(6) John puts a vase on(to) the table

2 Differently from first- and second-order logics, higher-order logics allows for predicates as arguments of other (higher-order) predicates. For example, P(K(x)) is a formula of higher-order logic (with arguments in brackets, predicates in front of the brackets, capital letters denoting predicates and small letters the arguments that are not predicates). Thus, K is the lower-order predicate as compared to P, and x is the lower-order argument as compared to K(x).

3 Cf. the two notations of the addition formula below, the first standard predicate-argument and the second infix: = (3, +(1,2)); 1+2 = 3.

(7) Estonian

John paneb vaasi laua-le John puts vase table-ALL

John puts (a|the) vase on(to) (a|the) table cs = vase; cl = table

(5) seems to work with (6)-(7) - the vase is assigned a DG, which is assigned a surface region, which is assigned (a|the) table.

(8) Estonian

John laheb kala-le

John goes fish-ALL

John goes fishing cs =John; cl = fishing

again, (5) seems to work (8) - John is assigned a DG, which is assigned a surface region, which is assigned to fishing. Possibly, the surface region refers to surface water or the surface of water. Alternatively, it may be some kind of metaphor. It is easier to use a metaphor to widen the pragmatic scope of an existing case than to introduce a new one to the language. The third possibility (not mutually exclusive with the previous ones) is that the surface region of fishing is a historical coincidence (i.e. it could have been also construed with INE/in(side) etc.). We will analyze (8) in more detail in Section 3.

Observe the difference between the values of cs: in (7), cs is O but in (8) S. Thus, it is not the case that cs can be assigned a specific syntactic category (as we already mentioned in Section 2.1).

2.2.2. Adessive

After ALL/onto, ADE/on should be easy:

(9) RS(cs,cl) = csRScl

in (other) words, a cs is assigned a surface region, which is assigned a cl (see Appendix A). For example, a book is on the table or the stone was lying on the ground (cs = book, stone; cl = table, ground).

2.2.3. Inessive, lative, locative, at, near, far

A range of spatial cases (INE/in(side), ALL/onto, ADE/on, LAT/to, LOC, at, near, far) is defined in Appendix A. With the exception of ALL/onto, all these can be analyzed with the general formula P(cs,cl) = csPcl, substituting P with the defined predicate.

2.2.4. Accusative

Let's try with another, this time a grammatical case, ACC:

(10) A(cs,cl) = csAcl

where A is activity and the assignment hierarchy is as follows: a cs is assigned an activity which is assigned a cl. For example

(11) John broke the vase

(12) John loves Mary

cs = John; cl = vase, Mary

2.2.5. Nominative

As we do not have time to go through all the 15 cases in Estonian (much less all the more than fifty cases and probably more than a hundred different adpositions in the languages of the world), the final test will be with the unmarked case, NOM:

(13) E(cs) = csE

where E is exists. For example

(14) John loves Mary

(15) John laughs cs =John

Arguably, the semantics of NOM in (13) is sufficiently general to be subsumed by the semantics of ACC, INE etc. This seems relatively unproblematic - as NOM is an unmarked case, its semantics is expected to be generic, and overridden by the semantics of more specific, marked cases (if any). NOM could be given a unique interpretation

(16) I(cs,cl) = cslcl

where I is identity (i.e. cs and cl are identical). However, (16) seems more like a trick for getting a unique interpretation than an informative interpretation of NOM.

2.3. Summary

All the 15 Estonian cases were found to be analyzable with this method but it is sometimes difficult to find the optimal (the most exact, informative and parsimonious) formula. A criteria for working formulas (both optimal and suboptimal) is a working interpretation (assignment hierarchy). Given the universality of the methods (semanalysis and mathematical logic), the relatively wide scope of the Estonian case system, the fact that case and adposition are semantically (almost) co-extensive, and the fact that the same formulas work cross-linguistically (and the more general universality of semantics - Haspelmath, 2007), it is reasonable to assume that all cases and adpositions in all languages can be given working formulas with this method. Ideally, all predicates in the formulas would be formally defined. For various reasons, I have not defined A and E in Appendix A. I am not sure whether it makes sense to define Activity and Existence - the notions are so general that the ontology might become prohibitively complex without adding any precision to what is obvious without defining. There is also the danger of distorting our natural intuitions with the added precision, in which case the definitions would not even apply to the objects they are supposed to define.

3. The method for analyzing case and adposition pragmatics

3.1. The method

The results presented in the previous chapter are universal but also very general and rigid, as each case and adposition has only one (a focal) interpretation. It is well known that a case (and probably also an adposition) can have different interpretations depending on the context. To describe the situation, more analysis is needed. First, we will adopt our own version of Mel'chuk's (1986) notion 'autonomous case', redefined as follows:

(17) Autonomous case := a case or an adposition that has a morphological marker or a word form associated with only one focal interpretation

The notion of ''focal interpretation'' is explained in the next paragraph. As one can see, (17) extends Mel'chuk's (1986) notion of 'autonomous case' to adpositions. However, this is not the only difference from Mel'chuk's definition. He states that ''a case... c is (morphologically) autonomous if it has at least one marker that does not coincide with a marker of any other case... which can appear on the same base (= stem) as c; otherwise, c is non-autonomous'' (Mel'chuk 1986, p. 66). Positing a non-autonomous case d (instead of a peripheral use of case c - a distinction overlooked by Mel'chuk) is justified only if otherwise the surface-syntactic rules which select cases would have to refer to the individual properties of the lexeme to be declined (Mel'chuk 1986, p. 67). However, peripheral uses of cases frequently depend on the individual properties of the lexeme (see e.g. (23) below). In fact, according to Mel'chuk, there would be at least 50 cases in Estonian, not more than 10 of which would be autonomous. While not falsifying Mel'chuk's definition, this would be either undesirable or absurd. However, the definition would be falsified by a non-autonomous case that is necessary for the language whilst being independent from the stems' individual lexemic properties. The Estonian non-autonomous ACC is such a case. Estonian has clearly a working NOM/ACC case system, and there is no way to get it working without an ACC. However, Estonian has no autonomous ACC, the function is performed by NOM, PAR and GEN, and the choice between these cases depends on the verb and mood, not on the individual lexemic properties of the declined stem.

In short, Mel'chuk's definition is falsified, but a notion of non-autonomous case is important and has to be retained, which is the reason for positing (17). By ''focal interpretation" I mean an interpretation to which an existing (in any language) case label centrally (commonly, traditionally) corresponds. In the case of adpositions, ''focal interpretation'' is an interpretation to which an existing (in any language) case label or adposition centrally (commonly, traditionally) corresponds. Obviously, there will be many non-focal interpretations to which no case label or adposition in any language focally corresponds. The only (?) downside of (17) is that the notion of the focal interpretation (or meaning) of a case/adposition is somewhat vague. First, we lack an overview of case labels, adpositions and their focal meanings in all languages. Second, the same label may be associated with different focal meanings in different languages. In the latter situation one should proceed as follows. Assuming that not a patently wrong case label has been ascribed for the language, one should use its focal meaning in this language for describing the language. Otherwise, one should choose the most common focal meaning associated with the (semantically) most appropriate case label cross-linguistically. If such a label is unavailable, one is dealing with a non-focal meaning (i.e. a peripheral use).

By (17), a case marker or an adposition can have many focal meanings associated with it. For example, the Estonian ALL marker encodes DAT; and NOM, PAR and GEN markers encode ACC (Estonian lacks both autonomous DAT and ACC). To address such complexities (along with many subtler pragmatic disctinctions), I devised the following method for analyzing case and adposition pragmatics:

(18) {base case(s)} ! marker case ! ((functional case)/[(metaphor|metonymy): x = y])

In (18), {base}, marker and (functional) case, as well as [metaphor|metonymy] are derived from one another and identified by bracket style. Importantly, 'case' receives a broader interpretation in (18) than elsewhere in the paper: an element of {CAS, ADP}. Thus, like our analysis of case semantics, (18) applies to cases and adpositions alike. Base case(s)

are cases from which the marker case in the language historically (directly) derives. Depending on the grammaticalization, there may also be a base ADV, N or V etc. instead of a base case (Heine and Kuteva, 2002a). Marker case is the common semantics (if not patently wrong) of the case label (if not patently falsely) associated with the case marker in the language. (With adpositions, marker case is the generic or common semantics of the adposition.) In the case of patently wrong labels/semantics, suitable corrections should be made. Functional case can be either autonomous (one per marker) or non-autonomous (many per marker). A marker case can have a finite number (constrained by (17)) of functional cases. Metaphor is defined as the production of concept z by attributing properties of concept y to sign (form-concept pair) x, where the concepts x!=y!=z (the 'concept' is 'meaning', ! = denotes nonequivalence). For example, the metaphor the Moon like sixpence can be analyzed as [metaphor: the moon = sixpence].4 Metonymy is defined as the relation by which sign x (form-concept pair) represents concept y, where the concepts x and y are in a natural relation (e.g. cause-effect, whole-part, spatial or temporal proximity) but x!=y. For example, the metonym the tall denoting a tall person can be analyzed as [metonymy: the tall = tall person].

Observe that the marker case can also be the functional case and the base case, adposition or concept can be unknown or void. The pragmatic formula could then be e.g.

(19) ACC

(19) makes use of the following convention: if the marker case is also the functional case, the functional case can be omitted. Thus, ACC ! (ACC) and (19) are equivalent.

With (18) we will describe case and adposition pragmatics. The core idea is that a marker case is, more often than not, only one stage in the chain of functional derivations (! some of which are synchronic, others diachronic. Parts of these chains can be described as grammaticalization, others are of a more pragmatic nature. The method is not only compatible with the method of analyzing CAS and ADP semantics but complementary to it, as all the three (or more) cases in (18) can be given semantic formulas analogous to (3), (5), (9) and (10). The functions in (18) work as follows: the first one maps from base case(s) to marker case, the second from marker to functional case or (metaphor|metonymy), the third from functional case or (metaphor|metonymy) to (metaphor|metonymy) or functional case, respectively. Only the second mapping is necessary, the rest are optional and, as shown above, the second can also be the identity function (the marker case can be also the functional one). Method (18) describes a situation where a series of uses are diachronically and/or synchronically derived from one other. In this sense, its application on cases and adpositions is only one possible application.

3.2. Examples

Let's have a look at the following example:

(20) Estonian

J armastab putr.u J loves porridge.PAR J loves porridge

Partitive is a case stipulating that the clause predicate applies only to a part of one of its arguments. Clearly, the functional case in (20) is not partitive - in fact, nothing in or about (20) except the partitive suggests that a part of porridge(s) is loved. The PAR encodes ACC here (remember that there is no autonomous ACC in Estonian). Thus, we can describe (20) in notation (18) as

(21) PAR ! (ACC)

with PAR as the marker and ACC as the functional case. However, it seems that the chain in (21) can be extended to the left -according to Ratsep (1977, 1979), the foundational cases5 of modern Estonian PAR are Proto-Baltic-Finnic SEP and ACC:

(22) {SEP/ACC} ! PAR ! (ACC)

Let's now have another look at (8) in Section 2.2.1. The particular use of ALL is unproductive and seemingly confined to the following words:

(23) Estonian6

seeni-le marju-le korje-le

mushooms-ALL berries-ALL gathering-ALL

kala-le jahi-le heina-le

fish-ALL hunt-ALL hay-ALL

The pattern can be captured with the following formula:

4 In the notations of metaphor and metonymy, = denotes valuation.

5 The foundational or the base cases (both morphologically and functionally foundational).

6 In Estonian, the majority of case markers attach to GEN rather than NOM stems. As the feature is irrelevant for the present analysis, I have not glossed the GEN (e.g. hunt.GEN-ALL) in e.g. (23), (26) and (27).

(24) {LAT} ! ALL ! (LAT) ! [metaphor: procurement = space]

First, the base case of Estonian ALL is LAT (Ratsep, 1979). Second, as it is not clear why RS should be invoked here, the functional case could be LAT (csDGcl) as well. On the other hand, the RS may have something to do with all the activities subsumed by (23) going on the surface of the land (rather than e.g. in the Earth's crust). Fortunately, there is a way to test whether the functional case in (23) is ALL. As Estonian has a functional ALL adposition (the semantic and pragmatic equivalent of onto), we can try whether substituting the allatives with this adposition works for (23). The test

(25) Estonian (kala\jahi\heina\...) peale (fish|hunt|hay|...) onto onto (fish|hunt|hay|...)

shows that it does not - (25) literally means onto fish, hunt, hay etc., i.e. something completely different than in (23). As an adposition test with an equivalent of into would fail either, (23) must be functional LAT (the English to).

Still, all this gets us nowhere in explaining why use (23) is confined to these particular stems. I suggest that the pattern of stem and case choice is consistent with [metaphor: procurement = space], i.e. with procuring being attributed a spatial scope. [metaphor: procurement = space] is a subcase of [metaphor: operation = space], as in e.g. think on this, react on this etc. Another subcase of [metaphor: operation = space] is [metaphor: investment = space], as in invest in this, put money on this, stake on this etc.

For the final example we will select something different again. Estonian has TRL, as in

(26) Estonian

ta osutus mehe-ks

(s)he turned out to be man-TRL (s)he turned out to be a man

The main function of Estonian (and Finnish) TRL is to express a future or a transient/accidental present state (cf. Erelt et al., 2000). Besides this, TRL can also encode PUR,7 as in

(27) Estonian

selle-ks on liiga hilja this-TRL is too late it is too late for this

One can distinguish between these two functions by the questions they answer. TRL, as in (26), answers to ''to what/whom?'', while PUR answers to ''for what?'' (but not ''for whom?''). For (26), the formula is TRL. For (27), the formula is either

(28) TRL ! (PUR)

(29) TRL ! [metaphor: purpose = state] ! (PUR)

with purpose metaphorically construed as the state, which licenses the use of TRL as PUR. As I cannot think of a test to validate the intermediate stage [metaphor: purpose = state], we will adopt the simpler formula which describes both (28) and (29) - as ! [metaphor: purpose = state] ! is a derivation, it can be expressed with the generic ! for derivation.

4. Background

The commonest method for analyzing case and adposition meaning, the one used in almost all grammars of natural languages, is verbal description (e.g. Erelt et al., 1995). The merits of present methods over verbal description are many, they are listed in Section 5. Since the methods I am proposing are formal, it makes sense to compare them with similar earlier methods. For this reason (and also for space considerations), the section will be confined to more formal accounts of case and adposition meaning.

Nearly all approaches agree that CAS and ADP meanings should be analyzed as some kinds of relations, although the nature, arity and valence of these relations, as well as the ways to analyze them vary considerably. Jackendoff (1983) distinguishes between two main senses of spatial prepositions, PLACE and PATH, analyzing them with rewrite rules and an informal function notation, e.g. [Place x] ? [Place PLACE-FUNCTION ([Thing y])], where subscript denotes type, function and argument constants are in uppercase and variables in lowercase. For example, the prepositional phrase in the mouse is under the table is analyzed as [Place UNDER ([Thing TABLE])]. There are many differences between Jackendoffs (1983) and my analysis. First, Jackendoff (1983) views only spatial ADPs, whereas the present methods apply for spatial and nonspatial cases

7 Once again, there is no autonomous PUR in Estonian.

and adpositions alike. Moreover, Jackendoffs analyzes spatial prepositional phrases rather than adpositions. Correspondingly, in his analysis there is no place for elements that are syntactically extrinsic to prepositional phrases (like cs, e.g. the mouse). A major difference is that Jackendoff does not use semanalysis. His types (Place, Thing, Path, etc.) are more akin to semantic roles than to semes. For example, Place could be analyzed further into semes {region, bounded} etc. Another major difference is the lack of formal definitions in Jackendoff (1983, 1990). Although tacitly relying on higher-order logic in analyzing PATHS that may take PLACES as arguments, this is the only (and an informal) use of it. Furthermore, according to Jackendoff (1983, pp. 167-168; and contrary to Kracht, 2002) there is no logical (e.g. higher- vs. lower-order) basis for distinguishing between PATHS and PLACES, only a denotational one. In short, there are some similarities but both the objects and methods of analysis are different.

Zwarts and Winter (2000) propose a model-theoretic semantics for spatial prepositions that is based on a vector space ontology. Thus, our objects of analysis differ. However, differently from prepositional phrases, spatial prepositions are at least a proper subset of set of cases and adpositions. Zwarts and Winter (2000) analyze the semantics of onto as follows: dir1(kA.kv.ext(v,A) A |v|<r), where dir1 is DG, A is a region (a set of points), v is a boundary vector externally closest to A, and r « 0. They define boundary vector as a vector that starts from a region's boundary, ''externally closest to A means'' means that the vector stretches outwards from the boundary of A at even angles. When we compare this definition with DG(cs,RS(cl)), three components (dir1 ~ DG, A ~ cl and v ~ RS) are somewhat similar. The similarity pertains to semantics only. Formally, the entities and their definitions are completely different. Obviously, they use k-calculus and functions instead of higher-order logic and predicates (although for the purposes of their analysis the formalisms may be equivalent). While precise and logically sound, I do not find the analysis particularly suitable for ADP and CAS semantics. First, it cannot be extended to nonspatial CAS and ADP. Second, the particular analysis is wrong, as it does not discriminate between e.g. putting something onto, against the side, bottom of or next to the table, whereas onto applies to the former only. This is a result of using the same formula kA.kv.ext(v,A) A |v|<r for on and at, which are semantically clearly distinct (see Appendix A).

Zwarts (2005) analyzes directional prepositions (across, along, down etc.) as paths defined through locative prepositions (above, at, below etc.). For example, over x is analyzed as {p: there is an interval... for which p(i) is on/above x}, where p: means ''all paths p such that'' and i e [0,1]. Defining ADPs through other ADPs is not nearly as detailed or desirable as defining them through lower-order elements (e.g. semes). It is even dubious whether the former qualifies as truly compositional unless the other ADPs are defined in turn (e.g. as vector space relations - Zwarts and Winter 2000).

Kracht (2002, 2003) analyzes the syntax of locatives with categorial grammar and semantics with k-calculus. He defines case/adposition alternately as sequences of morphemes (i.e. signs) and exponents (i.e. pure forms). What starts as an innocent simplification, meant to allow for string (i.e. form) substitutions on sequences of morphemes, leads to a puzzling theory stipulating semantically vacuous cases. For such cases a syntax-semantics tradeoff is assumed, with all functions being relegated to syntax. The theory (both the string substitutions and syntax-semantics tradeoff) is motivated by the claim or rather an assumption, based mostly on certain types of German and English prepositional phrases and (Jackendoff, 1983 and/or 1990 - Kracht is not precise about this) that locatives have the structure [M [L DP]], where M and L are modalizer and localizer, repectively. However, according to Jackendoff (1983, pp. 167-168), the structure is not universal even in English. To defend the idea of semantically vacuous cases, Kracht postulates the Emptiness Principle, stating that there can be markers that function purely syntactically in some contexts. While this is certainly possible (think of it in it is raining), the principle is very weak, and nothing follows from it whatever (except perhaps for the fact that syntax exists). Additionally, none of the examples of purely syntactic function that Kracht brings in defining the Emptiness Principle (selection, agreement, sandhi) qualifies as such. For example, argument selection correlates with semantic roles, a unit marked by agreement is usually a semantic as well as syntactic one (e.g. a DP), and sandhi is not even a syntactic phenomenon (it is a phonological one). Furthermore, if Kracht were correct, one would expect to find a language with an a priori semantically vacuous case or adpo-sition. However, no one has ever heard of anything like this. For example, English of, about, in front of, from etc. have meaningful as well as seemingly purely syntactic uses (see Kracht, 2002, for the latter). Given this, it is perhaps not too bold to conjecture that the former may interfere with or contribute to the latter. In sum, Kracht invokes the Emptiness Principle, as his analysis cannot be extended to case pragmatics and his semantic analysis does not cover nonspatial cases. Kracht (2002) analyzes the semantics of in as follows: kx.kt.{r: r # i(loc(x)(t)), r a region}, where i(loc(x)(t)) is the convex hull of the location of object x at time point t (loc(x) is the location of object x). The analysis seems to be correct, and if we compare it with (34), there are some similarities. An obvious advantage of (Lo c Lp) over kx.kt.{r: r # i(loc(x)(t)), r a region} is that the interpretation of the former is transparent and automatic, whereas the interpretation of the latter is opaque. Also, the definition of (Lo c Lp) is not as complexas that of kx.kt.{r: r # i(loc(x)(t)), r a region} (cf. Appendix A and Kracht, 2002). Another advantage of the present methods/analysis is that Kracht does not analyze surface cases (and it is not clear how to use his methods for this purpose).

The present paper is, at least nominally, also related to Potts' (1978) paper on case roles and componential analysis. However, Potts (1978) analyzed only case (i.e. semantic) roles, not the meanings of cases. The closest he got to analyzing a case was the analysis of Location: a in b. The analysis is essentially correct but there is a long way from it to (34) or (37).

5. Discussion

I have described two formal methods for analyzing case and adposition meaning. The method for case and adposition semantics is based on an analysis of semes and seme complexes as arguments and predicates in a higher-order logic

(and, more fundamentally, on mathematics). The method for case and adposition pragmatics is based on an analysis of case and adposition functions as a series of functional derivations. The latter method can be extended to all kinds of pragmatic phenomena, if restated as ''the method for analyzing the pragmatics of x that is based on an analysis of functions of x as a series of functional derivations", where x is a linguistic (or even a cognitive or a logical?) category. This method also subsumes the method for analyzing metaphor and metonymy.

The method for analyzing pragmatics is a macro-level notation for modeling all the relevant functional derivations, while the method for analyzing CAS/ADP semantics is designed for an in-depth analysis of particular functions. Thus, the methods are complementary. As a specific predicate-argument system can be used to analyze the semantics and/or syntax of words, sentences and phrases (Luuk, 2009), the semantic notation could also complement the program of deriving the meanings of these larger units from the meanings of their constituents. Furthermore, given the generality of component analysis and mathematical logic, it is likely that the semantic analysis could be extended to other grammatical categories (e.g. tense-aspect-mood) as well. As mentioned above, the macro-level notation is in principle general already.

The main merit of these methods is that, being formal yet modeled on (and thus, undetached from) linguistic facts, they balance exactness and parsimony, which is uncommon in analyzing the meanings of lower-order linguistic categories such as case, adposition, tense-aspect-mood etc. The usual method for analyzing the meanings of such categories is verbal description, which (as compared to the present methods) lacks either exactness or parsimony or both. Symptomat-ically, verbal descriptions of a meaning of the lower-order linguistic category are not readily comparable neither with each other nor with verbal descriptions of other meanings of the same category, leading to confusions that the present methods are designed to avoid. Another advantage of the notations over verbal description is brevity. Besides being prone to vagueness, a verbal rendering of the information captured in e.g. (5) or (24) would be simply too long and cumbersome.

As generic descriptions of the function of a semantic category, all the semantic formulas we established are cross-linguistically universal in the languages that have the category. Interestingly, some pragmatic formulas seem to be near-universal as well (cf. Heine and Kuteva, 2002b). It's a long shot but, besides their theoretical import, the methods could also hold promise for more effective natural language parsing, comprehension and learning algorithms.


I thank Sander Lestrade, Kees Hengeveld, Tania Kuteva, Robert Van Valin and Haldur Oim for their helpful comments. The work was supported by the Alexander von Humboldt Foundation, the target-financed theme No. 0180078s08, the National Programme for Estonian Language Technology project ''Semantic analysis of simple sentences 2'', and the European Regional Development Fund through the Estonian Center of Excellence in Computer Science, EXCS.

Appendix A

The appendix contains the formal definitions and technical explanations along with a few examples. For an introduction to the concepts and/or notation see e.g. and Hummel (2000) or any other introduction to mathematical logic and set theory.

There is a bijection between a set N and the indicator function of S c NN ? {0,1} which returns 1 for each s e S and 0 otherwise. As predicates are functions of type A ? {0,1}, they are indicator functions (e.g. of their truth-conditions). This property allows us to define predicates alternately as sets, indicator functions or other predicates.

A path is a continuous function p: [0,1] ? Rn, where Rn is an Euclidean n-space and p(0) is the starting and p(1) the end point of the path. Thus, paths are subsets of Euclidean n-spaces, and the following restrictions hold: a path is restricted to the universe of discourse; a path is restricted to the Rn space for which it is defined. A set X c Rn is path-connected iff (Vx,y e X)(9p)[p is path & p(0)=x & p(1) = y]. A set X c Rn is strongly path-connected iff (Vx,y e X)(9p)[p is path & p(0) = x & p(1) = y & p c X].

Like Kracht (2002), we assume all objects to be in one piece, i.e. if one is in n pieces we assume n separate objects. Given a set of objects O c N and the set of time points R, there is a partial function e: (O x R) ? Rn that returns a strongly path-connected set L c Rn for an object o e O at time point t e R. Given this, L is the location of o at t. Suppose o is cs and p cl:

(30) o : = cs

(31) P : = cl

(32) Lo := L(o,to)

(33) LP := L(p,tp)

The semantics of INE/in(side) then becomes: (34) lNE/in(side)' := (LoCLp)

A vector is a partial function v: (Rn x Rn) ? R that returns the distance between an ordered pair of points in an Euclidean n-space: v(a,b) = |v|, where a is the starting, b the end point and |v| the length of the vector. If v is a vector, v(c) is a point in v. Given locations Lo and Lp of two distinct objects o and p, the goal-determined direction DG of o is a vector v e {v(a, b): a e Lo & b e Lp}. Notice that DG is any member of this set of vectors. Also, it does not have to be a movement vector, it may e.g. be determined by a p that o is looking or pointing at. to may equal tp but we allow for the possibility that it might not if p is moving. Now we can define the semantics of LAT/to:

(35) LAT/to' := DG(Lo,Lp)

Technically, csDGcl is merely a useful shorthand for DG(Lo,Lp) - useful because the assignment hierarchy provides a straightforward interpretation (see Section 2.1).

An adjacency of location L is a strongly path-connected set ALcRn such that (Va e AL)(9p)[p is path & p(x) = a & p(x)RL & p(y)RAL& p(y)eL& p(x)^p(y)]. The adjacent region of L is defined as follows (UAL : = the generalized union of adjancies of L):

(36) Ra(L) := UAL

This gives us the semantics of LOC^at (x(G|R)' := the (generalized|restricted) semantics of x, x is an expression; US' := the generalized union of semantics of similar applicable spatial cases):

(37) LOCg' := (LonRA(Lp))u(LonLp)

(38) LOCR/at' := LOCG'-US'

For example, at the table ? LOCG'-(on'uin'unear') because in, on and near the table are possible, similar and have (obviously) different meanings, being thus excluded from the meaning of at for this expression. By contrast, at home ? LOCG'-near' because in and on home are unacceptable.

Assuming that Lo and Lp do not intersect (-9x(x e Lo & x e Lp)), their adjacent regions RA(Lo) and RA(Lp), and minimal n-balls Br(x) and Bs(y) such that LocBr(x) and LpcBs(y) where (r,s) are the radii and (x,y) the centerpoints of the balls, we say that o is near p and p near o iff (9v(a,b))[v is vector & a e RA(Lo) & b e RA(Lp) & (-9v(c))[v(c) e (Lo u Lp)] & z<|v|<3(r+s+z) & z is small positive number]. Here, a and b are the vector's starting and end points, (-9v(c))[v(c) e (Lo u Lp)] locates the vector outside Lo and Lp and z<|v|<3(r+s+z) specifies its length (z is required for distinguishing near0 from RA). Analogously, o is far from p and p far from o iff (9v(a,b))[v is vector & a e RA(Lo) & b e RA(Lp) & (-9v(c))[v(c) e (Lo u Lp)] & |v|>9(r+s)]. The numbers may be somewhat haphazardly chosen but the idea is that there is a transition zone between near' and far0 which is neither or both and for which neutral expressions are preferred. Using radii instead of fixed metric captures the relativity of the notions - the difference between near' in the town is near the city and the ball is near the table shows that near' depends on the sizes of cs and cl.

An object's primary gravity vector is the shortest gravity vector that determines its location (e.g. my primary gravity vector points to the center of Earth, the secondary to the center of Sun, the ternary probably to the centre of Milky Way etc.). We say that an adjacency of L is horizontal iff its mean gradient is 90 ± 4° with respect to the object's primary gravity vector.

The surface region RS # RA of L is defined as follows (ML := L's maximum adjacency; H(ML) := ML is horizontal; X(ML) := ML is (accessible/visible) as required by the context):

(39) RS(L) := ML such that [H(ML) & X(ML)] v [X(ML) & -3ML(H(ML) & X(ML))]

For example, putting p onto o requires o to be accessible, whereas pointing onto o requires o to be visible. The general idea is as follows: if there is a choice between different adjacencies of an object, there is a selection hierarchy for RS(L): [H(ML) &X(ML)] > [-H(ML)&X(ML)]. Thus, when something is on or put onto a table, the table's maximum horizontal and visible surface (usually the tabletop) is preferred to its legs and the bottom (assuming that the table is not upside down or on its side). Since a wall has no horizontal surface, on and onto a wall select [-H(ML)&X(ML)]. In the case of -X(ML), ALL/onto, ADE/on and DEL/''from surface'' do not apply. All this is stipulated by the definition of RS, a major advantage over the definition in Zwarts and Winter (2000) which did not distinguish on from at. RS gives us the semantics of ADE/on, thus partly also of ALL/onto and DEL/''from surface''8:

(40) ADE/on' := LonRS(Lp)

(41) ALL/onto' := DG(Lo,RS(Lp))

The analyses of semantics of ILL/into, ABL/SEP/from, ELA/from inside and DEL/''from surface'' are omitted here for brevity but can be deduced from those of ADE/on, ALL/onto, LOC^at, LAT/to and INE/in(side).

8 The terminology is confusing here, as the Finnish (and especially Estonian) ABL have roughly the same meaning as the Hungarian DEL (''from surface"). It seems more appropriate to use DEL instead of ABL for Estonian and Finnish, as ABL/SEP is commonly reserved for the generic from. Admittedly, the picture is complicated by the fact that Finno-Ugric local cases derive from a generic {LAT, LOC, (ABL|SEP)} system, the influence of which is still present in some expressions.


Blake, B.J., 2001. Case: Cambridge Textbooks in Linguistics. Cambridge University Press, Cambridge [England]; New York, NY, USA. Butt, M., 2006. Theories of Case: Cambridge Textbooks in Linguistics. Cambridge University Press, Cambridge.

Erelt, M., Kasik, R., Metslang, H., Rajandi, H., Ross, K., Saari, H., Tael, K., Vare, S., 1995. Eesti Keele Grammatika: Morfoloogia; Sonamoodustus, vol. I. Eesti

Teaduste Akadeemia Eesti Keele Instituut, Tallinn. Erelt, M., Erelt, T., Ross, K., 2000. Eesti Keele Kasiraamat. Eesti Keele Sihtasutus, Tallinn. Frawley, W., 1992. Linguistic Semantics. Lawrence Erlbaum Associates, Hillsdale, New Jersey etc.

Gardenfors, P., 1998. Some tenets of cognitive semantics. In: Allwood, J., Gardenfors, P. (Eds.), Cognitive Semantics: Meaning and Cognition. John Benjamins, Amsterdam, Philadelphia, pp. 19-36.

Haspelmath, M., 2007. Pre-established categories don't exist: consequences for language description and typology. Linguistic Typology 11, 119-132. Heine, B., Kuteva, T., 2002a. On the evolution of grammatical forms. In: Wray, A. (Ed.), The Transition to Language. Oxford University Press, Oxford, pp. 376397.

Heine, B., Kuteva, T., 2002b. World Lexicon of Grammaticalization. Cambridge University Press, Cambridge.

Hummel, K.E., 2000. Introductory Concepts for Abstract Mathematics. Chapman & Hall/CRC, Boca Raton, FL.

Jackendoff, R., 1983. Semantics and Cognition. MIT Press, Cambridge, MA.

Jackendoff, R., 1990. Semantic Structures. MIT Press, Cambridge, MA.

Kracht, M., 2002. On the semantics of locatives. Linguistics and Philosophy 25 (2), 175-232.

Kracht, M., 2003. Against the feature bundle theory of case. In: Brandner, E., Zinsmeister, H. (Eds.), New Perspectives on Case Theory. CSLI, Stanford.

Langacker, R.W., 1987. Foundations of Cognitive Grammar. Theoretical Prerequisites, vol. I. Stanford University Press, Stanford.

Lestrade, S., 2010. The Space of Case. PhD Thesis, Radboud Universiteit Nijmegen.

Luuk, E., 2008. Semantilised tasandid ja semantilised primitiivid. Keel ja Kirjandus 12, 949-967.

Luuk, E., 2009. The noun/verb and predicate/argument structures. Lingua 119 (11), 1707-1727.

Luuk, E., 2010. Nouns, verbs and flexibles: implications for typologies of word classes. Language Sciences 32 (3), 349-365.

Mel'chuk, I., 1986. Toward a definition of case. In: Brecht, R.D., Levine, J.S. (Eds.), Case in Slavic. Slavica, Columbus, OH, pp. 35-85.

Potts, T.C., 1978. Case grammar as componential analysis. In: Abraham, W. (Ed.), Valence, Semantic Case, and Grammatical Relations. Benjamins,

Amsterdam, pp. 399-457. Ratsep, H., 1977. Eesti keele ajalooline morfoloogia I. Tartu Riiklik Ulikool, Tartu. Ratsep, H., 1979. Eesti keele ajalooline morfoloogia II. Tartu Riiklik Ulikool, Tartu.

Taylor, J.R., 1999. Cognitive semantics and structural semantics. In: Blank, A., Koch, P. (Eds.), Historical Semantics and Cognition. Mouton de Gruyter, Berlin, New York, pp. 17-48.

Wierzbicka, A., 1996. Semantics: Primes and Universals. Oxford University Press, Oxford.

Wierzbicka, A., 2000. Lexical prototypes as a universal basis. In: Vogel, P.M., Comrie, B. (Eds.), Approaches to the Typology of Word Classes. Mouton de

Gruyter, Berlin, New York, pp. 285-317. Zwarts, J., 2005. Prepositional aspect and the algebra of paths. Linguistics and Philosophy 28, 739-779.

Zwarts, J., Winter, Y., 2000. Vector space semantics: a model-theoretic analysis of locative prepositions. Journal of Logic, Language and Information 9,169211.