Scholarly article on topic 'Universal Annotation of Slavic Verb Forms'

Universal Annotation of Slavic Verb Forms Academic research paper on "Languages and literature"

CC BY-NC-ND
0
0
Share paper
Keywords
{""}

Academic research paper on topic "Universal Annotation of Slavic Verb Forms"

PBML

The Prague Bulletin of Mathematical Linguistics NUMBER 105 APRIL 2016 143-193

Universal Annotation of Slavic Verb Forms

Daniel Zeman

Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics

Abstract

This article proposes application of a subset of the Universal Dependencies (UD) standard to the group of Slavic languages. The subset in question comprises morphosyntactic features of various verb forms. We systematically document the inventory of features observable with Slavic verbs, giving numerous examples from 10 languages. We demonstrate that terminology in literature may differ, yet the substance remains the same. Our goal is practical. We definitely do not intend to overturn the many decades of research in Slavic comparative linguistics. Instead, we want to put the properties of Slavic verbs in the context of UD, and to propose a unified (Slavic-wide) application of UD features and values to them. We believe that our proposal is a compromise that could be accepted by corpus linguists working on all Slavic languages.

1. Introduction and related work

Universal Dependencies (Nivre et al., 2016)1 is a project that seeks to design cross-linguistically consistent treebank annotation for as many languages as possible. Besides dependency relations, UD also defines universally applicable tags for parts of speech (universal POS tags) and common morphosyntactic features (universal features). The features are taken from a previous project called Interset (Zeman, 2008).

Being suitable for a variety of unrelated languages means that the core concepts of UD must be sufficiently general; at the same time, their definitions must be descriptive enough to signal that two phenomena in two different languages are (or are not) the same thing, despite conflicts in traditional terminologies.

There is always the danger that researchers working on different languages will apply the UD concepts differently. As UD gains on popularity and new datasets are

1http://universaldependencies.org/

© 2016 PBML. Distributed under CC BY-NC-ND. Corresponding author: zeman@ufal.mff.cuni.cz

Cite as: Daniel Zeman. Universal Annotation of Slavic Verb Forms. The Prague Bulletin of Mathematical Linguistics No. 105, 2016, pp. 143-193. doi: 10.1515/pralin-2016-0007.

converted to its annotation scheme, enforcing consistency is an increasingly important issue. It seems natural to start with looking at closely related languages and first make sure that they annotate the same things same way; then widen the view to larger language groups and so on.

The first work on Slavic-specific issues in UD was Zeman (2015). The present article focuses on part-of-speech tags and features of individual words, not on interword dependency relations. Some verb forms are analytical (periphrastic), made of two or more individual words. We occasionally use the periphrastic constructions for illustrative purposes but bear in mind that tags and features must be assigned to individual words only. Also note that UD postulates the concept of syntactic word, something that is not necessarily identical to the space-delimited orthographic word. An orthographic word may be understood as a fusion of two or more syntactically autonomous units; the annotation treats each of them separately.

Some work has been published that pre-dates UD and is related to our current effort. Besides Interset (Zeman, 2015), the outcomes of the MULTEXT-East project are highly relevant (Erjavec, 2012). Quite a few Slavic languages have morpho-syntactic tagsets stemming from MULTEXT-East. These tagsets are similar to each other and they were indeed intended to encode the same phenomena identically across languages. Unfortunatelly they have not always reached this goal. Traditional views and legacy resources sometimes outweighed the desire for uniformity. UD faces the same danger and we should strive hard to avoid it.

In the following sections we discuss UD tags and features applicable to Slavic verbs (as well as some words on the border between verbs and other parts of speech). We give numerous examples and inflection tables together with the proposed annotation.2 We list the native names of the verb forms in the beginning of each section.

We use ISO 639 language codes when refering to individual languages: [be] Be-larusian, [bg] Bulgarian, [cs] Czech, [cu] Old Church Slavonic, [dsb] Lower Sorbian, [hr] Croatian, [hsb] Upper Sorbian, [mk] Macedonian, [pl] Polish, [ru] Russian, [sk] Slovak, [sl] Slovenian, [sr] Serbian, [uk] Ukrainian.

Six Slavic languages ([bg], [cs], [cu], [hr], [pl] and [sl]) already have datasets in the current release of UD (1.2) and other languages are expected to get covered in the near future. We briefly summarize the approaches taken in the current data in Section 18.

2. Universal Features

The following universal features are discussed in the article. See the on-line documentation of UD (http://universaldependencies.org/) for their detailed description with examples. Here we provide just a list for quick reference:

2 The tables were compiled using on-line resources such as Wictionary, verb conjugators and language courses, as well as printed grammars and dictionaries. We do not cite these sources individually due to space considerations.

• Aspect: Imp (imperfective), Perf (perfective)

• VerbForm: Fin (finite verb), Inf (infinitive), Sup (supine), Part (participle), Trans (transgressive)

• Mood: Ind (indicative), Imp (imperative), Cnd (conditional)

• Tense: Past (past), Imp (imperfect), Pres (present), Fut (future)

• Voice: Act (active), Pass (passive)

• Number: Sing (singular), Dual (dual), Plur (plural)

• Person: 1, 2, 3

• Gender: Masc (masculine), Fem (feminine), Neut (neuter)

• Animacy: Anim (animate/human), Nhum (animate nonhuman), Inan (inanimate)

• Case: Nom (nominative), Gen (genitive), Dat (dative), Acc (accusative), Voc (vocative), Loc (locative), Ins (instrumental)

• Definite: Ind (indefinite), Def (definite)

• Negative: Pos (affirmative), Neg (negative)

3. Universal Part of Speech Tag and Lemma

We discuss various finite and non-finite forms of verbs in Slavic languages. We include some forms on the border of verbs and other parts of speech because we want to define the borderline between parts of speech uniformly for all Slavic languages.

We propose a simple (but approximate!) rule of thumb: if it inflects for Case, it is not a VERB. It is either an ADJ, or a NOUN. We treat such forms as adjectives or nouns derived from verbs. Nevertheless, they may have some features such as VerbForm and Tense that are normally used with verbs and that do not occur with other adjectives and nouns.

Verbal nouns have the neuter gender and they are rarely seen in plural.

Participles may, depending on language, have short and long forms. The long forms almost always inflect for Case and can be used attributively (as modifiers of nouns). We propose to classify them as adjectives. The short forms of some participle types receive the VERB tag: it signals that their inflection is limited3 and their usage is prevailingly predicative. In south Slavic languages even some short participles inflect for Case4 and get the ADJ tag; the short vs. long forms differ in the feature of Definite(ness) there.

Only a few Slavic verbs may function as auxiliaries and be tagged AUX. All of them may also be tagged VERB in other contexts. The main auxiliary verb is to be (byt, byvat, byf, bye, бути, быть, biti...) It may be used to form the future tense, past tense, conditional and passive. Serbo-Croatian languages use a different auxiliary verb, htjeti

3 A rare example of short form inflection in Czech is the feminine accusative, e.g. udelanu.

4Actually only a few forms—masculine singular nominative and masculine inanimate singular accusative—distinguish "long" vs. "short" forms in [sl] and [hr]. In the other cases there is just one form and it does not make much sense to classify it as either long or short.

"will", to form the future tense. We do not see any benefit in granting the auxiliary status to verbs that are not needed in periphrastic verb forms; in particular, modal verbs are tagged VERB, although UD for Germanic languages treats them as auxiliaries. In accord with the UD guidelines, the verb to be is tagged VERB if it functions as copula.

All words tagged VERB or AUX must have a non-empty value of the feature VerbForm.

The POS tag also determines what word form will be used as the lemma. For VERB and AUX, the lemma is the infinitive (Section 5),5 except for [bg] and [mk]: these languages do not have infinitives, and present indicative forms are used as lemmas there. However, if the word is tagged ADJ, the masculine singular nominative form of the adjective serves as the lemma. The annotation does not show the infinitive of the base verb (except for an optional reference in the MISC column). Similarly, the lemma of a verbal NOUN is its singular nominative form.

4. Aspect

Slavic languages distinguish two aspects: imperfective (Aspect=Imp) and perfective (Aspect=Perf). The feature is considered lexical, that is, all forms of one lemma (usually) belong to the same aspect. A few verbs (many of them loanwords from non-Slavic languages) work with both aspects. We omit the Aspect feature at these verbs. Most Slavic verbs are part of inflected aspect pairs where one verb is imperfective and the other is perfective. They have different lemmas and the morphological processes that create one from the other are considered derivational. Examples (Imp - Perf): [cs] delat - udelat "to do", sedet - sednout "to sit", kupovat - koupit "to buy", brat - vzit "to take". Although the meaning of the two verbs is similar, in perfective verbs the action is completed and in imperfective verbs it is ongoing.

The equivalents of the verb to be are imperfective.

5. Infinitive and Supine

[cs] infinitiv, neurcitek; [sk] infinitiv, neurcitok; [hsb] infinitiw; [pl] bezokolicznik; [uk] тфттив; [ru] инфинитив; [sl] nedolocnik (Inf), namenilnik (Sup); [hr] infinitiv. Tables 1 and 2.

Most Slavic languages have a distinct infinitive form, which is used as argument of modal and other verbs (control, purpose), and sometimes in construction of the periphrastic future tense. The infinitive is also used as the citation form of verbs. It does not exist in Macedonian and Bulgarian.

Czech has two forms of infinitive, e.g. delat and delati "to do". The longer form with the final -i is considered archaic, otherwise they are grammatically equivalent.

5 We do not prescribe whether inherently reflexive verbs such as [cs] smut se "to laugh" should or should not have the reflexive pronoun incorporated in their lemma.

en to be can togo to do to accept

cs byt, byti moct, moci jít, jíti dëlat, dëlati akceptovat, akceptovati

sk byf moci íst robit akceptovat

hsb bye moc hic dzëlac akceptowac

pl bye moc isc robic akceptowac

uk бути могти йти робити акцептувати

buty mohty jy robyty akceptuvaty

ru быть мочь идти делать акцептовать

byt' moc' idti delat' akceptovat'

sl biti moci iti delati akceptirati

hr biti moci ici delati, delat akceptirati, akceptirat

cu БЪ1ТИ мощи ити дЬллти

byti mosti iti dëlati

Table 1. VerbForm=Inf

en to be can togo to do to accept

sl bit it delat akceptirat

cu БЫТЬ by^ итъ йъ дЪлатъ dëla^

Table 2. VerbForm=Sup

In contrast, Slovenian uses only the longer form (delati) as infinitive, while the shorter form is called supine and is used after motion verbs (meaning "to go somewhere to do something").6 In Croatian both are considered infinitive but the short form is only used in future tense if the infinitive precedes the auxiliary verb: Ucit cu hrvatski. "I will learn Croatian." but Hocu uciti hrvatski.

Infinitive and supine verbs lack most other verbal features, they only have nonempty values of Aspect, VerbForm and in some languages also of Negative.

6. Present and Future Indicative

[cs] pfitomny cas (prezens), budouci cas (futurum); [sk] pritomny cas, budUci cas; [hsb] prezens, futur; [pl] czas terazniejszy, czas przyszty; [uk] теперШтй час, майбуттй час;

6The supine is an old form, attested in Old Church Slavonic. Besides Slovenian, it has also survived in Lower Sorbian.

Number Sing Dual Plur

Person 1 2 3 1 2 3 1 2 3

cs jsem jsi je jsme jste jsou

sk som si je sme ste su

hsb sym sy je smoj staj staj smy see su

Pl jestem jestes jest jestesmy jestescie sq

uk e ecu, e e e e e

je jesy, je je je je je

ru есть est' суть sut'

sl sem si je sva sta sta smo ste so

hr jesam jesi jest jesmo jeste jesu

sam si je smo ste su

bg съм cu е сме сте са

sam si e sme ste sa

cu ЕСМЪ ЕСИ ЕСТЬ ЕСВЪ Еста ЕСТЕ ЕСМЪ ЕСТЕ сжтъ

jesmb jesi jes^ jesve jesta jeste jesmъ jeste sp^

Table 3. To be, VerbForm=Fin | Mood=Ind | Tense=Pres. Note that in Ukrainian and Russian the original non-3rd person forms of this verb have become archaic.

[ru] настоящее время, будущее время; [sl] sedanjik, prihodnjik; [hr] sadasnje vrijeme, buduce vrijeme; [bg] сегашно време, бъдеще време. Tables 3-15.

Present tense is a simple finite verb form that marks person and number of the subject. Present forms of perfective verbs have a future meaning; however, we prefer morphology (form) to semantics (function) and annotate them Tense=Pres, regardless the aspect and meaning.7

Future tense of imperfective verbs is usually formed periphrastically, using infinitive or participle of the content verb, and special forms of the auxiliary verb to be, e.g. [cs] budu delat "I will do". These special forms are different from the present forms and they are annotated Tense=Fut. The infinitive of the content verb does not have the tense feature.

In Croatian, the periphrastic future is formed using another auxiliary verb, htjeti "will / want". This verb can also be used as a content (non-auxiliary) verb, and its auxiliary forms are not different from its normal present forms. Therefore they will be annotated Tense=Pres.

7Some tagsets prefer to call these forms non-past verb, cf. Przepiorkowski and Wolinski (2003). 148

Nu Sing Dual Plur

Pe 1 2 3 1 2 3 1 2 3

cs ЪЫи ЪЫев ЪЫе ЪЫете budete ЪЫои

sk ЪЫет ЪЫев ЪЫе ЪЫете budete Ъudú

hsb ЪЫи ЪЫгев ЪЫхе ЪЫхето] Ъudzetei Ъudzetei ЪЫхешу budzece ЪЫи

pl 4dç Ъ^dziesz fydzie fydziemy b^dziecie Ъ^dq

uk буду budu будеш budes буде bude будемо budemo будете budete будуть budut'

ru буду budu будешь budes' будет budet будем budem будете budete будут budut

sl Ъот Ъов Ъо Ъога Ъоsta Ъоsta Ъото boste Ъоdo

cu БЖДХ bçdç БЖДEШИ bçdesi БЖ.ДБТЪ bçdeH БЖДEBЪ bçdevë БЖ.ДБТй bçdeta БЖ.ДEТE bçdete БЖДEMЪ bçdemъ БЖ.ДЕТЕ bgdete БЖДЖТЪ bçdç^

Table 4. To be, VerbForm=Fin j Mood=Ind j Tense=Fut.

A handful of Czech, Slovak and Slovenian motion verbs also have simple future forms, created by the prefix pfooou]-: [cs] pujde "he will go", pojede "he will ride", poleti "he will fly" but also pokvete "it will bloom". In these cases the prefix is not derivational because it does not create a new perfective lemma with a full paradigm. Thus we annotate these forms as future so they are distinguished from the present forms. In other languages the situation may be different. Russian пойти (pojti) is a full perfective counterpart of the imperfective идти (idti) and its present forms are annotated Tense=Pres.

Ukrainian is special in that it has regular simple future forms of imperfective verbs (not restricted to motion verbs). The periphrastic future also exists.

Number Sing Dual

Person 1 2 3 1 2 3

cs püjdu püjdes püjde

sk pojdem pojdes pojde

hsb pondu pondzes pondze pondzemoj pondzetej pondzetej

sl pojdem pojdes pojde pojdeva pojdeta pojdeta

uk ümuMy jtymu ümuMew jtymes ümuMe jtyme

Number Plur

Person 1 2 3

cs püjdeme püjdete püjdou

sk pojdeme pojdete pojdu

hsb pondzemy pondzece pondu

sl pojdemo pojdete pojdejo

uk ümuMeMO ümuMeme ümuMymb

jtymemo jtymete jtymut'

Table 5. To go, VerbForm=Fin | Mood=Ind | Tense=Fut.

Number Person be can go do accept

Sing 1 jsem müzu, mohu jdu deläm akceptuji

Sing 2 jsi müzes jdes deläs akceptujes

Sing 3 je müze jde delä akceptuje

Plur 1 jsme müzeme jdeme deläme akceptujeme

Plur 2 jste müzete jdete deläte akceptujete

Plur 3 jsou müzou, mohou jdou delaji akceptuji

Table 6. [cs] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do accept

Sing 1 som môzem idu robím akceptujem

Sing 2 si môzes ides robís akceptujes

Sing 3 je môze ide robí akceptuje

Plur 1 sme môzeme ideme robíme akceptujeme

Plur 2 ste môzete idete robíte akceptujete

Plur 3 sú môzu idú robia akceptujú

Table 7. [sk] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do accept

Sing 1 sym mózu du dielam akceptuju

Sing 2 sy mózes dies dielas akceptujes

Sing 3 je móze die diela akceptuje

Dual 1 smój mózemoj diemoj dielamoj akceptujemoj

Dual 2 staj mózetej dietej dielatej akceptujetej

Dual 3 staj mózetej dietej dielatej akceptujetej

Plur 1 smy mózemy diemy dielamy akceptujemy

Plur 2 see mózeee dieee dielaee akceptujeee

Plur 3 su móza, mózeja du, dieja dielaja akceptuja

Table 8. [hsb] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do accept

Sing 1 jestem mog? id? robi? akceptuj?

Sing 2 jestes moiesz idziesz robisz akceptujesz

Sing 3 jest moie idzie robi akceptuje

Plur 1 jestesmy moiemy idziemy robimy akceptujemy

Plur 2 jestescie moiecie idziecie robicie akceptujecie

Plur 3 sq mogq idq robiq akceptujq

Table 9. [pl] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do accept

Sing 1 e можу йду роблю акцептую

je mozu jdu roblju akceptuju

Sing 2 ecu, e можеш йдеш робиш акцептуeш

jesy, je mozes jdes robys akceptujes

Sing 3 e може йде робить акцептуe

je moze jde robyt' akceptuje

Plur 1 e можемо йдемо, йдем робимо, робим акцептуeмо

je mozemo jdemo, jdem robymo, robym akceptujemo

Plur 2 e можете йдете робите акцептуeте

je mozete jdete robyte akceptujete

Plur 3 e можуть йдуть роблять акцептують

je mozut' jdut' robljat' akceptujut'

Table 10. [uk] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do accept

Sing 1 могу mogu иду idu делаю delaju акцептую akceptuju

Sing 2 можешь mozes' идёшь ides' делаешь delaes' акцептуешь akceptues'

Sing 3 есть est' может mozet идёт idet делает delaet акцептует akceptuet

Plur 1 можем mozem идём idem делаем delaem акцептуем akceptuem

Plur 2 можете mozete идёте idete делаете delaete акцептуете akceptuete

Plur 3 суть sut' могут mogut идут idut делают delajut акцептуют akceptujut

Table 11. [ru] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do accept

Sing 1 sem morem grem delam akceptiram

Sing 2 si mores gres delas akceptiras

Sing 3 je more gre dela akceptira

Dual 1 sva moreva greva delava akceptirava

Dual 2 sta moreta gresta delata akceptirata

Dual 3 sta moreta gresta delata akceptirata

Plur 1 smo moremo gremo delamo akceptiramo

Plur 2 ste morete greste delate akceptirate

Plur 3 so morejo gredo, grejo delajo akceptirajo

Table 12. [sl] VerbForm=Fin j Mood=Ind j Tense=Pres

Number Person be can go do accept

Sing 1 съм мога отивам правя акщпт^ам

sam moga otivam pravja akceptiram

Sing 2 си можeш отиваш правиш акщпт^аш

si mozes otivas pravis akceptiras

Sing 3 e можe отива прави акщпт^а

e moze otiva pravi akceptira

Plur 1 cмe можeм отuвамe правим акцeптupамe

sme mozem otivame pravim akceptirame

Plur 2 me можeтe отuватe правите акцeптupатe

ste mozete otivate pravite akceptirate

Plur 3 са могат отиват правят акщпт^ат

sa mogat otivat pravjat akceptirat

Table 13. [bg] VerbForm=Fin j Mood=Ind j Tense=Pres

Number Person be can g° do accept

Sing 1 jesam, sam mogu idem delam akceptiram

Sing 2 jesi, si mozes ides delas akceptiras

Sing 3 jest, je moze ide dela akceptira

Plur 1 jesmo, smo mozemo idemo delamo akceptiramo

Plur 2 jeste, ste mozete idete delate akceptirate

Plur 3 jesu, su mogu idu delaju akceptiraju

Table 14. [hr] VerbForm=Fin | Mood=Ind | Tense=Pres

Number Person be can g° do

Sing 1 ECMb jesmb Mors. mogp has, lAS id? AkAAra delaj?

Sing 2 ECH jesi MoXEmH mozesi HAEmH, lAEmH idesi AkAAEmH delajesi

Sing 3 ECTt jestb MOXETt mozete HAETt, lAETt idetb AkAAATt delaatb

Dual 1 ECBk jesve MOXEBk mozeve HAEBk, lAEBk ideve AkAAEBk delajeve

Dual 2 ecta jesta MOXETA mozeta HAETa, lAETa ideta AkAAETA delajeta

Dual 3 ECTE jeste MOXETE mozete HAETE, lAETE idete AkAAETE delajete

Plur 1 ECMt jesmb MOXEM! mozemo HAEMt, lAEMt idemb AkAAEMt delajemb

Plur 2 ECTE jeste MOXETE mozete HAETE, lAETE idete AkAAETE delajete

Plur 3 CXT! s?tb MOrSTt moggtb HASTt, igsTt id?tb AkAAraTt delajptb

Table 15. [cu] VerbForm=Fin | Mood=Ind | Tense=Pres

7. Imperative

[cs] rozkazovaci zpusob (imperativ); [sk] imperativ (rozkazovaci sposob); [hsb] imperatiw; [pl] tryb rozkazujqcy; [uk] наказовий cnoci6; [ru] повелительное наклонение; [sl] velelnik, velelni naklon; [hr] imperativ; [bg] повелително наклонение (императив). Tables 16-25.

Imperative is a simple finite verb form that marks person and number but it does not mark tense (we leave the Tense feature empty). Imperative forms are not available in the 3rd person (appeals to third persons maybe formed periphrastically, using particles and present indicative forms; these are not annotated as imperatives). Imperative also does not exist in the 1st person singular. Modal verbs usually do not have imperatives.

Number Person be g° do accept

Sing 2 bud' jdi, pojd' delej akceptuj

Plur 1 buddme jdeme, pojddme delejme akceptujme

Plur 2 buddte jdete, pojd'te delejte akceptujte

Table 16. [cs] VerbForm=Fin j Mood=Imp

Number Person be go do accept

Sing 2 bud' chod' rob akceptuj

Plur 1 budme chodme robme akceptujme

Plur 2 bud'te chod'te robte akceptujte

Table 17. [sk] VerbForm=Fin j Mood=Imp

Number Person be go do accept

Sing 2 budz dzi, póndz dzelaj akceptuj

Dual 1 budzmoj dzemoj, póndzmoj dzelajmoj akceptujmoj

Dual 2 budztej dzetej, póndztej dzelajtej akceptujtej

Plur 1 budzmy dzemy, póndzmy dzelajmy akceptujmy

Plur 2 budzce dzece, póndzce dzelajce akceptujce

Table 18. [hsb] VerbForm=Fin j Mood=Imp

Number Person be g° do accept

Sing 2 bqdz idz rob akceptuj

Plur 1 bqdzmy idzmy robmy akceptujmy

Plur 2 bqdzcie idzcie robcie akceptujcie

Table 19. [pl] VerbForm=Fin | Mood=Imp

Number Person be g° do accept

Sing 2 будь йди роби акцептуй

bud' jdy roby akceptuj

Plur 1 будьмо йд1мо, йд1м робiмо, робiм акцептуймо

bud'mo jdimo, jdim robimo, robim akceptujmo

Plur 2 будьте üdimb робть акцептуйте

bud'te jdit' robit' akceptujte

Table 20. [uk] VerbForm=Fin | Mood=Imp

Number Person be g° do accept

Sing 2 будь иди делай акцептуй

bud' idi delaj akceptuj

Plur 1 будемте идёмте делаемте акцептуемте

budemte idëmte delaemte akceptuemte

Plur 2 будьте идите делайте акцептуйте

bud'te idite delajte akceptujte

Table 21. [ru] VerbForm=Fin | Mood=Imp

Number Person be go do accept

Sing 2 bodi pojdi delaj akceptiraj

Dual 1 bodiva pojdiva delajva akceptirajva

Dual 2 bodita pojdita delajta akceptirajta

Plur 1 bodimo pojdimo delajmo akceptirajmo

Plur 2 bodite pojdite delajte akceptirajte

Table 22. [sl] VerbForm=Fin | Mood=Imp

Number Person be g° do accept

Sing 2 budi idi delaj akceptiraj

Plur 1 budimo idimo delajmo akceptirajmo

Plur 2 budite idite delajte akceptirajte

Table 23. [hr] VerbForm=Fin | Mood=Imp

Number Person be g° do accept

Sing 2 6i>du badi omueaü otivaj npaeu pravi aKU,enmupaü akceptiraj

Plur 2 fàdeme badete omueaüme otivajte npaeeme pravete aKU,enmupaüme akceptirajte

Table 24. [bg] VerbForm=Fin | Mood=Imp

Number Person be g° do

Sing 2 EXAH bçdi HAH, IAH idi A^AÄH delai

Dual 2 EXACTA bçdeta ha^tä, ia^tä ideta A^AÄHTÄ delaita

Plur 2 exacte bçdete ha^te, ia^te idete a^aähte delaite

Table 25. [cu] VerbForm=Fin | Mood=Imp

8. Aorist Indicative

[cs] aorist; [hsb] preteritum; [hr] aorist (predasnje svrseno vreme); [bg] MUHano CBT>pweHo BpeMe. Tables 26,32, 28 and 30.

Aorist is the old Slavic simple past tense. It is a finite form that marks person and number of the subject. It existed in the Old Church Slavonic language and it has survived in several languages until today; however, many languages have replaced it by the l-participle. For example, aorist is attested in Old Czech but it vanished during the 15th century.

Aorist is regularly used (together with imperfect, see Section 9) in Bulgarian and Macedonian. It is still understood in Serbian and Croatian, albeit its usage is limited. Aorist has also survived in the Sorbian languages, where it has effectively merged with imperfect into one simple past called preterite. Unlike in Bulgarian, in Sorbian the forms stemming from aorist are only found with perfective verbs, and the historical forms of imperfect only with imperfective verbs8 (Breu, 2000). Hence we have just two inflection patterns, instead of two different tenses.

We can use the simple Tense=Past feature to annotate aorist in Slavic languages as it does not collide with the other past forms. This has been the original intention in Interset and in Universal Dependencies and it is used currently both in the Old Church Slavonic and the Bulgarian data. On the other hand, UD Ancient Greek uses a language-specific value Tense=Aor; if the future versions of the universal guidelines adopt this value, it might be more appropriate to use it.

The Sorbian preterite will be also tagged Tense=Past, regardless whether the verb is perfective or imperfective.

9. Imperfect Indicative

[cs] imperfektum; [hr] imperfekat (predasnje nesvrseno vreme); [bg] MUHano HecBT>pweHo BpeMe. Tables 27, 29 and 31.

Imperfect is another simple past tense that only survived in a few languages. It does not have any equivalent in English, but there are imperfect tenses in Romance languages.

For the merged aorist-imperfect (preterite) in Sorbian languages, see Section 8.

Verbs in imperfect describe states or actions that were happening during some past moment. They may or may not continue at and after the moment of speaking. Important is the past context and the relation of the action (state) to some other action (state) happening in the past.

Despite the name, both imperfective and perfective verbs can be used in the imperfect tense! Perfective verbs in the imperfect tense denote actions that were repeated in

8It could be argued that the Sorbian usage is prototypical, while the imperfect tense of perfective verbs in Bulgarian is marked. Nevertheless, such change of perspective would have no impact on our proposed analysis.

Number Person be can g° do accept

Sing 1 bych mozech jid dëlach pfijiech

Sing 2 by moze jide dëla pfijie

Sing 3 by moze jide dëla pfijie

Dual 1 bychovë mozechovë jidovë dëlachovë pfijiechovë

Dual 2 bysta mozesta jideta dëlasta pfijiesta

Dual 3 bysta mozesta jideta dëlasta pfijiesta

Plur 1 bychom mozechom jidom dëlachom pfijiechom

Plur 2 byste mozeste jidete dëlaste pfijieste

Plur 3 bychu mozechu jidú dëlachu pfijiechu

Table 26. Old [cs] VerbForm=Fin j Mood=Ind j Tense=Past

Number Person be can go do accept

Sing 1 biech moziech jdiech dëlajiech pfijiech

Sing 2 biese moziese jdiese dëlajiese pfijiese

Sing 3 biese moziese jdiese dëlajiese pfijiese

Dual 1 biechovë moziechovë jdiechovë dëlajiechovë pfijiechovë

Dual 2 biesta moziesta jdiesta dëlajiesta pfijiesta

Dual 3 biesta moziesta jdiesta dëlajiesta pfijiesta

Plur 1 biechom moziechom jdiechom dëlajiechom pfijiechom

Plur 2 bieste mozieste jdieste dëlajieste pfijieste

Plur 3 biechu moziechu jdiechu dëlajiechu pfijiechu

Table 27. Old [cs] VerbForm=Fin j Mood=Ind j Tense=Imp

the past. Hence the Aspect feature should not be used to mark this tense. As discussed in Section 4, that feature should be reserved to denote the lexical aspect of Slavic verbs, bound to their lemma. Instead, Universal Features provide a feature dedicated to the imperfect tense, Tense=Imp.

Examples [bg]:

• Когато се прибрах вкъщи, децата вече спяха. (Kogato se pribrah vkasti, decata vece spjaha) "When I came home, the children were already asleep."

• Щом дойдеше, веднага запалваше цигара. (Stom dojdese, vednaga zapalvase cigara.) "Every time he came, he always lit a cigarette."

Number Person be can g° do accept

Sing 1 6ax bjah Mowax mozah omuBax otivah npaBux pravih aKU,enmupax akceptirah

Sing 2 6ewe, 6e bese, be Mowa moza omuBa otiva npaBu pravi aKU,enmupa akceptira

Sing 3 6ewe, 6e bese, be Mowa moza omuBa otiva npaBu pravi aKU,enmupa akceptira

Plur 1 6axMe bjahme MOwaxMe mozahme omuBaxMe otivahme npaBuXMe pravihme aKU,enmupaxMe akceptirahme

Plur 2 6axme bjahte Mowaxme mozahte omuBaxme otivahte npaBuxme pravihte aKU,enmupaxme akceptirahte

Plur 3 6axa bjaha Mowaxa mozaha omuBaxa otivaha npaBuxa praviha aKU,enmupaxa akceptiraha

Table 28. [bg] VerbForm=Fin | Mood=Ind | Tense=Past

Number Person be can g° do accept

Sing 1 6ax bjah Mowex mozeh omuBax otivah npaBex praveh aKU,enmupax akceptirah

Sing 2 6ewe, 6e bese, be Mowewe mozese omuBawe otivase npaBewe pravese aKU,enmupawe akceptirase

Sing 3 6ewe, 6e bese, be Mowewe mozese omuBawe otivase npaBewe pravese aKU,enmupawe akceptirase

Plur 1 6axMe bjahme MowexMe mozehme omuBaxMe otivahme npaBexMe pravehme aKU,enmupaxMe akceptirahme

Plur 2 6axme bjahte Mowexme mozehte omuBaxme otivahte npaBexme pravehte aKU,enmupaxme akceptirahte

Plur 3 6axa bjaha Mowexa mozeha omuBaxa otivaha npaBexa praveha aKU,enmupaxa akceptiraha

Table 29. [bg] VerbForm=Fin | Mood=Ind | Tense=Imp

Number Person be can go do

Sing 1 БЪ1ХЪ byckb могъ mogъ идъ, 1ДЪ idъ дЪлахъ dëlackb

Sing 2 БЪСТЪ bys^ МОЖЕ moze НДЕ, 1ДЕ ide дЪлашЕ dëlase

Sing 3 БЪСТЪ, БЪ bysЪ by МОЖЕ moze НДЕ, 1ДЕ ide дЪлашЕ dëlase

Dual 1 БЪХОВЪ bychovë моговЪ mogove ндовЪ, 1ДовЪ idove дЪлаховЪ dëlachovë

Dual 2 бъстл bysta можЕта mozeta НДЕта, 1ДЕта ideta дЬласта dëlasta

Dual 3 БЪСТЕ byste МОЖЕТЕ mozete НДЕТЕ, 1ДЕТЕ idete дъллсте dëlaste

Plur 1 БЪХОМЪ bychomъ могомъ mogomъ НДОМЪ, 1ДОМЪ idomъ дЪлахомъ dëlachomъ

Plur 2 БЪСТЕ byste МОЖЕТЕ mozete НДЕТЕ, 1ДЕТЕ idete дъллсте dëlaste

Plur 3 БЪША bysç могж mogg НДЖ, 1ДЖ idg дЪлашА. dëlasç

Table 30. [cu] VerbForm=Fin j Mood=Ind j Tense=Past

Numb P be can go do

Sing 1 Ekxi bechv Moxaaxi mozaachi HAkaxi, lAkaxi ideachi AkAaaxi deiaachi

Sing 2 Ek be MoxaamE mozaase HAkamE, igkamE idease AkAaamE delaase

Sing 3 Ek, EkamE be, bease MoxaamE mozaase HgkamE, igkamE idease AkAaamE delaase

Dual 1 EkxoBk bechove MoxaaxoBk mozaachove HAkaxoBk, igkaxoBk ideachove AkAaaxoBk deiaachove

Dual 2 EkcTa besta MoxaamETa mozaaseta HgkamETa, igkamETa ideaseta AkAaamETa delaaseta

Dual 3 EkamETE, EkCTE beasete, beste MoxaamETE mozaasete HgkamETE, igkamETE idéasete AkAaamETE delaasete

Plur 1 EkxoMl bechomi MoxaaxoMi mozaachomi HAkaxoMi, igkaxoMt ideachomi AkAaaxoMi delaachomi

Plur 2 EkCTE beste MoxaamETE mozaasete HgkamETE, igkamETE ideasete AkAaamETE delaasete

Plur 3 Ekaxs, Ekm& beachg, besg Moxaaxx mozaachg Hgkaxs., igkaxx ideachg AkAaaxs. delaachg

Table 31. [cu] VerbForm=Fin | Mood=Ind | Tense=Imp

Number Person be can go do accept

Sing 1 bech mózech diech dielach akceptowach

Sing 2 bese mózese diese dielase akceptowase

Sing 3 bese mózese diese dielase akceptowase

Dual 1 bechmoj mózechmoj diemoj dielachmoj akceptowachmoj

Dual 2 bestej mózestej diestej dielastej akceptowastej

Dual 3 bestej mózestej diestej dielastej akceptowastej

Plur 1 bechmy mózechmy diechmy dielachmy akceptowachmy

Plur 2 besce mózesce diesce dielasce akceptowasce

Plur 3 bechu mózechu diechu dielachu akceptowachu

Table 32. [hsb] VerbForm=Fin | Mood=Ind | Tense=Past

10. Active Participle and Past Tense

[cs] pfieesti cinne, minuly cas; [sk] minuly cas; [hsb] t-forma, perfekt; [pl] czas przeszty; [uk] минулий час; [ru] прошедшее время; [sl] opisni deleznik na -l, preteklik; [hr] glagolski pridjev radni, proslo vreme; [bg] минало деятелно свършено причастие, минало деятелно несвършено причастие. Tables 33-42.

The typical formation of the past tense in most (but not all) modern Slavic languages is periphrastic, using a finite form of the auxiliary verb to be and the active participle (as opposed to the passive participle). The participle may also be called past participle because of its close ties to the past tense, and despite the fact that it is also used to form conditional or even the future tense. Sometimes the participle itself is called past tense (it makes sense because in some languages the auxiliary verb is omitted). Or it is simply called l-participle because its suffixes typically involve the consonant -l.

Early stages of Slavic languages (and those modern stages that retained the aorist) understand the constructions with the l-participle as perfect tenses that we know in English. Present perfect, past perfect and future perfect may be constructed, depending on the form of the auxiliary verb. Interestingly, the periphrastic past tense is also termed preteritum in Modern Czech (Academia, 1986), but the term perfektum prevails when Old Czech is described (Komarek et al., 1967) (cf. Praterium = Imperfekt vs. Perfekt in German).

Like other Slavic participles, the l-participle marks gender and number. Typically it has only the short form that is used in predicates, it does not inflect for case and is tagged VERB or AUX. Occasional long forms exist but they are considered derived adjectives and tagged ADJ. The derivation is not productive. It applies mainly to intransitive perfective verbs, while the passive participle would be used with transitive verbs for the same purpose. Example [cs]: spadly "the one who fell down", shnily "rotten", poklesly "dropped". Annotating VerbForm of the derived adjective is purely optional. The short, predicative form should always have VerbForm=Part.

Voice=Act should also be always present so that the participle is distinguished from the passive participle.

Some Bulgarian verbs have two l-participles (past participles): perfect and imperfect. We cannot use the Aspect feature to distinguish them because the feature is bound to lemma, and an imperfective verb can have both perfect and imperfect participles. Nevertheless, the distinction is an analogy to the distinction between the two simple past tenses, and we will use the Tense feature to distinguish the participles. The default is Tense=Past (for past perfect participles). Past imperfect participles will get Tense=Imp.

It is less clear whether the l-participle should be annotated with Tense=Past in the other languages, in which it is not necessary to distinguish different types of l-participles. In many Slavic languages (especially the northern ones) this is the promi-

Number Gender Animacy be can g° do accept

Sing Masc byl mohl sel delal akceptoval

Sing Fem byla mohla sla delala akceptovala

Sing Neut bylo mohlo slo delalo akceptovalo

Plur Masc Anim byli mohli sli delali akceptovali

Plur Masc Inan byly mohly sly delaly akceptovaly

Plur Fem

Plur Neut byla mohla sla delala akceptovala

Table 33. [cs] VERB,AUX | VerbForm=Part | Voice=Act | Tense=Past

Number Gender be can g° do accept

Sing Masc bol mohol isiel robil akceptoval

Sing Fem bola mohla isla robila akceptovala

Sing Neut bolo mohlo islo robilo akceptovalo

Plur boli mohli isli robili akceptovali

Table 34. [sk] VERB,AUX | VerbForm=Part | Voice=Act | Tense=Past

nent and default function of the l-participle.9 Even in languages where it is used in periphrastic perfect tenses (which co-exist with simple past tenses), the perfect or re-sultative meaning implies that the action happened in the past, although the past is relative to a point in time that may be different from the moment of speaking. Therefore we recommend to include Tense=Past in the annotation.

See Section 18 for the annotation of l-participles used in the current UD datasets.

In Slovenian and Serbo-Croatian, the finite form of the auxiliary is used with all persons and numbers: Je sel v solo. "He went to the school." Sem sel v solo. "I went to the school." In Czech and Slovak, the finite form of the auxiliary is omitted in the 3rd person: Sel do skoly. "He went to the school." Sel jsem do skoly. "I went to the school." In Ukrainian and Russian, the auxiliary is omitted in all persons. That is why the subject cannot be dropped in Russian. The person could be understood from a finite verb but not from the participle, hence we need a personal pronoun: Он пошел

9 As mentioned above, it is also used in conditional and in some languages even in the future tense. Still, we are looking for distinctive features of individual words rather than of the periphrastic expressions. In a Slavic-wide perspective, Past seems as close as we can get without defining a language-specific feature for l-participles.

Number Gender be can g° do accept

Sing Masc byl mohl sol dzëlal akceptowal

Sing Fem byla móhla sla dzëlala akceptowala

Sing Neut bylo móhlo slo dzëlalo akceptowalo

Dual byloj móhloj sloj dzëlaloj akceptowaloj

Plur byli móhli sli dzëlali akceptowali

Table 35. [hsb] VERB,AUX j VerbForm=Part j Voice=Act j Tense=Past

Number Gender Animacy be can go do accept

Sing Masc byl mógl szedl robil akceptowal

Sing Fem byla mogla szla robila akceptowala

Sing Neut bylo moglo szlo robilo akceptowalo

Plur Masc Anim byli mogli szli robili akceptowali

Plur Masc Nhum ьУ1У mogly szly robily akceptowaly

Plur Masc Inan

Plur Fem

Plur Neut

Table 36. [pl] VERB,AUX j VerbForm=Part j Voice=Act j Tense=Past

в школу. (On posel v skolu.) "He went to the school." Я пошел в школу. (Ja posel v skolu.) "I went to the school."

In Polish, the auxiliary and the participle have merged in one past-tense form. However, they can also attach to a preceding word: Cieszg sig, zes zrozumial. "I am glad that you have understood." (The auxiliary -s is attached to a conjunction.) Mysmy nie wiedzieli, ze przyjadq. "We did not know they were coming." (Attached to a pronoun.) That is why the tokenization in the Polish treebank cuts off the finite morpheme as a separate syntactic word of a special type called "agglutination". We keep this approach to tokenization, emphasizing the parallelism between the Polish data and the other Slavic languages: Poszedldo szkoly. "He went to the school." Poszedl-em do szkoty. "I went to the school." (The hyphen in the second example indicates tokenization but it does not appear in the surface text.)

Note that there are other types of participles that could be (and sometimes are) called active participles. See Section 13 for details.

Number Gender be can g° do accept

Sing Masc 6ye buv miz mih üwoe jsov po6ue robyv aKU,enmyeae akceptuvav

Sing Fem 6yrn bula MOZMa mohla üwMa jsla po6una robyla aKU,enmyeaMa akceptuvala

Sing Neut 6yMO bulo MOZMO mohlo ÜWMO jslo POÖUMO robylo aKU,enmyeaMO akceptuvalo

Plur 6yxu buly MOZMU mohly ÜWMU jsly pOÖUMU robyly aKU,enmyeaMU akceptuvaly

Table 37. [uk] VERB,AUX | VerbForm=Part | Voice=Act | Tense=Past

Number Gender be can g° do accept

Sing Masc 6blM byl MOZ mog weM sel deMaM delal aKU,enmOeaM akceptoval

Sing Fem 6biMa byla MOZMa mogla wMa sla deMaMa delala aKU,enmOeaMa akceptovala

Sing Neut 6blMO bylo MOZMO moglo WMO slo deMaMO delalo aKU,enmOeaMO akceptovalo

Plur 6blMU byli MOZMU mogli WMU sli deMaMU delali aKU,enmOeaMU akceptovali

Table 38. [ru] VERB,AUX | VerbForm=Part | Voice=Act | Tense=Past

Number Gender be can g° do accept

Sing Masc bio mogao sao delao akceptirao

Sing Fem bila mogla sla delala akceptirala

Sing Neut bilo moglo slo delalo akceptiralo

Plur Masc bili mogli sli delali akceptirali

Plur Fem bile mogle sle delale akceptirale

Plur Neut bila mogla sla delala akceptirala

Table 39. [hr] VERB,AUX | VerbForm=Part | Voice=Act | Tense=Past

Number Gender be can g° do accept

Sing Masc bil mogel sel delal akceptiral

Sing Fem bila mogla sla delala akceptirala

Sing Neut bilo moglo slo delalo akceptiralo

Dual Masc bila mogla sla delala akceptirala

Dual Fem bili mogli sli delali akceptirali

Dual Neut

Plur Masc bili mogli sli delali akceptirali

Plur Fem bile mogle sle delale akceptirale

Plur Neut bila mogla sla delala akceptirala

Table 40. [sl] VERB,AUX j VerbForm=Part j Voice=Act j Tense=Past

Tense Number Gender be can g° do accept

Past Sing Masc бил bil могъл mogäl отивал otival правил ртагй акцептирал akceptiral

Past Sing Fem била bila могла mogla отивала otivala правила ртагйа акцептирала akceptirala

Past Sing Neut било bilo могло moglo отивало otivalo правило ртагйо акцептирало akceptiralo

Past Plur били bili могли mogli отивали otivali правили ртагШ акцептирали akceptirali

Imp Sing Masc можел mozel правел ртаге1

Imp Sing Fem можела mozela правела ртагеЫ

Imp Sing Neut можело mozelo правело ртаге1о

Imp Plur можели mozeli правели ртагеИ

Table 41. [bg] VERB,AUX j VerbForm=Part j Voice=Act

Number Gender be can go do

Sing Masc БЪ1АЪ Ьу1ъ моглъ mogh ШЕлЪ seh дЪлалъ delah

Sing Fem Бъла byla могла mogla шла sla дЪлала delala

Sing Neut БЪАО bylo могло moglo шло slo дЪлало delalo

Dual Masc Бъла byla могла mogla шла sla дЪлала delala

Dual Fem Neut БЪ1ЛЪ byle моглЪ mogle шлЪ sle дЪлалЪ delale

Plur Masc БЪАИ byli могли mogli шли sli дЪлали delali

Plur Fem БЪАЪ byly моглъ mogly шлъ1 sly дЪлалъ delaly

Plur Neut Бъла byla могла mogla шла sla дЪлала delala

Table 42. [cu] VERB,AUX | VerbForm=Part | Voice=Act | Tense=Past

Number Sing Dual Plur

Person 1 2 3 1 2 3 1 2 3

cs bych bys by bychom byste by

hsb bych by by bychmoj bystej bystej bychmy bysce bychu

Pl -bym -bys -by -bysmy -byscie -by

uk б, би b, by

ru бы, б by, b

hr bih bi bi bismo biste bi

bg бих bih би bi би bi бихме bihme бихте bihte биха biha

cu Бимь bimb БИ bi БИ bi БНВЪ bive Биста bista бисте biste Бимъ ьшъ бисте biste БЖ, Бишд bg, bisg

Table 43. To be, AUX | VerbForm=Fin | Mood=Cnd.

11. Conditional

[cs] podminovaci zpusob; [sk] podmienovaci spdsob; [hsb] konjunktiw; [pl] tryb przy-puszczajqcy; [uk] умовний cnoci6; [ru] условное наклонение, кондиционал; [sl] pogojnik; [hr] mogucni nacin, potencijal [bg] условно наклонение. Table 43.

The conditional mood (both present and past) is formed periphrastically using the active (l-) participle of the content verb and a special form of the auxiliary verb to be. The auxiliary form is annotated Mood=Cnd, the participle is not. The Tense feature of the auxiliary is empty. Some languages have present and past conditional but the difference is expressed analytically and the same auxiliary form is used in both.

The auxiliary form is finite and in some languages (e.g. Czech) it inflects for number and person. In other languages (e.g. Russian) it has been reduced to a single frozen form that is used in all persons and numbers. Some authors may prefer to tag the frozen auxiliary as particle (PART), but we suggest that it be tagged AUX, with the verb to be as its lemma, to keep the annotation parallel across Slavic languages.

In Slovak and Slovenian, the reduced particle-like conditional auxiliary by / bi is used and combined with the present indicative auxiliary exactly as for the past tense (all persons in Slovenian, only 1st and 2nd in Slovak). The present auxiliary is written separately. Similar analysis can be done in Polish where the present auxiliary takes the form of the agglutinating morpheme (cf. Section 10) but is treated as an independent syntactic word: potrafili-by-smy "we would be able".

Sometimes the conditional auxiliary merges with a subordinating conjunction as in Czech aby "so that", kdyby "if", Polish zebyscie "so that you", gdybysmy "if we", or Russian чтобы (ctoby) "so that". According to the UD guidelines we should split such fusions back into syntactic words in the annotation (что-бы).

12. Adverbial Participle (Transgressive)

[cs] pfechodnik pfitomny, pfechodnik minuly; [sk] prechodnik; [hsb] transgresiw; [pl] imieslow przyslowkowy wspolczesny, imieslow przyslowkowy uprzedni; [uk] дieприслiвник тепершнього часу, дieприслiвник минулого часу; [ru] деепричастие настоящего времени, деепричастие прошедшего времени; [sl] delezje; [hr] glagolski prilog sadasnji, glagol-ski prilog prosli; [bg] деепричастие. Tables 44-52.

Adverbial participles, also called transgressives, verbal adverbs, converbs (Ned-jalkov and Nedjalkov, 1987) or even gerunds (Comrie and Corbett, 2001),10 are non-finite forms of verbs that can be used as adverbial modifiers in a clause. The circumstance they specify is that the action of the main verb happens while the action of the

10The term gerund may cause confusion: in English it is close to verbal nouns (cf. Section 16), in Romance languages the term denotes present participles. The term transgressive is unique but it is not widely known. We can encounter it in descriptions of Czech and the Sorbian languages; more generally, its usage is limited to the German-Slavic linguistic tradition. We use the term here because it is part of the UD guidelines v1, encoded as the feature VerbForm=Trans.

Tense Number Gender be can go/come do accept

Pres Sing Masc jsa moha jda delaje akceptuje

Pres Sing Fem,Neut jsouc mohouc jdouc delajic akceptujic

Pres Plur jsouce mohouce jdouce delajice akceptujice

Past Sing Masc byv prised udelav akceptovav

Past Sing Fem,Neut byvsi pfisedsi udelavsi akceptovavsi

Past Plur byvse pfisedse udelavse akceptovavse

Table 44. [cs] VERB,AUX | VerbForm=Trans. Plural forms do not distinguish gender. The present and past transgressives in the "go/come" and "do" columns are forms of different lemmas (imperfective vs. perfective).

be can g° do accept

súc mozuc idúc robiac akceptujuc

Table 45. [sk] VERB,AUX | VerbForm=Trans | Tense=Pres. Modern Slovak has only

the present transgressive.

Tense be can go/come do accept

Pres mozo dzejo dzelajo, dzelajcy akceptujo, akceptujcy

Past bywsi pósowsi, pósedsi nadzelawsi akceptowawsi

Table 46. [hsb] VERB,AUX | VerbForm=Trans. The present and past transgressives in the "do" column are forms of different lemmas (imperfective vs. perfective).

transgressive is happening (present transgressive), or that it happens after the action of the transgressive has happened (past transgressive). The subject of the clause and of the transgressive is identical.

Present transgressives tend to be created from imperfective verbs and past transgressives from perfective verbs, but exceptions exist (Academia, 1986, p. 154). Again, Aspect should be fixed to lemma and not used to distinguish the two transgressives. The Tense feature should be used instead.

Transgressives are tagged VERB or AUXbut not ADV, and their features include Verb-Form=Trans. In some languages they mark gender and number of the subject. In others they don't.

Tense be can go/come do accept

Pres bçdqc mogqc idqc robiqc akceptujqc

Past bywszy poszedlszy zrobiwszy akceptowawszy

Table 47. [pl] VERB,AUX | VerbForm=Trans. The present and past transgressives in the "go" and "do" columns are forms of different lemmas (imperfective vs. perfective).

Tense be can go/come do accept

Pres будучи buducy можучи mozucy йдучи jducy роблячи robljacy акцептуючи akceptujucy

Past бувши buvsy могши mohsy прийшовши pryjsovsy зробивши zrobyvsy акцептувавши akceptuvavsy

Table 48. [uk] VERB,AUX | VerbForm=Trans. The present and past transgressives in the "go/come" and "do" columns are forms of different lemmas (imperfective vs. perfective).

Tense be can go/come do accept

Pres будучи идя делая акцептуя

buduci idja delaja akceptuja

Past быв, бывши могши шедши делав, делавши акцептовавши

byv, byvsi mogsi sedsi delav, delavsi akceptovavsi

Table 49. [ru] VERB,AUX j VerbForm=Trans.

Tense be can go/come do accept

Pres bodoc idoc delaje akceptiraje

Past bivsi prisedsi dodelavsi akceptiravsi

Table 50. [sl] VERB,AUX | VerbForm=Trans. The present and past transgressives in the "go/come" and "do" columns are forms of different lemmas (imperfective vs. perfective).

Tense be can go/come do accept

Pres buduci moguci iduci delajuci akceptirajuci

Past bivsi dosavsi dodelavsi akceptiravsi

Table 51. [hr] VERB,AUX | VerbForm=Trans. The present and past transgressives in the "go/come" and "do" columns are forms of different lemmas (imperfective vs. perfective).

be can g° do accept

бъдейки, бидейки badejki, bidejki можейки mozejki отивайки otivajki правейки pravejki акцептирайки akceptirajki

Table 52. [bg] VERB,AUX | VerbForm=Trans.

13. Verbal Adjective or Active Participle

[cs] pfidavne jmeno slovesne cinne (zpfidavnely pfechodnik); [sk] cinne pricastie; [hsb] prezensowy particip; [pl] imieslow przymiotnikowy czynny; [uk] активний дгеприкметник; [ru] действительное причастие; [sl] deleznik na -c, -si; [hr] particip, glagolski pridjev; [bg] сегашно деятелно причастие. Tables 53-61.

Active verbal adjectives (or participles) correspond to transgressives (see Section 12) and are different from the active l-participle (see Section 10). They are used attributively (not predicatively) and inflect for Case, except for Bulgarian that has neither long participles nor cases.

They should be tagged ADJ, not VERB or AUX, although their derivation from verbs is quite productive. Their lemma is the nominative singular form of the adjective, not the infinitive of the verb.

Optionally their relation to verbs may be documented using the features of Verb-Form=Part, Voice=Act, Aspect (same as the aspect of the base verb) and Tense (whether they correspond to present or past transgressive). The meaning directly follows from the transgressive: [cs] delajici "one who is doing" (present verbal adjective); udelavsi "one who has done" (past verbal adjective).

In standard Ukrainian, active verbal adjectives are considered ungrammatical, being a consequence of russification.11

11http://nl.ijs.si/ME/V4/msd/html/msd.A-uk.html#msd-body.1_div.3_div.11_div.5_div.1

Number Sing Plur

Gender Masc Neut Fem

Animacy Anim Inan

Nom delajici delajici

Gen delajiciho delajicich

Dat del ajicimu delajicim

Acc delajiciho delajici delajici delajici

Voc delajici delajici

Loc delajicim delajicich

Ins delajicim delajicimi

Table 53. [cs] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective delajici means "doing" and is derived from the imperfective verb delat "to do". The corresponding past adjective is udelavsi, it is derived from the perfective verb

udelat and uses the same suffixes.

Number Sing Plur

Gender Masc Neut Fem Masc Fem,Neut

Animacy Anim Inan Anim Inan

Nom robiaci robiace robiaca robiaci robiace

Gen robiaceho robiacej robiacich

Dat robiacemu robiacej robiacim

Acc robiaceho robiaci robiace robiacu robiacich robiace

Loc robiacom robiacej robiacich

Ins robiacim robiacou robiacimi

Table 54. [sk] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective robiaci means "doing" and is derived from the imperfective verb robit "to do". The corresponding past adjective is robivsi with similar suffixes.

Nu Sing Dual Plur

Ge Masc Neut Fem Masc F.,N. Masc F.,N.

An An. In. An. In. An. In.

Nom dzeiacy dzelace dzelaca dzelacaj dzelacej dzelaci dzelace

Gen dzelaceho dzelaceje dzelaceju dzelacych

Dat dzelacemu dzelacej dzelacymaj dzelacym

Acc dzelaceho dzeiacy dzelace dzelacu dzelaceju dzelacej dzelacych dzelace

Loc dzelacym dzelacej dzelacymaj dzelacych

Ins dzelacym dzelacej dzelacymaj dzelacymi

Table 55. [hsb] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective dzeiacy means "doing" and is derived from the imperfective verb dzeiac

"to do".

Number Sing Plur

Gender Masc Neut Fem Masc Fem,Neut

Animacy Anim,Nhum Inan Anim Nhum,Inan

Nom robiqcy robiqce robiqca robiqcy robiqce

Gen robiqcego robiqcej robiqcych

Dat robiqcemu robiqcej robiqcym

Acc robiqcego robiqcy robiqce robiqcq robiqcych robiqce

Voc robiqcy robiqce robiqca robiqcy robiqce

Loc robiqcym robiqcej robiqcych

Ins robiqcym robiqcq robiqcymi

Table 56. [pl] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective robiqcy means "doing" and is derived from the imperfective verb robic "to do". The corresponding past adjective is zrobiwszy, it is derived from the perfective verb

zrobic and uses the same suffixes.

Number Sing Plur

Gender Masc Neut Fem

Animacy Anim Inan

Nom делающий delajuscij делающее delaiuscee делающая delajuscaja делающие delajuscie

Gen делающего delajuscego делающей delajuscej делающих delajuscih

Dat делающему delajuscemu делающей delajuscej делающим delajuscim

Acc делающего delajuscego делающий delajuscij делающее delajuscee делающую delajuscuju делающие delajuscie

Loc делающем delajuscem делающей delajuscej делающих delajuscih

Ins делающим delajuscim делающей, делающею delajuscej, delajusceju делающими delajuscimi

Table 57. [ru] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective делающий (delajuscij) means "doing" and is derived from the imperfective verb делать (delat') "to do". The corresponding past adjective is сделавший (sdelavsij), it is derived from the perfective verb сделать (sdelat') and uses the same suffixes.

Nu Sing Dual Plur

Ge Masc Neut Fem Masc Fem,Neut Masc Fem Neut

An Anim Inan

Nom delajoc delajoce delajoca delajoca delajoci delajoci delajoce delajoca

Gen delajocega delajoce delajocih

Dat delajocemu delajoci delajocima delajocim

Acc delajocega delajoc delajoce delajoco delajoca delajoci delajoce delajoca

Loc delajocem delajoci delajocih

Ins delajocim delajoco delajocima delajocimi

Table 58. [sl] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective delajoc / delajoci means "doing" and is derived from the imperfective verb

delati "to do".

Number Sing Plur

Gender Masc Neut Fem Masc Fem Neut

Animacy Anim Inan

Nom delajuci delajuce delajuca delajuci delajuce delajuca

Gen delajuceg delajuce delajucih

Dat delajucem delajucoj delajucim

Acc delajuceg delajuci delajuce delajucu delajuce delajuca

Voc delajuci delajuce delajuca delajuci delajuce delajuca

Loc delajucem delajucoj delajucim

Ins delajucim delajucom delajucim

Table 59. [hr] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective delajuci means "doing" and is derived from the imperfective verb delati "to do". The corresponding past adjective is dodelavsi, it is derived from the perfective verb dodelati and uses the same suffixes.

Number Sing Plur

Gender Masc Fem Neut

Ind npaBew, pravest npaBewa pravesta npaBewo pravesto npaBew,u pravesti

Def npaBewuam pravestijat npaBewama pravestata npaBewomo pravestoto npaBewume pravestite

Table 60. [bg] npaBew, (pravest) "doing" ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The rows correspond to different values of Definite. Bulgarian adjectives do not inflect for Case.

Number Sing Dual

Gender Masc Neut Fem Masc Neut Fem

Nom A^Aam AÈAaffiÇH A^Aam^a A^AamçH

dëlajç dëlajçsti dëlajçsta dëlajçsti

Gen A^Aam^a A^Aam^A A^Aam^oy

dëlajçsta dëlajçstç dëlajçstu

Dat A^Aam^oy A^AamçH A^Aam^EMa A^Aam^aMa

dëlajçstu dëlajçsti dëlajçstema dëlajçstama

Acc A^Aam^b A^Aam^E A^Aam^x A^Aam^a A^AamçH

dëlajçstb dëlajçste dëlajçstç dëlajçsta dëlajçsti

Voc A^Aam A^AamçH A^Aam^a A^AamçH

dëlajç dëlajçsti dëlajçsta dëlajçsti

Loc A^Aam^H A^Aam^oy

dëlajçsti dëlajçstu

Ins A^Aam^EMb AÈAamçEm A^Aam^EMa A^Aam^aMa

dëlajçstemb dëlajçstejç dëlajçstema dëlajçstama

Number Plur

Gender Masc Neut Fem

Nom A^Aam^E dëlajçste A^Aam^a dëlajçsta A^Aam^A dëlajçstç

Gen A^Aam^b dëlajçstb

Dat A^Aam^EMt dëlajçstemb A^Aam^aMt dëlajçstamb

Acc A^Aam^A dëlajçstç A^Aam^a dëlajçsta AÎAiffiÇA dëlajçstç

Voc AÈAiffiÇE dëlajçste A^Aam^a dëlajçsta AÎAiffiÇA dëlajçstç

Loc A^Aam^Hxt dëlajçstichb A^Aam^axt dëlajçstachb

Ins AtAamçH dëlajçsti A^Aam^aMH dëlajçstami

Table 61. [cu] ADJ | Aspect=Imp | VerbForm=Part | Voice=Act | Tense=Pres. The adjective gkAam (delajg) means "doing" and is derived from the imperfective verb a£aath (delati) "to do". The corresponding past adjective is c'bA'kAaB'b (sbdelavb), it is derived from the perfective verb c^a^aath (sbdelati) and uses similar suffixes: Sing Masc Gen cBAkAABtma (sbdelavbsa), Sing Fem Nom cBAkAdBtmH (sbdelavbsi) etc. The table shows the short ("strong") forms of the nominal declension.

14. Passive Participle

[cs] pficesti trpne, pfidavne jmeno slovesne trpne; [sk] trpne pricastie; [hsb] preteri-towy particip; [pl] imieslow przymiotnikowy bierny; [uk] пасивний д1еприкметник; [ru] страдательное причастие; [sl] trpni deleznik; [hr] glagolski pridjev trpni; [bg] минало страдателно причастие. Tables 62-72.

The passive participle is a non-finite verbal form used to construct the periphrastic passive. It is the only form that bears the feature Voice=Pass.

All the other verb forms may take part in passive constructions. Examples [cs]: je nominovan "he is (being) nominated"; byljsem nominovan "I was nominated"; byl bych nominovan "I would be nominated"; budes nominovan "you will be nominated"; budte nominovan "be nominated"; byt nominovan "to be nominated" etc. It is always the passive participle that makes the construction passive. The auxiliary verb forms do not differ morphologically from the forms used in the active voice, which is the default. Therefore they should either be marked Voice=Act, or the Voice feature should be left empty. We suggest that the explicit annotation of Voice=Act is mandatory for the other participles, so that all types of participles are explicitly distinguished. For the other verbal forms, the feature is optional.

Note that Slavic languages also have the reflexive passive, consisting of a reflexive pronoun and a 3rd person indicative verb ([cs] Prezident se volikazde 4 roky. "The president is elected every 4 years.") Although the analytical construction is passive, the participating verb is morphologically not passive and will not be marked as such. The passive nature of the clause will be visible in the dependency annotation (the subject will be attached as nsubjpass and the reflexive pronoun will be attached using the language-specific relation auxpass:reflex). In [ru] the reflexive pronoun is written as one word with the finite verb: негласно считалось, что ему простительно всякое (neglasno scitalos', cto emu prostitel'no vsjakoe) "it was silently thought that he could be forgiven everything". When it is used to form the reflexive passive, we could in theory mark the whole form as passive; however, we recommend to split the form to two syntactic words (считало+сь / scitalo+s') and make it parallel with the other Slavic languages.

Passive participles may have short and long forms. As explained above (see Section 3), this distinction can be interpreted as indefinite vs. definite adjectives in the south Slavic languages. In the north it applies to Czech and Russian, where the short forms are used predicatively, and their Case inflection almost vanished (Czech short participles may form accusative but it is very rare). Since we cannot distinguish the forms by the Definite feature here, we suggest to tag the short forms VERB, even though the remnants of case inflection make this decision slightly inconsistent with the rest.12 The long forms are also called passive verbal adjectives and we treat them

12We also lose the parallelism between short passive participles and short forms of adjectives in Czech (nemocen vs. nemocny "ill"). The short adjectives are used in predicates as well. This is a controversial issue and the guideline we propose may be revised in future.

Number Sing Plur

Gender Masc Neut Fem Masc Fem Neut

Animacy Anim Inan Anim Inan

Nom dëlany dëlané dëlarn dëlani dëlané dëlarn

Gen dëlaného dëlané dëlanych

Dat dëlanému dëlané dëlanym

Acc dëlaného dëlany dëlané dëlanou dëlané dëlarn

Voc dëlany dëlané dëlarn dëlani dëlané dëlarn

Loc dëlaném dëlané dëlanych

Ins dëlanym dëlanou dëlanymi

VERB dëlân dëlâno dëMna dëlani dëMny dëlâna

Table 62. [cs] dëlany / dëlàn "done" ADJ,VERB j Aspect=Imp j VerbForm=Part j

Voice=Pass.

Number Sing Plur

Gender Masc Neut Fem Masc Fem,Neut

Animacy Anim Inan Anim Inan

Nom robeny robené robená robeni robené

Gen robeného robenej robenych

Dat robenému robenej robenym

Acc robeného robeny robené robenú robenych robené

Loc robenom robenej robenych

Ins robenym robenou robenymi

Table 63. [sk] robeny "done" ADJ j Aspect=Imp j VerbForm=Part j Voice=Pass.

as adjectives derived from verbs. Their tag should be ADJ and their lemma should be the adjectival form in masculine singular nominative, not the verb infinitive. They can be used as attributive modifiers of noun phrases (with which they agree in gender, number and case).

The long forms of passive participles may also be used in predicates, especially in languages that have only the long forms (e.g. Slovak). However, since they are tagged as adjectives, the dependency layer will analyze them as adjectival predicates with a copula.

In Polish and Ukrainian, the attributive form of singular neuter is different from the predicative one: [uk] писане правило (pysane pravylo) "a written rule" vs. правило

Nu Sing Dual Plur

Ge Masc Neut Fem Masc F.,N. Masc F.,N.

An An. In. An. In. An. In.

Nom dzeiany dzelane dzelana dzelanaj dzelanej dzelani dzelane

Gen dzelaneho dzelaneje dzelaneju dzelanych

Dat dzelanemu dzelanej dzelanymaj dzelanym

Acc dzelaneho dzeiany dzelane dzelanu dzelaneju dzelanej dzelanych dzelane

Loc dzelanym dzelanej dzelanymaj dzelanych

Ins dzelanym dzelanej dzelanymaj dzelanymi

Table 64. [hsb] dzeiany "done" ADJ | Aspect=Imp | VerbForm=Part | Voice=Pass.

Number Sing Plur

Gender Masc Neut Fem Masc Fem,Neut

Animacy Anim,Nhum Inan Anim Nhum,Inan

Nom robiony robione robiona robieni robione

robiono

Gen robionego robionej robionych

Dat robionemu robionej robionym

Acc robionego robiony robione robionq robionych robione

Voc robiony robione robiona robieni robione

Loc robionym robionej robionych

Ins robionym robionq robionymi

Table 65. [pl] robiony "done" ADJ | Aspect=Imp | VerbForm=Part | Voice=Pass.

писано (pravylo pysano) "a rule is/was written". One might be tempted to tag the predicative forms as VERB instead of ADJ, to make them parallel with the short (predicative) participles in Czech and Russian. Unfortunately, that would mean that two very similar Ukrainian sentences would get different part-of-speech and dependency analyses just because their subjects differ in gender and/or number. Therefore it seems better to classify these forms as adjectives, too.

Slovenian and Serbo-Croatian inflect both short and long adjectives for Case, and the same applies to passive participles (passive verbal adjectives).

Definite adjectives are longer than indefinite also in Bulgarian and Macedonian, although the construction is different from that of [sl] and [hr]. The definite forms are used only attributively, the short forms both as attributes and predicates. As this

Number Sing Plur

Ge/An M/Anim M/Inan Neut Fem

Nom делаемый delaemyj делаемое delaemoe делаемая delaemaja делаемые delaemye

Gen делаемого delaemogo делаемой delaemoj делаемых delaemyh

Dat делаемому delaemomu делаемой delaemoj делаемым delaemym

Acc делаемого delaemogo делаемый delaemyj делаемое delaemoe делаемую delaemuju делаемых, делаемые delaemyh, delaemye

Loc делаемом delaemom делаемой delaemoj делаемых delaemyh

Ins делаемым delaemym делаемой, делаемою delaemoj, delaemoju делаемыми delaemymi

VERB делаем delaem делаемо delaemo делаема delaema делаемы delaemy

Table 66. [ru] ADJ,VERB j Aspect=Imp j VerbForm=Part j Voice=Pass j

Tense=Pres.

Number Sing Plur

Ge/An M/Anim M/Inan Neut Fem

Nom сделанный сделанное сделанная сделанные

sdelannyj sdelannoe sdelannaja sdelannye

Gen сделанного sdelannogo сделанной sdelannoj сделанных sdelannyh

Dat сделанному sdelannomu сделанной sdelannoj сделанным sdelannym

Acc сделанного сделанный сделанное сделанную сделанных, сделанные

sdelannogo sdelannyj sdelannoe sdelannuju sdelannyh, sdelannye

Loc сделанном sdelannom сделанной sdelannoj сделанных sdelannyh

Ins сделанным sdelannym сделанной, сделанною sdelannoj, sdelannoju сделанными sdelannymi

VERB сделан сделано сделана сделаны

sdelan sdelano sdelana sdelany

Table 67. [ru] сделанный / сделан (sdelannyj / sdelan) "done" ADJ,VERB | Aspect=Perf | VerbForm=Part | Voice=Pass | Tense=Past.

Number Sing Plur

Gender Masc Neut Fem

Animacy Anim Inan Anim Inan

Nom зроблений zroblenyj зроблене zroblene зроблена zroblena зроблеш zrobleni

зроблено zrobleno

Gen зробленого zroblenoho зробленоi zroblenoi зроблених zroblenych

Dat зробленому zroblenomu зробленш zroblenij зробленим zroblenym

Acc зробленого zroblenoho зроблений zroblenyj зроблене zroblene зроблену zroblenu зроблених zroblenych зроблеш zrobleni

Loc зробленому zroblenomu зроблетй zroblenij зроблених zroblenych

Ins зробленим zroblenym зробленою zroblenoju зробленими zroblenymy

Table 68. [uk] 3po6xenuu (zroblenyj) "done" ADJ | Aspect=Perf | VerbForm=Part | Voice=Pass. The Nom-Ins rows show Case inflections of verbal adjectives.

Number Sing Plur

Gender Masc Fem Neut

Ind правен praven правена pravena правено praveno правени praveni

Def правеният pravenijat правената pravenata правеното pravenoto правените pravenite

Table 69. [bg] npaeen (praven) "done" ADJ | Aspect=Imp | VerbForm=Part | Voice=Pass. The rows correspond to different values of Definite. Bulgarian adjectives

do not inflect for Case.

also applies to passive participles, it seems appropriate to classify them (both forms) as ADJ. They do not inflect for Case but neither do adjectives because [bg] and [mk] have lost the case system.

Russian and Old Church Slavonic distinguish present and past passive participles: журнал, читаемый студентом (zurnal, citaemyj studentom) "journal that is being read by the student" vs. журнал, прочитанный студентом (zurnal, procitannyj studentom)

Nu Sing Dual Plur

Ge Masc Neut Fem Masc Fem,Neut Masc Fem Neut

An Anim Inan

Nom delan delano delana delana delani delani delane delana

Gen delanega delane delanih

Dat delanemu delani delanima delanim

Acc delanega delan delano delana delani delane delana

Loc delanem delani delanih

Ins delanim delano delanima delanimi

Table 70. [sl] delan / delani "done" ADJ | Aspect=Imp | VerbForm=Part |

Voice=Pass.

Number Sing Plur

Gender Masc Neut Fem Masc Fem Neut

Animacy Anim Inan

Nom delan delano delana delani delane delana

Gen delanog delane delanih

Dat delanom delanoj delanim

Acc delanog delan delano delanu delane delana

Voc delan delano delana delani delane delana

Loc delanom delanoj delanim

Ins delanim delanom delanim

Table 71. [hr] delan / delani "done" ADJ | Aspect=Imp | VerbForm=Part |

Voice=Pass.

"journal that has been read by the student". The distinction will be annotated using the Tense feature. Note that other languages will have the Tense feature empty. Both the above examples will use the same (the only) passive participle in Czech, they will differ only by the prefix because the second verb is perfective: casopis (pfe)cteny studentem "journal read by the student".

Passive participles are normally formed for transitive verbs, although verbs that subcategorize for a non-accusative object may also have a passive participle (neuter singular only).

Number Sing Dual

Gender Masc Neut Fem Masc Neut Fem

Nom A^AAEMt A^AAEMO A^AAEMA A^AAEMA A^AAEMk

delajem delajemo delajema delajema delajeme

Gen A^AAEMA A^AAEMtl A^AAEMOy

delajema delajemy delajemu

Dat A^AAEMOy A^AAEMk A^AAEMOMA A^AAEMAMA

delaj emu delajeme delajemoma delajemama

Acc A^AAEMt A^AAEMO A^AAEMX A^AAEMA A^AAEMk

delajemb delajemo delajemg delajema delajeme

Voc A^AAEMt A^AAEMO A^AAEMA A^AAEMk

delajemb delajemo delajema delajeme

Loc A^AAEMk A^AAEMOy

delajeme delajemu

Ins A^AAEMOMb A^AAEMOm. A^AAEMOMA A^AAEMAMA

delajemomb delajemojg delajemoma delajemama

Number Plur

Gender Masc Neut Fem

Nom A^AAEMH delajemi A^AAEMA delajema A^AAEMtl delajemy

Gen A^AAEMt delajemb

Dat A^AAEMOMt delajemoma A^AAEMAMt delajemamb

Acc A^AAEMtl delajemy A^AAEMA delajema A^AAEMtl delajemy

Voc A^AAEMH delajemi A^AAEMA delajema A^AAEMtl delajemy

Loc A^AAEM^Xt delajgstichb A^AAEMAXt delajgstachb

Ins A^AAEMtl delajemy A^AAEMAMH delajemami

Table 72. [cu] ADJ | Aspect=Imp | VerbForm=Part | Voice=Pass | Tense=Pres. The adjective a^aaem (delajem) means "being done" and is derived from the imperfective verb a^aath (delati) "to do". The corresponding past adjective is cba^aan (sbdelan) "done", it is derived from the perfective verb c^a^aath (sbdelati) and uses similar suffixes: Sing Fem Nom cba^aana (sbdelana), Sing Neut Nom cba^aano (sbdelano) etc. The table shows the short ("strong") forms of the nominal declension.

Example Gloss Languages L Tag VerbFo Voic Tense Defin

budouci what will be all? l ADJ (Part) (Fut)

delajoc who is doing sl, bg, cu s ADJ Part Act Pres Ind

delajici who is doing all l ADJ Part Act Pres (Def)

съдЪлавъ who has done cu s ADJ Part Act Past Ind

udelavsi who has done cs, sk, hsb, pl uk, ru, hr l ADJ Part Act Past (Def)

delal did / (has) done all s VERB AUX Part Act Past

правел was doing bg s VERB Part Act Imp

minuly what has passed all? l ADJ (Part) (Past)

delän (is (being)) done cs s VERB Part Pass

delan ((who) is) done sl, hr, bg s ADJ Part Pass Ind

delany who is/was done cs, sk, hsb, pl, uk, sl, hr, bg l ADJ Part Pass (Def)

делаем (is being) done ru s VERB Part Pass Pres

дЬлаЕмъ (is being) done cu s ADJ Part Pass Pres Ind

делаемый who is being done ru, cu l ADJ Part Pass Pres (Def)

сделан (has been, is) done ru s VERB Part Pass Past

съдЪлаы (who is) done cu s ADJ Part Pass Past Ind

сделанный who has been done ru, cu l ADJ Part Pass Past (Def)

Table 73. Participles. The "L" column denotes short vs. long forms. The Def feature only applies in languages where the Ind counterpart exists.

15. Participle Summary

Participles are words that share properties of verbs and adjectives. Just like adjectives, they have short and long forms. Historically, the long forms emerged as a fusion of the short form and a pronoun. North Slavic languages either do not have the short form or they do not mark the Case on it. Short and long forms are distinguished by the POS tag (VERB/ADJ). South Slavic languages use the short form and inflect it for Case (except for [bg] and [mk], which have lost cases). The long form is definite. Both forms are ADJ; short vs. long is distinguished by Definite=Ind/Def. The l-participle is special. Its short form is VERB even in the south Slavic languages (the Definite and Case features of the short form are empty). Table 73 gives a summary of the proposed annotation of participles. Adverbial participles are not covered here because we tag them as transgressives (VerbForm=Trans, see Section 12). [cu] does not have transgres-

Number Sing Plur

Nom delânî delânî

Gen delânî delânî

Dat delânî delânîm

Acc delânî delânî

Voc delânî delânî

Loc delânî delânîch

Ins delânîm delânîmi

Table 74. [cs] delani "doing" NOUN | Aspect=Imp. The rows correspond to different

values of Case.

Number Sing Plur

Nom robenie robenia

Gen robenia robenî

Dat robeniu robeniam

Acc robenie robenia

Loc robenî robeniach

Ins robenîm robeniami

Table 75. [sk] robenie "doing" NOUN | Aspect=Imp. The rows correspond to different

values of Case.

sives but the nominative forms of its active participles correspond to transgressives and can be used as adverbial modifiers.

16. Verbal Noun

[cs] podstatné jméno slovesné; [sk] slovesné podstatné meno; [hsb] werbalny substantiw; [pl] rzeczownik odczasownikowy; [uk] в1дд1есл1вний ¡менник; [ru] отглагольное существительное; [sl] glagolsko ime; [hr] radna (glagolska) imenica; [bg] отглаголно съще-ствително име. Tables 74-83.

Verbal noun is an abstract noun productively derived from a verb, denoting the action of the verb. It inflects for Case and Number, although it is only rarely seen in plural. Its gender is always Neut. We tag it NOUN and use its singular nominative form as the lemma (not the infinitive of the base verb).

The UD guidelines v1 suggest that VerbForm=Ger can be used to distinguish verbal nouns from other nouns. This works in English where the corresponding form is

Number Sing Dual Plur

Nom dzelanje dzelani dzelanja

Gen dzelanja dzelanjow

Dat dzelanju dzelanjomaj dzelanjam

Acc dzelanje dzelani dzelanja

Loc dzelanju dzelanjomaj dzelanjach

Ins dzelanjom dzelanjomaj dzelanjemi

Table 76. [hsb] dzefanje "doing" NOUN | Aspect=Imp. The rows correspond to different

values of Case.

Number Sing Plur

Nom robienie robienia

Gen robienia robien

Dat robieniu robieniom

Acc robienie robienia

Voc robienie robienia

Loc robieniu robieniach

Ins robieniem robieniami

Table 77. [pl] robienie "doing" NOUN | Aspect=Imp. The rows correspond to different

values of Case.

termed gerund. Unfortunately, this feature might cause confusion in Slavic linguistics where some authors use the term gerund for adverbial participles (cf. Section 12). Hence we advise against using it with Slavic verbal nouns. Nevertheless, the verbal nouns may mark the Aspect of their base verb.

Verbal nouns use suffixes similar to passive participles. Unlike passive participles, they can be derived from intransitive verbs as well.

17. Negation

Slavic verbs are negated by a local variant of the morpheme ne, which is either a bound morpheme (prefix), or a separate word (particle). If it is a prefix, we do not cut it off during tokenization.

A standalone negating word is tagged PART and it has the feature Negative=Neg. On the dependency level, it is attached to the negated verb using the neg relation.

Number Sing Plur

Nom роблення roblennja роблення roblennja

Gen роблення roblennja роблень roblen

Dat робленню roblennju робленням roblennjam

Acc роблення roblennja роблення roblennja

Loc роблент, робленню roblenni, roblennju робленнях roblennjach

Ins робленням roblennjam робленнями roblennjamy

Table 78. [uk] pofiAerna (roblennja) "doing" NOUN | Aspect=Imp. The rows correspond

to different values of Case.

In the case of the negative prefix, the verb itself bears the Negative=Neg feature. This type of prefixing is considered inflectional rather than derivational, that is, the lemma is still the affirmative (unprefixed) infinitive. If the language negates verbs by prefixing, all affirmative forms of these verbs should be annotated Negative=Pos.

In periphrastic constructions it is normal that only one participating word is negated, but various languages may have different rules on what participant it should be. Cf. [cs] Vcerajsem neseldomu. "I did not go home yesterday." (negated participle) and [hr] Jucer nisam isao kuci. (negated auxiliary).

Verbal adjectives (long forms of participles) and verbal nouns are negated in a similar fashion.

Czech is an example of a language where all verbs are negated using the prefix ne-. Russian is an example of the opposite: all finite forms and the l-participles are negated using the particle не (ne). With the other participles it becomes a prefix though: несовершенный (nesoversennyj) "imperfect". Yet different is Croatian where the negative particle is the default, except for the verbs biti, htjeti and imati that take the negative morpheme as a prefix.

18. Current Data

UD version 1.2, released in November 2015, contains data from 6 Slavic languages: Czech, Polish, Slovenian, Croatian, Bulgarian and Old Church Slavonic. Most of these datasets distinguish AUX from VERB (except for [cu], which uses only the VERB tag) and most of them have a non-empty value of VerbForm for all verbs (auxiliary or not). Here

Number Sing Plur

Nom делание, деланье делания, деланья

delanie, delan'e delanija, delan'ja

Gen делания, деланья деланий

delanija, delan'ja delanij

Dat деланию, деланью деланиям, деланьям

delaniju, delan'ju delanijam, delan'jam

Acc делание, деланье делания, деланья

delanie, delan'e delanija, delan'ja

Loc делании, деланье, деланьи деланиях, деланьях

delanii, delan'e, delan'i delanijah, delan'jah

Ins деланием, деланьем деланиями, деланьями

delaniem, delan'em delanijami, delan'jami

Table 79. [ru] делание (delanie) "doing" NOUN | Aspect=Imp. The rows correspond to

different values of Case.

the exceptions are [hr] (finite verbs are not marked), [pl] (predicative nonverbs such as to "it (is)" are tagged VERB) and [bg] (empty VerbForms are probably annotation errors). [cu] uses the subjunctive mood (Mood=Sub) instead of Mood=Cnd for the conditional auxiliaries.

All but [bg] have occurrences of VerbForm=Inf, [cu] and [sl] also have VerbForm=Sup.

All languages except [pl] tag verbal nouns as regular NOUN, without setting the VerbForm. Polish tags them VERB with VerbForm=Ger.

VerbForm=Trans is used in [cs], [pl] and [sl]; In Czech and Polish their main part of speech is VERB (or AUX) while in Slovenian it is ADV. Croatian data ignores the Trans value and annotates transgressives as ADV plus VerbForm=Part. Bulgarian tags them as regular adverbs, without any distinctive feature.

By far the largest proportion of inconsistency is caused by participles.

[cs]: The l-participles are tagged VERB/AUX VerbForm=Part | Tense=Past | Voice= Act. Short forms of passive participles are tagged VERB VerbForm=Part | Voice=Pass (empty Tense). Long forms are tagged as regular adjectives (empty VerbForm). Active participles related to transgressives are tagged ADJ VerbForm=Part | Voice=Act and distinguished by tense and aspect: either Aspect=Imp | Tense=Pres or Aspect=Perf | Tense=Past.

[pl]: All participles are tagged VERB. Present active (progressive) participles are marked Voice=Act | Tense=Pres, while the passive participles have Voice=Pass and empty Tense. The l-participles are marked as finite forms (VerbForm=Fin instead of Part!) with Tense=Past and empty Voice.

Number Sing Dual Plur

Nom delanje delanji delanja

Gen delanja delanj

Dat delanju delanjema delanjem

Acc delanje delanji delanja

Loc delanju delanjih

Ins delanjem delanjema delanji

Table 80. [sl] delanje "doing" NOUN | Aspect=Imp. The rows correspond to different

values of Case.

Number Sing Plur

Nom delanje delanja

Gen delanja delanja

Dat delanju delanjima

Acc delanje delanja

Voc delanje delanja

Loc delanju delanjima

Ins delanjem delanjima

Table 81. [hr] delanje "doing" NOUN | Aspect=Imp. The rows correspond to different

values of Case.

[sl]: The predicatively used l-participles are tagged VERB/AUX VerbForm=Part, with empty Voice and Tense. Participles tagged as adjectives (ADJ VerbForm=Part) are mostly passive participles, albeit their Voice feature is empty, too. However, some of them are adjectives derived from the l-participles (minuli, ostali, odrasle) and rarely also the present active participle (bolece).

[hr]: The l-participles are tagged VERB/AUX VerbForm=Part and they are the only active participles marked. Passive participles are tagged ADJ VerbForm=Part. The Tense and Voice features are always empty.

[bg]: Only the l-participles of the verb to be are tagged VERB/AUX VerbForm=Part. Predicatively used l-participles of other verbs appear as finite verbs (VerbForm=Fin), they are thus indistinguishable from the aorist and imperfect simple past tenses, respectively. For example, both Mowax and mozka (aorist and perfect l-participle of could) are annotated Voice=Act | Tense=Past. In parallel, both Mowex and Mowen (simple imperfect and imperfect l-participle of the same verb) are annotated Voice=Act

Number Sing Plur

Ind правене pravene правения, правенета pravenija, praveneta

Def правенето praveneto правенията, правенетата pravenijata, pravenetata

Table 82. [bg] npaBene (pravene) "doing" NOUN | Aspect=Imp. The rows correspond to different values of Definite. Bulgarian nouns do not inflect for Case.

Number Sing Dual Plur

Nom дЬллыие delanije дЪлаынн delanii дЪлаыиЪ delanija

Gen дЪлаыиЪ delanija дЬлаыню delaniju дЪлаынн delanii

Dat дЬлаыню delaniju дЪлаыиЕма delanijema дЬлаынЕмъ delanijemb

Acc ДЬл&ЫИЕ delanije дЪлаынн delanii дЪлаыиЪ delanija

Voc ДЬл&ЫИЕ delanije дЪлаынн delanii дЪлаыиЪ delanija

Loc дЬлаынн delanii дЬлаыню delaniju дЬлаыннхъ delaniichb

Ins дЬлаынЕмь dëlanijemь дЪлаыиЕма delanijema дЪлаынн delanii

Table 83. [cu] a^aanhe (delanije) "doing" NOUN | Aspect=Imp. The rows correspond to

different values of Case.

| Tense=Imp. All other participles, including some l-participles, are tagged ADJ Verb-Form=Part (they actually can take the definite suffix: миналата, останалите, миналия) Passive participles have empty Tense. Active participles are distinguished by Tense= Pres (imperfective verbs, progressive meaning) and Tense=Past (the l-participles).

[cu]: All participles are tagged VERB VerbForm=Part and no other part-of-speech tag occurs with the VerbForm feature. Except for the l-participle, which is relatively rare, all participle types can inflect for Case. Active participles are further distinguished by Tense=Pres, Past and in one case even Fut (бжджщии). The l-participles have Voice=Act but no Tense; on the other hand, they have currently a special value

of Aspect=Res, disregarding the lexical aspect of the lemma. Passive participles use the Tense feature to distinguish present and past forms.

19. Conclusion

We have presented the various combinations of morphological features of verbs that occur in Slavic languages, and we have proposed their unified and consistent representation within the Universal Dependencies framework. There already exist UD treebanks of six Slavic languages and we have shown that their authors have not always applied the UD annotation style in the same manner. Datasets for other languages are being prepared at the time of this writing, and their authors will have to take similar decisions. Our proposal should contribute to further harmonization of all these datasets: we hope to trigger discussion that will eventually lead to a more precise specification of UD guidelines for Slavic languages.

Acknowledgments

The author wishes to thank the following people for their valuable comments: Kaja Dobrovoljc, Natalia Kotsyba, Patrice Pognan, Martin Popel, Alexandr Rosen and Zdenek Zabokrtsky. This work has been supported by the Czech Science Foundation (GACR) grant no. GA15-10472S. It has been using language resources stored and distributed by the LINDAT/CLARIN project of the Ministry of Education, Youth and Sports of the Czech Republic (project LM2015071).

Bibliography

Academia. Mluvnice cestiny (2) Tvaroslovi. Academia, nakladatelstvi Ceskoslovenske akademie ved, Praha, Czechoslovakia, 1986.

Breu, Walter. Probleme der Interaktion von Lexik und Aspekt (ILA), volume 412 of Linguistische Arbeiten. Niemeyer, Tübingen, Germany, 2000. ISBN 3-484-30412-X.

Comrie, Bernard and Greville G. Corbett. The Slavonic Languages. Routledge, London, UK, 2001. ISBN 0-415-04755-2.

Erjavec, Tomaz. MULTEXT-East: Morphosyntactic Resources for Central and Eastern European Languages. Language Resources and Evaluation, 46(1):131-142, 2012.

Komärek, Miroslav, Vaclav Väzny, and Frantisek Trävnicek. Historicka mluvnice ceska II. Tvaroslovi. Stätni pedagogicke nakladatelstvi, Praha, Czechoslovakia, 1967.

Nedjalkov, Vladimir P. and Igor' V. Nedjalkov. On the typological characteristics of converbs. In Help, Toomas, editor, Symposium on language universals, pages 75-79, Tallinn, Soviet Union, 1987.

Nivre, Joakim, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajic, Christopher Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsar-faty, and Daniel Zeman. Universal Dependencies v1: A Multilingual Treebank Collection. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), Portoroz, Slovenia, 2016. European Language Resources Association.

Przepiorkowski, Adam and Marcin Wolinski. A Flexemic Tagset for Polish. In Proceedings of Morphological Processing of Slavic Languages, EACL 2003, 2003. URL http://nlp. ipipan.waw.pl/~adamp/Papers/2003-eacl-ws12/ws12.pdf.

Zeman, Daniel. Reusable Tagset Conversion Using Tagset Drivers. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), pages 213-218, Marrakech, Morocco, 2008. European Language Resources Association. ISBN 2-9517408-40.

Zeman, Daniel. Slavic Languages in Universal Dependencies. In Gajdosovä, Katarina and Adriäna Zäkovä, editors, Natural Language Processing, Corpus Linguistics, E-learning (proceedings of SLOVKO 2015), pages 151-163, Bratislava, Slovakia, 2015. Slovenskä akademia vied, RAM-Verlag. ISBN 978-3-942303-32-3.

Address for correspondence:

Daniel Zeman zeman@ufal.mff.cuni.cz

Ustav formälni a aplikovane lingvistiky Matematicko-fyzikälni fakulta Univerzita Karlova v Praze Malostranske nämesti 25 CZ-11800 Praha, Czechia