View on GitHub

L2-Annotation-Project

L2 Speech POS and Dependency Annotation Project

Guidelines for Dependency Annotation

We will be annotating our corpus for Universal Dependency Relations (version 2)

If you have questions during annotation, please follow this procedure:

Introduction to Dependency Relations

Dependency relations capture functional grammatical relationships between words in an utterance. Each linguistic item in an utterance has one (and only one) syntactic head, but may have multiple (or no) dependents. For example, in the sentence The hungry person ate the pizza, the word person has one syntactic head ate. person is connected to head via a nominal subject nsubj relationship. person also has two syntactic dependents: hungry (via an adjective modifier relationship amod) and The (via a determiner relationship det).

UD Annotation Scheme

Here, each dependency relation is described and examples of each are also given.

Universal Dependencies Descrption Example Notes
acl clausal modifier of noun First of all , I was really furious of watching a show_head starring_acl by a different actor , instead of Danny Brook . The main verb in the modifying clause will take the “acl” annotation
acl:relcl relative clause modifier There is something_head I like_acl:relcl to know from you . The main verb in the relative clause will take the “acl:relcl” annotation
advcl adverbial clause modifier I ask_head you for returning_advcl my money back . The main verb in the adverbial clause will take the “acl:relcl” annotation
advmod adverbial modifier For instance a lot of children nowadays_advmod replace_head their real life with the virtual world , which computers create .  
amod adjectival modifier It was about fourty five minutes later than original_amod starting_amod time_head There are 2 dependencies in this example
appos appositional modifier It was Katha_head , a girl_appos who had the same problems like Pat .  
aux auxiliary You have_aux also written_head in your advertisement that it would_aux started_head at half past seven but it started at 20:15 ! There are 2 dependencies in this example
aux:pass passive auxiliary If he had told everybody, J would have been_aux:pass blamed_head by whole school .  
case case marking If she was not followed by_case journalists_head , she would n’t die .  
cc coordinating conjunction And_cc the reason I choose painting is_head it ‘s being very exciting to me ,  
cc:preconj preconjunct In conclusion , there are both_cc:preconj advantages_head and disadvantages of covering celebrities .  
ccomp clausal complement You ca n’t imagine_head how funny_ccomp it was !  
compound compound would you check the weather_compound forecast_head for me , please ?  
compound:prt phrasal verb particle They make their life style much easy more fast because they need to catch_head up_compound:prt with the other people .  
conj conjunct I have knowladge about navigation_head , engine_conj , ropes_conj and knots_conj and sails_conj as well . There are 4 dependencies in this example
cop copula Unfortunatly , Pat was_cop n’t very good_head at keeping secrets .  
csubj clausal subject I really think that living_csubj will be_head like living in the moon , very exciting .  
csubj:pass clausal passive subject In the advertisement , it was told_head that Danny Brook was starring_csubj:pass , but in place of him there was a different actor and he was really dissapointing .  
dep unspecified dependency MOON LANDING HOAX_head : CHURCH = TECHNOLOGY_dep & GOVERNMENT - Shuttle Carried on Aircraft model Example from EWT corpus, limited examples
det determiner I would rather be accommodated in a tent because I already have the_det necessary material_head for camping .  
det:predet predeterminer But sometimes I have to forget my opinion and spent all_predet the day_head looking for my mother in a huge department-store.  
discourse discourse element I would like_head to travel on July , please_discourse , because that when I am in vacation .  
dislocated dislocated elements steel_dislocated or iron , which use for our building , it_head can make it stronger and easy to destroy . Limited examples in corpra
expl expletive In these gallaries there_expl are_head Historical paintings , antique furnitures which Chiang used .  
fixed fixed multiword expression Then , we should film the English class so_head as_fixed to_fixed show who makes the film . There are 2 dependencies in this example
flat flat multiword expression After concert I left with Ricky_head Martin_flat .  
flat:foreign foreign words Lopez ‘s 1561 book “Libro_head de_flat:foreign la_flat:foreign invencion_flat:foreign liberal_flat:foreign y_flat:foreign arte_flat:foreign del_flat:foreign Juego_flat:foreign del_flat:foreign Acedraz_flat:foreign “ became THE classic on Chess openings , including the one that bears his name . Example from EWT corpus, only example in corpus, there are 10 dependencies in this example
goeswith goes with All my class_head mates_goeswith are looking forward to go to London .  
iobj indirect object I would like to explain_head you_iobj my experience because I want you to give my money back .  
list list How many jeans_head , Pants_list , sweater_list ect._list do I have to take ? There are 3 dependencies in this example
mark marker This is the fact that_mark shopping is not enjoyable_head .  
nmod nominal modifier How is the home_head of the future_nmod will be ?  
nmod:npmod noun phrase as adverbial modifier of noun Yes she is having physical therapy 3 times_head a week_nmod:npmod . Example from EWT corpus
nmod:poss possessive nominal modifier I will go to your_nmod:poss office_head this week .  
nmod:tmod temporal modifier Her husband became a citizen_head of the US just the week_nmod:tmod before last . Example from EWT corpus
nsubj nominal subject I_nsubj will go_head to your office this week .  
nsubj:pass passive nominal subject We will not use black clothes as we_nsubj:pass are used_head to do .  
nummod numeric modifier I usually spend my time on playing computer about three_nummod hours_head a day .  
obj object Even we could make_head a trip_obj to Paris in the school holiday !  
obl oblique nominal Thank_head you very much for you time_obl .  
obl:npmod noun phrase as adverbial modifier They are a bit_obl:npmod far_head from town but they have huge car park .  
obl:tmod temporal modifier I ‘m so happy with the experience I lived_head last month_obl:tmod .  
orphan orphan So in the last century our daily life changed dramandesly and we became lazy and our life_head unpersonal_orphan , fast and unromantic . Only example in ESL corpus
parataxis parataxis They would be sinthetic clothes_head , I think_parataxis made of plastic or something similar . “  
punct punctuation The scene of Toyko is not beautiful_head ,punct but you think that this city is a developing place .punct There are 2 dependencies in this example
reparandum overridden disfluency It was just unbelievably dissapointing because the reason_head I mean_reparandum main reason which made me to go was to see Danny Brook . Limited examples in corpra
root root I could n’t believe_root it but he gave me the contract ot fill it out . The root is the head of itself; tag in webanno by first holding shift, and then dragging an arrow from the root tag to itself.
vocative vocative Now Kim_vocative , I will finish_head my letter but I promise you to write again as soon as possible , maybe if I ‘ve developed the photos of the concert and the stars and the …. Limited examples in corpra
xcomp open clausal complement Further because I like_head to live_xcomp in contact with the nature .  

Clarifications and special cases (tags)

acl

The UD guidelines define acl as finite or non-finite clauses that modify a nominal. The noun that the clause modifies is the head of the acl relation.

The dependent as a VBG:

The dependent as a VBN:

The dependent as a VB in an infinitival clause:

Relative clauses are also instances of acl, though they are tagged with acl:relcl.

acl or xcomp

The distinction between acl and xcomp is sometimes difficult to make. Both dependency relationships can contain infinitive clauses. To distinguish between the two, determine whether the head of the clause is a verb, adjective or a nominal.

If the clause is modifying a verb or adjective then it is typically xcomp

If the clause is modifying a nominal (that is, if the head of the clause is a noun phrase) then it is acl.

Adjectives at the beginning of utterances

There are quite a few sentences in the data that begin with an adjective or an adjectival phrase. We will approach tagging these occurrences differently, depending on the type of utterance and construction it is.

When the adjective is part of an unambiguously implied copular construction, it can be tagged as the root of the sentence…

<fine , thank you > should be tagged as:

This can also be the case in situations where there is explicit subordination attached to the initial adjective phrase:

< very busy because I attend big project now >

When there is no explicit subordination, nor an unambiguously implied copular construction, the initial adjective is likely a discourse marker. This will often be the case when the adjective can be substituted for other discourse markers like ‘OK’ or ‘yeah’ and gives no clear implication of a copular construction.

<fine , I will do it > will be tagged as:

ccomp

Clausal complements ccomp are given when a clause has an overt subject OR the implied subject can be interpreted as something other than the subject of the head clause (see ccomp or xcomp).

In most sentences, a ccomp dependent has an overt subject and is a VB or JJ which has a head that is a type of VB or JJ.

ccomp can sometimes be a NN when it is the head of a cop dependent.

ccomp or xcomp

External complements (xcomp) are given when a clause has no overt subject AND the implied subject is the same as the head clause. This is most common with infinitive clauses.

Clausal complements (ccomp) are given when a clause has an overt subject OR the implied subject can be interpreted as something other than the subject of the head clause.

csubj

csubj is used when the subject of a clause is itself a clause. The root of the subject clause (not the root of the sentence) is the csubj dependent.

csubj is often the main verb of the subject clause.

csubj is often not a verb when there is a copular verb in the subject clause.

discourse or parataxis

Utterances such as you know and I mean can be a bit confusing to tag. Although they might seem like like discourse markers, the guidelines for the discourse tag explicitly indicate that instances such as you know are not counted as discourse markers. Even in cases where these (and related utterances) appear to be functioning as discourse markers, they should be tagged as instances of parataxis.

expl

expl is used for the existential “there”.

“it” is marked as expl when it is used in extraposition constructions.

flat:foreign

flat:foreign is only used when there is a sequence of foreign words. The first foreign word is the head of the flat:foreign dependents.

fixed vs flat vs compound vs compound:prt

fixed is used for multiword expressions which behave like function words or short adverbials.

flat is used for multiword expressions such as names and titles which do not use regular syntactic relations (otherwise see compound). The first word in the multiword expression is the head of the flat dependencies.

compound is generally used for multiword expressions of nouns. This does not include mistakenly separated words (see goeswith).

Proper nouns which use regular syntactic relations are tagged as compound (otherwise see flat).

Multiword expressions of numbers also take the compound dependency.

compound can sometimes be a JJ.

The particle of an idiomatic phrasal verb is marked as a compound:prt dependent. The head of the compound:prt dependent is the verb element of the phrasal verb.

goeswith

goeswith is a tag that allows annotators to correct transcription errors. Use goeswith to mark orthography that should be combined, but was seperated during transcription. The first part is always the head, and all other parts are goeswith dependents of this head.

goeswith is used for abbreviations that form into 1 word.

When abbreviations form two words, the compound tag is used instead.

nmod

“An nmod relation is used for nominal dependents of another noun or noun phrase”. nmod relationships are typically realized via prepositional phrases, wherein a prepositional phrase is modifying a noun phrase.

Note that an nmod token can be a nominal itself (as is typical) as well as an attributive adjective (see ‘blue’ in the example above).

PRP is sometimes nmod because it can function as the nominal dependent of another noun or noun phrase.

nsubj

nsubj is used to mark the subject or agent of a clause. This most frequently occurs before a conjugated verb.

When the subject or agent refers to a copular verb (cop), the head of the nsubj dependent is the same as the head of the cop.

When “is” or “are” are the head of a preceding expl dependent, “is” or “are” are also the head of a following nominal nsubj dependent.

obl

In most sentences, an obl dependent is an NNP, NN, or PRP which has a head that is a VB*, JJ, or RB.

An obl dependent can also be IN. Common IN words which can be obl dependents include: “with”, “for”, “like”, “from”, “about”, “into”, and “of”.

“Which” and “that” can be obl dependents even though they are WDT.

An obl dependent can also be JJ. This occurs in a variety of contexts.

An obl dependent can also be DET when it is preceded by a IN or TO. This occurs with the words “this”, “that”, “all”, and “another”.

obl or nmod

Making the distinction between whether a token takes an oblique (obl) or nominal modifier (nmod) dependency comes down to constituency. This distinction often is required when examining prepositional phrases. To make this distinction, one needs to determine the head of the prepositional phrase by asking the question: what is the prepositional phrase modifying? If the PP is modifying a noun phrase, or argument that is functioning as a noun phrase (see ‘another_DET’ below), the head of the PP is likely an nmod. Note that these cases of nmod often immediately follow the noun phrase they’re modifying.

If the PP in question modifies a verb, adjective, or adverb it will likely be tagged as obl. Oblique phrases can also immediately follow a noun phrase, so the position/location of the phrase in the sentence isn’t a foolproof heuristic. That is, you can’t determine the dependency of the phrase just by its relative position.

obl:npmod

This relation is used when a noun phrase is used as an adverbial modifier. This relation is often realized in: (i) measure phrases:

(iii) In the constructions “years old” and “years ago”:

(iv) to mark time by describing the frequency of a recurring event or state:

See obl:npmod vs. obl:tmod vs nmod:tmod

obl:npmod vs. nmod:npmod

In the uncommon situation where obl:npmod is used to describe the frequency of a recurring event or state, and both the head and the dependent of the obl:npmod relationship are nouns, then the nmod:npmod tag is used instead.

obl:npmod vs. obl:tmod vs nmod:tmod

obl:tmod and nmod:tmod dependencies are nominal obl and nmod dependencies which specify time.

The obl:tmod dependency marks nominal indicators of time of which are headed by a VB*, JJ, or RB.

The nmod:tmod dependency is used when a nominal indicator of time is headed by another nominal construction.

See nmod or obl for more.

When a temporal word is not headed by a nominal construction (and is therefore a type of obl rather than an nmod:tmod) it is most often an obl:tmod dependent, with 2 exceptions where it is instead an obl:npmod dependent. (Note that the corpora are inconsistent in this distinction).

Use obl:npmod instead of obl:tmod:

1) In the constructions “years old” and “years ago”.

2) To describe the frequency of a recurring event or state.

parataxis

Parataxis is used in discourse-feeling constructions such as “you know” or “I mean” when they interrupt a clause.

If these discourse-feeling constructions introduce a clause, then it is a higher syntactic unit (such as root or a conj dependent headed by the root) which heads a ccomp dependent.

Parataxis is used for “a pair of what could have been standalone sentences, but which are being treated together as a single sentence” (UD Guidelines). In this case, the first sentence is the head of the following sentences, which are tagged as parataxis dependents.

When a parataxis dependent breaks syntactic structure by occurring in-between two elements that should be adjacent, the parataxis dependent is headed by the same syntactic unit it interrupts.

In this case, the obl dependent is interrupted by a parataxis dependent (by separating the case dependent from its head), so the parataxis dependent is also headed by the head of the obl dependent (the word ).

punct

Punctuation is, unfortunately, a slightly tedious element of dependency annotation. Different projects have adhered to different guidelines and processes for annotating punctuation. This means that the corpuses are tagged rather inconsistently when it comes to punctuation, meaning that they likely shouldn’t be used as a guide. Therefore, this section, which is based on the UD version 2, will act as our guide for tagging punctuation.

1) Sentence level punctuation (periods, question marks, exclamation marks etc.)

Sentence level punctuation is attached to the root of the sentence.

2) Clause & phrase level punctuation (commas, quotations, parentheses, occasionally hyphens etc. )

Below are four rules, extrapolated from the UD guidelines page, that are intended to be clear and directional.

(i) Punctuation preceding or following a dependent clause or phrase is attached to the head of that clause (subordinate clauses like relative clauses, adverbial clauses, parataxis, obl, npmods etc.).

When there are sequential dependent clauses/phrases (at the same syntactic level), then the following clause takes precedence over the comma separating the two clauses:

(ii) Punctuation separating coordinated units attaches to the following conjunct, in accordance with rule (i).

(iii) Within the relevant clause or phrase, punctuation is attached to the highest syntactic unit (the head) of that clause or phrase.

(iv) Paired punctuation (quotation, parentheses) is attached to the same unit.

3) Word level punctuation (word connecting hyphens )

Hyphens, or dashes, connecting two or more words are attached to the higher level (dominant) unit. The dominant unit in these cases is the word following the hyphen.

punct special cases

, commas around cc, discourse, and advmods

Discourse, ccs, and advmods only govern punctuation at last resort. Rather, the punctuation that often appears either preceding or proceeding these tags is a dependent of the same unit that governs the cc, discourse, or advmod tag.

, commas around appos and dislocated dependents.

Appos and dislocated dependents can both govern punctuation. In the first example below, the appos dependent also heads a flat dependent.

` : `

Colons are treated slightly differently than other clause-level punctuation. Colons are governed by the higher level clause, NOT the dependent clause.

reparandum

A reparandum dependent is a disfluency which is overridden by a repair. The repair is the head of the reparandum dependent.

Overridden disfluencies occur when correcting an utterance:

or repeating an utterance:

xcomp

External complements xcomp are given when a clause has no overt subject AND the implied subject is the same as the head clause (see ccomp or xcomp).

In most sentences, an xcomp dependent has an implied subject and is a VB or JJ which has a head that is a type of VB or JJ.

VB is often xcomp when firstly preceded by an infinite TO, which is preceded by a type of head VB.

VBG is often an xcomp when the subject is implied from the head (which is a type of VB).

JJ can be an xcomp when it modifies the action of the head. This only happens when the head is a type of VB. This can happen in imperative contexts.

Notice that xcomp can happen in imperative contexts as both a JJ and/or a type of VB.

Clarifications and special cases (particular words)

Copular be

In most sentences, the main verb of the first independent clause in a sentence is the “Root” of the sentence. In the example The hungry person ate the pizza, the syntactic head of ate is the “root” of the sentence (which isn’t represented aside from the root tag.)

In constructions with copular be (e.g., She is a professor.), however, the root is the predicate (in this case, the nominal complement professor). The word professor has three dependents She via an nsubj relationship, is via a cop relationship, and a via a det relationship.

In copular constructions with clausal predicates, however, as in The important thing is to keep calm, the “normal” conventions are used (is is the root)

elipses

Sometimes ellipses represent words lost during transcription, and their dependency relationship can be inferred based on context.

When the dependency relationship of ellipses cannot be inferred based on context, they are tagged as a punct dependent.

so

The word so will most often be in an advmod relationship:

said and think

The dependency tag obj is used when the verb said or think modifies a non-clausal construction.

The dependency tag ccomp is used when the verb said or think modifies a clausal construction.

Note: In more discourse contexts, think is often tagged as parataxis. See (parataxis).

Clarifications and special cases (multi-word)

one day last week

This utterance can be both common and difficult to tag.

day should head the nummod dependent one, the nmod:tmod dependent week, and often any punctuation which seperates the phrase from the root.

day is most often an obl:tmod dependent headed by a verb.

Sentences with only discourse

If a sentence contains only discourse, and the discourse elements are all at the same syntactic level, the root is the first discourse element in the sentence. The root is the head of the remaining discourse dependents.

take care …

When the construction take care precedes a nominal, it should be tagged as follows.

1) care is an obj dependent of the verb take.

2) The nominal is an nmod dependent headed by care.

Note: Pronouns are also nominals.

3) The nominal tends to head words between the nominal and the noun care.

Words with TO and IN POS special cases

When words with the POS TO or IN occur before punctuation, or at the end of a sentence, if they have no other option than to be headed by a preceding verb, they are generally tagged as obl dependents of that verb.

If one of these words is unambiguously an infinitival to, then it is instead tagged as an xcomp dependent.