**4.3 PropBank** *et relata*

From lexicon (and lexicography) through syntax: the step towards propositions has been taken and the results can be viewed in the realised and on-going project of PropBank, which adds predicate-argument relations to the syntactic trees of Penn-Treebank (concerning English language), thus achieving a corpus of text annotated with information about basic semantic propositions. In connection with this project, a continuation aims at creating Parallel PropBanks (the English-Chinese Treebank/PropBank)38.

Based upon PropBank, once again top-down observation and analysis has been carried out, generating a verb index39 (a system which merges links and web pages from four different natural language processing projects) and an index of nouns40, the goal of which is to mark the sets of arguments that cooccur with nouns in PropBank. They are the Unified Verb Index and Nombank.

### **4.4 FrameNet and Semlink. Towards increasing semantic annotation and resource combination**

In order to expand the annotation from the syntactic to the semantic level and to achieve frames passing through verbs and valences, other resources have been produced and are still under construction, their development being possible in relation to different corpora and languages: VerbNet41 (the largest on-line verb lexicon currently available for English) and valence lexica42 according to the PDT-ValencyLexicon43 model.

The most refined annotation on the semantic level of predicate-argument relationship is still provided by FrameNet44, Fillmore's Project, is consistent with his life-long research into

 37 The list of the first seven conferences is published at http://tlt8.unicatt.it/Links.htm ; the addresses of the last three edition are the following: http://tlt8.unicatt.it/ ; http://math.ut.ee/tlt9/index.html ;

http://tlt10.cl.uni-heidelberg.de/ See also, for a case study regarding a particular predicative structure,

<sup>(</sup>Bamman, Passarotti*,* Crane, 2008). 38 http://verbs.colorado.edu/~mpalmer/projects/ace.html 39 http://verbs.colorado.edu/verb-index/

<sup>40</sup> http://nlp.cs.nyu.edu/meyers/NomBank.html

<sup>41</sup> http://verbs.colorado.edu/~mpalmer/projects/verbnet.html 42 Cf. http://jochenleidner.posterous.com/english-valency-lexicon-online

<sup>43</sup> See (Hajič, J., Panevová, J., Urešová, Z., Bémová, A., Kolářová, V., Pajas, P., 2003 ) and

http://ufal.mff.cuni.cz/PDT-Vallex/ PDT-Vallex contains at the time of writing (January 2012) over

<sup>11000</sup> valency frames for more than 7000 verbs. It has been built in close connection with the Prague Czech-English Dependency Treebank project.

<sup>44</sup> https://framenet.icsi.berkeley.edu/fndrupal/

74 Semantics – Advances in Theories and Mathematical Models

The creation of annotated corpora at different levels (layers) constitutes a further development and a sound premise for a good selection of metadata. Here, we refer only to the creation of annotated corpora in a great deal of different languages at the syntactic level, treebanks, and to the systematically planned discussion about the relationship between annotation as such, and the adoption of apparatus according to which annotation needs to be done (not only manually, but also automatically, of course): the Treebanks and Linguistic

From lexicon (and lexicography) through syntax: the step towards propositions has been taken and the results can be viewed in the realised and on-going project of PropBank, which adds predicate-argument relations to the syntactic trees of Penn-Treebank (concerning English language), thus achieving a corpus of text annotated with information about basic semantic propositions. In connection with this project, a continuation aims at creating

Based upon PropBank, once again top-down observation and analysis has been carried out, generating a verb index39 (a system which merges links and web pages from four different natural language processing projects) and an index of nouns40, the goal of which is to mark the sets of arguments that cooccur with nouns in PropBank. They are the Unified Verb Index

**4.4 FrameNet and Semlink. Towards increasing semantic annotation and resource** 

In order to expand the annotation from the syntactic to the semantic level and to achieve frames passing through verbs and valences, other resources have been produced and are still under construction, their development being possible in relation to different corpora and languages: VerbNet41 (the largest on-line verb lexicon currently available for English)

The most refined annotation on the semantic level of predicate-argument relationship is still provided by FrameNet44, Fillmore's Project, is consistent with his life-long research into

37 The list of the first seven conferences is published at http://tlt8.unicatt.it/Links.htm ; the addresses of the last three edition are the following: http://tlt8.unicatt.it/ ; http://math.ut.ee/tlt9/index.html ; http://tlt10.cl.uni-heidelberg.de/ See also, for a case study regarding a particular predicative structure,

Parallel PropBanks (the English-Chinese Treebank/PropBank)38.

and valence lexica42 according to the PDT-ValencyLexicon43 model.

41 http://verbs.colorado.edu/~mpalmer/projects/verbnet.html 42 Cf. http://jochenleidner.posterous.com/english-valency-lexicon-online

43 See (Hajič, J., Panevová, J., Urešová, Z., Bémová, A., Kolářová, V., Pajas, P., 2003 ) and

http://ufal.mff.cuni.cz/PDT-Vallex/ PDT-Vallex contains at the time of writing (January 2012) over 11000 valency frames for more than 7000 verbs. It has been built in close connection with the Prague

(Bamman, Passarotti*,* Crane, 2008). 38 http://verbs.colorado.edu/~mpalmer/projects/ace.html 39 http://verbs.colorado.edu/verb-index/

40 http://nlp.cs.nyu.edu/meyers/NomBank.html

Czech-English Dependency Treebank project. 44 https://framenet.icsi.berkeley.edu/fndrupal/

**4.2 Treebanks and annotated corpora** 

Theories (TLT) conference series37.

**4.3 PropBank** *et relata* 

and Nombank.

**combination** 

cases, first, and then frames. "The FrameNet project is building a lexical database of English that is both human- and machine-readable, based on annotating examples of how words are used in actual texts. From the student's point of view, it is a dictionary of more than 10,000 word senses, most of them with annotated examples that show the meaning and usage. For the researcher in Natural Language Processing, the more than 170,000 manually annotated sentences provide a unique training dataset for semantic role labeling, used in applications such as information extraction, machine translation, event recognition, sentiment analysis, etc. For students and teacher of linguistics it serves as a valence dictionary, with uniquely detailed evidence for the combinatorial properties of a core set of the English vocabulary."

As it is already evident both from a strategic, epistemological point of view and from a practical one, resource compatibility and unification are highly appreciable and not only as a goal to be pursued in the future. SemLink, for instance, is "the effort to map between complementary lexical resources: WordNet, FrameNet , VerbNet and PropBank. The goal is to develop a broad-coverage, unified English resource that has a fine granularity and rich semantics of Word-Net and Frame-Net, that is a platform for syntactically based generalizations based on VerbNet, and that provides PropBank style effective training data for supervised Machine Learning techniques." (Palmer, 2009)

We would like to conclude our quick survey by quoting Martha Palmer's words at the conclusion of the same paper: "Efforts to link the PropBank/VerbNet and FrameNet resources to one another and to WordNet, and to define semantics for the roles used by each resource, are a likely avenue for future improvements in semantic role labeling systems, and will benefit Question-Answering, Information Extraction and other NLP applications." Let's pursue such avenues.
