**4. Tools and resources: from WordNet to FrameNet** *et alia*

After having reconstructed some basic steps of the more than bi-millennial thread of philosophic-linguistic thought, the next move must be that one of a recognition of data, in order to check the validity of theoretical contributions. From this point of view we are now in a privileged position, that of scholars favoured by the creation of a specific area of studies, computational linguistics and related resources, which support and provide inspiration to the theorists.

From the works of ancient grammarians to those of present-day linguists, the interplay between data and theory has always been of vital importance in developing sound, deep competences.

The work is hard, but well worth the effort and avoids restricting ourselves to armchair philosophy (Austin 1956/57) or armchair linguistics (Fillmore 1992).

**"Armchair linguistics** – writes Fillmore -does not have a good name in some linguistics circles. A caricature of the armchair linguist is something like this. He sits in a deep soft comfortable armchair, with his eyes closed and his hands clasped behind his head. Once in a while he opens his eyes, sits up abruptly shouting, "Wow, what a neat fact!", grabs his pencil, and writes something down. Then he paces around for a few hours in the excitement of having come still closer to knowing what language is really like.

(There isn't anybody exactly like this, but there are some approximations).

**Corpus linguistics** does not have a good name in some linguistics circles. A caricature of the corpus linguist is something like this. He has all of the primary facts that he needs, in the form of a corpus of approximately one zillion running words, and he sees his job as that of deriving secondary facts from his primary facts. At the moment he is busy determining the 72 Semantics – Advances in Theories and Mathematical Models

saturation (or, rather, as the mark of an ended task). No morpheme, no lexeme proper; intonation, rather, an unsuspended one; word order, possibly. But most of all, the plain intonation of an assertion contrasted with, for example, the rising intonation of a question. What does this mean? Different authors in different contexts have underlined the presence of a covert constituent in judging: the personal assent or dissent which determines the affirmative or negative structure of predication itself in assertions, and constitutes its

After having quoted Frege's expressions on this point (3.3), let us recall Brentano's statements about the role of assenting (or dissenting) while judging: once an object is given in presentation, with our judgements we express its acceptance or rejection (Brentano, 1995). This way of considering the further commitment involved through an act of judgement helps us gain a unified perspective on the two different kinds of questions mentioned in § 2.3. Any assertion – this is the suggestion – qualifies itself as a yes or no answer, even if apparently no question at all has generated it; completive questions just pave the way for oriented questions. Answers will then confirm or deny the orientation proposed, thus underlining the strict relationship between predicate as sentence-centre and predication as basic syntagmatic act, whatever illocutionary act may follow, be it an assertion or not.

After having reconstructed some basic steps of the more than bi-millennial thread of philosophic-linguistic thought, the next move must be that one of a recognition of data, in order to check the validity of theoretical contributions. From this point of view we are now in a privileged position, that of scholars favoured by the creation of a specific area of studies, computational linguistics and related resources, which support and provide inspiration to

From the works of ancient grammarians to those of present-day linguists, the interplay between data and theory has always been of vital importance in developing sound, deep

The work is hard, but well worth the effort and avoids restricting ourselves to armchair

**"Armchair linguistics** – writes Fillmore -does not have a good name in some linguistics circles. A caricature of the armchair linguist is something like this. He sits in a deep soft comfortable armchair, with his eyes closed and his hands clasped behind his head. Once in a while he opens his eyes, sits up abruptly shouting, "Wow, what a neat fact!", grabs his pencil, and writes something down. Then he paces around for a few hours in the excitement

**Corpus linguistics** does not have a good name in some linguistics circles. A caricature of the corpus linguist is something like this. He has all of the primary facts that he needs, in the form of a corpus of approximately one zillion running words, and he sees his job as that of deriving secondary facts from his primary facts. At the moment he is busy determining the

**4. Tools and resources: from WordNet to FrameNet** *et alia*

philosophy (Austin 1956/57) or armchair linguistics (Fillmore 1992).

of having come still closer to knowing what language is really like.

(There isn't anybody exactly like this, but there are some approximations).

illocutionary force.

the theorists.

competences.

relative frequencies of the eleven parts of speech as the first word of a sentence versus as the second word of a sentence.

(There isn't anybody exactly like this, but there are some approximations).

These two don't speak to each other very often, but when they do, the corpus linguist says to the armchair linguist, 'Why should I think that what you tell me is true?', and the armchair linguist says to the corpus linguist, 'Why should I think that what you tell me is interesting?'" (Fillmore, 1992).

By 'linguistic resources' we mean "Collections of data which primarily document communicative acts of humans by some form of recording and/or descriptions, both directly as in corpora, or at higher levels of abstraction in lexicons and ontologies. The primary data can be text, video recording and/or audio tracks."33

In 2010 a new initiative was launched by LREC (Language Resources and Evaluation Conference) in its 7th edition, the

Compilation of a *Map of Language Resources, Technologies and Evaluation*, "a collective enterprise of the LREC community, as a first step towards the creation of a very broad, community-built, Open Resource Infrastructure; […] The map was intended to monitor the use and creation of language resources (datasets, tools, etc.)"34.

We will now mention some of the main resources available, which can enable data collection *and* annotation at different levels about them, in a bottom-up direction.

#### **4.1 WordNet and MultiWordNet**

Firstly we shall start with lexical units, just words: WordNet "is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptualsemantic and lexical relations"35. From the point of view of our subject, predication and predicate-argument relationship, of particular note is that "The majority of the WordNet's relations connect words from the same part of speech (POS). Thus, WordNet really consists of four sub-nets, one each for nouns, verbs, adjectives and adverbs, with few cross-POS pointers. Cross-POS relations include the "morphosemantic" links that hold among semantically similar words sharing a stem with the same meaning: observe (verb), observant (adjective) observation, observatory (nouns). In many of the noun-verb pairs the semantic role of the noun with respect to the verb has been specified: {sleeper, sleeping\_car} is the LOCATION for {sleep} and {painter}is the AGENT of {paint}, while {painting, picture} is its RESULT."

MultiWordNet is a multilingual lexical database, aligned with Princeton WordNet36.

http://multiwordnet.fbk.eu/english/home.php

<sup>33</sup> From the Glossary of INTERA project:: http://www.mpi.nl/INTERA/ 34 http://www.informatik.uni-trier.de/~ley/db/conf/lrec/lrec2010.html: see especially section 0.33, Question Answering.

<sup>35</sup> http://wordnet.princeton.edu/

<sup>36</sup> http://multiwordnet.fbk.eu/english/home.php

Queries and Predicate – Argument Relationship 75

cases, first, and then frames. "The FrameNet project is building a lexical database of English that is both human- and machine-readable, based on annotating examples of how words are used in actual texts. From the student's point of view, it is a dictionary of more than 10,000 word senses, most of them with annotated examples that show the meaning and usage. For the researcher in Natural Language Processing, the more than 170,000 manually annotated sentences provide a unique training dataset for semantic role labeling, used in applications such as information extraction, machine translation, event recognition, sentiment analysis, etc. For students and teacher of linguistics it serves as a valence dictionary, with uniquely detailed evidence for the combinatorial properties of a core set of the English vocabulary." As it is already evident both from a strategic, epistemological point of view and from a practical one, resource compatibility and unification are highly appreciable and not only as a goal to be pursued in the future. SemLink, for instance, is "the effort to map between complementary lexical resources: WordNet, FrameNet , VerbNet and PropBank. The goal is to develop a broad-coverage, unified English resource that has a fine granularity and rich semantics of Word-Net and Frame-Net, that is a platform for syntactically based generalizations based on VerbNet, and that provides PropBank style effective training data

We would like to conclude our quick survey by quoting Martha Palmer's words at the conclusion of the same paper: "Efforts to link the PropBank/VerbNet and FrameNet resources to one another and to WordNet, and to define semantics for the roles used by each resource, are a likely avenue for future improvements in semantic role labeling systems, and will benefit Question-Answering, Information Extraction and other NLP applications." Let's

We have considered the differences between questions and requests, and their co-presence

Because of the so-called "descriptive fallacy" in philosophy of language, it took rather a long time to give them the attention they were due. Thanks to pragmatics, this oversight has been

Asking questions testifies to the strong relationship between lack of determinacy (poverty, both in knowledge and in action) and the need to overcome it (in order to attain plentifulness). Interrogative structures are devices where triggers such as *wh*-words or suspended assent are at work to retrieve missing information, extract knowledge, or receive

Answers are therefore not only assertions, but also permissions, prohibitions, orders, suggestions, etc. The logico-linguistic structure which is always required across this variety of speech acts and which makes possible the wording of questions and requests is predication. Even in elliptical or simply verbless sentences, predication is at work albeit implicit or implied. To be at work means that it is a necessary condition for the complete efficiency and comprehensibility of the sentence itself. To be at work, then, means that the addressee/hearer/reader has to bear in mind, or retrieve, the predication, where the absence of recognition would prevent him/her from understanding the meaning, i.e. the

for supervised Machine Learning techniques." (Palmer, 2009)

pursue such avenues.

in the structure of queries.

the cooperation requested.

**5. Conclusion** 

rectified.
