**Intelligent Information Access Based on Logical Semantic Binding Method**

Rabiah A. Kadir1, T.M.T. Sembok2 and Halimah B. Zaman2 *1Universiti Putra Malaysia /Faculty of Computer Science and IT 2Univeristi Kebangsaan Malaysia /Faculty of Science and IT Malaysia* 

#### **1. Introduction**

The idea of the computer system capable of simulating understanding with respect to reading a document and answering questions pertaining to it has attracted researchers since the early 1970s. Currently, the information access has received increased attention within the natural language processing (NLP) community as a means to develop and evaluate robust question answering methods. Most recent work has stressed the value of information access as a challenge in terms of their targeting successive skill levels of human performance and the existence of independently developed scoring algorithm and human performance measures. It is an exciting research implementation in natural language understanding, because it requires broad-coverage techniques and semantic knowledge which can be used to determine the strength of understanding the natural language in computer science.

In 2003, MITRE Corporation defined a new research paradigm for natural language processing (NLP) by implementing question answering system on reading comprehension. Reading comprehension offers a new challenge and a human-centric evaluation paradigm for human language technology. It is an exciting testbed for research in natural language understanding towards the information access research problem.

The current state-of-the-art development in computer-based language understanding makes reading comprehension system as a good project (Hirschman et al., 1999). It can be a valuable state-of-the-art tool to access natural language understanding. It has been proven by series of work on question answering for reading comprehension task, and it reported an accuracy of 36.3% (Hirschman et al., 1999) on answering the questions in the test of stories. Subsequently, the work of Charniak et al. (2000), Riloff & Thelen (2000), Ng et al. (2000) and Bashir et al. (2004) achieved 41%, 39.8%, 23.6% and 31.6%-42.8% accuracy, respectively. However, all of the above systems used a simple bag-of word matching, bag-of verb stem, hand-crafted heuristic rules, machine learning and advanced BOW and BOV approach. In contrast, this topic will discuss a logic representation and logical deduction approach for an inference. We aim to expand upon proposed logical formalisms towards semantic for question answering rather than just on surface analysis. Set of words, lexical and semantic clues, feature vector and a list of word token were utilized for knowledge representation in this approach.

Intelligent Information Access Based on Logical Semantic Binding Method 139

distinguished based on its answers; either explicitly or implicitly as stated in the text. The goal of using logical semantic binding approach over logical forms has allow for more complex cases, such as in *Why* question where the information extracted is an implicit context from a text passage. The types of questions conducted using this approach are considered as causal antecedent, causal consequent, instrumental or procedural, concept

The enhancement of logical-linguistic also depends on the discourse understanding from the external knowledge as an additional input in order to understand the text query and produce as its output some description or hypernyms of the information conveyed by the text. World knowledge is a knowledge about the world, that is, particularly referred to the experience or compilations of experience with other information that are not referring to a particular passage that is being asked and it would be true in real world. Real world knowledge refers to the type of knowledge from the end-user, the architectural or implementation knowledge from the software developer and other levels of knowledge as well. World knowledge is used to support the information extraction procedure and to broaden the scope of information access based on the theory of cognitive psychology (Ram & Moorman, 2005). However, several research which started in 2001, tried to exploit world knowledge to support the information extraction (Golden & Goldman, 2001; Ferro et al., 2003). Information access task is retrieves a set of most relevant answer literal for a query is attempted. Therefore, binding are performed between the query given and the stored documents that are represented in Pragmatic Skolemize Clauses logical form. This chapter presents a comprehensive discussion of how logical semantic binding approach is practical

In addition to handling the semantic of a language which involves in ascertaining the meaning of a sentence, this section describes the nature of reading comprehension that includes the understanding of a story. Generally, the understanding of a document can be deciphered based on case-by-case sentences. This can be done by sentence understanding through the study of context-independent meaning within individual sentence which must include event, object, properties of object, and the thematic role relationship between the event and the object in the sentences. Based on this theory of sentence understanding, an experiment was executed based on logical linguistics and DCG was chosen as the basis of

Document understanding focused on inferential processing, common sense reasoning, and world knowledge which are required for in-depth understanding of documents. These efforts are concerned with specific aspects of knowledge representation, an inference technique, and

The challenge to computer systems on reading a document and demonstrating understanding through question answering was first addressed by Charniak (1972) in Dalmas et al. (2004). This work showed the diversity of both logical and common sense

question types (Hirschman et al. 1999; Lehnert et al. 1983; Grohe & Segoyfin 2000).

completion, judgemental and feature specification.

to access the information semantically.

**2. Syntax-semantic formalism** 

semantic translation.

**2.1 Document understanding** 

This topic describes a method for natural language understanding that concerned with the problem of generating an automated answer for open-ended question answering processes that involve open-ended questions (ie. WHO, WHAT, WHEN, WHERE and WHY). The problem of generating an automated answer involves the context of sophisticated knowledge representation, reasoning, and inferential processing. Here, an existing resolution theorem prover with the modification of some components will be explained based on experiments carried out such as: knowledge representation, and automated answer generation. The answers to the questions typically refer to a string in the text of a passage and it only comes from the short story associated with the question, even though some answers require knowledge beyond the text in the passage. To provide a solution to the above problem, the research utilizes world knowledge to support the answer extraction procedure and broadening the scope of the answer, based on the theory of cognitive psychology (Lehnert, 1981, Ram & Moorman, 2005). The implementation used the backward-chaining deduction reasoning technique of an inference for knowledge based which are represented in simplified logical form. The knowledge based representation known as Pragmatic Skolemized Clauses, based on first order predicate logic (FOPL) using Extended Definite Clause Grammar (X-DCG) parsing technique to represent the semantic formalism.

This form of knowledge representation implementation will adopt a translation strategy which involves noun phrase grammar, verb phrase grammar and lexicon. However, the translation of stored document will only be done partially based on the limited grammar lexicon. The queries will be restricted to verb and noun phrase form to particular document. The restriction adopted in the query is appropriate, since the objective is to acquire inductive reasoning between the queries and document input. Logical-linguistic representation is applied and the detailed translation should be given special attention. This chapter deals with question answering system where the translation should be as close as possible to the real meaning of the natural language phrases in order to give an accurate answer to a question. The aim of the translation is to produce a good logical model representation that can be applied to information access process and retrieve an accurate answer. This means that logical-linguistic representation of semantic theory chosen is practically correct for the intended application.

The representation of questions and answers, and reasoning mechanisms for question answering is of concern in this chapter. To achieve a question answering system that is capable of generating the automatic answers for all types of question covered, implementation of logical semantic binding with its argument into existing theorem prover technique will describe in this chapter. Different types of questions require the use of different strategies to find the answer. A semantic model of question understanding and processing is needed, one that will recognize equivalent questions, regardless of the words, syntactic inter-relations or idiomatic forms. The process of reasoning in generating an automated answer began with the execution of resolution theorem proving. Then, the answer extraction proceeded with logical semantic binding approach to continue tracking the relevant semantic relation rules in knowledge base, which contained the answer key in skolem constant form that can be bounded. A complete relevant answer is defined as a set of skolemize clauses containing at least one skolem constant that is shared and bound to each other. The reasoning technique adopted by the system to classify answers, can be classed into two types: satisfying and hypothetical answers. Both classes were formally

This topic describes a method for natural language understanding that concerned with the problem of generating an automated answer for open-ended question answering processes that involve open-ended questions (ie. WHO, WHAT, WHEN, WHERE and WHY). The problem of generating an automated answer involves the context of sophisticated knowledge representation, reasoning, and inferential processing. Here, an existing resolution theorem prover with the modification of some components will be explained based on experiments carried out such as: knowledge representation, and automated answer generation. The answers to the questions typically refer to a string in the text of a passage and it only comes from the short story associated with the question, even though some answers require knowledge beyond the text in the passage. To provide a solution to the above problem, the research utilizes world knowledge to support the answer extraction procedure and broadening the scope of the answer, based on the theory of cognitive psychology (Lehnert, 1981, Ram & Moorman, 2005). The implementation used the backward-chaining deduction reasoning technique of an inference for knowledge based which are represented in simplified logical form. The knowledge based representation known as Pragmatic Skolemized Clauses, based on first order predicate logic (FOPL) using Extended Definite Clause Grammar (X-DCG)

This form of knowledge representation implementation will adopt a translation strategy which involves noun phrase grammar, verb phrase grammar and lexicon. However, the translation of stored document will only be done partially based on the limited grammar lexicon. The queries will be restricted to verb and noun phrase form to particular document. The restriction adopted in the query is appropriate, since the objective is to acquire inductive reasoning between the queries and document input. Logical-linguistic representation is applied and the detailed translation should be given special attention. This chapter deals with question answering system where the translation should be as close as possible to the real meaning of the natural language phrases in order to give an accurate answer to a question. The aim of the translation is to produce a good logical model representation that can be applied to information access process and retrieve an accurate answer. This means that logical-linguistic representation of semantic theory chosen is

The representation of questions and answers, and reasoning mechanisms for question answering is of concern in this chapter. To achieve a question answering system that is capable of generating the automatic answers for all types of question covered, implementation of logical semantic binding with its argument into existing theorem prover technique will describe in this chapter. Different types of questions require the use of different strategies to find the answer. A semantic model of question understanding and processing is needed, one that will recognize equivalent questions, regardless of the words, syntactic inter-relations or idiomatic forms. The process of reasoning in generating an automated answer began with the execution of resolution theorem proving. Then, the answer extraction proceeded with logical semantic binding approach to continue tracking the relevant semantic relation rules in knowledge base, which contained the answer key in skolem constant form that can be bounded. A complete relevant answer is defined as a set of skolemize clauses containing at least one skolem constant that is shared and bound to each other. The reasoning technique adopted by the system to classify answers, can be classed into two types: satisfying and hypothetical answers. Both classes were formally

parsing technique to represent the semantic formalism.

practically correct for the intended application.

distinguished based on its answers; either explicitly or implicitly as stated in the text. The goal of using logical semantic binding approach over logical forms has allow for more complex cases, such as in *Why* question where the information extracted is an implicit context from a text passage. The types of questions conducted using this approach are considered as causal antecedent, causal consequent, instrumental or procedural, concept completion, judgemental and feature specification.

The enhancement of logical-linguistic also depends on the discourse understanding from the external knowledge as an additional input in order to understand the text query and produce as its output some description or hypernyms of the information conveyed by the text. World knowledge is a knowledge about the world, that is, particularly referred to the experience or compilations of experience with other information that are not referring to a particular passage that is being asked and it would be true in real world. Real world knowledge refers to the type of knowledge from the end-user, the architectural or implementation knowledge from the software developer and other levels of knowledge as well. World knowledge is used to support the information extraction procedure and to broaden the scope of information access based on the theory of cognitive psychology (Ram & Moorman, 2005). However, several research which started in 2001, tried to exploit world knowledge to support the information extraction (Golden & Goldman, 2001; Ferro et al., 2003).

Information access task is retrieves a set of most relevant answer literal for a query is attempted. Therefore, binding are performed between the query given and the stored documents that are represented in Pragmatic Skolemize Clauses logical form. This chapter presents a comprehensive discussion of how logical semantic binding approach is practical to access the information semantically.

## **2. Syntax-semantic formalism**

In addition to handling the semantic of a language which involves in ascertaining the meaning of a sentence, this section describes the nature of reading comprehension that includes the understanding of a story. Generally, the understanding of a document can be deciphered based on case-by-case sentences. This can be done by sentence understanding through the study of context-independent meaning within individual sentence which must include event, object, properties of object, and the thematic role relationship between the event and the object in the sentences. Based on this theory of sentence understanding, an experiment was executed based on logical linguistics and DCG was chosen as the basis of semantic translation.

#### **2.1 Document understanding**

Document understanding focused on inferential processing, common sense reasoning, and world knowledge which are required for in-depth understanding of documents. These efforts are concerned with specific aspects of knowledge representation, an inference technique, and question types (Hirschman et al. 1999; Lehnert et al. 1983; Grohe & Segoyfin 2000).

The challenge to computer systems on reading a document and demonstrating understanding through question answering was first addressed by Charniak (1972) in Dalmas et al. (2004). This work showed the diversity of both logical and common sense

Intelligent Information Access Based on Logical Semantic Binding Method 141

vi. World knowledge – this includes the general knowledge pertaining to the structure of the world that the language user must have in order to, for example, maintain a conversation. It includes what each language user must know about the other user's

There is various textual entities in a document that must be recognized. Following are an

These entities may be detected using various techniques. Regular expressions and pattern matching are often used (Mueller 1999; Zamora 2004; Li & Mitchell 2003). For example, in the system ThoughtTreasure developed by Mueller (1998), provides text agents for recognizing lexical entries, names, places, times, telephone numbers, media objects,

A document understanding system must resolve various anaphoric entities on the objects to which they refer (Mitkov 1994). Examples of anaphoric entities are pronouns (*she*, *they*),

possessive determiners (*my*, *his*), and arbitrary constructions involving the following:

Relative clauses (*the \$1,200 they had forced him to give them*, *the milk that fell on the floor*)

be expanded, thus becomes useful for the document understanding application.

Anaphora resolution is a difficult problem to tackle. However, in this research, the anaphora resolution will be attained by adding world knowledge as an input to the original passage.

A commonsense knowledge base is a useful resource for a document understanding system. Most importantly, the commonsense knowledge base can evolve along with the document understanding system. Whenever a piece of commonsense knowledge comes in handy in the document understanding system, it can be added to the database. The database can then

The above databases have various advantages and disadvantages such as WordNet (Fellbaum 1998), which was designed as a lexical rather than a conceptual database. This

Indefinite and definite articles (*an elevator salesman*, *the shaft*, *the buffer springs*)

beliefs and goals. Recognizing Textual Entities

examples of textual entities:

Times: *yesterday*, last week

Names: *John J. Hug, Mary-Ann* Numbers: *\$1,200*, *12 inches*

Phrases: *freight elevator*, *buffer springs*

Places: *downtown Brooklyn*, East End

products, prices, and email headers.

Adjectives (*the pink milk*) Genitives (*Jim's milk*)

Names (*John J. Hug*)

Commonsense Knowledge Bases

Words: *was*, *pushed*

Anaphora

reasoning which needed to be linked together with what was said explicitly in the story or article and then to answer the questions about it. More recent works have attempted to systematically determine the feasibility of reading comprehension as a research challenge in terms of targeting successive skill levels of human performance for open domain question answering (Hirschman et al. 1999; Riloff & Thelen 2000; Charniak et al. 2000; Ng et al., 2000; Wang et al. 2000; Bashir et al. 2004; Clark et al. 2005). The work initiated by Hirschman (1999), also expressed the same data set. Earlier works from years 1999 until 2000 introduced the 'bag-of-word' to represent the sentence structure. Ferro et al. (2003) innovated knowledge diagram and conceptual graph to their sentence structure respectively. This thesis, however, shall focus on the logical relationship approach in handling syntactic and semantic variants to sentence structure. This approach will be discussed thoroughly in the following sections and chapters.

The input of document understanding is divided into individual sentences. Intersentential interactions, such as reference is an important aspect of language understanding and the task of sentence understanding. The types of knowledge that are used in analyzing an individual sentence (such as syntactic knowledge) are quite different from the kind of knowledge that comes into play in intersentential analysis (such as knowledge of discourse structure).

#### **2.1.1 Sentence understanding**

A sentence can be characterised as a linear sequence of words in a language. The output desired from a sentence understander must include the event, object, properties of object, and the thematic role relationship between the event and the object in the sentence (Ram & Moorman 2005). In addition, it is also desirable to include the syntactic parse structure of the sentence. A fundamental problem in mapping the input to the output in terms of showing sentence understanding is the high degree of ambiguity in natural language. Several types of knowledge such as syntactic and semantic knowledge can be used to resolve ambiguities and identify unique mappings from the input to the desired output. Some of the different forms of knowledge relevant for natural language understanding (Allen 1995; Doyle 1997; Mahesh 1995; Mueller 2003; Dowty et al. 1981; Capel et al. 2002; Miles 1997) are as follows:


vi. World knowledge – this includes the general knowledge pertaining to the structure of the world that the language user must have in order to, for example, maintain a conversation. It includes what each language user must know about the other user's beliefs and goals.

Recognizing Textual Entities

There is various textual entities in a document that must be recognized. Following are an examples of textual entities:

Words: *was*, *pushed*

140 Advances in Knowledge Representation

reasoning which needed to be linked together with what was said explicitly in the story or article and then to answer the questions about it. More recent works have attempted to systematically determine the feasibility of reading comprehension as a research challenge in terms of targeting successive skill levels of human performance for open domain question answering (Hirschman et al. 1999; Riloff & Thelen 2000; Charniak et al. 2000; Ng et al., 2000; Wang et al. 2000; Bashir et al. 2004; Clark et al. 2005). The work initiated by Hirschman (1999), also expressed the same data set. Earlier works from years 1999 until 2000 introduced the 'bag-of-word' to represent the sentence structure. Ferro et al. (2003) innovated knowledge diagram and conceptual graph to their sentence structure respectively. This thesis, however, shall focus on the logical relationship approach in handling syntactic and semantic variants to sentence structure. This approach will be discussed thoroughly in the

The input of document understanding is divided into individual sentences. Intersentential interactions, such as reference is an important aspect of language understanding and the task of sentence understanding. The types of knowledge that are used in analyzing an individual sentence (such as syntactic knowledge) are quite different from the kind of knowledge that

A sentence can be characterised as a linear sequence of words in a language. The output desired from a sentence understander must include the event, object, properties of object, and the thematic role relationship between the event and the object in the sentence (Ram & Moorman 2005). In addition, it is also desirable to include the syntactic parse structure of the sentence. A fundamental problem in mapping the input to the output in terms of showing sentence understanding is the high degree of ambiguity in natural language. Several types of knowledge such as syntactic and semantic knowledge can be used to resolve ambiguities and identify unique mappings from the input to the desired output. Some of the different forms of knowledge relevant for natural language understanding (Allen 1995; Doyle 1997; Mahesh 1995; Mueller 2003; Dowty et al. 1981; Capel et al. 2002;

i. Morphological knowledge – this concerns how words are constructed from more basic meaning units called morphemes. A morpheme is the primitive unit of meaning in a language. For example, the meaning of word *friendly* is derivable from the meaning of

iii. Semantic knowledge – this concerns what words mean and how these meanings combine in sentences to form sentence meanings. This involves the study of context-

iv. Pragmatic knowledge – this concerns how sentences are used in different situations and

v. Discourse knowledge – this concerns how the immediately preceding sentences affect the interpretation of the next sentence. This information is especially important for

interpreting pronouns and temporal aspects of the information conveyed.

the noun *friend* and the suffix *–ly*, which transforms a noun into an adjective. ii. Syntactic knowledge – this concerns how words can be put together to form correct sentences and determines what structural role each word plays in the sentence and

comes into play in intersentential analysis (such as knowledge of discourse structure).

following sections and chapters.

**2.1.1 Sentence understanding** 

Miles 1997) are as follows:

independent meaning.

what phrases are subparts of other phrases.

how its use affects the interpretation of a sentence.

Phrases: *freight elevator*, *buffer springs*

Times: *yesterday*, last week

Places: *downtown Brooklyn*, East End

Names: *John J. Hug, Mary-Ann*

Numbers: *\$1,200*, *12 inches*

These entities may be detected using various techniques. Regular expressions and pattern matching are often used (Mueller 1999; Zamora 2004; Li & Mitchell 2003). For example, in the system ThoughtTreasure developed by Mueller (1998), provides text agents for recognizing lexical entries, names, places, times, telephone numbers, media objects, products, prices, and email headers.

Anaphora

A document understanding system must resolve various anaphoric entities on the objects to which they refer (Mitkov 1994). Examples of anaphoric entities are pronouns (*she*, *they*), possessive determiners (*my*, *his*), and arbitrary constructions involving the following:

Adjectives (*the pink milk*)

Genitives (*Jim's milk*)

Indefinite and definite articles (*an elevator salesman*, *the shaft*, *the buffer springs*)

Names (*John J. Hug*)

Relative clauses (*the \$1,200 they had forced him to give them*, *the milk that fell on the floor*)

Anaphora resolution is a difficult problem to tackle. However, in this research, the anaphora resolution will be attained by adding world knowledge as an input to the original passage.

Commonsense Knowledge Bases

A commonsense knowledge base is a useful resource for a document understanding system. Most importantly, the commonsense knowledge base can evolve along with the document understanding system. Whenever a piece of commonsense knowledge comes in handy in the document understanding system, it can be added to the database. The database can then be expanded, thus becomes useful for the document understanding application.

The above databases have various advantages and disadvantages such as WordNet (Fellbaum 1998), which was designed as a lexical rather than a conceptual database. This

Intelligent Information Access Based on Logical Semantic Binding Method 143

Each function symbol and predicate symbol must have a particular arity. The arity need not be shown explicitly if it is understood. In any specific predicate logic language individual

If *fn* is a function symbol of arity *n*, and *t1, …, tn* are terms, then *fn(t1, …, tn)* is a (functional)

free\_vars( C,FreeVars ), free\_vars( [C0|Cs], Fvs, FVs )

If *Pn* is a predicate symbol of arity *n*, and *t1, …, tn* are terms, then *Pn(t1, …, tn)* is an atomic

proper\_noun( male, christopher). noun( bear, bears). Syntax of Well-Formed Formulas (Wffs): Every atomic formula is a wffs. If P is a wff, then so

and *x* is a variable, then *x*(*P*) and *x*(*P*) are wffs. is called the universal quantifier. is

Parentheses are not accounted with when there is no ambiguity, in which case and will have the highest priority, then and will have higher priority than , which, in turn will have higher priority than . For example, xP(x) yQ(y) P(a) Q(b) will be written

Every concurrence of *x* in *P*, not on the scope of some occurrence of *x* or *x*, is said to be free in *P* and bound in *xP* and *xP*. Every occurrence of every variable other than *x* that is free in *P* is also free in *xP* and *xP*. A wff with at least one free variable is called open, no

Syntactic Category: Below, is a syntactic category of English fragment covered by DCG that

SynCat = {S, NP, VP, DET, CNP, ProperN, ADJ, REL, CN, TV, IV, PP}

free variables are called closed, and an expression with no variables is called ground.

The elements of SynCat are symbols representing the English categories as follows:

 *Q*), and (*P* 

 *Q*). If *P* is a wffs

 *Q*), (*P Q*), (*P* 

called the existential quantifier. *P* is called the scope of quantification.

constant, variables, function symbols, and predicate symbols must be disjointed.

term.

example:

formula. example:

S: sentences

NP: noun phrases VP: verb phrases DET: determiners

CNP: common noun phrases

Syntax of Atomic Formulas:

is *P*. if *P* and *Q* are wffs, then so are (*P* 

instead of ((x(P(x)) y(Q(y))) (P(a) Q(b))).

is given by the set SynCat (Partee 2006; Partee 2001):

Syntax of Terms: Every individual constant and every variable are considered a term.

means that it lacks links between words in different syntactic categories. For example, there is no link between the noun *creation* and the verb *create*.

#### **2.2 First-order predicate logic syntax-semantic formalism**

A crucial component of understanding involves computing a representation of the meaning of sentences and texts. The notion of representation has to be defined earlier, because most words have multiple meanings known as senses (Fillmore & Baker 2000; Sturgill & Segre 1994; Vanderveen & Ramamoorthy 1997). For example, the word *cook* can be sensed as a verb and a sense as a noun; and still can be sensed as a noun, verb, adjective, and adverb. This ambiguity would inhibit the system from making appropriate inferences needed to model understanding.

To represent meaning, a more precised language is required. The tools to do this can be derived from mathematics and logic. This involves the use of formally specified representation languages. Formal languages are comprised of very simple building blocks. The most fundamental is the notion of an atomic symbol, which is distinguishable from any other atomic symbol that is simply based on how it is written.

#### **2.2.1 Syntax**

It is common, when using formal language in computer science or mathematical logic, to abstain from details of concrete syntax in term of strings of symbols and instead work solely with parse trees. The syntactic expressions of FOPLs consist of terms, atomic formulas, and well-formed formulas (wffs) (Shapiro 2000; Dyer 1996). Terms consist of individual constants, variables and functional terms. Functional terms, atomic formula, and wffs are nonatomic symbol structures. The atomic symbols of FOPLs are individual constants, variable, function symbols, and predicate symbols. Individual Constants comprised the following:


Variables comprised the following:


Function Symbols comprised the following:


Predicate Symbols comprised the following:


Each function symbol and predicate symbol must have a particular arity. The arity need not be shown explicitly if it is understood. In any specific predicate logic language individual constant, variables, function symbols, and predicate symbols must be disjointed.

Syntax of Terms: Every individual constant and every variable are considered a term.

If *fn* is a function symbol of arity *n*, and *t1, …, tn* are terms, then *fn(t1, …, tn)* is a (functional) term.

example:

142 Advances in Knowledge Representation

means that it lacks links between words in different syntactic categories. For example, there

A crucial component of understanding involves computing a representation of the meaning of sentences and texts. The notion of representation has to be defined earlier, because most words have multiple meanings known as senses (Fillmore & Baker 2000; Sturgill & Segre 1994; Vanderveen & Ramamoorthy 1997). For example, the word *cook* can be sensed as a verb and a sense as a noun; and still can be sensed as a noun, verb, adjective, and adverb. This ambiguity would inhibit the system from making appropriate inferences needed to

To represent meaning, a more precised language is required. The tools to do this can be derived from mathematics and logic. This involves the use of formally specified representation languages. Formal languages are comprised of very simple building blocks. The most fundamental is the notion of an atomic symbol, which is distinguishable from any

It is common, when using formal language in computer science or mathematical logic, to abstain from details of concrete syntax in term of strings of symbols and instead work solely with parse trees. The syntactic expressions of FOPLs consist of terms, atomic formulas, and well-formed formulas (wffs) (Shapiro 2000; Dyer 1996). Terms consist of individual constants, variables and functional terms. Functional terms, atomic formula, and wffs are nonatomic symbol structures. The atomic symbols of FOPLs are individual constants, variable, function symbols, and predicate symbols. Individual Constants comprised the

iii. Any character string not containing blanks or other punctuation marks. For example,

is no link between the noun *creation* and the verb *create*.

model understanding.

**2.2.1 Syntax** 

following:

**2.2 First-order predicate logic syntax-semantic formalism** 

other atomic symbol that is simply based on how it is written.

i. Any letter of the alphabet (preferable early) ii. Any (such) letter with a numeric subscript

i. Any letter of the alphabet (preferably late)

ii. Any (such) letter with a numeric subscript

Predicate Symbols comprised the following:

ii. Any (such) letter with a numeric subscript

i. Any letter of the alphabet (preferably early middle)

i. Any letter of the alphabet (preferably late middle)

Function Symbols comprised the following:

ii. Any (such) letter with a numeric subscript. For example, *x, xy, g7*.

iii. Any character string not containing blanks. For example, *noun, prep*.

iii. Any character string not containing blanks. For example, *read\_sentence, gensym*.

*Christopher, Columbia*.

Variables comprised the following:

free\_vars( C,FreeVars ), free\_vars( [C0|Cs], Fvs, FVs )

Syntax of Atomic Formulas:

If *Pn* is a predicate symbol of arity *n*, and *t1, …, tn* are terms, then *Pn(t1, …, tn)* is an atomic formula.

example:

proper\_noun( male, christopher). noun( bear, bears).

Syntax of Well-Formed Formulas (Wffs): Every atomic formula is a wffs. If P is a wff, then so is *P*. if *P* and *Q* are wffs, then so are (*P Q*), (*P Q*), (*P Q*), and (*P Q*). If *P* is a wffs and *x* is a variable, then *x*(*P*) and *x*(*P*) are wffs. is called the universal quantifier. is called the existential quantifier. *P* is called the scope of quantification.

Parentheses are not accounted with when there is no ambiguity, in which case and will have the highest priority, then and will have higher priority than , which, in turn will have higher priority than . For example, xP(x) yQ(y) P(a) Q(b) will be written instead of ((x(P(x)) y(Q(y))) (P(a) Q(b))).

Every concurrence of *x* in *P*, not on the scope of some occurrence of *x* or *x*, is said to be free in *P* and bound in *xP* and *xP*. Every occurrence of every variable other than *x* that is free in *P* is also free in *xP* and *xP*. A wff with at least one free variable is called open, no free variables are called closed, and an expression with no variables is called ground.

Syntactic Category: Below, is a syntactic category of English fragment covered by DCG that is given by the set SynCat (Partee 2006; Partee 2001):

SynCat = {S, NP, VP, DET, CNP, ProperN, ADJ, REL, CN, TV, IV, PP}

The elements of SynCat are symbols representing the English categories as follows:

S: sentences

NP: noun phrases

VP: verb phrases

DET: determiners

CNP: common noun phrases

Intelligent Information Access Based on Logical Semantic Binding Method 145

ii. If *Pn* is an *n*-ary predicate symbol, and *t1*, …, *tn* are ground terms, then [*Pn*(*t1*, …, *tn*)] is

ii. If *P* and *Q* are ground wffs, then [*P Q*] is True if [*P*] is True and [*Q*] is True, otherwise,

iii. If *P* and *Q* are ground wffs, then [*P Q*] is False if [*P*] is False and [*Q*] is False,

iv. If *P* and *Q* are ground wffs, then [*P Q*] is False if [*P*] is True and [*Q*] is False,

v. If *P* and *Q* are ground wffs, then [*P Q*] is True if [*P*] and [*Q*] are both True or both

vii. *xP*] is True if there is some ground term, *t* such that [*P*{*t*/*x*}] is True. Otherwise, it is

Every English expression of a particular syntactic category is translated into semantic

"e-type" or "referential" NPs (*Chris, the president*)

NPs as predicates (*an animal*, *a president*)

vi. [*xP*] is True if [*P*{*t*/*x*}] is True for every ground term, *t*. Otherwise, it is False.

expression of a corresponding type. The semantic types are defined as in Table 1.

Expressions

VP, IV e t verb phrases, intransitive verbs (*read the book, is big*)

universal quantifier (*the*) existential quantifier (*a*)

is none temporary treatment: pretend it is not there

i. If *P* is a ground wff, then [*P*] is True if [*P*] is False, otherwise, it is False.

True if [*t1*], …, [*tn*] [*Pn*], and False otherwise.

Semantic of Wffs

it is False.

False.

Syntactic Category

where:

Semantic Types of FOPL

NP e

DET e t to e

otherwise, it is True.

otherwise, it is True.

False, otherwise, it is False.

Semantic Type

S t sentences ProperN e names (*Chris*)

e t

CN(P) e t common noun phrases (*cat*)

TV e, e t transitive verbs (*read, lives*)

Based on Table 1, some of the compositional semantics are as follows:

S: Denotes a truth value, relative to an assignment of the values to free variables

NP: Is of two kinds. First, referential NP formed with definite article *the* is of type e and denotes an individual, as in *The boy writes*. Second, Predicate NP formed with indefinite

e t to e t

Table 1. Semantic Types of Syntactic Category in FOPL

e is a type, representing object of sort entity

t is a type, representing truth values

ADJ(P) e t predicative adjectives (*pretty*, *big*) REL e t relative clauses (*who(m) read the book*)

ProperN: proper nouns

ADJ: adjectives

REL: relative clauses

CN: common nouns

TV: transitive verbs

IV: intransitive verbs

PrepP: prepositional phrase

## **2.2.2 Semantic**

Although the intensional semantics of a FOPL depend on the domain being formalized, and the extensional semantics depend also on a particular situation, specification on the types of entities is usually given as the intensional and the extensional semantic of FOPL expressions.

The usual semantic of FOPL assumes a Domain, *D*, of individuals, function on individuals, sets of individuals, and relations on individuals. Let *I* be the set of all individuals in the Domain *D*.

Semantic of Atomic Symbols

Individual Constants:

If *a* is an individual constants, [*a*] is some particular individual in I.

Function Symbols:

If *fn* is a function symbol of arity *n*, [*fn*] is some particular function in D,

$$[f^n] \colon \mathbf{I} \times \ldots \times \mathbf{I} \to \mathbf{I} \text{ ( $n$  times)}$$

Predicate Symbols:

If *P1* is a unary predicate symbol, [*P1*] is more particular subset of I. If *Pn* is a predicate symbols of arity *n*, [*Pn*] is some particular subset of the relation I … I (*n* times).

Semantic of Ground Terms

Individual Constants:

If *a* is an individual constant, [*a*] is some particular individual in I.

Functional Terms:

If *fn* is a function symbol of arity *n*, and *t1, …, tn* are ground terms, then [*fn*(*t1*, …, *tn*)] = [*fn*]([*t1*], …, [*tn*]).

Semantic of Ground Atomic Formulas

i. If *P1* is unary predicate symbol, and *t* is a ground term, then [*P1(t)*] is True if [*t*] [*P1*], and False otherwise.

ii. If *Pn* is an *n*-ary predicate symbol, and *t1*, …, *tn* are ground terms, then [*Pn*(*t1*, …, *tn*)] is True if [*t1*], …, [*tn*] [*Pn*], and False otherwise.

Semantic of Wffs

144 Advances in Knowledge Representation

Although the intensional semantics of a FOPL depend on the domain being formalized, and the extensional semantics depend also on a particular situation, specification on the types of entities is usually given as the intensional and the extensional semantic of FOPL

The usual semantic of FOPL assumes a Domain, *D*, of individuals, function on individuals, sets of individuals, and relations on individuals. Let *I* be the set of all individuals in the

[*fn*]: I … I I (*n* times)

If *P1* is a unary predicate symbol, [*P1*] is more particular subset of I. If *Pn* is a predicate

If *fn* is a function symbol of arity *n*, and *t1, …, tn* are ground terms, then [*fn*(*t1*, …, *tn*)] =

i. If *P1* is unary predicate symbol, and *t* is a ground term, then [*P1(t)*] is True if [*t*] [*P1*],

symbols of arity *n*, [*Pn*] is some particular subset of the relation I … I (*n* times).

If *a* is an individual constants, [*a*] is some particular individual in I.

If *a* is an individual constant, [*a*] is some particular individual in I.

If *fn* is a function symbol of arity *n*, [*fn*] is some particular function in D,

ProperN: proper nouns

REL: relative clauses CN: common nouns TV: transitive verbs IV: intransitive verbs

PrepP: prepositional phrase

Semantic of Atomic Symbols

Individual Constants:

Function Symbols:

Predicate Symbols:

Semantic of Ground Terms

Semantic of Ground Atomic Formulas

and False otherwise.

Individual Constants:

Functional Terms:

[*fn*]([*t1*], …, [*tn*]).

ADJ: adjectives

**2.2.2 Semantic** 

expressions.

Domain *D*.


Semantic Types of FOPL


Every English expression of a particular syntactic category is translated into semantic expression of a corresponding type. The semantic types are defined as in Table 1.

Table 1. Semantic Types of Syntactic Category in FOPL

where:

e is a type, representing object of sort entity

t is a type, representing truth values

Based on Table 1, some of the compositional semantics are as follows:

S: Denotes a truth value, relative to an assignment of the values to free variables

NP: Is of two kinds. First, referential NP formed with definite article *the* is of type e and denotes an individual, as in *The boy writes*. Second, Predicate NP formed with indefinite

Intelligent Information Access Based on Logical Semantic Binding Method 147

The basic expression *animal* and *young*, is a category of CN and ADJ, are translated into predicate (x)animal(x) and (x)young(x) respectively. However, the word *young* is considered as a property, not as a thing. This has to do with the distinction between sense and reference. A common noun such as *owl* can refer to many different individuals, so its translation is the property that these individuals share. The reference of *animal* in any

These are different with phrases, such as verbs which require different numbers of arguments. For example, the intransitive verb *read* is translated into one-place predicate (x)read(x). Meanwhile, a transitive verb such as *writes* translates to a two-place predicate such as (y)(x)writes(x,y). The copula (*is*) has no semantic representation. The

Basic expressions can be combined to form complex expressions through unification process, which can be accomplished by arguments on DCG rules. The following shows the illustration of combining several predicates in the N1 by joining them with (and) symbol

> young = (x)young(x) smart = (x)smart(x) animal = (x)animal(x)

young smart animal = (x)(young(x) smart(x) animal(x))

adj(X^young(X)) --> [young]. adj(X^smart(X)) --> [smart]. adj(X^green(X)) --> [green]. noun(X^animal)) --> [animal]. noun(X^cat)) --> [cat]. The syntactic and translation rules of DCG are equivalent to the rules defined in PS. For the

n1(Sem) --> n(Sem).

n1(X^(P,Q)) --> adj(X^P), n1(X^Q). Through these implementation rules, basic English expressions are combined to form complex expressions, and at the same time translated into FOPL expressions using Prolog unification process. The implementation rule for the determiner in natural language corresponds to the quantifiers in formal logic. The determiner (DET) can be combined with a common noun (CN) to form a noun phrase. The determiner or quantifier normally goes with the connective , and with . The sentence *An animal called Pooh* contains quantifier and its semantic representation is presented as *(x)(animal(x)^called(x,Pooh))*. In this case,

particular utterance is the value of *x* that makes animal(x) true.

then, the complex expression will be presented as:

PS rules, the semantic of the whole N1 is as follows:

and below is the rule that combines an adjective with an N1:

Prolog notation is written as exist(X,animal(X),called(X,Pooh)).

DCG rules for the lexicon entries for the particular words:

(Covington 1994). From

representation for *is an animal* is the same as for *animal,* (x)animal(x).

article *a* is of type e t and denotes a set, as in *Christopher is a boy*. There are lexical NPs of the referential kind, including proper nouns (*George, Robin*) and indexed pronoun (*hei*) which will be interpreted as individual variable *xi*.

CN, CNP, ADJ, VP, IV, REL: All of type e t, one-place predicates, denoting sets of individuals. For this type, the parser will freely go back and forth between sets and their characteristic function, treating them as equivalent.

TV: Is a type of e, e t, a function from ordered pairs to truth values, i.e. the characteristic function of a set of ordered pairs. A 2-place relation is represented as a set of ordered pairs, and any set can be represented by its characteristic function.

DET (a): Form predicate nominals as an identity function on sets. It applies to any set as argument and gives the same set as value. For example, the set of individuals in the model who are student, a student = || a || (|| student ||) = || student ||

DET (the): Form e-type NPs as the *iota* operator, which applies to a set and yields an entity if its presuppositions are satisfied, otherwise it is undefined. It is defined as follows:



For example: the set of animals who Chris love contains only Pooh, then *the animal who Chris loves* will denotes Pooh. If Chris loves no animal or loves more than one animal, then *the animal who Chris loves* is undefined, i.e. has no semantic value.

Semantic Representation of English Expression in FOPL

Table 2 shows the semantic representation or syntax-semantic formalism that represents a number of simple basic English expressions and phrases, along with a way of representing the formula in Prolog.


Table 2. Representation of Simple Words and Phrases

article *a* is of type e t and denotes a set, as in *Christopher is a boy*. There are lexical NPs of the referential kind, including proper nouns (*George, Robin*) and indexed pronoun (*hei*)

CN, CNP, ADJ, VP, IV, REL: All of type e t, one-place predicates, denoting sets of individuals. For this type, the parser will freely go back and forth between sets and their

TV: Is a type of e, e t, a function from ordered pairs to truth values, i.e. the characteristic function of a set of ordered pairs. A 2-place relation is represented as a set of

DET (a): Form predicate nominals as an identity function on sets. It applies to any set as argument and gives the same set as value. For example, the set of individuals in the model

DET (the): Form e-type NPs as the *iota* operator, which applies to a set and yields an entity if

For example: the set of animals who Chris love contains only Pooh, then *the animal who Chris loves* will denotes Pooh. If Chris loves no animal or loves more than one animal, then *the* 

Table 2 shows the semantic representation or syntax-semantic formalism that represents a number of simple basic English expressions and phrases, along with a way of representing

**Semantic Representation As written in Prolog**

christopher

X^animal(x)

X^young(x)

X^read(X)

X^animal(x)

Y^X^with(X,Y)

X^young(X),animal(X)

Y^X^writes(X,Y)

its presuppositions are satisfied, otherwise it is undefined. It is defined as follows: || || = *d* if there is one and only one entity *d* in the set denoted by || ||

ordered pairs, and any set can be represented by its characteristic function.

who are student, a student = || a || (|| student ||) = || student ||

*animal who Chris loves* is undefined, i.e. has no semantic value.

**logical constant** *Christopher* 

**1-place predicate**

**1-place predicate**

1-place predicate joined by 'and'

*animal(x)*

*x)writes(x,y)*

*x)animal(x)*

*x)young(x)*

*x)young(x)*

**2-place predicate**

**1-place predicate**

**1-place predicate**

**1-place predicate**

*x)with(x,y)*

*x)animal(x)*

Table 2. Representation of Simple Words and Phrases

*(*

*(*

*(*

*(y)(*

*(x)read(x)*

*(*

*(y)(*

Semantic Representation of English Expression in FOPL

which will be interpreted as individual variable *xi*.

characteristic function, treating them as equivalent.


the formula in Prolog.

**Syntactic Category**

*animal*  (CN)

*young* (ADJ)

*writes*  (TV)

*read*  (IV)

*with*  (PrepP)

*is an animal*  (Copular VP)

*Christopher*  (PN)

*young animal*  (CN with ADJ) The basic expression *animal* and *young*, is a category of CN and ADJ, are translated into predicate (x)animal(x) and (x)young(x) respectively. However, the word *young* is considered as a property, not as a thing. This has to do with the distinction between sense and reference. A common noun such as *owl* can refer to many different individuals, so its translation is the property that these individuals share. The reference of *animal* in any particular utterance is the value of *x* that makes animal(x) true.

These are different with phrases, such as verbs which require different numbers of arguments. For example, the intransitive verb *read* is translated into one-place predicate (x)read(x). Meanwhile, a transitive verb such as *writes* translates to a two-place predicate such as (y)(x)writes(x,y). The copula (*is*) has no semantic representation. The representation for *is an animal* is the same as for *animal,* (x)animal(x).

Basic expressions can be combined to form complex expressions through unification process, which can be accomplished by arguments on DCG rules. The following shows the illustration of combining several predicates in the N1 by joining them with (and) symbol (Covington 1994). From

> young = (x)young(x) smart = (x)smart(x) animal = (x)animal(x)

then, the complex expression will be presented as:

young smart animal = (x)(young(x) smart(x) animal(x))

DCG rules for the lexicon entries for the particular words:

adj(X^young(X)) --> [young]. adj(X^smart(X)) --> [smart]. adj(X^green(X)) --> [green]. noun(X^animal)) --> [animal]. noun(X^cat)) --> [cat].

The syntactic and translation rules of DCG are equivalent to the rules defined in PS. For the PS rules, the semantic of the whole N1 is as follows:

$$\text{n1(Sem) \rightarrow n(Sem).}$$

and below is the rule that combines an adjective with an N1:

$$\text{n1}(\mathbb{X}^{\wedge}(\text{P}\_{\bullet}\text{Q})) \rightharpoonup \text{adj}(\mathbb{X}^{\wedge}\text{P}), \text{n1}(\mathbb{X}^{\wedge}\text{Q}).$$

Through these implementation rules, basic English expressions are combined to form complex expressions, and at the same time translated into FOPL expressions using Prolog unification process. The implementation rule for the determiner in natural language corresponds to the quantifiers in formal logic. The determiner (DET) can be combined with a common noun (CN) to form a noun phrase. The determiner or quantifier normally goes with the connective , and with . The sentence *An animal called Pooh* contains quantifier and its semantic representation is presented as *(x)(animal(x)^called(x,Pooh))*. In this case, Prolog notation is written as exist(X,animal(X),called(X,Pooh)).

Intelligent Information Access Based on Logical Semantic Binding Method 149

AUX Temporary treatment as in FOPL: pretend it is not there

The syntax and semantic formalism to define the notion of representation due to shows the meaning of a sentence. A new logical form, known as PragSC has been proposed for designing an effective logical model representation that can be applied to question answering process and retrieve an accurate answer. The main advantage of logical representation in this problem is its ability to gives names to the constituents such as noun phrase and verb phrase. This means that it recognizes a sentence as more than just a string of words. Unlike template and keyword approach, it can describe recursive structure, means the longer sentence have shorter sentences within them. Figure 1 illustrates the example of

Each Natural language text is directly translated into PragSC form which can be used as a complete content indicator of a passage or query. The passages and queries are processed to form their respective indexes through the translation and normalization process which are composed of simplification processes. The similarity values between the passage and query indexes are computed using the skolemize clauses binding of

**Categories Template Forms**  CN [ X | predicate(X) ]

ADJ [ X | predicate(X) ]

Table 3. Syntax-semantic formalism of english fragment

English phrase (an animal called pooh) translation:

Fig. 1. Semantic tree

TV [[ X | A ], Y | A & predicate(X, Y) ] IV [[ X | A ] | A & predicate(X) ]

DET [[ X | A ],[ X | C ] | (X, A app-op C) ] Prep [[ X | A ], Y | A & predicate(X, Y) ]

## **3. Knowledge representation**

Knowledge representation is the symbolic representation aspects of some closed universe of discourse. They are four properties in a good system for knowledge representations in our domain, which are representation adequacy, inferential adequacy, inferential efficiency, acquisition efficiency (Mohan, 2004). The objective of knowledge representations is to make knowledge explicit. Knowledge can be shared less ambiguously in its explicit form and this became especially important when machines started to be applied to facilitate knowledge management. Knowledge representation is a multidisciplinary subject that applies theories and techniques from three other fields (Sowa, 2000); logic, ontology, and computation.

Logic and Ontology provide the formalization mechanisms required to make expressive models easily sharable and computer aware. Thus, the full potential of knowledge accumulations can be exploited. However, computers play only the role of powerful processors of more or less rich information sources. It is important to remark that the possibilities of the application of actual knowledge representation techniques are enormous. Knowledge is always more than the sum of its parts and knowledge representation provides the tools needed to manage accumulations of knowledge.

To solve the complex problems encountered in artificial intelligent, it needs both a large amount of knowledge and some mechanisms for manipulating that knowledge to create solution to new problems. Putting human knowledge in a form with which computers can reason it is needed to translate from such 'natural' language form, to some artificial language called symbolic logic. Logic representation has been accepted as a good candidate for representing the meaning of natural language sentences (Bratko, 2001) and also allows more subtle semantic issues to be dealt with. A complete logical representation of openended queries and the whole text of passages need an English grammar and lexicon (Specht, 1995; Li, 2003). The output requested for reading comprehension task from each input English phrase must include the event, object, properties of object, and the thematic role relationship between the event and the object in the sentence (Ram & Moorman, 2005).

The translation strategy involves noun phrase grammar, verb phrase grammar and lexicon which are built entirely for the experiment purposes. However, the translation of stored passages will only be done partially based on the limited grammar lexicon. The queries will be restricted to verb and noun phrase form. The restriction adopted in the query is appropriate, since the objective of the reading comprehension task in this research, is to acquire deductive reasoning between the queries and passage input. Therefore, this can be done using verb and noun phrase. Some evidence has been gathered to support this view (Ferro et al., 2003; Bashir et al., 2004).

This work deals with question answering system where the translation should be as close as possible to the real meaning of the natural language phrases in order to give an accurate answer to a question. The query given and the stored passages are represented in PragSC logical form. In general, the translation of the basic expressions or English words into semantic templates are based on their syntactic categories as shown in Table 3, where, X stand for object CN, Y stand for object CN or ADJ, 'predicate' stands for the English word, stand for exists or all, and 'app-op' stand for & or .

Knowledge representation is the symbolic representation aspects of some closed universe of discourse. They are four properties in a good system for knowledge representations in our domain, which are representation adequacy, inferential adequacy, inferential efficiency, acquisition efficiency (Mohan, 2004). The objective of knowledge representations is to make knowledge explicit. Knowledge can be shared less ambiguously in its explicit form and this became especially important when machines started to be applied to facilitate knowledge management. Knowledge representation is a multidisciplinary subject that applies theories and techniques from three other fields (Sowa, 2000); logic, ontology, and computation.

Logic and Ontology provide the formalization mechanisms required to make expressive models easily sharable and computer aware. Thus, the full potential of knowledge accumulations can be exploited. However, computers play only the role of powerful processors of more or less rich information sources. It is important to remark that the possibilities of the application of actual knowledge representation techniques are enormous. Knowledge is always more than the sum of its parts and knowledge representation provides

To solve the complex problems encountered in artificial intelligent, it needs both a large amount of knowledge and some mechanisms for manipulating that knowledge to create solution to new problems. Putting human knowledge in a form with which computers can reason it is needed to translate from such 'natural' language form, to some artificial language called symbolic logic. Logic representation has been accepted as a good candidate for representing the meaning of natural language sentences (Bratko, 2001) and also allows more subtle semantic issues to be dealt with. A complete logical representation of openended queries and the whole text of passages need an English grammar and lexicon (Specht, 1995; Li, 2003). The output requested for reading comprehension task from each input English phrase must include the event, object, properties of object, and the thematic role relationship between the event and the object in the sentence (Ram & Moorman, 2005).

The translation strategy involves noun phrase grammar, verb phrase grammar and lexicon which are built entirely for the experiment purposes. However, the translation of stored passages will only be done partially based on the limited grammar lexicon. The queries will be restricted to verb and noun phrase form. The restriction adopted in the query is appropriate, since the objective of the reading comprehension task in this research, is to acquire deductive reasoning between the queries and passage input. Therefore, this can be done using verb and noun phrase. Some evidence has been gathered to support this view

This work deals with question answering system where the translation should be as close as possible to the real meaning of the natural language phrases in order to give an accurate answer to a question. The query given and the stored passages are represented in PragSC logical form. In general, the translation of the basic expressions or English words into semantic templates are based on their syntactic categories as shown in Table 3, where, X stand for object CN, Y stand for object CN or ADJ, 'predicate' stands for the English word,

**3. Knowledge representation** 

the tools needed to manage accumulations of knowledge.

(Ferro et al., 2003; Bashir et al., 2004).

stand for exists or all, and 'app-op' stand for & or .


Table 3. Syntax-semantic formalism of english fragment

The syntax and semantic formalism to define the notion of representation due to shows the meaning of a sentence. A new logical form, known as PragSC has been proposed for designing an effective logical model representation that can be applied to question answering process and retrieve an accurate answer. The main advantage of logical representation in this problem is its ability to gives names to the constituents such as noun phrase and verb phrase. This means that it recognizes a sentence as more than just a string of words. Unlike template and keyword approach, it can describe recursive structure, means the longer sentence have shorter sentences within them. Figure 1 illustrates the example of English phrase (an animal called pooh) translation:

Fig. 1. Semantic tree

Each Natural language text is directly translated into PragSC form which can be used as a complete content indicator of a passage or query. The passages and queries are processed to form their respective indexes through the translation and normalization process which are composed of simplification processes. The similarity values between the passage and query indexes are computed using the skolemize clauses binding of

Intelligent Information Access Based on Logical Semantic Binding Method 151

g15Set of relevant clauses: two(g15) book(g15) his(g9) own(g9) of(g15,g9) write(chris,g15) two(g15) book(g15) famous(g18) be(likes(tells(g15,it)),g18) The example considered that write(chris,g15) is the key skolemize clauses. g15 is the key of answer that is used to accumulate the relevant clauses through linking up process either to

Implementation of skolem clauses binding is actually more complicated when the clauses contain variables. So two skolemize clauses cannot be unified. In this experiment, the operation involves "normalization" of the variables just enough so that two skolemize clauses are unified. Normalization is an imposition process of giving standards atom to each common noun that exists in each input text passage which was represented as variable during the translation process. The skolem clauses normalization involves X-DCG parsing technique that has been extended with functionality of bi-clausifier. The detail of X-DCG parsing technique has been explained in chapter 5. Skolem constants were generated through the first parsing process. Then, the process of normalization was implemented in second parsing, which is a transformation process identifying two types of skolem constant

Whereas, binding is a term within this experiment, which refers to the process of accumulating relevant clauses by skolem constant or atom connected to any clauses existing in knowledge base. Each skolemize clause is conceived as connected if each pair of clause in it is interrelated by the key answer which consists of a skolem constant or atom. The idea that it should be specific is based on coherent theory which deals with this particular set of phenomena, originated in the 1970s, based on the work in transformational grammar (Peters

This work was conducted to solve the problem by connecting the key of an answer that has been produced through resolution theorem prover. Skolemize clauses binding technique gives the interrelation of skolemize clauses that could be considered as a relevant answer by connecting its key of answer. To establish this logical inference technique, Figure 2

An answer is literally generated by negating a query and implementing skolemize clauses binding. This will enable a resolution theorem prover to go beyond a simple "yes" answer by providing a connected skolem constant used to complete a proof. Concurrently, a semantic relation rule is also specified in pragmatic skolemize clauses as a knowledge base representation. In the example provided, this can be seen as binding process proceeds. If the semantic relation rule being searched contains rules that are unified to a question through its skolem constant, the answers will be produced. Consider the following sample as a

to differentiate between quantified (fn) and ground term (gn) variable names.

its subject side or object side.

& Ritchie, 1973; Boy, 1992).

illustrates the inference engine framework.

resolution theorem prover technique. This representation is used to define implication rules for any particular question answering and for defining synonym and hypernym words.

A query is translated into its logical representation as documents are translated. This representation is then simplified and partially reduced. The resulting representation of the query is then ready to be proven with the passage representation and their literal answers are retrieved. The proving is performed through uncertain implication process where predicates are matched and propagated, which finally gives a literal answer value between the query and the passage. In the following section, a more detailed description of the query process and its literal answer value will be discussed.

## **4. Logical semantic binding inference engine**

Work on open-ended question answering requires sophisticated linguistic analysis, including discourse understanding and deals with questions about nearly everything, and not only relying on general ontologies and world knowledge. To achieve a question answering system that is capable of generating the automatic answers for all types of question covered, implementation of skolemize clauses binding with its argument into existing theorem prover technique is introduced. Automated theorem proving served as an early model for question answering in the field of AI (Wang et al., 2000). Whereas, skolemize clauses binding approach over logical forms has allow for more complex cases, such as in Why question where the information extracted is an implicit context from a text passage. Skolemize clauses binding approach relates how one clause can be bound to others. Using this approach, the proven theorem need only to determine which skolem constant can be applied to, and valid clauses will be produced automatically.

Skolemize clauses binding is designed to work with simplified logical formula that is transformed into Pragmatic Skolem Clauses form. The basic idea is that if the key of skolemize clause match with any skolemize clauses in knowledge base, then both clauses are unified to accumulate the relevant clauses by connecting its normalize skolem constant or atom on the subject side or the object side of another. The normalize skolem constant or atom is a key for answer depending on the phrase structure of the query. Given a key of skolemize clause in negation form and a set of clauses related in knowledge base in an appropriate way, it will generate a set of relevant clauses that is a consequence of this approach. Lets consider the example of English query Why did Chris write two books of his own? to illustrate the idea of skolemize clauses binding.

Example: Why did Chris write two books of his own?

Key skolemize clause:

~write(chris,g15).Unification:

~ write(chris,g15) :- write(chris,g15)

Key of answer (Object):

resolution theorem prover technique. This representation is used to define implication rules for any particular question answering and for defining synonym and hypernym

A query is translated into its logical representation as documents are translated. This representation is then simplified and partially reduced. The resulting representation of the query is then ready to be proven with the passage representation and their literal answers are retrieved. The proving is performed through uncertain implication process where predicates are matched and propagated, which finally gives a literal answer value between the query and the passage. In the following section, a more detailed description of the query

Work on open-ended question answering requires sophisticated linguistic analysis, including discourse understanding and deals with questions about nearly everything, and not only relying on general ontologies and world knowledge. To achieve a question answering system that is capable of generating the automatic answers for all types of question covered, implementation of skolemize clauses binding with its argument into existing theorem prover technique is introduced. Automated theorem proving served as an early model for question answering in the field of AI (Wang et al., 2000). Whereas, skolemize clauses binding approach over logical forms has allow for more complex cases, such as in Why question where the information extracted is an implicit context from a text passage. Skolemize clauses binding approach relates how one clause can be bound to others. Using this approach, the proven theorem need only to determine which skolem constant can be applied to, and valid clauses will be produced

Skolemize clauses binding is designed to work with simplified logical formula that is transformed into Pragmatic Skolem Clauses form. The basic idea is that if the key of skolemize clause match with any skolemize clauses in knowledge base, then both clauses are unified to accumulate the relevant clauses by connecting its normalize skolem constant or atom on the subject side or the object side of another. The normalize skolem constant or atom is a key for answer depending on the phrase structure of the query. Given a key of skolemize clause in negation form and a set of clauses related in knowledge base in an appropriate way, it will generate a set of relevant clauses that is a consequence of this approach. Lets consider the example of English query Why did Chris write two books of his own? to illustrate the idea

~write(chris,g15).Unification:

~ write(chris,g15) :- write(chris,g15)

process and its literal answer value will be discussed.

**4. Logical semantic binding inference engine** 

words.

automatically.

of skolemize clauses binding.

Key skolemize clause:

Key of answer (Object):

Example: Why did Chris write two books of his own?

g15Set of relevant clauses: two(g15) book(g15) his(g9) own(g9) of(g15,g9) write(chris,g15) two(g15) book(g15) famous(g18) be(likes(tells(g15,it)),g18)

The example considered that write(chris,g15) is the key skolemize clauses. g15 is the key of answer that is used to accumulate the relevant clauses through linking up process either to its subject side or object side.

Implementation of skolem clauses binding is actually more complicated when the clauses contain variables. So two skolemize clauses cannot be unified. In this experiment, the operation involves "normalization" of the variables just enough so that two skolemize clauses are unified. Normalization is an imposition process of giving standards atom to each common noun that exists in each input text passage which was represented as variable during the translation process. The skolem clauses normalization involves X-DCG parsing technique that has been extended with functionality of bi-clausifier. The detail of X-DCG parsing technique has been explained in chapter 5. Skolem constants were generated through the first parsing process. Then, the process of normalization was implemented in second parsing, which is a transformation process identifying two types of skolem constant to differentiate between quantified (fn) and ground term (gn) variable names.

Whereas, binding is a term within this experiment, which refers to the process of accumulating relevant clauses by skolem constant or atom connected to any clauses existing in knowledge base. Each skolemize clause is conceived as connected if each pair of clause in it is interrelated by the key answer which consists of a skolem constant or atom. The idea that it should be specific is based on coherent theory which deals with this particular set of phenomena, originated in the 1970s, based on the work in transformational grammar (Peters & Ritchie, 1973; Boy, 1992).

This work was conducted to solve the problem by connecting the key of an answer that has been produced through resolution theorem prover. Skolemize clauses binding technique gives the interrelation of skolemize clauses that could be considered as a relevant answer by connecting its key of answer. To establish this logical inference technique, Figure 2 illustrates the inference engine framework.

An answer is literally generated by negating a query and implementing skolemize clauses binding. This will enable a resolution theorem prover to go beyond a simple "yes" answer by providing a connected skolem constant used to complete a proof. Concurrently, a semantic relation rule is also specified in pragmatic skolemize clauses as a knowledge base representation. In the example provided, this can be seen as binding process proceeds. If the semantic relation rule being searched contains rules that are unified to a question through its skolem constant, the answers will be produced. Consider the following sample as a

Intelligent Information Access Based on Logical Semantic Binding Method 153

~ writes(r(frances & bellamy),f25) :- writes(r(frances & bellamy),f25)

The skolemized clauses (a) to (g) are a collection of answer sets that are unified to the question given because each clause is bound with at least one skolem constant. The semantic relation rule base indicates that r(frances & bellamy) is bound to clause (g), meanwhile f25 (pledge) is bound to clause (a) and (g). The system continue tracking any relevant semantic relation rules in knowledge base, which contain skolem constant f25 that can be bounded. In this case clause (f) is picked out. Clause (f) gives more binding process by another skolem constant, g37, represent young people predicate. The process of skolem constant binding was retained until there are no skolem clauses which can be bounded. It is a process of accumulating of relevant clauses by *skolem constant (x)* or *atom* connected to any clauses

 x P(x,x1)P(x1,x2) … P(xn-1,xn) P(xn) (1) The example is motivated by showing what happened when the facts, r(frances & bellamy) and pledge are bound to other clauses or semantic relation rules. Then, the resulting answer

makes(f25,g37) young(g37) people(g37) feels(g37,g38) proud(g38) All the skolemized clauses were considered as a set of answer that is relevant to the question, and they may be the best information available. Another examples are shown in Table 2. Each example begins with part of a collection of semantic rules in knowledge base, represented in skolemized clauses. In this research, a question Q is represented as a proposition, and a traditional proof initiated by adding the negation of the clause form of Q to a consistent knowledge base K. If an inconsistency is unified, then skolemized clauses

A relevant answer to a particular question can be generally defined as an answer that implies all clauses to that question. Relevance for answers has been defined as unifying the skolem constant by the question. In a rule base consisting solely of skolem constants, the

then, bind both entities to any relevant semantic relation rule to find the answer.

a. cl([pledge(f25)],[]) b. cl([young(g37)],[]) c. cl([people(g37)],[]) d. cl([proud(g38)],[]) e. cl([feels(g37,g38)],[]) f. cl([makes(f25,g37)],[])

existing in knowledge base.

**4.1 Relevant answer** 

is:

g. cl([writes(r(frances & bellamy),f25)],[])

binding process proceed to find the relevant answer.

semantic relation rule used as an illustration that was originally based on a children passage entitled "School Children to Say Pledge" from Remedia Publications.

Fig. 2. The architecture of an inference engine framework

Semantic relation rules in PragSC form:

```
cl([pledge(f25)],[]) 
           cl([young(g37)],[]) 
           cl([people(g37)],[]) 
           cl([proud(g38)],[]) 
          cl([feels(g37,g38)],[]) 
         cl([makes(f25,g37)],[]) 
cl([writes(r(frances & bellamy),f25)],[])
```
Given above is the simple semantic relation rules, and the question Why did Frances Bellamy write the pledge?, then, the following logical form of question are produced.

```
~ pledge(f25) # ~ writes(r(frances & bellamy),f25) # answer(f25)
```
Based on the above representation, f25 and r(frances & bellamy) is unified with the semantic relation rules in knowledge base;

~ pledge(f25) :- pledge(f25)

~ writes(r(frances & bellamy),f25) :- writes(r(frances & bellamy),f25)

then, bind both entities to any relevant semantic relation rule to find the answer.

a. cl([pledge(f25)],[])

152 Advances in Knowledge Representation

semantic relation rule used as an illustration that was originally based on a children passage

cl([pledge(f25)],[]) cl([young(g37)],[]) cl([people(g37)],[]) cl([proud(g38)],[]) cl([feels(g37,g38)],[]) cl([makes(f25,g37)],[]) cl([writes(r(frances & bellamy),f25)],[]) Given above is the simple semantic relation rules, and the question Why did Frances

Bellamy write the pledge?, then, the following logical form of question are produced.

~ pledge(f25) # ~ writes(r(frances & bellamy),f25) # answer(f25) Based on the above representation, f25 and r(frances & bellamy) is unified with the semantic

~ pledge(f25) :- pledge(f25)

entitled "School Children to Say Pledge" from Remedia Publications.

Fig. 2. The architecture of an inference engine framework

Semantic relation rules in PragSC form:

relation rules in knowledge base;


The skolemized clauses (a) to (g) are a collection of answer sets that are unified to the question given because each clause is bound with at least one skolem constant. The semantic relation rule base indicates that r(frances & bellamy) is bound to clause (g), meanwhile f25 (pledge) is bound to clause (a) and (g). The system continue tracking any relevant semantic relation rules in knowledge base, which contain skolem constant f25 that can be bounded. In this case clause (f) is picked out. Clause (f) gives more binding process by another skolem constant, g37, represent young people predicate. The process of skolem constant binding was retained until there are no skolem clauses which can be bounded. It is a process of accumulating of relevant clauses by *skolem constant (x)* or *atom* connected to any clauses existing in knowledge base.

$$\mathbf{x} \rightarrow \mathbf{P}(\mathbf{x}, \mathbf{x}\_1) \land \mathbf{P}(\mathbf{x}\_1, \mathbf{x}\_2) \land \dots \land \mathbf{P}(\mathbf{x}\_{\mathtt{n} \cdot 1}, \mathbf{x}\_n) \land \mathbf{P}(\mathbf{x}\_n) \tag{1}$$

The example is motivated by showing what happened when the facts, r(frances & bellamy) and pledge are bound to other clauses or semantic relation rules. Then, the resulting answer is:

$$\begin{array}{c} \text{makes(f25,g37)}\\ \text{young(g37)}\\ \text{people(g37)}\\ \text{feels(g37,g38)}\\ \text{prod(g38)} \end{array}$$

All the skolemized clauses were considered as a set of answer that is relevant to the question, and they may be the best information available. Another examples are shown in Table 2. Each example begins with part of a collection of semantic rules in knowledge base, represented in skolemized clauses. In this research, a question Q is represented as a proposition, and a traditional proof initiated by adding the negation of the clause form of Q to a consistent knowledge base K. If an inconsistency is unified, then skolemized clauses binding process proceed to find the relevant answer.

#### **4.1 Relevant answer**

A relevant answer to a particular question can be generally defined as an answer that implies all clauses to that question. Relevance for answers has been defined as unifying the skolem constant by the question. In a rule base consisting solely of skolem constants, the

Intelligent Information Access Based on Logical Semantic Binding Method 155

Same as the first example, this second example recognised g46 as a skolem constant to be unified to a skolemized clauses in knowledge base which involve more than one clauses to be unified, ~ two(g46) :- two(g46); ~ book(g46) :- book(g46); ~ writes(chris,g46) : writes(chris,g46). Then, g46 binds to any skolemized clauses consisting of the same skolem constant, and tracks all possible skolemized clauses in knowledge base by binding skolem constant exists, g52, until all skolem constants bindings are complete. The relevant clauses

two(g46) book(g46) famous(g52) be(like(tells(g46,it)),g52). Throughout this experiment, providing information in a form of pragmatic skolemized clauses is just a method to collect the keywords for relevant answers. The issues related to the problem of providing an answer in correct English phrases can be considered another important area of research in question answering. In this research this problem has been considered, but thus far it has taken the form of observations rather than formal theories.

This topic aims is to extract some relevant answers which are classified into satisfying and hypothetical answers. When the idea of an answer is expanded to include all relevant information, question answering may be viewed as a process of searching for and returning of information to a questioner that takes different places in time. As one of the most challenging and important processes of question answering systems is to retrieve the best relevant text excerpts with regard to the question, Ofoghi et al. (2006) proposed a novel approach to exploit not only the syntax of the natural language of the questions and texts, but also the semantics relayed beneath them via a semantic question rewriting and passage retrieval task. Therefore, in our experiment, we used logics description to provide a natural representation and reasoning mechanism to answer a question which is a combination of

resolution theorem prover and a new approach called skolemize clauses binding.

focused on providing detailed formal definition of world knowledge.

On the other hand, external knowledge sources are added in order to give more understanding of text and produce some descriptions of the information conveyed by the text passages. External knowledge sources consist of two components with different roles of usage and motivation. First, world knowledge is used to solve the outstanding problem related to the ambiguity introduced by anaphora and polysemy. Meanwhile, in the second component, hypernyms matching procedure constitute the system in looking for the meaning of superordinates words in the question given. The purpose of this component is to produce a variety of answers based on different ways on how it is asked. This thesis has clearly demonstrated their importance and applicability to question answering, including their relationship to the input passage in natural language. In particular, this thesis is

Situating a query as a concept in a taxonomic hierarchy makes explicit the relationship among type of questions, and this is an important part of intelligent intelligent extraction. A

are as follows:

This represents an area for further research interest.

**5. Intelligent information access** 

unifying of a single skolem constant to a question would be considered a relevant answer. When rules are added, the experiment becomes more complicated. When taxonomic relationship is represented in a rule base, a relevant answer can be defined as an interconnection of all clauses that unify and bind the same skolem constants. Table 4 depicts two examples illustrate the skolemized clauses binding process to extract relevant answer.


Table 4. Example of question answering process

The first example in Table 4, g1 is considered as a skolem constant to be unified to a skolemized clause in knowledge base, ~ end(r(pony & express),g1) :- end(r(pony & express),g1). Then g1 binds to any skolemized clauses consisting of the same skolem constant, and tracks all possible skolemized clauses in knowledge base by binding skolem constant exists, f1, until all skolem constants bindings are complete. The relevant answer consists of several clauses that are bound by g1. The output is as follows:

> sents(g1,f1). now(g1). mail(g1). new(f1). faster(f1). way(f1).

Same as the first example, this second example recognised g46 as a skolem constant to be unified to a skolemized clauses in knowledge base which involve more than one clauses to be unified, ~ two(g46) :- two(g46); ~ book(g46) :- book(g46); ~ writes(chris,g46) : writes(chris,g46). Then, g46 binds to any skolemized clauses consisting of the same skolem constant, and tracks all possible skolemized clauses in knowledge base by binding skolem constant exists, g52, until all skolem constants bindings are complete. The relevant clauses are as follows:

#### two(g46) book(g46) famous(g52) be(like(tells(g46,it)),g52).

Throughout this experiment, providing information in a form of pragmatic skolemized clauses is just a method to collect the keywords for relevant answers. The issues related to the problem of providing an answer in correct English phrases can be considered another important area of research in question answering. In this research this problem has been considered, but thus far it has taken the form of observations rather than formal theories. This represents an area for further research interest.

## **5. Intelligent information access**

154 Advances in Knowledge Representation

unifying of a single skolem constant to a question would be considered a relevant answer. When rules are added, the experiment becomes more complicated. When taxonomic relationship is represented in a rule base, a relevant answer can be defined as an interconnection of all clauses that unify and bind the same skolem constants. Table 4 depicts two examples illustrate the skolemized clauses binding process to extract relevant answer.

Semantic relation rules

Proposition

Unifying process

cl([now(g1)],[]) cl([new(f1)],[]) cl([faster(f1)],[]) cl([way(f1)],[]), cl([sents(g1,f1)],[]) cl([now(g1)],[])

answer(g1)

*g1* connecting: now(g1) mail(g1) sents(g1,f1)

now(g1) mail(g1) new(f1) faster(f1) way(f1)] sents(g1,f1)

Table 4. Example of question answering process

cl([end(r(pony & express),g1)],[])

~ end(r(pony & express),g1) #

~ end(r(pony & express),g1) : end(r(pony & express),g1)

(*K*)

(*Q*)

SCB Key connecting

SCB Clauses interrelating **Example 1 Example 2** 

cl([two(g46)],[]) cl([book(g46)],[]) cl([own(his)],[])

cl([writes(chris,g46)],[]) cl([famous(g52)],[])

cl([be(like (tells(g46,it)),g52)],[])

~ two(g46)) # ~ book(g46) # ~ writes(chris,g46)) # answer(g46)

~ two(g46) :- two(g46) ~ book(g46) :- book(g46) ~ writes(chris,g46) : writes(chris,g46)

be(like (tells(g46,it)),g52)

be(like(tells(g46,it)),g52)

*g46* connecting: two(g46) book(g46)

two(g46) book(g46) famous(g52)

The first example in Table 4, g1 is considered as a skolem constant to be unified to a skolemized clause in knowledge base, ~ end(r(pony & express),g1) :- end(r(pony & express),g1). Then g1 binds to any skolemized clauses consisting of the same skolem constant, and tracks all possible skolemized clauses in knowledge base by binding skolem constant exists, f1, until all skolem constants bindings are complete. The relevant answer

> sents(g1,f1). now(g1). mail(g1). new(f1). faster(f1). way(f1).

consists of several clauses that are bound by g1. The output is as follows:

This topic aims is to extract some relevant answers which are classified into satisfying and hypothetical answers. When the idea of an answer is expanded to include all relevant information, question answering may be viewed as a process of searching for and returning of information to a questioner that takes different places in time. As one of the most challenging and important processes of question answering systems is to retrieve the best relevant text excerpts with regard to the question, Ofoghi et al. (2006) proposed a novel approach to exploit not only the syntax of the natural language of the questions and texts, but also the semantics relayed beneath them via a semantic question rewriting and passage retrieval task. Therefore, in our experiment, we used logics description to provide a natural representation and reasoning mechanism to answer a question which is a combination of resolution theorem prover and a new approach called skolemize clauses binding.

On the other hand, external knowledge sources are added in order to give more understanding of text and produce some descriptions of the information conveyed by the text passages. External knowledge sources consist of two components with different roles of usage and motivation. First, world knowledge is used to solve the outstanding problem related to the ambiguity introduced by anaphora and polysemy. Meanwhile, in the second component, hypernyms matching procedure constitute the system in looking for the meaning of superordinates words in the question given. The purpose of this component is to produce a variety of answers based on different ways on how it is asked. This thesis has clearly demonstrated their importance and applicability to question answering, including their relationship to the input passage in natural language. In particular, this thesis is focused on providing detailed formal definition of world knowledge.

Situating a query as a concept in a taxonomic hierarchy makes explicit the relationship among type of questions, and this is an important part of intelligent intelligent extraction. A

Intelligent Information Access Based on Logical Semantic Binding Method 157

The interpreter is a domain-independent embodiment of logical inference approach to generate a clauses form representation. The translation process is guided by a set of phrase structure rules of the sentence and build a tree structure of sentence. The rules mean: An S can consist of an NP followed by a VP. An NP can consist of a D followed by an N. A VP can consist of a V followed by an NP, and etc. This set of rules is called a Definite-Clause

S :- NP, VP NP :- D, N VP :- V, NP The parsing process is like left-right top-down parsers, DCG-rule parsers go into a loop when they encounter a rule of the form. Each position in the tree has labels, which may indicate procedure to be run when the traversal enters or leaves that position. The leaves of the tree will be words, which are picked out after morphological processing, or pieces of the original text passage. In the latter case, the interpreter looks up the phrase structure in the lexicon dictionary to find realization for the words that satisfied the lexical items. Below is

D( a, singular). N( animal, animals ). V( amaze, amazes, amazed, amazed, amazing, amazes ). The result is a new logical form representation of phrase structure tree, possibly with part(s) of the original text passage. In this way, the entire text passage is gradually translated into

> alive(\_36926 ^ isa(r(christopher & robin),\_36926)) & well(\_36926 ^ isa(r(christopher & robin),\_36926))

exists(\_46238,((pretty(\_46238) & home(\_46238)) & calls(\_46238,r(cotchfield & farm))) & lives(chris,\_46238)) After got a way of putting logical formula into a nice tidy form, an obvious thing to investigate was need a way of writing something in clausal form known as Pragmatic Skolemize Clauses (PragSC). PragSC form is a collection of clauses with at most one unnegated literal. The logical formula must turns out into PragSC form, to work with logical inference approach as proposed. The interpreter does some additional work in translation process, therefore, some modification to its was required. Before PragSC can be generated, it is required to generate a new unique constant symbol known as Skolem Constant using multi-parsing approach. The first parsing used to generate skolem constant, introducing two types of skolem constant to differentiate between quantified (*fn*) and ground term (*gn*) variable names. Meanwhile, the second parsing was implemented an algorithm to convert a

to be deeper than their knowledge of the source language.

Grammar (DCG) as shown below:

shown an example of lexical items.

logical form as shown below.

simplified logical formula into PragSC form.

the same in-depth knowledge to re-encode the meaning in the target language. In fact, in general, interpreters' knowledge of the target language is more important, and needs

logical technique solves a constraint satisfaction problem by the combination of two different methods. Logical reasoning applied an inference engine to extract an automatic answer. Logical technique exploits the good properties of different methods by applying them to problems they can efficiently solve. For example, search is efficient when the problem has many solutions, while an inference is efficient in proving unsatisfiability of overconstrained problems.

This logical technique is based on running search over a set of variables and inference over the other ones. In particular, backtracking or some other form of search is executed with a number of variables; whenever a consistent partial assignment over these variable is found, an inference is executed on the remaining variables to check whether this partial assignment can be extended to form a solution based on logical approach. This affects the choice of the variables evaluated by the search. Indeed, once a variable is evaluated, it can be effectively extracted from the knowledge base, restricting all constraints it is involved with in its value. Alternatively, an evaluated variable can be replaced by a skolem constant, one for each constraint, all having a single-value domain. This mixed technique is efficient if the search variables are chosen in a manner where duplicating or deleting them turns the problem into one that can be efficiently solved by inference.

## **6. Discussion and conclusion**

To appreciate fully the significance of the findings of this research, it helps to firstly understand the level of scientific rigor used to guide the formation of conclusions from the research. The experiments are considered complete when the expecting results or findings replicate across previous research and settings. Findings with a high degree of replicability are finally considered as incontrovertible findings and these form the basis for additional research. Each research study within this research domain network usually follows the most rigorous scientific procedures.

The study does not embrace any a priori theory, but represent the linguistic knowledge base into logical formalisms to build up the meaning representation and enforce syntactic and semantic agreements that include all information that are relevant to a question. In a true scientific paradigm, the study is tested in different behaviour or condition which involve two kinds of external knowledge sources. This contrasts with the usual nature of previous researches in the same domain, where none was ever tested against all four conditions as in this study. The detail of the research works and experiences are as follows:

 **Logical Interpreter Process.** The interpreter process, whether it be for translation or interpreting, can be described as decoding the meaning of the source text and reencoding this meaning in the target representation. In this experiment the target representation is in simplified logical model. To decode the meaning of a text, the translator must first identify its component "interpreter units," that is to say, the segments of the text to be treated as a cognitive unit. A interpreter unit may be a word, a phrase or even one or more sentences. Behind this seemingly simple procedure lies a complex cognitive operation. To decode the complete meaning of the source text, the interpreter must consciously and methodically interpret and analyze all its features. This process requires thorough knowledge of the interpreter, grammar, semantics, syntax, dictionary, lexicons and the like, of the source language. The interpreter needs

logical technique solves a constraint satisfaction problem by the combination of two different methods. Logical reasoning applied an inference engine to extract an automatic answer. Logical technique exploits the good properties of different methods by applying them to problems they can efficiently solve. For example, search is efficient when the problem has many solutions, while an inference is efficient in proving unsatisfiability of

This logical technique is based on running search over a set of variables and inference over the other ones. In particular, backtracking or some other form of search is executed with a number of variables; whenever a consistent partial assignment over these variable is found, an inference is executed on the remaining variables to check whether this partial assignment can be extended to form a solution based on logical approach. This affects the choice of the variables evaluated by the search. Indeed, once a variable is evaluated, it can be effectively extracted from the knowledge base, restricting all constraints it is involved with in its value. Alternatively, an evaluated variable can be replaced by a skolem constant, one for each constraint, all having a single-value domain. This mixed technique is efficient if the search variables are chosen in a manner where duplicating or deleting them turns the problem into

To appreciate fully the significance of the findings of this research, it helps to firstly understand the level of scientific rigor used to guide the formation of conclusions from the research. The experiments are considered complete when the expecting results or findings replicate across previous research and settings. Findings with a high degree of replicability are finally considered as incontrovertible findings and these form the basis for additional research. Each research study within this research domain network usually follows the most

The study does not embrace any a priori theory, but represent the linguistic knowledge base into logical formalisms to build up the meaning representation and enforce syntactic and semantic agreements that include all information that are relevant to a question. In a true scientific paradigm, the study is tested in different behaviour or condition which involve two kinds of external knowledge sources. This contrasts with the usual nature of previous researches in the same domain, where none was ever tested against all four conditions as in

 **Logical Interpreter Process.** The interpreter process, whether it be for translation or interpreting, can be described as decoding the meaning of the source text and reencoding this meaning in the target representation. In this experiment the target representation is in simplified logical model. To decode the meaning of a text, the translator must first identify its component "interpreter units," that is to say, the segments of the text to be treated as a cognitive unit. A interpreter unit may be a word, a phrase or even one or more sentences. Behind this seemingly simple procedure lies a complex cognitive operation. To decode the complete meaning of the source text, the interpreter must consciously and methodically interpret and analyze all its features. This process requires thorough knowledge of the interpreter, grammar, semantics, syntax, dictionary, lexicons and the like, of the source language. The interpreter needs

this study. The detail of the research works and experiences are as follows:

overconstrained problems.

one that can be efficiently solved by inference.

**6. Discussion and conclusion** 

rigorous scientific procedures.

the same in-depth knowledge to re-encode the meaning in the target language. In fact, in general, interpreters' knowledge of the target language is more important, and needs to be deeper than their knowledge of the source language.

The interpreter is a domain-independent embodiment of logical inference approach to generate a clauses form representation. The translation process is guided by a set of phrase structure rules of the sentence and build a tree structure of sentence. The rules mean: An S can consist of an NP followed by a VP. An NP can consist of a D followed by an N. A VP can consist of a V followed by an NP, and etc. This set of rules is called a Definite-Clause Grammar (DCG) as shown below:

$$\begin{array}{c} \text{S} \text{:} \text{-} \text{NP} \text{, VP} \\ \text{NP} \text{:} \text{-} \text{D} \text{,N} \\ \text{VP} \text{:} \text{-} \text{V} \text{, NP} \end{array}$$

The parsing process is like left-right top-down parsers, DCG-rule parsers go into a loop when they encounter a rule of the form. Each position in the tree has labels, which may indicate procedure to be run when the traversal enters or leaves that position. The leaves of the tree will be words, which are picked out after morphological processing, or pieces of the original text passage. In the latter case, the interpreter looks up the phrase structure in the lexicon dictionary to find realization for the words that satisfied the lexical items. Below is shown an example of lexical items.

> D( a, singular). N( animal, animals ). V( amaze, amazes, amazed, amazed, amazing, amazes ).

The result is a new logical form representation of phrase structure tree, possibly with part(s) of the original text passage. In this way, the entire text passage is gradually translated into logical form as shown below.

> alive(\_36926 ^ isa(r(christopher & robin),\_36926)) & well(\_36926 ^ isa(r(christopher & robin),\_36926))

exists(\_46238,((pretty(\_46238) & home(\_46238)) & calls(\_46238,r(cotchfield & farm))) & lives(chris,\_46238))

After got a way of putting logical formula into a nice tidy form, an obvious thing to investigate was need a way of writing something in clausal form known as Pragmatic Skolemize Clauses (PragSC). PragSC form is a collection of clauses with at most one unnegated literal. The logical formula must turns out into PragSC form, to work with logical inference approach as proposed. The interpreter does some additional work in translation process, therefore, some modification to its was required. Before PragSC can be generated, it is required to generate a new unique constant symbol known as Skolem Constant using multi-parsing approach. The first parsing used to generate skolem constant, introducing two types of skolem constant to differentiate between quantified (*fn*) and ground term (*gn*) variable names. Meanwhile, the second parsing was implemented an algorithm to convert a simplified logical formula into PragSC form.

Intelligent Information Access Based on Logical Semantic Binding Method 159

Charniak, E., Altun, Y., Braz, R.d.S., Garret, B., Kosmala, M., Moscovich, T., Pang, L., Pyo,

Clark, C., Hodges, D., Stephan, J., & Moldovan, D. 2005. Moving QA Towards Reading

Dalmas, T., Leidner, J.L., Webber, B., Grover, C., & Bos, J. 2004. Generating Annotated

Doyle, P. 1997. Natural Language Understanding: AI Qual Notes. http://www1.cs.columbia.edu/nlp/paper.html [14 November 2003].

Ferro, L., Greiff, W., Hirschman, L., & Wellner, B. 2003. Reading, Learning, Teaching.

Fillmore, C. J., & Baker, C.F. 2000. Frame Semantic for Text Understanding. *International* 

Golden, T. M., & Goldman, S.R. 2001. Development and Evaluation of an Automatized Comprehension Assessment Tool. 0113369. University of Texas at Dallas. Grohe, M., & Segoyfin, L. 2000. On First-Order Topologies Queries*. 15th Annual IEEE* 

Hirschman, L., Light, M., Breck, E., & Burger, J.D. (1999). Deep Read: A Reading

Lehnert, W. G. (1981). A Computational Theory of Human Question Answering. In A. K.

Lehnert, W., Micheal, D., Johnson, P., Yang, C.J., & Harley, S. 1983. BORIS - an Experiment

Mahesh, K. 1995. Syntax-Semantic Interaction in Sentence Understanding. PhD Thesis.

Mitkov, R. 1994. An Integrated Model for Anaphora Resolution*. Proceedings of the 15th* 

Mueller, E. T. 2003. Story Understanding through Multi-representation Model Construction*.* 

Ng, H. T., Teo, L.H., & Kwan, J.L.P. (2000). A Machine Learning Approach to Answering

Questions for Reading Comprehension Tests*.* Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and

Miles, W. S. 1997. Natural Language Understanding. WSM Information Systems Inc.

Comprehension System*.* Proceeding of the 37th Annual Meeting of the Association

Joshi, Weber, B.J., & Sag, I.A. Elements of Discourse Understanding, (pp.145-176).

in In-Depth Understanding of Narrative. *Journal of Artificial Intelligence* 20 (1): 55-68.

 http://www.cs.wisc.edu/~dyer/cs540/notes/fopc.html [10 March 2004]. Fellbaum, C. 1998. *WordNet: An Electronic Lexical Database*. Cambridge: MIT Press.

*Symposium on Logic in Computer Science (LICS '00)*, pp. 349 - 361

5).

Artificial Intelligence, CA.

Holland: D. Reidel Publishing Co.

*Journal of Lexicography* 16 (3): 363-366.

Dyer, C. R. 1996. Lecture Notes: First-Order Logic (Chapter 8 - 9).

for Computational Linguistics, (pp. 325 – 332).

Cambridge: Cambridge University Press.

Georgia Institute of Technology, Georgia.

*Conference on Computational Linguistics*, pp. 1170 -1176

*Proceeding of HLT-NAACL 2003 Workshop.*, pp. 46 - 53.

Very Large Corpora (EMNLP/VLC-2000), (pp. 124-132).

http://www.mitre.org/news/events/ [2 April 2004].

C., Sun, Y., Wy, W., Yang, Z., Zeller, S., and Zorn, L. (2000). Reading Comprehension Programs in an Statistical-Language-Processing Class*.* In Proceeding of the ANLP/NAACL 2000 Workshop on Reading Comprehension Test as Evaluation for Computer-Based Language Understanding Systems, (pp. 1-

Comprehension Using Context and Default Reasoning. American Association for

Corpora for Reading Comprehension and Question Answering Evaluation. http://remote.science.uva.nl/~mdr/NLPQA/07dalmas-et-al.pdf [2 April 2004]. Dowty, D. R., Wall, R. E., & Peters, S. 1981. *Introduction to Montague Semantics*. Dordrecht,

cl([alive(g34)],[]). cl([isa(r(christopher & robin),g34)],[]). cl([well(g35)],[]). cl([isa(r(christopher & robin),g35)],[]).

cl([pretty(f29)],[]). cl([home(f29)],[]). cl([calls(f29,r(cotchfield & farm))],[]). cl([lives(chris,f29)],[]).

 **Identifying Inference Engine Methodology.** In this experiment, the inference procedure has to identify the type to generate a relevant answer. The inference procedure is a key component of the knowledge engineering process. After all preliminary information gathering and modeling are completed queries are passed to the inference procedure to get answers. In this step, the inference procedure operates on the axioms and problem-specific facts to derive at the targeted information. During this process, inference is used to seek out assumptions which, when combined with a theory, can achieve some desired goal for the system without contradicting known facts. By seeking out more and more assumptions, worlds are generated with noncontradicting knowledge.

In inference process, implementation of skolemize clauses binding with its argument into existing theorem prover technique is introduced. The answer literal enables a resolution refutation theorem prover to keep track of variable binding as a proof proceeds. Resolution refutation can be though as the bottom-up construction of a search tree, where the leaves are the clause produced by knowledge base and the negation of the goal. For example, if the question asked has the logical form *y P*(*x*, *y*), then a refutation proof is initiated by adding the clause {¬*P*(*x*, *y*)} to the knowledge base. When the answer literal is employed, the clause {¬*P*(*x*, *y*), *ANSWER*(*x*)} is added instead. The *x* in the answer literal (ANSWER(*x*)) will reflect any substitutions made to the *x* in ¬P(x, y), but the *ANSWER* predicate will not participate in (thus, will not effect) resolution. Then, the inference process preceded using skolemize clauses binding approach relates how one clause can be bound to others. For example, if the key of skolemize clause (*x*) match with any skolemize clauses in knowledge base, then both clauses are unified to accumulate the relevant clauses by connecting its normalize skolem constant or atom on the subject side or the object side of another, formulated as *x → P(x,x1) P(x1,x2) … P(xn-1,xn) P(xn).*

#### **7. References**


http://12r.cs.uiuc.edu/~danr/Teaching/CS598-04/Projects/First/Term2.pdf


cl([alive(g34)],[]). cl([isa(r(christopher & robin),g34)],[]). cl([well(g35)],[]). cl([isa(r(christopher & robin),g35)],[]).

cl([pretty(f29)],[]). cl([home(f29)],[]). cl([calls(f29,r(cotchfield & farm))],[]). cl([lives(chris,f29)],[]). **Identifying Inference Engine Methodology.** In this experiment, the inference procedure has to identify the type to generate a relevant answer. The inference procedure is a key component of the knowledge engineering process. After all preliminary information gathering and modeling are completed queries are passed to the inference procedure to get answers. In this step, the inference procedure operates on the axioms and problem-specific facts to derive at the targeted information. During this process, inference is used to seek out assumptions which, when combined with a theory, can achieve some desired goal for the system without contradicting known facts. By seeking out more and more assumptions, worlds are generated with non-

In inference process, implementation of skolemize clauses binding with its argument into existing theorem prover technique is introduced. The answer literal enables a resolution refutation theorem prover to keep track of variable binding as a proof proceeds. Resolution refutation can be though as the bottom-up construction of a search tree, where the leaves are the clause produced by knowledge base and the negation of the goal. For example, if the question asked has the logical form *y P*(*x*, *y*), then a refutation proof is initiated by adding the clause {¬*P*(*x*, *y*)} to the knowledge base. When the answer literal is employed, the clause {¬*P*(*x*, *y*), *ANSWER*(*x*)} is added instead. The *x* in the answer literal (ANSWER(*x*)) will reflect any substitutions made to the *x* in ¬P(x, y), but the *ANSWER* predicate will not participate in (thus, will not effect) resolution. Then, the inference process preceded using skolemize clauses binding approach relates how one clause can be bound to others. For example, if the key of skolemize clause (*x*) match with any skolemize clauses in knowledge base, then both clauses are unified to accumulate the relevant clauses by connecting its normalize skolem constant or atom on the subject side or the object side of another, formulated as *x → P(x,x1)* 

Allen, J. 1995. *Natural Language Understanding*. 2nd Ed. Redwood City, CA:

Bashir, A., Kantor, A., Ovesdotter C. A., Ripoche, G., Le, Q., & Atwell, S. (2004). Story

Charniak, E. 1972. Toward a Model of Children's Story Comprehension. PhD Thesis.

 http://12r.cs.uiuc.edu/~danr/Teaching/CS598-04/Projects/First/Term2.pdf Capel, A., Heaslip, L., & Williamson, D. 2002. *English Grammar*. H. P. 1990. 1st Ed. Westerhill

Road, Bishopbriggs, Glasgow G64 2QT: HarperCollins Publishers.

contradicting knowledge.

 *P(x1,x2)* 

**7. References** 

 *…* 

 *P(xn-1,xn)* 

Benjamin/Cummings.

 *P(xn).*

Comprehension. Retrieved April 11, 2005 from

Institute of Technology, Massachusetts.


**7** 

*Greece* 

**Knowledge Representation in** 

**a Proof Checker for Logic Programs** 

*Department of Sciences, Technological Educational Institute of Crete,* 

Emmanouil Marakakis, Haridimos Kondylakis and Nikos Papadakis

Lately the need for systems that ensure the correctness of software is increasing rapidly. Software failures can cause significant economic loss, endanger human life or environmental damage. Therefore, the development of systems that verify the correctness of software

*Formal methods* are techniques based on mathematics which aim to make software production an engineering subject as well as to increase the quality of software. *Formal verification*, in the context of software systems, is the act of proving or disproving the correctness of a system with respect to a certain formal specification or property, using formal methods of mathematics. *Formal program verification* is the process of formally proving that a computer program does exactly what is stated in the program specification it was written to realize. Automated techniques for producing proofs of correctness of software systems fall into two general categories: 1) *Automated theorem proving (*Loveland, 1986), in which a system attempts to produce a formal proof given a description of the system, a set of logical axioms, and a set of inference rules. 2) *Model checking*, in which a system verifies certain properties by means of an exhaustive search of all possible states that

Neither of these techniques works without human assistance. Automated theorem provers usually require guidance as to which properties are "interesting" enough to pursue. Model checkers can quickly get bogged down in checking millions of uninteresting states if not

*Interactive verifiers* or *proof checkers* are programs which are used to help a user in building a proof and/or find parts of proofs. These systems provide information to the user regarding the proof in hand, and then the user can make decisions on the next proof step that he will follow. Interactive theorem provers are generally considered to support the user, acting as clerical assistants in the task of proof construction. The *interactive systems* have been more suitable for the systematic formal development of mathematics and in mechanizing formal methods (Clarke & Wing, 1996). *Proof editors* are interactive language editing systems which ensure that some degree of "semantic correctness" is maintained as the user develops the proof. The *proof checkers* are placed between the two extremes, which are the automatic

**1. Introduction** 

under all circumstances is crucial.

a system could enter during its execution.

theorem provers and the proof editors (Lindsay, 1988).

given a sufficiently abstract model.

