**Knowledge Representation in a Proof Checker for Logic Programs**

Emmanouil Marakakis, Haridimos Kondylakis and Nikos Papadakis *Department of Sciences, Technological Educational Institute of Crete, Greece* 

## **1. Introduction**

160 Advances in Knowledge Representation

Ram, A., & Moorman, K. (2005). Towards a theory of reading and understanding. *In* 

Riloff, E., & Thelen, M. (2000). A Rule-based Question Answering System for Reading

Shapiro, S. C. 2000. Propositional, First-Order and Higher-Order Logics: Basic Definitions,

MIT Press.

Systems, (pp. 13 – 19).

*Understanding Language Understanding: Computational Models of Reading, Cambridge:*

Comprehension Tests*.* Proceeding of ANLP/NAACL 2000 Workshop on Reading Comprehension Tests as Evaluation for Computer-Based Language Understanding

Rules of Inference, and Examples. *In Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language* April: 379 – 395 Sturgill, D. B., & Segre, A. M. 1994. Using Hundreds of Workstations to solve First-order Logic problems. *12th National Conf. on Artificial Intelligence*, pp. 187-192 Vanderveen, K. B., & Ramamoorthy, C. V. 1997. Anytime Reasoning in First-Order Logic*. Proc. of the 9th Int. Conf. on Tools with Artificial Intelligence (ICTAI '97)*, pp. 142 - 148

> Lately the need for systems that ensure the correctness of software is increasing rapidly. Software failures can cause significant economic loss, endanger human life or environmental damage. Therefore, the development of systems that verify the correctness of software under all circumstances is crucial.

> *Formal methods* are techniques based on mathematics which aim to make software production an engineering subject as well as to increase the quality of software. *Formal verification*, in the context of software systems, is the act of proving or disproving the correctness of a system with respect to a certain formal specification or property, using formal methods of mathematics. *Formal program verification* is the process of formally proving that a computer program does exactly what is stated in the program specification it was written to realize. Automated techniques for producing proofs of correctness of software systems fall into two general categories: 1) *Automated theorem proving (*Loveland, 1986), in which a system attempts to produce a formal proof given a description of the system, a set of logical axioms, and a set of inference rules. 2) *Model checking*, in which a system verifies certain properties by means of an exhaustive search of all possible states that a system could enter during its execution.

> Neither of these techniques works without human assistance. Automated theorem provers usually require guidance as to which properties are "interesting" enough to pursue. Model checkers can quickly get bogged down in checking millions of uninteresting states if not given a sufficiently abstract model.

> *Interactive verifiers* or *proof checkers* are programs which are used to help a user in building a proof and/or find parts of proofs. These systems provide information to the user regarding the proof in hand, and then the user can make decisions on the next proof step that he will follow. Interactive theorem provers are generally considered to support the user, acting as clerical assistants in the task of proof construction. The *interactive systems* have been more suitable for the systematic formal development of mathematics and in mechanizing formal methods (Clarke & Wing, 1996). *Proof editors* are interactive language editing systems which ensure that some degree of "semantic correctness" is maintained as the user develops the proof. The *proof checkers* are placed between the two extremes, which are the automatic theorem provers and the proof editors (Lindsay, 1988).

Knowledge Representation in a Proof Checker for Logic Programs 163

specification expressed in typed FOL into structured form which is required by our correctness method (Marakakis, 1997), (Marakakis, 2005). 2) The component "*Theorem Proof* 

The "*KB Update*" subsystem allows the user to update the KB of the system through a userfriendly interface. The knowledge base (KB) and its contents are also shown in Fig. 2. The KB contains the representation of specifications, theorems, axioms, lemmas, and programs complements. It also has the representation of FOL laws in order to facilitate their selection for application. These entities are represented in ground representation (Hill & Gallagher, 1998). The main benefit of this representation is the distinct semantics of the object program variables from the meta-variables. It should be noted that the user would like to see theorems, axioms, lemmas and programs in a comprehensible form which is independent of their representation. However, the ground representation cannot be easily understood by users. Moreover, the editing of elements in ground representation is error-prone. Part of the interface of the system is the "*Ground-Nonground Representation Transformer*" component which transforms an expression in ground representation into a corresponding one in the standard formalism of FOL and vice-versa. The standard form of expressions helps users in

*Knowledge* and *representation* are two distinct concepts. They play a central role in the development of intelligent systems. *Knowledge* is a description of the world, i.e. the problem domain. *Representation* is how knowledge is encoded. *Reasoning* is *how* to extract more

*Checker* supports the proof task of the selected correctness theorem.

the proof task and for the update of the KB.

Fig. 2. Main components of the proof-checker.

information from what is explicitly represented.

**3. Knowledge representation** 

In this chapter we will present a proof *checker* or an *interactive verifier* for logic programs which are constructed by a schema-based method (Marakakis, 1997), (Marakakis & Gallagher, 1994) and we will focus on the knowledge representation and on its use by the core components of the system. A *meta-program* is any program which uses another program, the object program, as data. Our proof checker is a meta-program which reasons about object programs. The logic programs and the other elements of the theory represented in the Knowledge Base (KB) of our system are the object programs. The KB is the data of the proof checker. The proof checker accesses and changes the KB. The representation of the underlying theory (object program) in the proof checker (meta-program) is a key issue in the development of the proof checker. Our System has been implemented in Sicstus Prolog and its interface has been implemented in Visual Basic (Marakakis, 2005), (Marakakis & Papadakis, 2009)

## **2. An Overview of the main components of the proof checker**

This verifier of logic programs requires a lot of interaction with the user. That is why emphasis is placed on the design of its interface. The design of the interface aims to facilitate the proof task of the user. A screenshot of the main window of our system is shown in Fig. 1.


Fig. 1. The main window of the proof-checker.

Initially, all proof decisions are taken by the programmer. The design of the interface aims to facilitate the proof task of the user. This interactive verifier of logic programs consists of three distinct parts the *interface,* the *prover* or *transformer and the knowledge base (KB).* The interface offers an environment where the user can think and decide about the proof steps that have to be applied. The user specifies each proof step and the prover performs it. A high-level design of our system is depicted in Fig. 2. The main components of the proof checker with their functions are shown in this figure. The prover of the system consists of the following two components. 1) The component "*Spec Transformer*" transforms a

In this chapter we will present a proof *checker* or an *interactive verifier* for logic programs which are constructed by a schema-based method (Marakakis, 1997), (Marakakis & Gallagher, 1994) and we will focus on the knowledge representation and on its use by the core components of the system. A *meta-program* is any program which uses another program, the object program, as data. Our proof checker is a meta-program which reasons about object programs. The logic programs and the other elements of the theory represented in the Knowledge Base (KB) of our system are the object programs. The KB is the data of the proof checker. The proof checker accesses and changes the KB. The representation of the underlying theory (object program) in the proof checker (meta-program) is a key issue in the development of the proof checker. Our System has been implemented in Sicstus Prolog and its interface has been implemented in

This verifier of logic programs requires a lot of interaction with the user. That is why emphasis is placed on the design of its interface. The design of the interface aims to facilitate the proof task of the user. A screenshot of the main window of our system is shown in Fig. 1.

Initially, all proof decisions are taken by the programmer. The design of the interface aims to facilitate the proof task of the user. This interactive verifier of logic programs consists of three distinct parts the *interface,* the *prover* or *transformer and the knowledge base (KB).* The interface offers an environment where the user can think and decide about the proof steps that have to be applied. The user specifies each proof step and the prover performs it. A high-level design of our system is depicted in Fig. 2. The main components of the proof checker with their functions are shown in this figure. The prover of the system consists of the following two components. 1) The component "*Spec Transformer*" transforms a

Visual Basic (Marakakis, 2005), (Marakakis & Papadakis, 2009)

Fig. 1. The main window of the proof-checker.

**2. An Overview of the main components of the proof checker** 

specification expressed in typed FOL into structured form which is required by our correctness method (Marakakis, 1997), (Marakakis, 2005). 2) The component "*Theorem Proof Checker* supports the proof task of the selected correctness theorem.

The "*KB Update*" subsystem allows the user to update the KB of the system through a userfriendly interface. The knowledge base (KB) and its contents are also shown in Fig. 2. The KB contains the representation of specifications, theorems, axioms, lemmas, and programs complements. It also has the representation of FOL laws in order to facilitate their selection for application. These entities are represented in ground representation (Hill & Gallagher, 1998). The main benefit of this representation is the distinct semantics of the object program variables from the meta-variables. It should be noted that the user would like to see theorems, axioms, lemmas and programs in a comprehensible form which is independent of their representation. However, the ground representation cannot be easily understood by users. Moreover, the editing of elements in ground representation is error-prone. Part of the interface of the system is the "*Ground-Nonground Representation Transformer*" component which transforms an expression in ground representation into a corresponding one in the standard formalism of FOL and vice-versa. The standard form of expressions helps users in the proof task and for the update of the KB.

Fig. 2. Main components of the proof-checker.

## **3. Knowledge representation**

*Knowledge* and *representation* are two distinct concepts. They play a central role in the development of intelligent systems. *Knowledge* is a description of the world, i.e. the problem domain. *Representation* is how knowledge is encoded. *Reasoning* is *how* to extract more information from what is explicitly represented.

Knowledge Representation in a Proof Checker for Logic Programs 165

In logic programming there is not clear distinction between programs and data because data can be represented as program clauses. The semantics of a meta-program depend on the way the object program is represented in the meta-program. Normally, a distinct representation is given to each symbol of the object language in the meta-language. This is called *naming relation* (Hill & Gallagher, 1998). Rules of construction can be used to define the representation of the constructed terms and formulas. Each expression in the language of the object program should have at least one representation as an expression in the language of the meta-program. The *naming relation* for constants, functions, propositions, predicates and connectives is straightforward. That is, constants and propositions of the object language can be represented as constants in the meta-language. Functions and predicates of the object language can be represented as functions in the language of metaprogram. A connective of the object language can be represented either as a connective or as a predicate or as a function in the meta-language. The main problem is the representation of the variables of the object language in the language of the meta-program. There are two approaches. One approach is to represent the variables of the object program as ground terms in the meta-program. This representation is called *ground representation*. The other approach is to represent the variables of the object program as variables (or non-ground

terms) in the meta-program. This representation is called *non-ground representation.*

representation of the object program (Gallagher 1993).

Using non-ground representation of the object program is much easier to make an efficient implementation of the meta-program than using ground representation. In non-ground representation, there is no need to provide definitions for *renaming*, *unification* and *application* of substitutions of object language formulas. These operations which are time consuming do not require special treatment for the object language terms. The inefficiency in ground representation is mainly due to the representation of the variables of the object program as constants in the meta-program. Because of this representation complicated definitions for *renaming*, *unification* and *application* of substitutions to terms are required. On the other hand, there are semantic problems with non-ground representation. The metaprogram will not have clear declarative semantics. There is not distinction of variables of the object program from the ones of the meta-program which range over different domains. This problem can be solved by using a typed logic language instead of the standard firstorder predicate logic. The ground representation is more clear and expressive than the nonground one and it can be used for many meta-programming tasks. Ground representation is suitable for meta-programs which have to reason about the computational behavior of the object program. The ground representation is required in order to perform any complex meta-programming task in a sound way. Its inherent complexity can be reduced by specialization. That is, such meta-programs can be specialized with respect to the

Another issue is how the theory of the object program is represented in the meta-program. There are again two approaches. One approach is the object program to be represented in the meta-program as program statements (i.e. clauses). In this case, the components of the object program are fixed and the meta-program is specialized for just those programs that can be constructed from these components. The other approach is the object program to be represented as a term in a goal that is executed in the meta-program. In this case the object program can be either fixed or it can be constructed dynamically. In this case the metaprogram can reason about arbitrary object programs. This is called *dynamic meta-*

Different types of knowledge require different types of representation. Different types of knowledge representation require different types of reasoning. The most popular knowledge representation methods are based on *logic*, *rules*, *frames* and *semantic nets*. Our discussion will be focused on knowledge representation based on logic.

Logic is a language for reasoning. It is concerned with the truth of statements about the world. Each statement is either "true" or "false". Logic includes the following: a) *syntax* which specifies the symbols in the language and how they can be combined to form sentences, b) *semantics* which specify how to assign a truth to a sentence based on its meaning in the world and c) *inference rules* which specify methods for computing new sentences from existing sentences. There are different types of logic, i.e. propositional logic, first-order predicate logic, fuzzy logic, modal logic, description logic, temporal logic, etc. We are concerned on knowledge representation and reasoning based on typed first-order predicate logic because our correctness method is based on typed FOL.

Another classification of knowledge representation is *procedural* and *declarative* knowledge representation. *Declarative knowledge* concerns representation of the problem domain (world) as a set of truth sentences. This representation expresses "*what something is*". On the other hand, the procedural *knowledge* concerns tasks which must be performed to reach a particular goal. In procedural representation, the control information which is necessary to use the knowledge is embedded in the knowledge itself. It focuses on "*how something is done*". In the same way, *declarative programming* is concerned with writing down "*what*" should be computed and much less with "*how*" it should be computed (Hill & Lloyd, 1994). Declarative programming separates the control component of an algorithm (the *"how"*) from the logic component (the "*what*"). The key idea of declarative programming is that a program is a theory (in some suitable logic) and computation is deduction from the theory (Lloyd, 1994). The advantages of declarative programming are: a) teaching, b) semantics, c) programmer productivity, c) meta-programming and e) parallelism. Declarative programming in Logic Programming means that programs are theories. The programmer has to supply the intended interpretation of the theory. Control is usually supplied automatically by the system, i.e. the logic programming language. We have followed the declarative knowledge representation for the representation of the knowledge base of our system.

#### **3.1 Meta-programming, ground and non-ground representation**

A language which is used to reason about another language (or possibly itself) is called *meta-language* and the language reasoned about is called the *object language*. A *meta-program* is a program whose data is another program, i.e. the *object program*. Our proof-checker is a meta-program which manipulates other logic programs. It has been implemented in Prolog and the underlying theory, i.e. the logic programs being verified and the other elements of the KB, is the object program. An important decision is how to represent programs of the object language (i.e. the KB elements in our case) in the programs of the meta-language, i.e. in the meta-programs. G*round representation* and non*-ground representation* are the two main approaches to the representation of object programs in meta-programs. We have followed the ground representation approach for the representation of the elements of the KB of our system. Initially, ground and non-ground representation will be discussed. Then, we will see the advantages and the drawbacks of the two representations.

Different types of knowledge require different types of representation. Different types of knowledge representation require different types of reasoning. The most popular knowledge representation methods are based on *logic*, *rules*, *frames* and *semantic nets*. Our

Logic is a language for reasoning. It is concerned with the truth of statements about the world. Each statement is either "true" or "false". Logic includes the following: a) *syntax* which specifies the symbols in the language and how they can be combined to form sentences, b) *semantics* which specify how to assign a truth to a sentence based on its meaning in the world and c) *inference rules* which specify methods for computing new sentences from existing sentences. There are different types of logic, i.e. propositional logic, first-order predicate logic, fuzzy logic, modal logic, description logic, temporal logic, etc. We are concerned on knowledge representation and reasoning based on typed first-order

Another classification of knowledge representation is *procedural* and *declarative* knowledge representation. *Declarative knowledge* concerns representation of the problem domain (world) as a set of truth sentences. This representation expresses "*what something is*". On the other hand, the procedural *knowledge* concerns tasks which must be performed to reach a particular goal. In procedural representation, the control information which is necessary to use the knowledge is embedded in the knowledge itself. It focuses on "*how something is done*". In the same way, *declarative programming* is concerned with writing down "*what*" should be computed and much less with "*how*" it should be computed (Hill & Lloyd, 1994). Declarative programming separates the control component of an algorithm (the *"how"*) from the logic component (the "*what*"). The key idea of declarative programming is that a program is a theory (in some suitable logic) and computation is deduction from the theory (Lloyd, 1994). The advantages of declarative programming are: a) teaching, b) semantics, c) programmer productivity, c) meta-programming and e) parallelism. Declarative programming in Logic Programming means that programs are theories. The programmer has to supply the intended interpretation of the theory. Control is usually supplied automatically by the system, i.e. the logic programming language. We have followed the declarative knowledge representation for the representation of the

A language which is used to reason about another language (or possibly itself) is called *meta-language* and the language reasoned about is called the *object language*. A *meta-program* is a program whose data is another program, i.e. the *object program*. Our proof-checker is a meta-program which manipulates other logic programs. It has been implemented in Prolog and the underlying theory, i.e. the logic programs being verified and the other elements of the KB, is the object program. An important decision is how to represent programs of the object language (i.e. the KB elements in our case) in the programs of the meta-language, i.e. in the meta-programs. G*round representation* and non*-ground representation* are the two main approaches to the representation of object programs in meta-programs. We have followed the ground representation approach for the representation of the elements of the KB of our system. Initially, ground and non-ground representation will be discussed. Then, we will

discussion will be focused on knowledge representation based on logic.

predicate logic because our correctness method is based on typed FOL.

**3.1 Meta-programming, ground and non-ground representation** 

see the advantages and the drawbacks of the two representations.

knowledge base of our system.

In logic programming there is not clear distinction between programs and data because data can be represented as program clauses. The semantics of a meta-program depend on the way the object program is represented in the meta-program. Normally, a distinct representation is given to each symbol of the object language in the meta-language. This is called *naming relation* (Hill & Gallagher, 1998). Rules of construction can be used to define the representation of the constructed terms and formulas. Each expression in the language of the object program should have at least one representation as an expression in the language of the meta-program. The *naming relation* for constants, functions, propositions, predicates and connectives is straightforward. That is, constants and propositions of the object language can be represented as constants in the meta-language. Functions and predicates of the object language can be represented as functions in the language of metaprogram. A connective of the object language can be represented either as a connective or as a predicate or as a function in the meta-language. The main problem is the representation of the variables of the object language in the language of the meta-program. There are two approaches. One approach is to represent the variables of the object program as ground terms in the meta-program. This representation is called *ground representation*. The other approach is to represent the variables of the object program as variables (or non-ground terms) in the meta-program. This representation is called *non-ground representation.*

Using non-ground representation of the object program is much easier to make an efficient implementation of the meta-program than using ground representation. In non-ground representation, there is no need to provide definitions for *renaming*, *unification* and *application* of substitutions of object language formulas. These operations which are time consuming do not require special treatment for the object language terms. The inefficiency in ground representation is mainly due to the representation of the variables of the object program as constants in the meta-program. Because of this representation complicated definitions for *renaming*, *unification* and *application* of substitutions to terms are required. On the other hand, there are semantic problems with non-ground representation. The metaprogram will not have clear declarative semantics. There is not distinction of variables of the object program from the ones of the meta-program which range over different domains. This problem can be solved by using a typed logic language instead of the standard firstorder predicate logic. The ground representation is more clear and expressive than the nonground one and it can be used for many meta-programming tasks. Ground representation is suitable for meta-programs which have to reason about the computational behavior of the object program. The ground representation is required in order to perform any complex meta-programming task in a sound way. Its inherent complexity can be reduced by specialization. That is, such meta-programs can be specialized with respect to the representation of the object program (Gallagher 1993).

Another issue is how the theory of the object program is represented in the meta-program. There are again two approaches. One approach is the object program to be represented in the meta-program as program statements (i.e. clauses). In this case, the components of the object program are fixed and the meta-program is specialized for just those programs that can be constructed from these components. The other approach is the object program to be represented as a term in a goal that is executed in the meta-program. In this case the object program can be either fixed or it can be constructed dynamically. In this case the metaprogram can reason about arbitrary object programs. This is called *dynamic meta-*

Knowledge Representation in a Proof Checker for Logic Programs 167

The type variables are specified by the lower case Greek letter *a* followed by a positive integer which is the unique identifier of the variables e.g. *a1*, *a2*, *a3*, *a4* etc. Each type variable is represented in ground form by a term of the form *tv(N)* or in simplified form *tvN* where *N* stands for the unique identifier of the variable. For example, the ground representation of

Object program variables and variables in specifications are expressed using the lower case English letter *x* followed by a positive integer which is the unique identifier of the variables e.g. *x1*, *x2*, *x3*, *x4* etc. Each object variable is represented in ground form by a term of the form *v(N)* where *N* stands for the unique identifier of the variable. For example, the ground representation of the object variables *x1, x2, x3* is *v(1)* ,*v(2) and v(3)* respectively. Note that, the quantifier of each variable comes before the variable in the formula. Subscripted variables of the form *x1i* represent elements from constructed objects. They are represented by a term of the form *v(Id, i:nat):ElementType* where the first argument "*Id*" represents its unique identifier and the second one represents its subscript. "*Id*" is a natural number. This type of variables occurs mainly in specifications. A term like *v(Id, i:nat):ElementType* can be assumed as representing either a regular compound term or an element of a structured object like a sequence. The distinction is performed by checking the types of the elements *x* and *x(i)*. For example, for *i=1* by checking *x1:seq(α1)* and *x1(1:nat):α1*, it can be inferred that

A set of axioms is applied to each DT including the "*domain closure*" and the "*uniqueness*" axioms which will be also presented later on Section 3.7. Each axiom is specified by a FOL formula. Axioms are represented by the predicates "*axiom\_def\_ID/1*" and "*axiom\_def/4*" as

represents the identifiers of all axioms in the KB. Its argument "*Axiom\_Ids*" is a list with the identifiers of the axioms. For example, the representation "axiom\_def\_ID([1,2,3,4])" says

The argument "*Axiom\_Id*" is the unique identifier of the axiom, i.e. a positive integer. "*DT\_name*" is the name of the DT which the axiom is applied to. "*Axiom\_name*" is the name of axiom. "*Axiom\_specification*" is a list which has the representation of the specification of

**Example**: *Domain closure axiom for sequences*. Informally, this axiom says that a sequence can

The specification of each axiom is represented by a predicate of the following form

*"axiom\_def(Axiom\_Id, DT\_name, Axiom\_name, Axiom\_specification)".* 

type variables *a1*, *a2*, *a3* could be *tv(1)* or *tv1*, *tv(2)* or *tv2*, *tv(3)* or *tv3* respectively.

**3.3 Representation of variables** 

**3.3.2 Object program variables** 

*x1(1:nat):α1* is an element of *x1:seq(α1)*.

follows. The predicate

the axiom.

"*axiom\_def\_ID(Axiom\_Ids)"*

**3.4 Representation of axioms and lemmas** 

that the KB has four axioms with identifiers 1,2,3 and 4.

be either empty or it will consist from head and tail.

**3.3.1 Type variables** 

*programming.* The object program in our proof checker is represented as clauses. The underlying theory is fixed for each proof task.

#### **3.2 Ground representation of object programs in the proof-checker**

The KB shown in Fig. 2 contains the representation of specifications, theorems, axioms, lemmas and programs complements. It also has the representation of FOL laws in order to facilitate their selection for application. These KB elements are represented in ground representation (Hill & Gallagher, 1998).. The representation of the main symbols of the object language which are used in this chapter is shown below.


Predicates are represented by their names assuming that each predicate has a unique name. In case of name conflicts, we use the ground term *p(i)* where *i* is natural. Sum of n elements,

i.e. 1 *n i i x* is represented as the following ground term: *sum(1:nat, v(2):nat, v(3, v(4):nat):Type):Type* where "*Type*" is the type of x*i*.

#### **3.3 Representation of variables**

#### **3.3.1 Type variables**

166 Advances in Knowledge Representation

*programming.* The object program in our proof checker is represented as clauses. The

The KB shown in Fig. 2 contains the representation of specifications, theorems, axioms, lemmas and programs complements. It also has the representation of FOL laws in order to facilitate their selection for application. These KB elements are represented in ground representation (Hill & Gallagher, 1998).. The representation of the main symbols of the

**3.2 Ground representation of object programs in the proof-checker** 

object language which are used in this chapter is shown below.

**Object language symbol Representation** 

length of sequence *x1*(#x1) *len(v(1):Type):nat*

type variable term *tv(i)*, *i* is natural

sequence constructor *seq\_cons*(*Head*, *Tail*) where *Head* and *Tail* are

accordingly.

Predicates are represented by their names assuming that each predicate has a unique name. In case of name conflicts, we use the ground term *p(i)* where *i* is natural. Sum of n elements,

is represented as the following ground term: *sum(1:nat, v(2):nat, v(3,* 

(*Head* :: *Tail*) defined in ground representation

*x1i* /*Type* (e.g. *x1*/*α1*) *v(1, i:nat):Type (e.g. v(1, 1:nat):tv(1) )*

object program variable term *v(i)*, *i* is natural function term *g(i)*, *i* is natural proposition, formulas of FOL term *f(i)*, *i* is natural predicate term *p(i)*, *i* is natural connectives (,,*~,*↔) \/, /\, ~, -> , <->

constant constant

exist () *ex* for all ( ) *all*

operation plus (+) *plus* operation minus (-) *minus*

type sequence *seq*  empty sequence (<>) *nil\_seq* 

equality (=) *eq* inequality (*≠) ~eq less-equal (*≤) *le* greater-equal (≥) *ge* type natural (N) *nat* type integer (Z) *int* nonzero naturals (*N1*) *posInt*

*v(4):nat):Type):Type* where "*Type*" is the type of x*i*.

i.e.

1

*n i i x* 

operator / (*Object*/*Type*) (*Object* : *Type*)

underlying theory is fixed for each proof task.

The type variables are specified by the lower case Greek letter *a* followed by a positive integer which is the unique identifier of the variables e.g. *a1*, *a2*, *a3*, *a4* etc. Each type variable is represented in ground form by a term of the form *tv(N)* or in simplified form *tvN* where *N* stands for the unique identifier of the variable. For example, the ground representation of type variables *a1*, *a2*, *a3* could be *tv(1)* or *tv1*, *tv(2)* or *tv2*, *tv(3)* or *tv3* respectively.

#### **3.3.2 Object program variables**

Object program variables and variables in specifications are expressed using the lower case English letter *x* followed by a positive integer which is the unique identifier of the variables e.g. *x1*, *x2*, *x3*, *x4* etc. Each object variable is represented in ground form by a term of the form *v(N)* where *N* stands for the unique identifier of the variable. For example, the ground representation of the object variables *x1, x2, x3* is *v(1)* ,*v(2) and v(3)* respectively. Note that, the quantifier of each variable comes before the variable in the formula. Subscripted variables of the form *x1i* represent elements from constructed objects. They are represented by a term of the form *v(Id, i:nat):ElementType* where the first argument "*Id*" represents its unique identifier and the second one represents its subscript. "*Id*" is a natural number. This type of variables occurs mainly in specifications. A term like *v(Id, i:nat):ElementType* can be assumed as representing either a regular compound term or an element of a structured object like a sequence. The distinction is performed by checking the types of the elements *x* and *x(i)*. For example, for *i=1* by checking *x1:seq(α1)* and *x1(1:nat):α1*, it can be inferred that *x1(1:nat):α1* is an element of *x1:seq(α1)*.

#### **3.4 Representation of axioms and lemmas**

A set of axioms is applied to each DT including the "*domain closure*" and the "*uniqueness*" axioms which will be also presented later on Section 3.7. Each axiom is specified by a FOL formula. Axioms are represented by the predicates "*axiom\_def\_ID/1*" and "*axiom\_def/4*" as follows. The predicate

"*axiom\_def\_ID(Axiom\_Ids)"*

represents the identifiers of all axioms in the KB. Its argument "*Axiom\_Ids*" is a list with the identifiers of the axioms. For example, the representation "axiom\_def\_ID([1,2,3,4])" says that the KB has four axioms with identifiers 1,2,3 and 4.

The specification of each axiom is represented by a predicate of the following form

*"axiom\_def(Axiom\_Id, DT\_name, Axiom\_name, Axiom\_specification)".* 

The argument "*Axiom\_Id*" is the unique identifier of the axiom, i.e. a positive integer. "*DT\_name*" is the name of the DT which the axiom is applied to. "*Axiom\_name*" is the name of axiom. "*Axiom\_specification*" is a list which has the representation of the specification of the axiom.

**Example**: *Domain closure axiom for sequences*. Informally, this axiom says that a sequence can be either empty or it will consist from head and tail.

Knowledge Representation in a Proof Checker for Logic Programs 169

represents the identifiers of all theorems that are available in the KB. Its argument "*Theorems\_Ids*" is a list with the identifiers of all theorems. The specification of each theorem

The argument "*Theorem\_Id*" is the unique identifier of the theorem, i.e. a positive integer. The arguments "*Program\_Id*" and "*Spec\_struct\_Id*" are the unique identifiers of the program and the structured specifications respectively. "*Theorem\_specification*" is a list which has the

Example: The predicate *sum(x1, x2)* where *Type(sum) = seq(Z) Z* is true iff *x2* is the sum of the sequence of integers *x1.* The correctness theorem for predicate *sum/2* and the theory

*Pr* is the logic program for predicate *sum*/2, excluding the DT definitions. *Comp(Pr)* is the complement of the program *Pr. Spec* is the specification of predicate *sum*/2, i.e. *sumS(x1,x2). A* is the theory for sequences, i.e. the underlying DTs for predicate *sum*/2, including the

The specification of a theorem may need to be transformed into structured form in order to proceed to the proof. The structure form of theorems facilitates the proof task. Each theorem in structured form is specified by a FOL formula. They are represented by the predicates

represents the identifiers of all theorems available in the KB with the specification part in structured form. Its argument "*Theorems\_Ids*" is a list with the identifiers of all theorems in structured form. The specification of each theorem is represented by a predicate of the form

The argument "*Theorem\_struct\_Id*" is the unique identifier, i.e. a positive integer, of the theorem whose specification part is in structured form. The arguments "*Program\_Id*" and "*Spec\_struct\_Id*" are the unique identifiers of the program and the structured specifications respectively. "*Theorem\_specification*" is a list which has the representation of the specification

*"theorem\_struct(Theorem\_struct\_Id,Program\_Id,Spec\_struct\_Id, Theorem\_specification).".* 

 *x1/seq(Z), x2/Z (sum(x1,x2) ↔ sumS(x1, x2))* 

*"theorem(Theorem\_Id, Program\_Id, Spec\_struct\_Id, Theorem\_specification)."* 

is represented by a predicate of the form

representation of the specification of theorem.

which is used to prove it is as follows.

specifications of the DT operations.

**3.6.2 Theorems in structured form** 

 *sum\_s(v(1):seq(int), v(2):int))]).* 

*theorem(1, progr1, spec\_struct1,* 

*"theorem\_struct\_ID(Theorems\_Ids)"* 

 *x2/Z (sum(x1,x2) ↔ sums(x1,x2))* 

 *[all v(1):seq(int),all v(2):int, (sum(v(1):seq(int), v(2):int):int <->* 

"*theorem\_struct\_ID/1*" and "*theorem\_struct/4*" as follows. The predicate

*Comp(Pr)* 

 *Spec A |=* 

**Theorem Specification**:

*x1/seq(Z),* 

**Representation**:

of theorem.

**Specification**:

*[ x1/seq(a2),[x1= < > ( x3/a2, x4/seq(a2),[x1=x3::x4])]]* 

#### **Representation**:

*axiom\_def (1, sequences, 'domain closure', [all v(1):seq(tv(1)), (eq(v(1):seq(tv(1)),nil\_seq) \/ [ex v(2):tv(1), ex v(3):seq(tv(1)), eq(v(1):seq(tv(1)),seq\_cons(v(2):tv(1), v(3):seq(tv(1))):seq(tv(1))) ])]).* 

Similarly, lemmas are represented by the predicates "*lemma\_sp\_ID/1*" and "*lemma\_sp/4*".

## **3.5 Representation of first-order logic laws**

The FOL laws are equivalence preserving transformation rules. Each FOL law is specified by a FOL formula. They are represented by predicates *fol\_law\_ID/1* and *fol\_law/3* as follows. The predicate

*"fol\_law\_ID(FOL\_laws\_Ids)"* 

represents the identifiers of all FOL laws in the KB. Its argument "*FOL\_laws\_Ids*" is a list with the identifiers of all FOL laws. The specification of each FOL law is represented by a predicate of the form

 *"fol\_law(FOL\_law\_Id, FOL\_law\_description, FOL\_law\_specification).".* 

The argument "*FOL\_law\_Id*" is the unique identifier of the FOL law, i.e. a positive integer. "*FOL\_law\_description*" is the name of a FOL law. "*FOL\_law\_specification*" is a list which has the ground representation of the specification of FOL law.

**Example**: ( distribution)

**Specification:** 

*P (Q R) ↔ (P Q) (P R)* 

**Representation**:

*fol\_law(2,' distribution', [f1 /\ (f2 \/ f3) <-> (f1 /\ f2) \/ (f1 /\ f3)]).* 

#### **3.6 Representation of theorems**

#### **3.6.1 Initial theorem**

The theorems that have to be proved must also be represented in the KB. Each theorem is specified by a FOL formula. They are represented by the predicates "*theorem\_ID/1*" and "*theorem/4*" as follows. The predicate

*"theorem\_ID(Theorems\_Ids)"* 

represents the identifiers of all theorems that are available in the KB. Its argument "*Theorems\_Ids*" is a list with the identifiers of all theorems. The specification of each theorem is represented by a predicate of the form

*"theorem(Theorem\_Id, Program\_Id, Spec\_struct\_Id, Theorem\_specification)."* 

The argument "*Theorem\_Id*" is the unique identifier of the theorem, i.e. a positive integer. The arguments "*Program\_Id*" and "*Spec\_struct\_Id*" are the unique identifiers of the program and the structured specifications respectively. "*Theorem\_specification*" is a list which has the representation of the specification of theorem.

Example: The predicate *sum(x1, x2)* where *Type(sum) = seq(Z) Z* is true iff *x2* is the sum of the sequence of integers *x1.* The correctness theorem for predicate *sum/2* and the theory which is used to prove it is as follows.

*Comp(Pr) Spec A |= x1/seq(Z), x2/Z (sum(x1,x2) ↔ sumS(x1, x2))* 

*Pr* is the logic program for predicate *sum*/2, excluding the DT definitions. *Comp(Pr)* is the complement of the program *Pr. Spec* is the specification of predicate *sum*/2, i.e. *sumS(x1,x2). A* is the theory for sequences, i.e. the underlying DTs for predicate *sum*/2, including the specifications of the DT operations.

#### **Theorem Specification**:

*x1/seq(Z), x2/Z (sum(x1,x2) ↔ sums(x1,x2))* 

#### **Representation**:

168 Advances in Knowledge Representation

 *x4/seq(a2),[x1=x3::x4])]]* 

Similarly, lemmas are represented by the predicates "*lemma\_sp\_ID/1*" and "*lemma\_sp/4*".

The FOL laws are equivalence preserving transformation rules. Each FOL law is specified by a FOL formula. They are represented by predicates *fol\_law\_ID/1* and *fol\_law/3* as follows. The

represents the identifiers of all FOL laws in the KB. Its argument "*FOL\_laws\_Ids*" is a list with the identifiers of all FOL laws. The specification of each FOL law is represented by a

The argument "*FOL\_law\_Id*" is the unique identifier of the FOL law, i.e. a positive integer. "*FOL\_law\_description*" is the name of a FOL law. "*FOL\_law\_specification*" is a list which has

The theorems that have to be proved must also be represented in the KB. Each theorem is specified by a FOL formula. They are represented by the predicates "*theorem\_ID/1*" and

 *"fol\_law(FOL\_law\_Id, FOL\_law\_description, FOL\_law\_specification).".* 

 *[ex v(2):tv(1), ex v(3):seq(tv(1)), eq(v(1):seq(tv(1)),seq\_cons(v(2):tv(1),* 

the ground representation of the specification of FOL law.

**Specification**:

**Representation**:

predicate

 *x1/seq(a2),[x1= < >* 

 *( x3/a2,*

 *[all v(1):seq(tv(1)), (eq(v(1):seq(tv(1)),nil\_seq) \/* 

*axiom\_def (1, sequences, 'domain closure',* 

 *v(3):seq(tv(1))):seq(tv(1))) ])]).* 

**3.5 Representation of first-order logic laws** 

*"fol\_law\_ID(FOL\_laws\_Ids)"* 

predicate of the form

**Example**: ( distribution)

 *R) ↔ (P* 

**3.6 Representation of theorems** 

"*theorem/4*" as follows. The predicate

*"theorem\_ID(Theorems\_Ids)"* 

*Q) (P R)* 

 *distribution',* 

 *[f1 /\ (f2 \/ f3) <-> (f1 /\ f2) \/ (f1 /\ f3)]).* 

**Specification:** 

**Representation**:

*fol\_law(2,'* 

**3.6.1 Initial theorem** 

*P (Q* 

*[*

> *theorem(1, progr1, spec\_struct1, [all v(1):seq(int),all v(2):int, (sum(v(1):seq(int), v(2):int):int <-> sum\_s(v(1):seq(int), v(2):int))]).*

#### **3.6.2 Theorems in structured form**

The specification of a theorem may need to be transformed into structured form in order to proceed to the proof. The structure form of theorems facilitates the proof task. Each theorem in structured form is specified by a FOL formula. They are represented by the predicates "*theorem\_struct\_ID/1*" and "*theorem\_struct/4*" as follows. The predicate

*"theorem\_struct\_ID(Theorems\_Ids)"* 

represents the identifiers of all theorems available in the KB with the specification part in structured form. Its argument "*Theorems\_Ids*" is a list with the identifiers of all theorems in structured form. The specification of each theorem is represented by a predicate of the form

*"theorem\_struct(Theorem\_struct\_Id,Program\_Id,Spec\_struct\_Id, Theorem\_specification).".* 

The argument "*Theorem\_struct\_Id*" is the unique identifier, i.e. a positive integer, of the theorem whose specification part is in structured form. The arguments "*Program\_Id*" and "*Spec\_struct\_Id*" are the unique identifiers of the program and the structured specifications respectively. "*Theorem\_specification*" is a list which has the representation of the specification of theorem.

Knowledge Representation in a Proof Checker for Logic Programs 171

 *progr\_clause(progr1, 4, [all v(9):seq(int), all v(10):int, all v(11):seq(int), [p3(v(9):seq(int), v(10):int, v(11):seq(int))<-> p5(v(9):seq(int), v(10):int, v(11):seq(int)) /\ p6(v(9):seq(int),* 

*progr\_clause(progr1, 5, [all v(12):seq(int), all v(13):int, all v(14):seq(int),[p5(v(12):seq(int),* 

*progr\_clause(progr1, 6, [all v(15):seq(int), all v(17):int, all v(16):seq(int),[p6(v(15):seq(int),* 

 *progr\_clause(progr1, 7, [all v(18):seq(int), all v(19):int, all v(20):int, all v(21):int,[p4(v(18):seq(int), v(19):int, v(20):int, v(21):int) <-> plus\_int(v(20):int, v(19):int,* 

The logic specification (*Spec*) is shown in Section 3.6 and its representation in ground form. The theory *A* of the DT operations including the specification of the DT operations is as

*x4/seq(a2),[x1=x3::x4])]* 

*axiom\_def(2, sequences, "uniqueness i", [all v(1):tv(1), all v(2):seq(tv(1)), [~[eq(seq\_cons* 

 *axiom\_def(3, sequences, "uniqueness ii", [all v(1):tv(1), all v(2):tv(1), all v(3):seq(tv(1)), all v(4):seq(tv(1)),[eq(seq\_cons(v(1):tv(1), v(3):seq(tv(1))):seq(tv(1)), seq\_cons(v(2):tv(1), v(4):seq(tv(1))):seq(tv(1))) -> (eq(v(1):tv(1), v(2):tv(1)) /\* 

*axiom\_def(4, sequences, "summation over 0 entities", [all v(1):seq(int), [eq(v(1):seq(int),nil\_seq)* 

*x5/seq(a2),[x1::x4=x3::x5* 

*x1=x3 x4=x5]* 

 *progr\_clause(progr1, 2, [all v(6):seq(int),[p1(v(6):seq(int)) <-> empty\_seq(v(6):seq(int))]]). progr\_clause(progr1, 3, [all v(7):seq(int), all v(8):int,[p2(v(7):seq(int), v(8):int) <->* 

*v(5):int, v(2):int) /\ sum(v(4):seq(int),v(5):int)]])]]).* 

*v(13):int, v(14):seq(int)) <-> head(v(12):seq(int), v(13):int)]]).* 

*v(17):int, v(16):seq(int))<-> tail(v(15):seq(int), v(16):seq(int))]]).* 

*neutral\_add\_subtr\_int(v(8):int)]]).* 

*v(10):int, v(11):seq(int))]]).* 

Domain closure axiom for sequences

Uniqueness axioms for sequences

 *( x3/a2,*

Its ground representation is shown in section 3.4.

*eq(v(3):seq(tv(1)), v(4):seq(tv(1))))]]).* 

Definition of summation operation over 0 entities

*x3/seq(a2),[~[x1::x3/a2= < > ]]* 

*Σ((i=1 to #x1 ) x1i )=0]* 

*-> eq(sum(1:int,len(v(1):seq(int)):int, v(1,v(3):nat):int), 0:int)]]).* 

*x4/seq(a2),*

*(v(1):tv(1), v(2):seq(tv(1))):tv(1),nil\_seq:seq(tv(1)))]]]).* 

 *x1/seq(a2),[x1= < >* 

*v(21):int)]]).* 

follows. *Axioms* 

*i*

*ii*

*Lemmas* 

*x1/a2,*

*x1/a2,x3/a2,*

*x1/seq(Z),[x1= < >* 

**Representation**:

**Representation**:

*[~p1(v(1):seq(int)) /\ p3(v(1):seq(int), v(3):int,v(4):seq(int)) /\ p4(v(1):seq(int),v(3):int,* 

#### **Example**

In order to construct the theorem in structured form the predicate specification must be transformed into structured form. The initial logic specification for predicate *sum*/2 is the following.

$$\forall \mathbf{x1} \text{/seq(Z)}, \mathbf{x2} \text{/Z (sum}^{\mathbf{x1}} \mathbf{x2} \text{/} \mathbf{x1}, \mathbf{x2}) \leftrightarrow \mathbf{x2} = \sum\_{l=1}^{\#\mathbf{x1}} \mathbf{x1}\_l \text{/} \mathbf{x2}$$

The logic specification of *sumS(x1,x2)* in structured form and its representation are following.

#### **Theorem Specification**:

*x1/seq(Z), x2/Z (sums(x1,x2) ↔ [x1=<>x2=0 [x4/Z, x5/Z, x6/seq(Z),[x1=x5::x6 x2=x5+x4 sums(x6,x4)]]])* 

#### **Representation**:

```
theorem_struct(1, progr1, spec_struct1,
 [all v(1):seq(int), all v(2):int, (sum(v(1):seq(int), v(2):int) <-> 
 ((eq(v(1):seq(int),nil_seq:seq(tv(1))) /\eq(v(2):int,0:int)) \/ [ex v(3):int, 
 ex v(4):int, ex v(5):seq(int), (eq(v(1):seq(int), seq_cons(v(4):int, 
 v(5):seq(int)):seq(int) ) /\ eq(v(2):int,plus(v(4):int, v(3):int)) /\ 
 sum_s(v(5):seq(int), v(3):int) )]) )]).
```
#### **3.7 An example theory and theorem**

Throughout this Chapter we use the correctness theorem and theory for predicate *sum/2.*  That is,

*Comp(Pr) Spec A |= x1/seq(Z), x2/Z (sum(x1,x2) ↔ sumS(x1, x2))* 

The ground representation of theory is also illustrated.

#### **Theory**

The logic program completion *Comp(Pr)* of *Pr* is as follows.

```
x1/seq(Z),x2/Z,[sum(x1,x2) ↔ (p1(x1)  p2(x1,x2)  [x3/Z,  x4/seq(Z),
 x5/Z,[~p1(x1)p3(x1,x3,x4)p4(x1,x3,x5,x2)sum(x4,x5)]])] 
x1/seq(Z),[p1(x1) ↔ empty_seq(x1)] 
 x1/seq(Z),x2/Z,[p2(x1,x2) ↔ neutral_add_subtr_int(x2)] 
 x1/seq(Z),x2/Z,x3/seq(Z),[p3(x1,x2,x3) ↔ p5(x1,x2,x3)p6(x1,x2,x3)] 
 x1/seq(Z),x2/Z,x3/seq(Z),[p5(x1,x2,x3) ↔ head(x1,x2)] 
 x1/seq(Z),x2/Z,x3/seq(Z),[p6(x1,x2,x3) ↔ tail(x1,x3)] 
 x1/seq(Z),x2/Z,x3/Z,x4/Z,[p4(x1,x2,x3,x4) ↔ plus_int(x3,x2,x4)]
```
#### **Representation**:

 *progr\_clause(progr1, 1,[all v(1):seq(int), all v(2):int, [sum(v(1):seq(int),v(2):int) <-> ((p1(v(1):seq(int)) /\p2(v(1):seq(int), v(2):int)) \/ [ex v(3):int, ex v(4):seq(int), ex v(5):int,*  *[~p1(v(1):seq(int)) /\ p3(v(1):seq(int), v(3):int,v(4):seq(int)) /\ p4(v(1):seq(int),v(3):int, v(5):int, v(2):int) /\ sum(v(4):seq(int),v(5):int)]])]]).* 


The logic specification (*Spec*) is shown in Section 3.6 and its representation in ground form.

The theory *A* of the DT operations including the specification of the DT operations is as follows.

*Axioms* 

170 Advances in Knowledge Representation

In order to construct the theorem in structured form the predicate specification must be transformed into structured form. The initial logic specification for predicate *sum*/2 is the

*<sup>i</sup> x* )

The logic specification of *sumS(x1,x2)* in structured form and its representation are following.

*sums(x6,x4)]]])* 

*x2=0 [x4/Z,x5/Z,* 

Throughout this Chapter we use the correctness theorem and theory for predicate *sum/2.* 

 *progr\_clause(progr1, 1,[all v(1):seq(int), all v(2):int, [sum(v(1):seq(int),v(2):int) <-> ((p1(v(1):seq(int)) /\p2(v(1):seq(int), v(2):int)) \/ [ex v(3):int, ex v(4):seq(int), ex v(5):int,* 

 *x1/seq(Z), x2/Z (sum(x1,x2) ↔ sumS(x1, x2))* 
