**Meet the editor**

Prof Elmer P. Dadios finished his doctoral degree at Loughborough University, United Kingdom in 1996. He was a recipient of the Philippines Department of Science and Technology (DOST) 50 Men and Women of Science and Technology; DOST Scholar Achievers); The National Research Council of the Philippines Basic Research Achievement Award; The National Academy of Science

and Technology (NAST) Outstanding Scientific Paper Award; The De La Salle University Miguel Febres Cordero Research Award. Currently, Dr. Dadios is a University Fellow of the De La Salle University and holds the University's highest faculty rank of Full Professor 10. He is the president of the NEURONEMECH Inc.

He has been a consultant for Robotics and Automation in the Philippine government and private corporations. He is the founder of the IEEE Computational Intelligence Society - Philippines Chapter. He is the Founder and President of the Mechatronics and Robotics Society of the Philippines.

Contents

**Preface IX** 

**Part 1 Hybrid Fuzzy Logic Algorithms 1** 

**Fuzzy Set Model and Data Analysis 3** 

Chapter 4 **Standard Fuzzy Sets and some Many-Valued Logics 75** 

**System in Supply Chain Management Evaluation 115** 

**Multivalued Knowledge-Base 25** 

Chapter 3 **Resolution Principle and Fuzzy Logic 55** 

Chapter 5 **Parametric Type-2 Fuzzy Logic Systems 97**  Arturo Tellez, Heron Molina, Luis Villa,

Chapter 6 **Application of Adaptive Neuro Fuzzy Inference** 

**Algorithms in Wavelet Domain 127**  Heydy Castillejos and Volodymyr Ponomaryov

Chapter 8 **Fuzzy Logic Approach for QoS Routing Analysis 149** 

**Part 2 Techniques and Implementation 147** 

Adrian Shehu and Arianit Maraj

Elsa Rubio and Ildar Batyrshin

Chapter 1 **Ambiguity and Social Judgment:** 

Kazuhisa Takemura

Chapter 2 **From Fuzzy Datalog to** 

Agnes Achs

Hashim Habiballa

Jorma K. Mattila

Thoedtida Thipparat

Chapter 7 **Fuzzy Image Segmentation** 

### Contents

### **Preface** XI

	- **Part 2 Techniques and Implementation 147**

### Preface

*Algorithm* is used to define the notion of decidability. It is a set of rules that precisely defines a sequence of operations. This is essential for computers to process information. Computer programs contain algorithms that detail specific instructions a computer should perform to carry out a specified task. The traditional computer program performs specific instructions sequentially, and uses crisp values of information which do not support uncertainties. Thus, when a problem is getting harder and becoming more complex, alternative algorithms are required in order to obtain accurate solutions. To this date, the quest of discovering new algorithms is in a race. The fuzzy logic algorithm is one of very strong contender s in this race because fuzzy logic exhibits reasoning power similar to how humans reason out. Fuzzy logic is able to process incomplete data and provide approximate solutions to problems other methods find difficult to solve. Fuzzy logic was first proposed by Lotfi A. Zadeh of the University of California at Berkeley in 1965. This is based on the idea that humans do not think in terms of crisp numbers, but rather in terms of concepts. The degree of membership of an object in a concept may be partial, with an object being partially related with many concepts. By characterizing the idea of partial membership in concepts, fuzzy logic is better able to convert natural language control strategies in a form usable by machines.

This book presents Algorithms, Techniques, and Implementations of fuzzy logic. It is categorized into two sections, namely:


In section one, there are seven chapters that focus on hybrid fuzzy logic algorithms and methodology:

	- Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation
	- Fuzzy Image Segmentation Algorithms in Wavelet Domain

In section two, there are seven chapters that focus on fuzzy logic modeling and implementations, particularly:


The contributions to this book clearly reveal the fuzzy logic models, techniques, and implementation which are very important for the development of new technologies. I hope the readers of this book will find it a unique and significant source of knowledge and reference for the years to come.

> **Elmer P. Dadios**  University Fellow and Full Professor, Department of Manufacturing Engineering and Management, De La Salle University, Philippines

## **Part 1**

**Hybrid Fuzzy Logic Algorithms** 

**1** 

*Japan* 

Kazuhisa Takemura *Waseda University,* 

**Ambiguity and Social Judgment:** 

**Fuzzy Set Model and Data Analysis** 

Comparative judgment is essential in human social lives. Comparative judgment is a type of human judgment procedure, in which the evaluator is asked which alternative is preferred (e.g., "Do you prefer Brand A to Brand B?" or "How do you estimate the probability of choosing Brand A over Brand B when you compare the two brands? "). This type of judgment is distinguished from absolute judgment, in which the evaluator is asked to assess the attractiveness of an object (e.g., "How much do you like this brand on

The ambiguity of social judgment has been conceptualized by the fuzzy set theory. The fuzzy set theory provides a formal framework for the presentation of the ambiguity. Fuzzy sets were defined by Zadeh(1965) who also outlined how they could be used to characterize complex systems and decision processes ( Zadeh, 1973). Zadeh argues that the capacity of humans to manipulate fuzzy concepts should be viewed as a major asset, not a liability. The complexities in the real world often defy precise measurement and fuzzy logic defines concepts and its techniques provide a mathematical method able to deal with thought processes which are often too imprecise and ambiguous to deal with by classical

This chapter introduces a model of ambiguous comparative judgment (Takemura,2007) and provides a method of data analysis for the model, and then shows some examples of the data analysis of social judgments. Comparative judgments in social situations often involve ambiguity with regard to confidence, and people may be unable to make judgments without some confidence intervals. To measure the ambiguity (or vagueness) of human judgment, the fuzzy rating method has been proposed and developed (Hesketh, Pryor, Gleitzman, & Hesketh, 1988). In fuzzy rating, respondents select a representative rating point on a scale and indicate higher or lower rating points, depending on the relative ambiguity of their judgment. For example, fuzzy rating would be useful for perceived temperature, with the evaluator indicating a representative value and lower and upper values. This rating scale allows for asymmetries and overcomes the problem, identified by Smithson (1987), of researchers arbitrarily deciding the most representative value from a range of scores. By making certain simplifying assumptions (which is not uncommon in fuzzy set theory), the rating can be viewed as an L-R fuzzy number, thereby making the use of fuzzy set

**1. Introduction** 

a scale of 0 to 100?").

mathematical techniques.

## **Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis**

Kazuhisa Takemura *Waseda University, Japan* 

### **1. Introduction**

Comparative judgment is essential in human social lives. Comparative judgment is a type of human judgment procedure, in which the evaluator is asked which alternative is preferred (e.g., "Do you prefer Brand A to Brand B?" or "How do you estimate the probability of choosing Brand A over Brand B when you compare the two brands? "). This type of judgment is distinguished from absolute judgment, in which the evaluator is asked to assess the attractiveness of an object (e.g., "How much do you like this brand on a scale of 0 to 100?").

The ambiguity of social judgment has been conceptualized by the fuzzy set theory. The fuzzy set theory provides a formal framework for the presentation of the ambiguity. Fuzzy sets were defined by Zadeh(1965) who also outlined how they could be used to characterize complex systems and decision processes ( Zadeh, 1973). Zadeh argues that the capacity of humans to manipulate fuzzy concepts should be viewed as a major asset, not a liability. The complexities in the real world often defy precise measurement and fuzzy logic defines concepts and its techniques provide a mathematical method able to deal with thought processes which are often too imprecise and ambiguous to deal with by classical mathematical techniques.

This chapter introduces a model of ambiguous comparative judgment (Takemura,2007) and provides a method of data analysis for the model, and then shows some examples of the data analysis of social judgments. Comparative judgments in social situations often involve ambiguity with regard to confidence, and people may be unable to make judgments without some confidence intervals. To measure the ambiguity (or vagueness) of human judgment, the fuzzy rating method has been proposed and developed (Hesketh, Pryor, Gleitzman, & Hesketh, 1988). In fuzzy rating, respondents select a representative rating point on a scale and indicate higher or lower rating points, depending on the relative ambiguity of their judgment. For example, fuzzy rating would be useful for perceived temperature, with the evaluator indicating a representative value and lower and upper values. This rating scale allows for asymmetries and overcomes the problem, identified by Smithson (1987), of researchers arbitrarily deciding the most representative value from a range of scores. By making certain simplifying assumptions (which is not uncommon in fuzzy set theory), the rating can be viewed as an L-R fuzzy number, thereby making the use of fuzzy set

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 5

applied for the analysis of the former model, and a fuzzy logistic regression model

*X1 =* (*X11, X12,…,X1n*), *X2* = (*X21, X22,…,X2n*),…, *Xm =* (*Xm1, Xm2,…,Xmn*), where *Xij* (*i = 1.m; j = 1.,n*) is the value of alternative *Xi* on dimension *j.* Note that the components of *Xi* may be

The relational structure < X, > is a weak order if, and only if, for all *Xa, Xb, Xc*, the

However, the weak order relation is not always assumed in this paper. That is, transitivity

As a classical preference relation is a subset of X × X , is a classical set often viewed as a

Note that "iff" is short for "if and only if" and {0,1} is called the valuation set. If the valuation set is allowed to be the real interval [0,1], is called a fuzzy preference relation.

*µa*: X × X → [0,1].

Ambiguous preference relations are defined as a fuzzy set of *X* ×*X* × *S*, where *S* is a subset of one-dimensional real number space. *S* is interpreted as a domain of preference strength. *S*

:: X × X × *S* → [0,1].

1 iff X X 0 iff not(X X )

a b a b

β

is defined as:

 .

 *Xn* be a set of multidimensional alternatives with elements of the form

(Takemura, 2004) was proposed for the analysis of the latter model.

**2.2.1 Definition 1: Set of multidimensional alternatives** 

ambiguous linguistic variables rather than crisp numbers.

Let be a binary relation on X, that is, is a subset of X × X*.*

1. Connectedness (Comparability): *Xa, Xb* or *Xb Xa,* 2. Transitivity: If *Xa Xb* and *Xb Xc*, then *Xa Xc.*

**2.2.3 Definition 3: Fuzzy preference relation** 

characteristic function *c* from X × X to {0,1} such that:

That is, the membership function *µa* is defined as:

**2.2.4 Definition 4: Ambiguous preference relation** 

or connectedness may be violated in the preference relations.

*c*(Xj Xk) =

may be bounded, for example, *S* = [0,1]. The membership function *µ*

*µ*β  

**2.2.2 Definition 2: Classic preference relation** 

**2.2 Assumptions of the model** 

following two axioms are satisfied.

Let X *= X1*

× *X2* × *….* ×

theoretical operations possible (Hesketh et al., 1988; Takemura, 2000). Lastly, numerical illustrations of psychological experiments are provided to examine the ambiguous comparative judgment model (Takemura, 2007) using the proposed data analysis.

### **2. Model of ambiguous comparative judgment**

### **2.1 Overview of ambiguous comparative judgment and the judgment model**

Social psychological theory and research have demonstrated that comparative evaluation has a crucial role in the cognitive processes and structures that underlie people's judgments, decisions, and behaviors(e.g.,Mussweiler,2003). People comparison processes are almost ubiquitous in human social cognition. For example, people tend to compare their performance of others in situations that are ambiguous (Festinger,1954). It is also obvious that they are critical in forming personal evaluations, and making purchase decisions (Kühberger,,.Schulte-Mecklenbeck, & Ranyard, 2011; Takemura,2011).

The ambiguity or vagueness is inherent in people's comparative social judgment. Traditionally, psychological and philosophical theories implicitly had assumed the ambiguity of thought processes ( Smithson, 1987, 1989). For example, Wittgenstein (1953) pointed out that lay categories were better characterized by a " family resemblance" model which assumed vague boundaries of concepts rather than a classical set-theoretic model. Rosch (1975) and Rosch & Mervice(1975) also suggested vagueness of lay categories in her prototype model and reinterpret-ed the family resemblance model. Moreover, the social judgment theory (Sherif & Hovland,1961) and the information integration theory (Anderson,1988) for describing judgment and decision making assumed that people evaluate the objects using natural languages which were inherently ambiguous. However, psychological theories did not explicitly treat the ambiguity in social judgment with the exception of using random error of judgment.

Takemura (2007) proposed fuzzy set models that explain ambiguous comparative judgment in social situations. Because ambiguous comparative judgment may not always hold transitivity and comparability properties, the models assume parameters based on biased responses that may not hold transitivity and comparability properties. The models consist of two types of fuzzy set components for ambiguous comparative judgment. The first is a fuzzy theoretical extension of the additive difference model for preference, which is used to explain ambiguous preference strength and does not always assume judgment scale boundaries, such as a willing to pay (WTP) measure. The second type of model is a fuzzy logistic model of the additive difference preference, which is used to explain ambiguous preference in which preference strength is bounded, such as a probability measure (e.g., a certain interval within a bounded interval from 0 to 100%).

Because judgment of a bounded scale, such as a probability judgment, causes a methodological problem when fuzzy linear regression is used, a fuzzy logistic function to prevent this problem was proposed. In both models, multi-attribute weighting parameters and all attribute values are assumed to be asymmetric fuzzy L-R numbers. For each model, A method of parameter estimation using fuzzy regression analysis was proposed. That is, a fuzzy linear regression model using the least squares method (Takemura, 1999, 2005) was applied for the analysis of the former model, and a fuzzy logistic regression model (Takemura, 2004) was proposed for the analysis of the latter model.

### **2.2 Assumptions of the model**

4 Fuzzy Logic – Algorithms, Techniques and Implementations

theoretical operations possible (Hesketh et al., 1988; Takemura, 2000). Lastly, numerical illustrations of psychological experiments are provided to examine the ambiguous

Social psychological theory and research have demonstrated that comparative evaluation has a crucial role in the cognitive processes and structures that underlie people's judgments, decisions, and behaviors(e.g.,Mussweiler,2003). People comparison processes are almost ubiquitous in human social cognition. For example, people tend to compare their performance of others in situations that are ambiguous (Festinger,1954). It is also obvious that they are critical in forming personal evaluations, and making purchase decisions

The ambiguity or vagueness is inherent in people's comparative social judgment. Traditionally, psychological and philosophical theories implicitly had assumed the ambiguity of thought processes ( Smithson, 1987, 1989). For example, Wittgenstein (1953) pointed out that lay categories were better characterized by a " family resemblance" model which assumed vague boundaries of concepts rather than a classical set-theoretic model. Rosch (1975) and Rosch & Mervice(1975) also suggested vagueness of lay categories in her prototype model and reinterpret-ed the family resemblance model. Moreover, the social judgment theory (Sherif & Hovland,1961) and the information integration theory (Anderson,1988) for describing judgment and decision making assumed that people evaluate the objects using natural languages which were inherently ambiguous. However, psychological theories did not explicitly treat the ambiguity in social judgment with the

Takemura (2007) proposed fuzzy set models that explain ambiguous comparative judgment in social situations. Because ambiguous comparative judgment may not always hold transitivity and comparability properties, the models assume parameters based on biased responses that may not hold transitivity and comparability properties. The models consist of two types of fuzzy set components for ambiguous comparative judgment. The first is a fuzzy theoretical extension of the additive difference model for preference, which is used to explain ambiguous preference strength and does not always assume judgment scale boundaries, such as a willing to pay (WTP) measure. The second type of model is a fuzzy logistic model of the additive difference preference, which is used to explain ambiguous preference in which preference strength is bounded, such as a probability measure (e.g., a

Because judgment of a bounded scale, such as a probability judgment, causes a methodological problem when fuzzy linear regression is used, a fuzzy logistic function to prevent this problem was proposed. In both models, multi-attribute weighting parameters and all attribute values are assumed to be asymmetric fuzzy L-R numbers. For each model, A method of parameter estimation using fuzzy regression analysis was proposed. That is, a fuzzy linear regression model using the least squares method (Takemura, 1999, 2005) was

comparative judgment model (Takemura, 2007) using the proposed data analysis.

**2.1 Overview of ambiguous comparative judgment and the judgment model** 

(Kühberger,,.Schulte-Mecklenbeck, & Ranyard, 2011; Takemura,2011).

**2. Model of ambiguous comparative judgment** 

exception of using random error of judgment.

certain interval within a bounded interval from 0 to 100%).

### **2.2.1 Definition 1: Set of multidimensional alternatives**

Let X *= X1*× *X2* × *….* × *Xn* be a set of multidimensional alternatives with elements of the form *X1 =* (*X11, X12,…,X1n*), *X2* = (*X21, X22,…,X2n*),…, *Xm =* (*Xm1, Xm2,…,Xmn*), where *Xij* (*i = 1.m; j = 1.,n*) is the value of alternative *Xi* on dimension *j.* Note that the components of *Xi* may be ambiguous linguistic variables rather than crisp numbers.

### **2.2.2 Definition 2: Classic preference relation**

Let be a binary relation on X, that is, is a subset of X × X*.*

The relational structure < X, > is a weak order if, and only if, for all *Xa, Xb, Xc*, the following two axioms are satisfied.


However, the weak order relation is not always assumed in this paper. That is, transitivity or connectedness may be violated in the preference relations.

### **2.2.3 Definition 3: Fuzzy preference relation**

As a classical preference relation is a subset of X × X , is a classical set often viewed as a characteristic function *c* from X × X to {0,1} such that:

$$c(\mathsf{X}\_{\mathsf{j}} \succ \mathsf{X}\_{\mathsf{k}}) = \begin{cases} 1 & \text{iff} \\ 0 & \text{iff} \end{cases} \qquad \begin{array}{c} \mathsf{X}\_{\mathsf{a}} \succ \mathsf{X}\_{\mathsf{b}} \\ \text{not}(\mathsf{X}\_{\mathsf{a}} \succ \mathsf{X}\_{\mathsf{b}}) \cdot \mathsf{X}\_{\mathsf{c}} \end{array}$$

Note that "iff" is short for "if and only if" and {0,1} is called the valuation set. If the valuation set is allowed to be the real interval [0,1], is called a fuzzy preference relation. That is, the membership function *µa* is defined as:

$$
\mu\_a \colon \mathcal{X} \times \mathcal{X} \to [0, 1].
$$

### **2.2.4 Definition 4: Ambiguous preference relation**

Ambiguous preference relations are defined as a fuzzy set of *X* ×*X* × *S*, where *S* is a subset of one-dimensional real number space. *S* is interpreted as a domain of preference strength. *S* may be bounded, for example, *S* = [0,1]. The membership function *µ*βis defined as:

$$
\mu\_{\land} \dots \mathbb{X} \times \mathbb{X} \times \mathbb{S} \to [0, 1].
$$

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 7

*abn* ⊗ *(Xan*○-*-Xbn*

where *log*, ○÷ ,⊗ ,○+, and ○- are logarithmic, division, product , additive, and difference

The second model of the equation (2) is the model for [0,1]. However, the model could apply to not only the interval [0,1] but also any finite interval [a,b](a<b). Therefore, the model of

Non-comparability and intransitivity properties are explained if a threshold of comparative judgment is assumed, if intransitivity is indicated by the necessity measure of fuzzy comparative relation resulting from the existence of the threshold, and if a necessity

(1

difference model and the logistic regression model, respectively. Assuming the above relation of (3) or (4), it is clear that intransitivity and non-comparability hold in the

Traditional approaches to the measurement of social judgment have involved methods such as the semantic differential, the Likert scale, or the Thurstone scale. Although insights into the ambiguous nature of social judgment were identified early in the development of measurement of social judgment, the subsequent methods used failed to capture this ambiguity, no doubt because traditional mathematics was not well developed for dealing

In order to measure the vagueness of human judgment, the fuzzy rating method has recently been proposed and developed (Hesketh et al.,1988; Takemura,1996). In the fuzzy rating method, respondents select a representative rating point on a scale and indicate lower or upper rating points if they wish depending upon the relative vagueness of their judgment (see Figure 2). For example, the fuzzy rating method would be useful for measuring perceived temperature indicating the representative value and the lower or upper values. This rating scale allows for asymmetries, and overcomes the problem, identified by Smithson (1987), of researchers arbitrarily deciding most representative value from a range of scores. By making certain simplifying assumptions ( not uncommon within fuzzy set theory), the rating can be viewed as a L-R fuzzy number, hence making possible the use of

**3. Fuzzy data analysis for the ambiguous comparative judgment model** 

*)* is a necessity measure, and *θ, and Pθ* are threshold parameters for the additive

*Xa Xb iff Nes ( v(Xa Xb)*

)(2)

>θ

○-*p(Xa Xb) )*

> *P*θ

*)* (3)

*)* (4)

 *1nXb1)ype correction and drawing figures.mments and* 

)=

A

operations based on the extension principle for the fuzzy set, respectively.

the equation (2)is considered to be a special case for the finite interval model.

measure for fuzzy relation does not always lead to comparability. That is,

*Xa Xb iff Nes( p(Xa Xb)* ○÷

l

or

where *Nes (*

・

comparative judgment.

**3.1 Fuzzy rating data and fuzzy set** 

fuzzy set theoretic operations).

with vagueness of judgment (Hesketh et al.,1988).

*og ( p(Xa Xb)* ○÷

*Aab0*○+ *Aab1* ⊗

(1

*Xa1*○-*Xb1*

(

○-*p(Xj Xk)*

)○+…○+

**2.2.7 Explaining non-comparability and intransitivity** 

Ambiguous preference relation is interpreted as a fuzzified version of a classical characteristic function *c*(*Xa Xb*).

Therefore, the ambiguous preference relation for *Xa Xb* is represented as the fuzzy set *v(Xa Xb)*. For simplicity, *v(Xa Xb)* will be assumed to be an asymmetrical L-R fuzzy number (see Figure 1).

Fig. 1. Example of Ambiguous Preference Relation

### **2.2.5 Additive difference model of ambiguous comparative judgement**

The ambiguous preference relation *v(Xa Xb)* for *Xa Xb* is represented as the following additive difference model using L-R fuzzy numbers:

*v(Xa Xb)*= *Aab0*○+ *Aab1* ⊗ (*Xa1*○-*Xb1*)○+…○+Aabn ⊗ (*Xan*○-*-Xbn*)(1)

where ⊗ , ○+, and ○-are the product, additive, and difference operation based on the extension principle for the fuzzy set, respectively.

The parameter *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. The parameter *Ajk*0 would be a fuzzy variable and larger than *Aab0* if *Xa* were more salient than *Xb*. This model can be reduced to the Fuzzy Utility Difference Model (Nakamura, 1992) if multi-attribute weighting parameters are assumed to be crisp numbers, and reduced to the Additive Difference Model (Tversky, 1969) if multi-attribute weighting parameters and the values of multi-attributes are assumed to be crisp numbers.

#### **2.2.6 Logistic model of ambiguous comparative judgement**

Let an ambiguous preference relation that is bounded (e.g., fuzzy probability in [0,1]) be *p*(*Xa Xb*) for *Xa Xb*. *p*(*Xa Xb*) and be represented as the following logistic model using L-R fuzzy numbers:

l*og ( p(Xa Xb)* ○÷ (1 ○-*p(Xj Xk)*)= *1nXb1)ype correction and drawing figures.mments and Aab0*○+ *Aab1* ⊗ (*Xa1*○-*Xb1*) ○+…○+A*abn* ⊗ *(Xan*○-*-Xbn*)(2)

where *log*, ○÷ ,⊗ ,○+, and ○- are logarithmic, division, product , additive, and difference operations based on the extension principle for the fuzzy set, respectively.

The second model of the equation (2) is the model for [0,1]. However, the model could apply to not only the interval [0,1] but also any finite interval [a,b](a<b). Therefore, the model of the equation (2)is considered to be a special case for the finite interval model.

### **2.2.7 Explaining non-comparability and intransitivity**

Non-comparability and intransitivity properties are explained if a threshold of comparative judgment is assumed, if intransitivity is indicated by the necessity measure of fuzzy comparative relation resulting from the existence of the threshold, and if a necessity measure for fuzzy relation does not always lead to comparability. That is,

$$\text{Xa} \succeq \text{Xb} \text{ } \text{iff} \text{ Nes} \left( v(\text{Xa} \succeq \text{Xb}) \ge \theta \right) \tag{3}$$

or

6 Fuzzy Logic – Algorithms, Techniques and Implementations

Ambiguous preference relation is interpreted as a fuzzified version of a classical

Therefore, the ambiguous preference relation for *Xa Xb* is represented as the fuzzy set *v(Xa Xb)*. For simplicity, *v(Xa Xb)* will be assumed to be an asymmetrical L-R fuzzy number

**Ambiguous preference relation** 

∈*S* 

**as fuzzy set :** *v(Xa Xb)*

characteristic function *c*(*Xa Xb*).

*μβ(Xa Xb)*

1

0

*Preference strength s*

**2.2.5 Additive difference model of ambiguous comparative judgement** 

The ambiguous preference relation *v(Xa Xb)* for *Xa Xb* is represented as the following

*Xan*○-*-Xbn*

where ⊗ , ○+, and ○-are the product, additive, and difference operation based on the

The parameter *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. The parameter *Ajk*0 would be a fuzzy variable and larger than *Aab0* if *Xa* were more salient than *Xb*. This model can be reduced to the Fuzzy Utility Difference Model (Nakamura, 1992) if multi-attribute weighting parameters are assumed to be crisp numbers, and reduced to the Additive Difference Model (Tversky, 1969) if multi-attribute weighting parameters and the values of multi-attributes

Let an ambiguous preference relation that is bounded (e.g., fuzzy probability in [0,1]) be *p*(*Xa Xb*) for *Xa Xb*. *p*(*Xa Xb*) and be represented as the following logistic model using L-

)(1)

Fig. 1. Example of Ambiguous Preference Relation

additive difference model using L-R fuzzy numbers:

)○+…○+Aabn ⊗ (

extension principle for the fuzzy set, respectively.

**2.2.6 Logistic model of ambiguous comparative judgement** 

(see Figure 1).

*v(Xa Xb)*

*Aab0*○+ *Aab1* ⊗

=

(*Xa1*○-*Xb1*

are assumed to be crisp numbers.

R fuzzy numbers:

$$X\_a \succ X\_b \text{ iff } \text{Yes}(\begin{array}{c} p(X\_a \succ X\_b) \oplus \end{array} \begin{pmatrix} \text{ } I \stackrel{\frown}{\ominus} p(X\_a \succ X\_b) \end{pmatrix} \succeq P\_\theta) \tag{4}$$

where *Nes (*・*)* is a necessity measure, and *θ, and Pθ* are threshold parameters for the additive difference model and the logistic regression model, respectively. Assuming the above relation of (3) or (4), it is clear that intransitivity and non-comparability hold in the comparative judgment.

### **3. Fuzzy data analysis for the ambiguous comparative judgment model**

### **3.1 Fuzzy rating data and fuzzy set**

Traditional approaches to the measurement of social judgment have involved methods such as the semantic differential, the Likert scale, or the Thurstone scale. Although insights into the ambiguous nature of social judgment were identified early in the development of measurement of social judgment, the subsequent methods used failed to capture this ambiguity, no doubt because traditional mathematics was not well developed for dealing with vagueness of judgment (Hesketh et al.,1988).

In order to measure the vagueness of human judgment, the fuzzy rating method has recently been proposed and developed (Hesketh et al.,1988; Takemura,1996). In the fuzzy rating method, respondents select a representative rating point on a scale and indicate lower or upper rating points if they wish depending upon the relative vagueness of their judgment (see Figure 2). For example, the fuzzy rating method would be useful for measuring perceived temperature indicating the representative value and the lower or upper values. This rating scale allows for asymmetries, and overcomes the problem, identified by Smithson (1987), of researchers arbitrarily deciding most representative value from a range of scores. By making certain simplifying assumptions ( not uncommon within fuzzy set theory), the rating can be viewed as a L-R fuzzy number, hence making possible the use of fuzzy set theoretic operations).

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 9

Firstly, the convexity of the fuzzy subset is defined as follows: A fuzzy subset A ⊆ R is

Aα= {x| μA(x) ≧ α}, α∈[0,1],

Secondly, the normality of the fuzzy subset is defined as follows: A fuzzy subset A ⊆ R is

One of the most well known fuzzy numbers is the L-R fuzzy number (Dubois & Prade,1980).

where L((x - m)/u) is a increasing monotonic function, R((x - m)/v) is a decreasing

An example of the fuzzy rating scale and of the representation of the rating data using L-R fuzzy number are shown in Figure 3. Note in Figure 3 that representations of variables are

0 100

*M ij x*

*R ij x*

*x*

*R ij x*

Fuzzy rating data

*Xij* :

L-R fuzzy number (Membership function)

abbreviated as follows: *xijL* for *xi j (0lL*, *xijR* for *xi j (0l<sup>R</sup>* , *xijM* for *xi j (1lL* = *xi j (1lR*.

*L ij x*

Fig. 3. Fuzzy Rating Data and Its Representation by L-R Fuzzy Numbers

*M ij x*

*L ij x*

convex if and only if every ordinary

The L-R fuzzy number is defined as follows:

μA(x) = L((x - m)/u), - ∞ < x < m,

= R((x - m)/v), m < x < ∞,

monotonic function, u>0, and v>0.

μ

0.1

normal if and only if ∀x ∈R, max μA(x) = 1.

<sup>x</sup>

∀x ∈R:

= 1, x=m,

subset is convex( That is, in the case of a closed interval of R).

Fig. 2. Example of Fuzzy Rating

A fuzzy set A is defined as follows. Let X denote a universal set, such as X={x1,x2,....,xn}. Then, the membership function μA⊆X by which a fuzzy set A is defined has the form

$$\mathsf{u} \mathsf{A} \,\,\, :\, \mathsf{X} \to \{0, 1\} \,\, \,\, \mathsf{A}$$

where [0,1] denotes the interval of real numbers from 0 to 1, inclusive.

The concept of a fuzzy set is the foundation for analysis where fuzziness exists (Zadeh, 1965). a fuzzy set may be expressed as:

$$\begin{aligned} \mathbf{A} &= \mathfrak{\mu} \mathbf{A}(\mathbf{x}\_{\mathrm{l}}) / \chi\_{\mathbf{1}} \oplus \mathfrak{\mu} \mathbf{A}(\mathbf{x}\_{\mathrm{2}}) / \chi\_{\mathbf{2}} \oplus \dots \quad \dots \quad \oplus \mathfrak{\mu} \mathbf{A}(\mathbf{x}\_{\mathrm{n}}) / \chi\_{\mathbf{n}} \\ &= \mathfrak{\underline{\mathfrak{\mu}}} \quad \mathfrak{\mu} \mathbf{A}(\mathbf{x}\_{\mathrm{i}}) / \chi\_{\mathbf{i}} \end{aligned}$$

where μA(xi) represents the "grade of membership" of Xi in A, or the degree to which Xi satisfies the properties of the set A. It should be noted that here the symbol '"+ " does not refer to the ordinary addition.

μA is called a membership function, or a possibility function. The Xi values are drawn from a global set of all possible values, X. Grade of membership take values between 0 and 1. The membership function has a value of 0 when the properties of the fuzzy set are not at all satisfied, and 1 when the properties of fuzzy set are completely satisfied.

Hesketh et al.(1988) pointed out that fuzzy rating data can be represented as fuzzy sets by making certain implifying assumptions, which are not uncommon within fuzzy set theory. According to Hesketh et al.(1988), those assumptions are:


Making those assumptions, fuzzy rating data in this study can be expressed as a fuzzy number which is a kind of fuzzy set. The concept of the fuzzy number can be defined from the concept of the fuzzy subset(Kaufman & Gupta,1985). The properties of fuzzy numbers are the convexity and the normality of a fuzzy subset.

Firstly, the convexity of the fuzzy subset is defined as follows: A fuzzy subset A ⊆ R is convex if and only if every ordinary

$$A\_{\mathfrak{a}} = \{ \mathbf{x} \mid \mathfrak{p}A \left( \mathbf{x} \right) \ge \mathfrak{a} \}, \mathfrak{a} \in [0, 1],$$

subset is convex( That is, in the case of a closed interval of R).

Secondly, the normality of the fuzzy subset is defined as follows: A fuzzy subset A ⊆ R is normal if and only if

∀x ∈R, max μA(x) = 1.

<sup>x</sup>

One of the most well known fuzzy numbers is the L-R fuzzy number (Dubois & Prade,1980).

The L-R fuzzy number is defined as follows:

∀x ∈R:

8 Fuzzy Logic – Algorithms, Techniques and Implementations

A fuzzy set A is defined as follows. Let X denote a universal set, such as X={x1,x2,....,xn}.

The concept of a fuzzy set is the foundation for analysis where fuzziness exists (Zadeh,

where μA(xi) represents the "grade of membership" of Xi in A, or the degree to which Xi satisfies the properties of the set A. It should be noted that here the symbol '"+ " does not

μA is called a membership function, or a possibility function. The Xi values are drawn from a global set of all possible values, X. Grade of membership take values between 0 and 1. The membership function has a value of 0 when the properties of the fuzzy set are not at all

Hesketh et al.(1988) pointed out that fuzzy rating data can be represented as fuzzy sets by making certain implifying assumptions, which are not uncommon within fuzzy set theory.

3. The fuzzy membership function takes its maximum value, one, at the point on the

4. The extent of the fuzzy support is represented by the horizontal lines to either side

5. The fuzzy membership function tapers uniformly from its value of one at the representative point to a value of zero beyond the fuzzy support or the left and right

Making those assumptions, fuzzy rating data in this study can be expressed as a fuzzy number which is a kind of fuzzy set. The concept of the fuzzy number can be defined from the concept of the fuzzy subset(Kaufman & Gupta,1985). The properties of fuzzy numbers

extensions. The membership value of the lower point and the upper point is 0.

Then, the membership function μA⊆X by which a fuzzy set A is defined has the form

where [0,1] denotes the interval of real numbers from 0 to 1, inclusive.

satisfied, and 1 when the properties of fuzzy set are completely satisfied.

1) Low ambiguity

2)High ambiguity

A = μA(x1)/x1 ⊕ μA(x2)/x2 ⊕ ... ⊕ μA(xn)/xn

According to Hesketh et al.(1988), those assumptions are:

2. The global set X is represented along the horizontal axis.

fuzzy support represented by the representative point.

1. The fuzzy set has a convex membership function.

are the convexity and the normality of a fuzzy subset.

Fig. 2. Example of Fuzzy Rating

1965). a fuzzy set may be expressed as:

μA :X→[0, 1],

= Σ μA(xi)/xi,

refer to the ordinary addition.

of evaluated point.

n

i=1

$$\begin{cases} \mu \text{A(x)} = \text{L}((\text{x} - \text{m})/\text{u}), \text{ } \cdot \text{ } \infty \text{ } \text{x} \le \text{m}, \\\\ \text{ } = \text{1}, \text{ } \text{x} \text{=} \text{m}, \\\\ \text{ } = \text{R}((\text{x} - \text{m})/\text{v}), \text{ } \text{m} \le \text{x} \le \text{w}, \end{cases}$$

where L((x - m)/u) is a increasing monotonic function, R((x - m)/v) is a decreasing monotonic function, u>0, and v>0.

An example of the fuzzy rating scale and of the representation of the rating data using L-R fuzzy number are shown in Figure 3. Note in Figure 3 that representations of variables are abbreviated as follows: *xijL* for *xi j (0lL*, *xijR* for *xi j (0l<sup>R</sup>* , *xijM* for *xi j (1lL* = *xi j (1lR*.

Fig. 3. Fuzzy Rating Data and Its Representation by L-R Fuzzy Numbers

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 11

 = ( ) *R k x*0 α

fuzzy coefficient for the *j*-th attribute and the α-level set of fuzzy input data *Xjk*, () ()

j jk a x α α is defined in the same manner, respectively. ( )

assumed to be 1 (a crisp number) for the purpose of estimation for the fuzzy bias parameter

To define the dissimilarity between the predicted and observed values of the dependent

L 2 <sup>k</sup> z ) <sup>α</sup> +( ( ) R <sup>k</sup> y <sup>α</sup> - ( )

The definition in Equation (12) can be applied to interval data as well as to L-R fuzzy numbers. That is, Equation (12) represents the sum of squares for the distance between

To generalize, a dissimilarity indicator representing the square of the distance for L-R fuzzy

In the case of a triangular fuzzy number with *wj =* 1, the above equation is approximately

The proposed method is to estimate fuzzy coefficients using minimization of the sum of *Dk*

R 2 k 1 z ) +( ( ) R k 0 y - ( )

<sup>N</sup> <sup>2</sup> k k 1 Min D =

L 2 k j z ) <sup>α</sup> +( ( ) R <sup>k</sup> <sup>j</sup> y <sup>α</sup> - ( )

=1 (11)

L

<sup>k</sup> z ) <sup>α</sup> (12)

k j z ) <sup>α</sup> ) (13)

k 0 z ) (14)

*2*

0k x <sup>α</sup> and ( )

L R j jk a x α α ,

R 0k x <sup>α</sup> are

j jk a x α α is a product between the lower value of the α-level

R 2

R 2

R 2

(15)

j(h) <sup>1</sup> a 0, ≥ ∈j J (16)

j(h) j(h) <sup>2</sup> a 0, a 0, ≤ ≥∈j J (17)

j(h) <sup>3</sup> a 0, ≤ ∈j J (18)

j(h) j(h) −+ ≥ aa0 (19)

( ) *L k x*0 α

L <sup>k</sup> y <sup>α</sup> - ( )

L L

variable, the following indicator *Dk* ( ) <sup>α</sup> *<sup>2</sup>*was adopted:

n

j=0

L k 0 y - ( )

 wj(( ( ) L <sup>k</sup> <sup>j</sup> y <sup>α</sup> - ( )

> L 2 k 0 z ) +( ( ) L k 1 y - ( )

Objective function:

Subject to: <sup>L</sup>

L R

R

L R

where α*j = jh/n, j = 0,...,n*, *h* is an equal interval, and *wj* is a weight for the *j*-th level.

 *Dk* ( ) <sup>α</sup> *<sup>2</sup>*=( ( )

numbers can be written as follows:

Dk2=

Dk2 =( ( )

In the above Equation (9), () ()

R R

() () R L

interval data.

represented as:

respecting *k*. That is,

*A0.*

j jk a x α α , or () ()

### **3.2 Analysis of the additive difference type model**

The set of fuzzy input-output data for the *k*-th observation is defined as:

$$\left(\mathbf{Y}\_{\text{abk}}; \mathbf{X}\_{\text{a1k}'} \; \mathbf{X}\_{\text{a2k}'}..., \mathbf{X}\_{\text{ank}} \; \mathbf{X}\_{\text{b1k}'} \mathbf{X}\_{\text{b2k}'}..., \mathbf{X}\_{\text{bnk};}\right) \tag{5}$$

where *Yabk* indicates the *k*-th observation's ambiguous preference for the *a-*th alternative (a) over the *b-*th alternative (b), which represented by fuzzy L-R numbers, and *Xajk* and *Xbjk* are the *j*-th attribute values of the alternatives (a and b) for observation *k*.

Let *Xabjk* be *Xajk* - *Xbjk*, where - is a difference operator based on the fuzzy extension principle, and denote *Xk.* as the abbreviation of *Xabk* in the following section. Therefore, a set of fuzzy input-output data for the *i*-th observation is re-written as:

$$(\text{\textquotedblleft}\_{\text{k}}; \text{\textquotedblright}\_{\text{1k}}, \text{\textquotedblleft}\_{\text{2k}}, \dots, \text{\textquotedblleft}\_{\text{nk}}\text{\textquotedblright}\_{\text{\textquotedblleft}}), k = 1, 2, \dots, N\text{\textquotedblright}\tag{6}$$

where *Yk* is a fuzzy dependent variable, and *Xjk* is a fuzzy independent variable represented by L-R fuzzy numbers. For simplicity, assume that *Yk* and *Xjk* are positive for any membership value, α ∈ (0,1).

The fuzzy linear regression model (where both input and output data are fuzzy numbers) is represented as follows:

$$\overline{\mathbf{Y}}\_{\mathbf{k}} = \mathbf{A}\_0 \oplus \mathbf{A}\_1 \otimes \mathbf{X}\_{1\mathbf{k}} \oplus \dots \oplus \mathbf{A}\_n \otimes \mathbf{X}\_{n\mathbf{k}} \tag{7}$$

where is a fuzzy estimated variable, A*j*(*j = 1,…,n*) is a fuzzy regression parameter represented by an L-R fuzzy number, ⊗ is an additive operator, and ⊕ is the product operator based on the extension principle.

It should be noted that although the explicit form of the membership function of Yk cannot be directly obtained, the α-level set of Yk can be obtained from Nguyen's theorem (Nguyen, 1978).

Let ( ) L <sup>k</sup> z <sup>α</sup> be a lower value of Yk , and ( ) R <sup>k</sup> z <sup>α</sup> be an upper value of Yk .

Then,

$$\mathbf{Z}\_{\mathbf{k}} = \left[ \mathbf{z}\_{\mathbf{k}(a)}^{\mathrm{L}}, \mathbf{z}\_{\mathbf{k}(a)}^{\mathrm{R}} \right], \quad \alpha \in \{0, 1\} \tag{8}$$

Where

$$\mathbf{z}\_{\mathbf{k}(\alpha)}^{\mathcal{L}} = \sum\_{\mathbf{j}=0}^{n} \left[ \min \left( \mathbf{a}\_{\mathbf{j}(\alpha)}^{\mathcal{L}} \mathbf{x}\_{\mathbf{jk}(\alpha)}^{\mathcal{L}} \mathbf{a}\_{\mathbf{j}(\alpha)}^{\mathcal{L}} \mathbf{x}\_{\mathbf{jk}(\alpha)}^{\mathcal{R}} \right) \right] \tag{9}$$

$$\mathbf{z}\_{\mathbf{k}(\alpha)}^{\mathbb{R}} = \sum\_{\mathbf{j}=0}^{n} \left[ \max \left( \mathbf{a}\_{\mathbf{j}(\alpha)}^{\mathbb{R}} \mathbf{x}\_{\mathbf{jk}(\alpha)}^{\mathbb{L}} \mathbf{a}\_{\mathbf{j}(\alpha)}^{\mathbb{R}} \mathbf{x}\_{\mathbf{jk}(\alpha)}^{\mathbb{R}} \right) \right] \tag{10}$$

10 Fuzzy Logic – Algorithms, Techniques and Implementations

where *Yabk* indicates the *k*-th observation's ambiguous preference for the *a-*th alternative (a) over the *b-*th alternative (b), which represented by fuzzy L-R numbers, and *Xajk* and *Xbjk* are

Let *Xabjk* be *Xajk* - *Xbjk*, where - is a difference operator based on the fuzzy extension principle, and denote *Xk.* as the abbreviation of *Xabk* in the following section. Therefore, a set

where *Yk* is a fuzzy dependent variable, and *Xjk* is a fuzzy independent variable represented by L-R fuzzy numbers. For simplicity, assume that *Yk* and *Xjk* are positive for any

The fuzzy linear regression model (where both input and output data are fuzzy numbers) is

where is a fuzzy estimated variable, A*j*(*j = 1,…,n*) is a fuzzy regression parameter represented by an L-R fuzzy number, ⊗ is an additive operator, and ⊕ is the product

It should be noted that although the explicit form of the membership function of Yk cannot be directly obtained, the α-level set of Yk can be obtained from Nguyen's theorem (Nguyen,

> ( ) { ( ) () () () () } <sup>n</sup> <sup>L</sup> LL LR k j jk j jk

> ( ) { ( ) () () () () } <sup>n</sup> <sup>R</sup> RL RR k j jk j jk

z min a x ,a x α α <sup>α</sup> <sup>α</sup> <sup>α</sup>

z max a x ,a x <sup>α</sup> αααα

R

j 0

=

j 0

=

( ) Y ;X , X , ,X X ,X , ,X abk a1k a2k ank; b1k b2 k bnk; (5)

( ) Y ;X ,X , ,X k 1k 2k nk , *k=1,2,….,N* (6)

YAAX A X <sup>k</sup> = ⊕ ⊗ ⊕⊕ ⊗ 0 1 1k n nk (7)

<sup>k</sup> z <sup>α</sup> be an upper value of Yk .

() () ( ] L R Z z ,z , 0,1 <sup>k</sup> k k α α <sup>=</sup> α ∈ (8)

<sup>=</sup> (9)

<sup>=</sup> (10)

**3.2 Analysis of the additive difference type model** 

membership value, α ∈ (0,1).

operator based on the extension principle.

<sup>k</sup> z <sup>α</sup> be a lower value of Yk , and ( )

represented as follows:

1978).

Then,

Where

Let ( ) L

The set of fuzzy input-output data for the *k*-th observation is defined as:

the *j*-th attribute values of the alternatives (a and b) for observation *k*.

of fuzzy input-output data for the *i*-th observation is re-written as:

$$\mathfrak{X}\_{0k(a)}^L = \mathfrak{X}\_{0k(a)}^\mathbb{R} = 1 \tag{11}$$

In the above Equation (9), () () L L j jk a x α α is a product between the lower value of the α-level fuzzy coefficient for the *j*-th attribute and the α-level set of fuzzy input data *Xjk*, () () L R j jk a x α α , () () R L j jk a x α α , or () () R R j jk a x α α is defined in the same manner, respectively. ( ) L 0k x <sup>α</sup> and ( ) R 0k x <sup>α</sup> are assumed to be 1 (a crisp number) for the purpose of estimation for the fuzzy bias parameter *A0.*

To define the dissimilarity between the predicted and observed values of the dependent variable, the following indicator *Dk* ( ) <sup>α</sup> *<sup>2</sup>*was adopted:

$$D\_{\mathbf{k}}\ (a) \ \mathbf{\color{red}{2}} = (\mathbf{y}\_{\mathbf{k}\langle a\rangle}^{\mathbf{L}} \cdot \mathbf{z}\_{\mathbf{k}\langle a\rangle}^{\mathbf{L}})^2 + (\mathbf{y}\_{\mathbf{k}\langle a\rangle}^{\mathbf{R}} \cdot \mathbf{z}\_{\mathbf{k}\langle a\rangle}^{\mathbf{R}})^2 \tag{12}$$

The definition in Equation (12) can be applied to interval data as well as to L-R fuzzy numbers. That is, Equation (12) represents the sum of squares for the distance between interval data.

To generalize, a dissimilarity indicator representing the square of the distance for L-R fuzzy numbers can be written as follows:

$$\mathbf{Dk}^2 = \sum\_{\mathbf{j}=0}^n \mathbf{w} \mathbf{j} ((\mathbf{y}\_{\mathbf{k}(\neq \mathbf{j})}^\mathcal{L} \mathbf{z}\_{\mathbf{k}(\neq \mathbf{j})}^\mathcal{L})^2 + (\mathbf{y}\_{\mathbf{k}(\neq \mathbf{j})}^\mathcal{R} \mathbf{z}\_{\mathbf{k}(\neq \mathbf{j})}^\mathcal{R})^2) \tag{13}$$

where α*j = jh/n, j = 0,...,n*, *h* is an equal interval, and *wj* is a weight for the *j*-th level.

In the case of a triangular fuzzy number with *wj =* 1, the above equation is approximately represented as:

$$\mathbf{Dk}^2 = (\mathbf{y}\_{\mathbf{k}(0)}^\mathcal{L} - \mathbf{z}\_{\mathbf{k}(0)}^\mathcal{L})^2 + (\mathbf{y}\_{\mathbf{k}(1)}^\mathcal{L} - \mathbf{z}\_{\mathbf{k}(1)}^\mathcal{R})^2 + (\mathbf{y}\_{\mathbf{k}(0)}^\mathcal{R} - \mathbf{z}\_{\mathbf{k}(0)}^\mathcal{R})^2 \tag{14}$$

The proposed method is to estimate fuzzy coefficients using minimization of the sum of *Dk 2* respecting *k*. That is,

$$\text{Objective function: Min } \sum\_{\mathbf{k}=1}^{N} \mathbf{D}\_{\mathbf{k}} \stackrel{\ast}{}^{\mathbf{2}} \tag{15}$$

$$\text{Subject to: } \operatorname{a}^{\mathcal{L}}\_{j(h)} \ge 0, \text{ j} \in \mathcal{J}\_1 \tag{16}$$

$$\mathbf{a}\_{\mathbf{j}(\mathbf{h})}^{\mathcal{L}} \le \mathbf{0}, \; \mathbf{a}\_{\mathbf{j}(\mathbf{h})}^{\mathcal{R}} \ge \mathbf{0}, \; \mathbf{j} \in \mathcal{J}\_2 \tag{17}$$

$$\mathbf{a}\_{j(h)}^{\mathbb{R}} \le \mathbf{0}, \ j \in \mathbb{J}\_3 \tag{18}$$

$$-\mathbf{a}\_{j(h)}^{\mathcal{L}} + \mathbf{a}\_{j(h)}^{\mathcal{R}} \ge 0 \tag{19}$$

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 13

highest value (such as 1). The present study develops the concept of logistic regression for the crisp numbers, and then proposes the fuzzy version of logistic regression analysis for

where *Pabk* indicates the *k*-th observation's ambiguous preference for the *a-*th alternative (a) over the *b-*th alternative (b), which is represented by fuzzy L-R numbers, and *Xajk* and *Xbjk*

Let *Xabjk* be *Xajk* ○- *Xbjk*, where ○- is a difference operator based on the fuzzy extension principle, and denote *Xk.* as the abbreviation of *Xabk* in the following section. Therefore, a set

where *Pk* is a fuzzy dependent variable, and *Xjk* is a fuzzy independent variable represented by L-R fuzzy numbers. For simplicity, I assume that *Pk* and *Xjk* are positive for any

The fuzzy logic regression model (where both input and output data are fuzzy numbers) is

 *log*(*Pk* ○÷ (1 ○-*Pk*)) = ⊗ ⊕ ⊗ ⊕⊕ ⊗ AX AX A X 0 i0 1 i1 m im (24)

where *log*(*Pk*○÷ (1○-*Pk*)) is the estimated fuzzy log odds, ○÷ is the division operator, ○- is the difference operator, ⊗ is the product operator, and ⊕ is the additive operator based on the

It should be noted that although the explicit form of the membership function of *log*(*Pk*○÷ (1○-*Pk*)) cannot be directly obtained, the α -level set of *log*(*Pk*○÷ (1○-*Pk*)) can be

bound. Then, the α level set of the fuzzy dependent variable *Pk* can be represented as

[*log*(*Pk*○÷ (1○-*Pk*))]<sup>α</sup><sup>=</sup>

() () () () LL RR kk kk [min(log(P /(1 P )),log(P /(1 P ))) αα αα − −

() () () ()

LL RR max(lo kk kk g(P /(1 P )),log(P /(1 P )))] αα αα − − (25)

kP <sup>α</sup> be the lower bound of the dependent fuzzy variable and ( )

Therefore, the α level set of the left term in Equation (24) is as follows:

( ) P ;X ,X , ,X ;X ,X , ,X abk a1k a2k ank b1k b2k bnk (22)

( ) P ;X ,X , ,X k 1k 2k nk , k=1,2,….,N (23)

R

kP <sup>α</sup> be the upper

The set of fuzzy input-output data for the *k-*th observation is defined as:

are the *j*-th attribute values of the alternatives (a and b) for observation *k*.

of fuzzy input-output data for the *i*-th observation is re-written as:

fuzzy input and output data.

membership value, α ∈ (0,1).

() () <sup>α</sup> ( ] L R <sup>k</sup> k k P P ,P , 0,1 α α = α <sup>∈</sup> .

extension principle for the fuzzy set, respectively.

obtained using Nguyen's theorem (Nguyen, 1978).

represented as follows:

Let ( ) L

Where { } <sup>123</sup> j∈ =∪∪ 0,....,n J J J , 12 23 31 J J ,J J ,J J , ∩ = ϕ ∩ = ϕ ∩ = ϕ

$$\mathbf{z}\_{\mathbf{k}}^{\mathcal{L}}(a) = \sum\_{\mathbf{j}\_a \mathbf{l}\_1} \mathbf{a}\_{\mathbf{j}(a)}^{\mathcal{L}} \mathbf{x}\_{\mathbf{jk}(a)}^{\mathcal{L}} + \sum\_{\mathbf{j}\_a \mathbf{l}\_2 \mathbf{l}\_3} \mathbf{a}\_{\mathbf{j}(a)}^{\mathcal{L}} \mathbf{x}\_{\mathbf{jk}(a)}^{\mathcal{R}} \tag{20}$$

$$\mathbf{z}\_{\mathbf{k}}^{\mathbb{R}}(\boldsymbol{\alpha}) = \sum\_{\mathbf{j}\_{\mathbf{k}} \mathbf{J}\_{1} \mathbf{J}\_{12}} \mathbf{a}\_{\mathbf{j}(\boldsymbol{\alpha})}^{\mathbb{R}} \mathbf{x}\_{\mathbf{jk}(\boldsymbol{\alpha})}^{\mathbb{R}} + \sum\_{\mathbf{j}\_{\mathbf{k}} \mathbf{J}\_{3}} \mathbf{a}\_{\mathbf{j}(\boldsymbol{\alpha})}^{\mathbb{R}} \mathbf{x}\_{\mathbf{jk}(\boldsymbol{\alpha})}^{\mathbb{L}} \tag{21}$$

The estimated coefficients can be derived through quadratic programming. The proposed fuzzy least squares method is also shown in Figure 4.

Fig. 4. Fuzzy Least Squares Regressions Analysis for Fuzzy Input and Output Data

### **3.3 Analysis of the logistic type model**

Although the fuzzy linear regression analysis in the fuzzy additive difference model can give satisfactory results, these fuzzy regression analyses may fail to interpret psychological judgment data that have bounds on a psychological scale. For example, a perceived purchase probability has [0,1] interval and cannot be greater than 1 or less than 0. For such data, these fuzzy regression analyses may predict the values that are greater than 1 or less than 0. It may happen that the predicted values are greater than the highest bound or less than the lowest bound, and this causes a significant problem if the predicted values are used in a subsequent analysis. Therefore, the present study also attempted to solve this problem by setting predicted values to be greater than the lowest value (such as 0) or less than the

12 Fuzzy Logic – Algorithms, Techniques and Implementations

() () () () () 1 2 3 L LL LR k j jk j jk j J jJJ z ax ax ∈ ∈

( ) () () () () 1, 12 3 R R R R L k j jk j jk jJ J j J z ax ax ∈ ∈

The estimated coefficients can be derived through quadratic programming. The proposed

*z <sup>L</sup> z*

*j* +

*M y*

*M z*

*L y*

*n*

*j*

11

<sup>2</sup>

Fig. 4. Fuzzy Least Squares Regressions Analysis for Fuzzy Input and Output Data

Although the fuzzy linear regression analysis in the fuzzy additive difference model can give satisfactory results, these fuzzy regression analyses may fail to interpret psychological judgment data that have bounds on a psychological scale. For example, a perceived purchase probability has [0,1] interval and cannot be greater than 1 or less than 0. For such data, these fuzzy regression analyses may predict the values that are greater than 1 or less than 0. It may happen that the predicted values are greater than the highest bound or less than the lowest bound, and this causes a significant problem if the predicted values are used in a subsequent analysis. Therefore, the present study also attempted to solve this problem by setting predicted values to be greater than the lowest value (such as 0) or less than the

==

*N*

*k*

α αα αα = + (20)

α αα αα = + (21)

μ

*R*

*<sup>j</sup>* + <sup>2</sup>

*j*

*Y k*

*R y*

2

Where { } <sup>123</sup> j∈ =∪∪ 0,....,n J J J , 12 23 31 J J ,J J ,J J , ∩ = ϕ ∩ = ϕ ∩ = ϕ

fuzzy least squares method is also shown in Figure 4.

0.1

→

*Min*

**3.3 Analysis of the logistic type model** 

highest value (such as 1). The present study develops the concept of logistic regression for the crisp numbers, and then proposes the fuzzy version of logistic regression analysis for fuzzy input and output data.

The set of fuzzy input-output data for the *k-*th observation is defined as:

$$\left(\mathbf{P}\_{\rm abk}; \mathbf{X}\_{\rm a1k}, \mathbf{X}\_{\rm a2k}, \dots; \mathbf{X}\_{\rm ank}; \mathbf{X}\_{\rm b1k}, \mathbf{X}\_{\rm b2k}, \dots; \mathbf{X}\_{\rm bnk}\right) \tag{22}$$

where *Pabk* indicates the *k*-th observation's ambiguous preference for the *a-*th alternative (a) over the *b-*th alternative (b), which is represented by fuzzy L-R numbers, and *Xajk* and *Xbjk* are the *j*-th attribute values of the alternatives (a and b) for observation *k*.

Let *Xabjk* be *Xajk* ○- *Xbjk*, where ○- is a difference operator based on the fuzzy extension principle, and denote *Xk.* as the abbreviation of *Xabk* in the following section. Therefore, a set of fuzzy input-output data for the *i*-th observation is re-written as:

$$\left(\mathbb{P}\_{\mathbf{k}}; \mathbb{X}\_{1\mathbf{k}}, \mathbb{X}\_{2\mathbf{k}}, \dots, \mathbb{X}\_{n\mathbf{k}}\right), \mathbf{k} = \mathbf{1}, \mathbf{2}, \dots, \mathbf{N} \tag{23}$$

where *Pk* is a fuzzy dependent variable, and *Xjk* is a fuzzy independent variable represented by L-R fuzzy numbers. For simplicity, I assume that *Pk* and *Xjk* are positive for any membership value, α ∈ (0,1).

The fuzzy logic regression model (where both input and output data are fuzzy numbers) is represented as follows:

$$\overline{\log(P\_k \oplus (1 \odot P\_k))} = \mathbf{A}\_0 \oplus \mathbf{X}\_{i0} \oplus \mathbf{A}\_1 \otimes \mathbf{X}\_{i1} \oplus \dots \oplus \mathbf{A}\_{\mathfrak{m}} \otimes \mathbf{X}\_{\text{im}} \tag{24}$$

where *log*(*Pk*○÷ (1○-*Pk*)) is the estimated fuzzy log odds, ○÷ is the division operator, ○- is the difference operator, ⊗ is the product operator, and ⊕ is the additive operator based on the extension principle for the fuzzy set, respectively.

It should be noted that although the explicit form of the membership function of *log*(*Pk*○÷ (1○-*Pk*)) cannot be directly obtained, the α -level set of *log*(*Pk*○÷ (1○-*Pk*)) can be obtained using Nguyen's theorem (Nguyen, 1978).

Let ( ) L kP <sup>α</sup> be the lower bound of the dependent fuzzy variable and ( ) R kP <sup>α</sup> be the upper bound. Then, the α level set of the fuzzy dependent variable *Pk* can be represented as () () <sup>α</sup> ( ] L R <sup>k</sup> k k P P ,P , 0,1 α α = α <sup>∈</sup> .

Therefore, the α level set of the left term in Equation (24) is as follows:

$$[\log(P\_k \ominus)(1 \ominus P\_k)]\_{\circ} -$$

$$\begin{aligned} \text{[min}(\overline{\log(\mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{L}} / (1 - \mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{L}}))}) & \overline{\log(\mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{R}} / (1 - \mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{R}}))}) \\\\ \max(\overline{\log(\mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{L}} / (1 - \mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{L}}))}, \overline{\log(\mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{R}} / (1 - \mathcal{P}\_{\mathbf{k}(a)}^{\mathcal{R}}))})) \end{aligned} \tag{25}$$

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 15

The participant also rated the desirability of the attribute information for each computer using a fuzzy rating method. The fuzzy rating scale of desirability ranged from 0 point to 100 points. (see Figure 6). That is, the participant answered the lower value, the

0 100

100

The fuzzy coefficients were obtained by fuzzy linear regression analysis using the least squares under constraints, as shown in Tables 1 and 2. The dependent variable of Table 1 was the same as that in Table 2. However, the independent variables in Table 1 are objective values measured by crisp numbers, whereas in Table 2 the independent variables are fuzzy rating values measured by an L-R fuzzy number. The parameter of *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. The parameter *Ajk*0 would be a fuzzy variable and larger than *Aab0* if *Xa* were more salient than *Xb*. This model can be reduced to the Fuzzy Utility Difference Model (Nakamura, 1992) if multi-attribute weighting parameters are assumed to be crisp numbers, and reduced to the Additive Difference Model (Tversky, 1969) if multi-attribute weighting parameters and the values of multi-attributes are assumed to be crisp numbers as explained before. According to Tables 1 and 2, the preference strength concerning comparative judgment was influenced most by whether the target computer was new or used. The impact of the hard disks' attributes was smaller than that of the new-used dimension.

The participant was a 43-year-old adult. The participant rated the ambiguous probability of preferring a certain computer (DELL brand) out of seven different computers. Three types of attribute information (hard disk: 100 or 60 GB; memory: 2.80 or 2.40 GHz; new or used product) were manipulated in the same manner as in the previous judgment task.. That is, the participant answered the lower value, the representative value , and upper value for the probability that superior alternative is preferred to inferior alternative. The participant used the fuzzy rating method to provide representative, lower, and upper values of probabilities

representative value , and upper value for each attribute value.

Fig. 6. Example of a Fuzzy Desirability Rating.

1) Low ambiguity

2)High ambiguity 0

**4.1.1.2 Analysis and results** 

**4.1.2 Example of the logistic model 4.1.2.1 Participant and procedure** 

(see Figure 7 ).

Let ( ) L <sup>k</sup> z <sup>α</sup> be a lower value of [*log*(*Pk*○÷ (1○-*Pk*))]α, and ( ) R <sup>k</sup> z <sup>α</sup> be an upper value of [*log*(*Pk*○÷ (1○-*Pk*))]<sup>α</sup>

where

$$\mathbf{z}\_{\mathbf{k}(\alpha)}^{\mathcal{L}} = \sum\_{\mathbf{j}=0}^{n} \left[ \min \left( \mathbf{a}\_{\mathbf{j}(\alpha)}^{\mathcal{L}} \mathbf{x}\_{\mathbf{jk}(\alpha)}^{\mathcal{L}} \mathbf{a}\_{\mathbf{j}(\alpha)}^{\mathcal{L}} \mathbf{x}\_{\mathbf{jk}(\alpha)}^{\mathcal{R}} \right) \right] \tag{26}$$

$$\mathbf{z}\_{\mathbf{k}\left(\alpha\right)}^{\mathrm{R}} = \sum\_{\mathbf{j}=0}^{\mathrm{n}} \left[ \max \left( \mathbf{a}\_{\mathbf{j}\left(\alpha\right)}^{\mathrm{R}} \mathbf{x}\_{\mathbf{jk}\left(\alpha\right)}^{\mathrm{L}} \mathbf{a}\_{\mathbf{j}\left(\alpha\right)}^{\mathrm{R}} \mathbf{x}\_{\mathbf{jk}\left(\alpha\right)}^{\mathrm{R}} \right) \right] \tag{27}$$

$$\mathcal{X}\_{0k}^{L}(\alpha) = \mathcal{X}\_{0k}^{R}(\alpha) = \mathbf{1} \tag{28}$$

In the above Equation (26), is a product between the lower value of the �-level fuzzy coefficient for the *j*-th attribute and the α-level set of fuzzy input data *Xjk*, , or is defined in the same manner, respectively. and are assumed to be 1 (a crisp number) for the purpose of estimation for the fuzzy bias parameter *A0*. The parameter estimation method is basically the same as the fuzzy logistic regression method and a more concrete procedure is described in Takemura (2004).

### **4. Numerical example of the data analysis method**

To demonstrate the appropriateness of the proposed data analysis methods, the detail numerical examples are shown for the individual level analysis (Takemura,2007) and group level analysis (Takemura, Matsumoto, Matsuyama, & Kobayashi, 2011) of ambiguous comparative judgments.

### **4.1. Individual level analysis of ambiguous comparative model**

### **4.1.1 Example of additive difference model**

### **4.1.1.1 Participant and procedure**

The participant was a 43-year-old faculty member of Waseda University. The participant rated differences in WTP for two different computers (DELL brand) with three types of attribute information (hard disk: 100 or 60 GB; memory: 2.80 or 2.40 GHz; new or used product). The participant compared a certain alternative with seven different alternatives. The participant provided representative values and lower and upper WTP values using a fuzzy rating method. (see Figure 5)

The participant was asked the amount of money he would be willing to pay to upgrade the inferior from inferior alternative to superior alternative using fuzzy rating method. That is, the participant answered the lower value, the representative value, and upper value for the amount of money he would be willing to pay.

$$\begin{array}{cccc} \text{Lower Value} & \text{Representative Value} & \text{Upper Value} \\\\ \text{(} & ) \text{ Yen} & \text{(} & ) \text{ Yen} & \text{(} & ) \text{ Yen} \end{array}$$

Fig. 5. Example of a Fuzzy Rating in WTP Task.

The participant also rated the desirability of the attribute information for each computer using a fuzzy rating method. The fuzzy rating scale of desirability ranged from 0 point to 100 points. (see Figure 6). That is, the participant answered the lower value, the representative value , and upper value for each attribute value.

Fig. 6. Example of a Fuzzy Desirability Rating.

### **4.1.1.2 Analysis and results**

14 Fuzzy Logic – Algorithms, Techniques and Implementations

( ) { ( ) () () () () } <sup>n</sup> <sup>L</sup> LL LR k j jk j jk

( ) { ( ) () () () () } <sup>n</sup> <sup>R</sup> RL RR k j jk j jk

In the above Equation (26), is a product between the lower value of the �-level fuzzy coefficient for the *j*-th attribute and the α-level set of fuzzy input data *Xjk*, , or is defined in the same manner, respectively. and are assumed to be 1 (a crisp number) for the purpose of estimation for the fuzzy bias parameter *A0*. The parameter estimation method is basically the same as the fuzzy logistic regression method and a more concrete procedure is described in

To demonstrate the appropriateness of the proposed data analysis methods, the detail numerical examples are shown for the individual level analysis (Takemura,2007) and group level analysis (Takemura, Matsumoto, Matsuyama, & Kobayashi, 2011) of ambiguous

The participant was a 43-year-old faculty member of Waseda University. The participant rated differences in WTP for two different computers (DELL brand) with three types of attribute information (hard disk: 100 or 60 GB; memory: 2.80 or 2.40 GHz; new or used product). The participant compared a certain alternative with seven different alternatives. The participant provided representative values and lower and upper WTP values using a

The participant was asked the amount of money he would be willing to pay to upgrade the inferior from inferior alternative to superior alternative using fuzzy rating method. That is, the participant answered the lower value, the representative value, and upper value for the

> Lower Value Representative Value Upper Value ( ) Yen ( ) Yen ( ) Yen

z min a x ,a x α α <sup>α</sup> <sup>α</sup> <sup>α</sup>

z max a x ,a x <sup>α</sup> αααα

( ) *L k x*0 α = ( ) *R k x*0 α

R

<sup>=</sup> (26)

<sup>=</sup> (27)

<sup>k</sup> z <sup>α</sup> be an upper value of [*log*(*Pk*○÷

=1 (28)

<sup>k</sup> z <sup>α</sup> be a lower value of [*log*(*Pk*○÷ (1○-*Pk*))]α, and ( )

j 0

=

j 0

=

**4. Numerical example of the data analysis method** 

**4.1. Individual level analysis of ambiguous comparative model** 

**4.1.1 Example of additive difference model** 

amount of money he would be willing to pay.

Fig. 5. Example of a Fuzzy Rating in WTP Task.

**4.1.1.1 Participant and procedure** 

fuzzy rating method. (see Figure 5)

Let ( ) L

where

(1○-*Pk*))]<sup>α</sup>

Takemura (2004).

comparative judgments.

The fuzzy coefficients were obtained by fuzzy linear regression analysis using the least squares under constraints, as shown in Tables 1 and 2. The dependent variable of Table 1 was the same as that in Table 2. However, the independent variables in Table 1 are objective values measured by crisp numbers, whereas in Table 2 the independent variables are fuzzy rating values measured by an L-R fuzzy number. The parameter of *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. The parameter *Ajk*0 would be a fuzzy variable and larger than *Aab0* if *Xa* were more salient than *Xb*. This model can be reduced to the Fuzzy Utility Difference Model (Nakamura, 1992) if multi-attribute weighting parameters are assumed to be crisp numbers, and reduced to the Additive Difference Model (Tversky, 1969) if multi-attribute weighting parameters and the values of multi-attributes are assumed to be crisp numbers as explained before. According to Tables 1 and 2, the preference strength concerning comparative judgment was influenced most by whether the target computer was new or used. The impact of the hard disks' attributes was smaller than that of the new-used dimension.

### **4.1.2 Example of the logistic model**

### **4.1.2.1 Participant and procedure**

The participant was a 43-year-old adult. The participant rated the ambiguous probability of preferring a certain computer (DELL brand) out of seven different computers. Three types of attribute information (hard disk: 100 or 60 GB; memory: 2.80 or 2.40 GHz; new or used product) were manipulated in the same manner as in the previous judgment task.. That is, the participant answered the lower value, the representative value , and upper value for the probability that superior alternative is preferred to inferior alternative. The participant used the fuzzy rating method to provide representative, lower, and upper values of probabilities (see Figure 7 ).

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 17

0% 100%

100% 0%

The fuzzy coefficients were obtained by fuzzy linear regression analysis using least squares under constraints, as shown in Tables 3 and 4. However, in Table 3 the independent variables are objective values measured by crisp numbers, whereas in Table 4 the independent variables are fuzzy rating values measured by an L-R fuzzy number. The parameter *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. According to Tables 3 and 4, the bounded preference strength was influenced most by whether the target computer was new or used. Interestingly, the impact of the attribute for memory was slightly greater than was the case

Attribute Value

Hard Disk(M) Representative 0.000

Fuzzy Memory(L) Lower 1.781 Coefficient Memory(M) Representative 1.781 Memory(R) Upper 1.881

(R) Upper 1.443

Hard Disk (L) Lower 0.000

Hard Disk (R) Upper 0.009

New or Used(R) Lower 1.791 New or Used(M) Representative 2.097 New or Used(L) Upper 2.777

(L) Lower 0.847

(M) Representative 1.201

1) Low ambiguity

2)High ambiguity

Fig. 7. Example of Fuzzy Probability Rating.

Note: The independent variables are crisp numbers.

Table 3. Coefficients of Fuzzy Logistic Regression Analysis

*Ajk* <sup>0</sup>

*A*jk0

*A*jk0

**4.1.2.2 Analysis and results** 

in Tables 1 and 2.


Note: The independent variables are crisp numbers.

Table 1. Coefficients of Fuzzy Regression Analysis


Note: The independent variables are fuzzy L-R numbers.

Table 2. Coefficients of Fuzzy Regression Analysis

Fig. 7. Example of Fuzzy Probability Rating.

### **4.1.2.2 Analysis and results**

16 Fuzzy Logic – Algorithms, Techniques and Implementations

Hard Disk(L) Lower 78.5

Fuzzy Memory(L) Lower 0.0 Coefficient Memory(M) Representative 0.0

(R) Upper 33 111.2

Hard Disk(L) Lower 33.9

Fuzzy Memory(L) Lower 0.0 Coefficient Memory(M) Representative 0.0

(R) Upper 48 004.0

Hard Disk (M) Representative 85.7 Hard Disk (R) Upper 986.8

Memory(R) Upper 0.0 New or Used(R) Lower 22 332.5 New or Used (M) Representative 22 332.5 New or Used(L) Upper 22 332.5

Hard Disk (M) Representative 33.9 Hard Disk (R) Upper 33.9

Memory(R) Upper 0.0 New or Used(R) Lower 446.1 New or Used(M) Representative 446.1 New or Used(L) Upper 446.1

(L) Lower 36 082.1

(M) Representative 36 082.1

(L) Lower 25 450.8

(M) Representative 29 420.1

Attribute Value

Note: The independent variables are crisp numbers. Table 1. Coefficients of Fuzzy Regression Analysis

*Ajk*<sup>0</sup>

*A*jk0

*A*jk0

Note: The independent variables are fuzzy L-R numbers. Table 2. Coefficients of Fuzzy Regression Analysis

*Ajk* <sup>0</sup>

*A*jk0

*Ajk* <sup>0</sup>

Attribute Value

The fuzzy coefficients were obtained by fuzzy linear regression analysis using least squares under constraints, as shown in Tables 3 and 4. However, in Table 3 the independent variables are objective values measured by crisp numbers, whereas in Table 4 the independent variables are fuzzy rating values measured by an L-R fuzzy number. The parameter *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. According to Tables 3 and 4, the bounded preference strength was influenced most by whether the target computer was new or used. Interestingly, the impact of the attribute for memory was slightly greater than was the case in Tables 1 and 2.


Note: The independent variables are crisp numbers.

Table 3. Coefficients of Fuzzy Logistic Regression Analysis

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 19

Then, please estimate the amount of money you would be willing to pay to upgrade the inferior alternative from inferior alternative to superior alternative using fuzzy rating method. That is, the participants answered the lower value, the representative value, and

The fuzzy coefficients were obtained by fuzzy linear regression analysis using the least squares under constraints, as shown in Tables 5 for the digital camera data and Table 6 for mobile phone data. The independent variables in Table5 and Table 6 are objective values measured by crisp numbers. The parameter of *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables.According to Tables 5, the preference strength concerning comparative judgment was influenced most by whether the target digital camera was 2.5 or 5.0 inches. The impact of the memory's attribute was smaller than those of display size and weight dimensions. According to Tables 6, the preference strength concerning comparative judgment was influenced most by whether the target mobile phone was 2.8 or 3.0 inches. The impact of the pixel number's attribute was smaller than those of display size and weight dimensions. The participants also rated the desirability of the attribute information for each computer using a fuzzy rating method. The fuzzy rating scale of desirability ranged from 0 point to 100 points. (see Figure 6). That is, the participant answered the lower value, the representative value ,

Which alternative do you prefer ? Please circle the superior alternative.

upper value for the amount of money you would be willing to pay.

Minimum: 2,000 yen ----- Maximum: 10, 000 yen

 **Brand A Brand B**  Weight: 130g Weight: 160g Memory: 25MB Memory: 50MB Display: 50 inches Display: 25 inches

**Difference** 

Fig. 8. Example of Fuzzy WTP Rating

and upper value for each attribute value.

**4.2.1.2 Analysis and results** 

Representative Value: 5,000 yen

**Question:** 


Note: The independent variables are fuzzy L-R numbers.

Table 4. Coefficients of Fuzzy Logistic Regression Analysis

### **4.2 Group level analysis of ambiguous comparative model**

### **4.2.1 Example of additive difference model**

### **4.2.1.1 Participants and procedure**

The participant s were 100 undergraduate university students (68 female and 32 male students) enrolled in an economic

psychology class at Waseda University. They were recruited for an experiment investigating "consumer preference ".

Their average age was 21.3 years old. The participants rated differences in WTP for two different digital cameras with three types of attribute information (weight: 130 gram or1 60 gram; memory: 25 or 50 MB; display size:2.5 or 5.0 inches). The participants compared a certain alternative with seven different alternatives. The participants also rated differences in WTP for two different mobile phones with three types of attribute information (weight: 123 gram or132 gram; pixel number:3,200,000 or 5,070,000 pixels; display size:2.8 or 3.0 inches). The participants compared a certain alternative with seven different mobile phones. The participant provided representative values and lower and upper WTP values using a fuzzy rating method. The participants were asked the amount of money he would be willing to pay to upgrade the inferior from inferior alternative to superior alternative using fuzzy rating method. That is, the participants answered the lower value, the representative value , and upper value for the amount of money he would be willing to pay. An example of fuzzy WTP rating is illustrated in the Figure 8.

### **Question:**

18 Fuzzy Logic – Algorithms, Techniques and Implementations

Hard Disk (M) Representative 0.000 Hard Disk (R) Upper 0.000

New or Used(R) Lower 0.043 New or Used(M) Representative 0.043 New or Used(L) Upper 0.043

(M) Representative 1.806

Hard Disk(L) Lower 0.000

Fuzzy Memory(L) Lower 0.008 Coefficient Memory(M) Representative 0.008 Memory(R) Upper 0.008

(L) Lower 1.806

(R) Upper 1.806

The participant s were 100 undergraduate university students (68 female and 32 male

psychology class at Waseda University. They were recruited for an experiment investigating

Their average age was 21.3 years old. The participants rated differences in WTP for two different digital cameras with three types of attribute information (weight: 130 gram or1 60 gram; memory: 25 or 50 MB; display size:2.5 or 5.0 inches). The participants compared a certain alternative with seven different alternatives. The participants also rated differences in WTP for two different mobile phones with three types of attribute information (weight: 123 gram or132 gram; pixel number:3,200,000 or 5,070,000 pixels; display size:2.8 or 3.0 inches). The participants compared a certain alternative with seven different mobile phones. The participant provided representative values and lower and upper WTP values using a fuzzy rating method. The participants were asked the amount of money he would be willing to pay to upgrade the inferior from inferior alternative to superior alternative using fuzzy rating method. That is, the participants answered the lower value, the representative value , and upper value for the amount of money he would be willing to pay. An example of fuzzy

Attribute Value

Note: The independent variables are fuzzy L-R numbers.

**4.2.1 Example of additive difference model** 

**4.2.1.1 Participants and procedure** 

students) enrolled in an economic

WTP rating is illustrated in the Figure 8.

"consumer preference ".

Table 4. Coefficients of Fuzzy Logistic Regression Analysis

*Ajk* <sup>0</sup>

Ajk0

*Ajk* <sup>0</sup>

**4.2 Group level analysis of ambiguous comparative model** 

Which alternative do you prefer ? Please circle the superior alternative.

Then, please estimate the amount of money you would be willing to pay to upgrade the inferior alternative from inferior alternative to superior alternative using fuzzy rating method. That is, the participants answered the lower value, the representative value, and upper value for the amount of money you would be willing to pay.


Fig. 8. Example of Fuzzy WTP Rating

### **4.2.1.2 Analysis and results**

The fuzzy coefficients were obtained by fuzzy linear regression analysis using the least squares under constraints, as shown in Tables 5 for the digital camera data and Table 6 for mobile phone data. The independent variables in Table5 and Table 6 are objective values measured by crisp numbers. The parameter of *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables.According to Tables 5, the preference strength concerning comparative judgment was influenced most by whether the target digital camera was 2.5 or 5.0 inches. The impact of the memory's attribute was smaller than those of display size and weight dimensions. According to Tables 6, the preference strength concerning comparative judgment was influenced most by whether the target mobile phone was 2.8 or 3.0 inches. The impact of the pixel number's attribute was smaller than those of display size and weight dimensions. The participants also rated the desirability of the attribute information for each computer using a fuzzy rating method. The fuzzy rating scale of desirability ranged from 0 point to 100 points. (see Figure 6). That is, the participant answered the lower value, the representative value , and upper value for each attribute value.

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 21

The participant s were 100 undergraduate university students (68 female and 32 male students). Their average age was 21.3 years old. The participants rated the ambiguous probability of preferring a certain digital camera out of seven different digital cameras. The three types of attribute information (weight: 130 gram or1 60 gram; memory: 25 or 50 MB; display size:2.5 or 5.0 inches) were manipulated in the same manner as in the previous individual judgment task. They also rated the ambiguous probability of preferring a certain mobile phone out of seven different mobile phones. The three types of attribute information (weight: 123 gram or132 gram; pixel number:3,200,000 or 5,070,000 pixels; display size:2.8 or 3.0 inches) were manipulated in the same manner in the previous judgment task. The participant provided representative values and lower and upper values of probabilities. That is, the participants answered the lower value, the representative value , and upper value for the probability that superior alternative is preferred to inferior alternative. The participants used the fuzzy rating method to provide representative, lower, and upper

The fuzzy coefficients were obtained by fuzzy logistic regression analysis using the least squares under constraints, as shown in Tables 7 for the digital camera data and Table 8 for mobile phone data. The independent variables in Table 7 and Table 8 are objective values measured by crisp numbers. The parameter of *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. According to Tables 7, the bounded preference strength was influenced most by whether the target digital camera was 2.5 or 5.0 inches. The impact of the memory's attribute was smaller than those of display size and weight dimensions. According to Tables 8, the bounded preference strength t was influenced most by whether the target mobile phone was 2.8 or 3.0 inches. The impact of the weight's attribute was smaller than those of display

Weight(L) Lower 0.035

Fuzzy Memory(L) Lower 0.003 Coefficient Memory(M) Representative 0.003 Memory(R) Upper 0.003

(R) Upper 1.072

Table 7. Coefficients of Fuzzy Logistic Regression Analysis for Digital Camera Data

Weight (M) Representative 0.038 Weight (R) Upper 0.054

Display Size(R) Lower 2.625 Display Size (M) Representative 2.625 Display Size(L) Upper 2.625

> (L) Lower -0.122 (M) Representative 0.459

**4.2.2 Example of the logistic model 4.2.2.1 Participants and procedure** 

values of probabilities (see Figure 7 ).

size and pixel number dimensions.

Note: The independent variables are crisp numbers.

Attribute Value

*Ajk* <sup>0</sup>

Ajk0 Ajk0

**4.2.2.2 Analysis and results** 


Note: The independent variables are crisp numbers.

Table 5. Coefficients of Fuzzy Regression Analysis for Digital Camera Data


Note: The independent variables are crisp numbers.

Table 6. Coefficients of Fuzzy Regression Analysis for Mobile Phone Data

### **4.2.2 Example of the logistic model**

### **4.2.2.1 Participants and procedure**

20 Fuzzy Logic – Algorithms, Techniques and Implementations

Weight (M) Representative 48.57 Weight (R) Upper 68.33

Memory(R) Upper 14.62 Display Size(R) Lower 223.10 Display Size (M) Representative 4791.98 Display Size(L) Upper 4791.98

(L) Lower 11361.25

(M) Representative 11361.25

Weight(L) Lower 48.57

Fuzzy Memory(L) Lower 8.29 Coefficient Memory(M) Representative 8.29

(R) Upper 15447.54

Weight(L) Lower 28.84

Fuzzy Pixel Number(L) Lower -12.12 Coefficient Pixel Number(M) Representative 28.55

(R) Upper 12569.35

Table 6. Coefficients of Fuzzy Regression Analysis for Mobile Phone Data

Weight (M) Representative 28.84 Weight (R) Upper 53.44

Pixel Number(R) Upper 28.55 Display Size(R) Lower -233.73 Display Size(M) Representative 190.29 Display Size(L) Upper 190.29

(L) Lower 7758.98

(M) Representative 8234.94

Table 5. Coefficients of Fuzzy Regression Analysis for Digital Camera Data

Attribute Value

*Ajk* <sup>0</sup>

*Ajk*<sup>0</sup>

*A*jk0

Attribute Value

Note: The independent variables are crisp numbers.

*Ajk* <sup>0</sup>

*A*jk0 *A*jk0

Note: The independent variables are crisp numbers.

The participant s were 100 undergraduate university students (68 female and 32 male students). Their average age was 21.3 years old. The participants rated the ambiguous probability of preferring a certain digital camera out of seven different digital cameras. The three types of attribute information (weight: 130 gram or1 60 gram; memory: 25 or 50 MB; display size:2.5 or 5.0 inches) were manipulated in the same manner as in the previous individual judgment task. They also rated the ambiguous probability of preferring a certain mobile phone out of seven different mobile phones. The three types of attribute information (weight: 123 gram or132 gram; pixel number:3,200,000 or 5,070,000 pixels; display size:2.8 or 3.0 inches) were manipulated in the same manner in the previous judgment task. The participant provided representative values and lower and upper values of probabilities. That is, the participants answered the lower value, the representative value , and upper value for the probability that superior alternative is preferred to inferior alternative. The participants used the fuzzy rating method to provide representative, lower, and upper values of probabilities (see Figure 7 ).

### **4.2.2.2 Analysis and results**

The fuzzy coefficients were obtained by fuzzy logistic regression analysis using the least squares under constraints, as shown in Tables 7 for the digital camera data and Table 8 for mobile phone data. The independent variables in Table 7 and Table 8 are objective values measured by crisp numbers. The parameter of *Ajk*0 involves a response bias owing to presentation order, context effects, and the scale parameter of the dependent variables. According to Tables 7, the bounded preference strength was influenced most by whether the target digital camera was 2.5 or 5.0 inches. The impact of the memory's attribute was smaller than those of display size and weight dimensions. According to Tables 8, the bounded preference strength t was influenced most by whether the target mobile phone was 2.8 or 3.0 inches. The impact of the weight's attribute was smaller than those of display size and pixel number dimensions.


Note: The independent variables are crisp numbers.

Table 7. Coefficients of Fuzzy Logistic Regression Analysis for Digital Camera Data

Ambiguity and Social Judgment: Fuzzy Set Model and Data Analysis 23

marketing research, risk perception research, and human judgment and decision-making research. Empirical research using possibilistic analysis and least squares analysis will be

Results of these applications to psychological study indicated that the parameter estimated in the proposed analysis was meaningful for social judgment study. This study has a methodological restriction on statistical inferences for fuzzy parameters. Therefore, we plan further work on the fuzzy theoretic analysis of social judgment directed toward the statistical study of fuzzy regression analysis and fuzzy logistic regression analysis such as

This work was supported in part by Grants in Aids for Grant-in-Aid for Scientific Research on Priority Area, The Ministry of Education, Culture, Sports, Science and Technology(MEXT). I thank Matsumoto,T., Matsuyama,S.,and Kobayashi,M.. for their

Anderson,N.H.(1988). A functional approach to person cognition. In T.K.Srull & R.S. Wyer

Dubois D. & Prade,H. (1980). Fuzzy sets and systems: Theory and applications, New York:

Festinger, L. (1954). A theory of social comparison processes. *Human Relations,* 7, 114–140. Hesketh, B., Pryor, R., Gleitzman, M., & Hesketh, T. (1988). Practical applications and

(Ed.), *Fuzzy sets in psychology* (pp. 425–454). New York: North Holland. Kühberger,A.,.Schulte-Mecklenbeck,M. & Ranyard,R. (2011). Introduction: Windows for

Mussweiler, T. (2003). Comparison processes in social judgment: Mechanisms and

Nakamura, K. (1992). On the nature of intransitivity in human referential judgments. In V.

Nguyen, H. T. (1978). A note on the extension principle for fuzzy sets. *Journal of Mathematical* 

Rosch,E. (1975). Cognitive representation of semantic categories. Journal of Experimental

Rosch,E., & Mervis,C.B. (1975). Family resemblances: Studies in the internal structure of

Sakawa, M., & Yano, H. (1992). Multiobjective fuzzy linear regression analysis for fuzzy

(Eds.), *Advances in social cognition.* vol.1. Hiisdale, New Jersey: Lawrence Erlbaum

psychometric evaluation of a computerised fuzzy graphic rating scale. In T. Zetenyi

understanding the mind, In M.Schulte-Mecklenbeck, A.Kühberger, & R. Ranyard(Eds.), A handbook of process tracing methods for decision research,New

Novak (Ed.), *Fuzzy approach to reasoning and decision making, academia* (pp. 147–162).

statistical tests of parameters, outlier detection, and step-wise variable selection.

assistance, and the editor and the reviewers for their valuable comments.

needed to examine the validity of these models.

**6. Acknowledgment** 

**7. References** 

Associates, pp.37-51.

Yorrk: Psychologgy Press, pp.3-17.

Prague: Kluwer Academic Publishers.

*Analysis and Application,* 64, 369–380.

Psychology: General, 104, 192-233.

categories. *Cognitive Psychology,* 7, 573-603.

input-output data. *Fuzzy Sets and Systems,* 47, 173–181.

consequences*. Psychological Review,* 110, 472–489.

Academic Press.


Note: The independent variables are crisp numbers.

Table 8. Coefficients of Fuzzy Logistic Regression Analysis for Mobile Phone Data

### **5. Conclusion**

This chapter introduce fuzzy set models for ambiguous comparative judgments, which do not always hold transitivity and comparability properties. The first type of model was a fuzzy theoretical extension of the additive difference model for preference that is used to explain ambiguous preference strength. This model can be reduced to the Fuzzy Utility Difference Model (Nakamura, 1992) if multi-attribute weighting parameters are assumed to be crisp numbers, and can be reduced to the Additive Difference Model (Tversky, 1969) if multi-attribute weighting parameters and the values of multi-attributes are assumed to be crisp numbers. The second type of model was a fuzzy logistic model for explaining ambiguous preference in which preference strength is bounded, such as a probability measure.

In both models, multi-attribute weighting parameters and all attribute values were assumed to be asymmetric fuzzy L-R numbers. For each model, parameter estimation method using fuzzy regression analysis was introduced. Numerical examples for comparison were also demonstrated. As the objective of the numerical examples was to demonstrate that the proposed estimation might be viable, further empiric studies will be needed. Moreover, because the two models require different evaluation methods, comparisons of the psychological effects of the two methods must be studied further.

In this chapter, the least squares method was used for data analyses of the two models. However, the possibilistic linear regression analysis (Sakawa & Yano, 1992) and the possibilistic logistic regression analysis (Takemura, 2004) could also be used in the data analysis of the additive difference type model and the logistic type model, respectively. The proposed models and the analyses for ambiguous comparative judgments will be applied to marketing research, risk perception research, and human judgment and decision-making research. Empirical research using possibilistic analysis and least squares analysis will be needed to examine the validity of these models.

Results of these applications to psychological study indicated that the parameter estimated in the proposed analysis was meaningful for social judgment study. This study has a methodological restriction on statistical inferences for fuzzy parameters. Therefore, we plan further work on the fuzzy theoretic analysis of social judgment directed toward the statistical study of fuzzy regression analysis and fuzzy logistic regression analysis such as statistical tests of parameters, outlier detection, and step-wise variable selection.

### **6. Acknowledgment**

22 Fuzzy Logic – Algorithms, Techniques and Implementations

Weight (M) Representative 0.002 Weight (R) Upper 0.009

Pixel Number(R) Upper 0.024 Display Size(R) Lower 0.161 Display Size(M) Representative 0.165 Display Size(L) Upper 0.232

(L) Lower -0.871

(M) Representative 0.030

Weight(L) Lower 0.002

Fuzzy Pixel Number(L) Lower 0.012 Coefficient Pixel Number(M) Representative 0.017

(R) Upper 0.887

This chapter introduce fuzzy set models for ambiguous comparative judgments, which do not always hold transitivity and comparability properties. The first type of model was a fuzzy theoretical extension of the additive difference model for preference that is used to explain ambiguous preference strength. This model can be reduced to the Fuzzy Utility Difference Model (Nakamura, 1992) if multi-attribute weighting parameters are assumed to be crisp numbers, and can be reduced to the Additive Difference Model (Tversky, 1969) if multi-attribute weighting parameters and the values of multi-attributes are assumed to be crisp numbers. The second type of model was a fuzzy logistic model for explaining ambiguous preference in which preference strength is bounded, such as a probability

In both models, multi-attribute weighting parameters and all attribute values were assumed to be asymmetric fuzzy L-R numbers. For each model, parameter estimation method using fuzzy regression analysis was introduced. Numerical examples for comparison were also demonstrated. As the objective of the numerical examples was to demonstrate that the proposed estimation might be viable, further empiric studies will be needed. Moreover, because the two models require different evaluation methods, comparisons of the

In this chapter, the least squares method was used for data analyses of the two models. However, the possibilistic linear regression analysis (Sakawa & Yano, 1992) and the possibilistic logistic regression analysis (Takemura, 2004) could also be used in the data analysis of the additive difference type model and the logistic type model, respectively. The proposed models and the analyses for ambiguous comparative judgments will be applied to

psychological effects of the two methods must be studied further.

Table 8. Coefficients of Fuzzy Logistic Regression Analysis for Mobile Phone Data

Attribute Value

*Ajk* <sup>0</sup> *Ajk* <sup>0</sup>

*Ajk*<sup>0</sup>

Note: The independent variables are crisp numbers.

**5. Conclusion** 

measure.

This work was supported in part by Grants in Aids for Grant-in-Aid for Scientific Research on Priority Area, The Ministry of Education, Culture, Sports, Science and Technology(MEXT). I thank Matsumoto,T., Matsuyama,S.,and Kobayashi,M.. for their assistance, and the editor and the reviewers for their valuable comments.

### **7. References**


**2** 

Agnes Achs

 *Hungary* 

**From Fuzzy Datalog to** 

*University of Pecs Faculty of Engineering,* 

**Multivalued Knowledge-Base** 

Despite the fact that people have very different and ambiguous concepts and knowledge, they are able to talk to one another. How does human mind work? How can people give answers to questions? Modelling human conversation and knowledge demands to deal with

Human knowledge consists of static and dynamic knowledge chunks. The static ones include the so called lexical knowledge or the ability to sense similarities between facts and between predicates. Through dynamic attainments one can make deductions or one can give answers to a question. There are several and very different approaches to make a model of human knowledge, but one of the most common and widespread fields of research is based

Fuzzy sets theory, proposed by Zadeh (1965), is a realistic and practical means to describe the world that we live in. The method has successfully been applied in various fields, among others in decision making, logic programming, and approximate reasoning. In the last decade, a number of papers have dealt with that subject, e.g. (Formato et al 2000, Sessa 2002, Medina et al 2004, Straccia et al 2009). They deal with different aspects of modelling and handling uncertainty. (Straccia 2008) gives a detailed overview of this topic with widespread references. Our investigations have begun independently of these works, and have run parallel to them. Of course there are some similar features, but our model differs from the

As a generalization of fuzzy sets, intuitionistic fuzzy sets were presented by Atanassov (Atanassov 1983), and have allowed people to deal with uncertainty and information in a much broader perspective. Another well-known generalization of an ordinary fuzzy set is the interval-valued fuzzy set, which was first introduced by Zadeh (Zadeh 1975). These generalizations make descriptions and models of the world more realistic, and practical.

In the beginning, our knowledge-base model was based on the concept of fuzzy logic, later on it was extended to intuitionistic and interval-valued logic. In this model, the static part is a background knowledge module, while the dynamic part consists of a Datalog based deduction mechanism. To develop this mechanism, it was necessary to generalize the Datalog language

and to extend it into fuzzy and intuitionistic direction. (Achs 1995, 2007, 2010).

**1. Introduction** 

on fuzzy logic.

uncertainty and deductions.

others detailed in literature.


## **From Fuzzy Datalog to Multivalued Knowledge-Base**

Agnes Achs *University of Pecs Faculty of Engineering, Hungary* 

### **1. Introduction**

24 Fuzzy Logic – Algorithms, Techniques and Implementations

Sherif,M.,& Hovland,C,I. (1961). *Social judgment: Assimilation and contrast effects in communication and attitude change.* New Haven: Yale University Press. Smithson, M. (1987). *Fuzzy set analysis for the behavioral and social sciences*. New York:

Takemura, K. (1999). A fuzzy linear regression analysis for fuzzy input-output data using

Takemura, K. (2000). Vagueness in human judgment and decision making. In Z. Q. Liu & S.

Takemura, K. (2004). Fuzzy logistic regression analysis for fuzzy input and output data.

Takemura, K. (2005). Fuzzy least squares regression analysis for social judgment study.

Takemura, K. (2007). Ambiguous comparative judgment: Fuzzy set model and data analysis.

Takemura, K. ,Matsumoto,T.,Matsuyama,S.,& Kobayashi,M., (2011). Analysis of consumer's

Takemura,K. (1996). *Psychology of decision making.* Tokyo:Fukumura Syuppan. (in Japanese). Takemura,K. (2011) Model of multi-attribute decision making and good decision.

Zadeh,A. (1973). Outline of a new approach to the analysis of complex systems and decision processes, *IEEE Transactions on Systems, Man and Cybernetics, SMC* 3(1), 28-44.

the least squares method under linear constraints and its application to fuzzy rating

Miyamoto (Eds), *Soft Computing for Human Centered Machines* (pp. 249–281). Tokyo:

Proceedings of the joint 2nd International Conference on Soft Computing and Intelligent Systems and the 5th International Symposium on Advanced Intelligent

ambiguous comparative judgment. *Discussion Paper, Department of Psychology,* 

Smithson,M.(1989) Ignorance and uncertainty. New York: Springer-Verlag-

data. *Journal of Advanced Computational Intelligence,* 3, 36–40.

*Journal of Advanced Computational Intelligence,* 9, 461–466.

Operations Research,56(10),583-590 (In Japanese)

Zadeh,A. (1965). Fuzzy sets, *Information and Control*, 8, 338-353.

Tversky, A. (1969). Intransitivity of preferences. *Psychological Review,* 76, 31–48. Wittgenstein,L. (1953). *Philosophical investigations.* New York:MacMillan.

Systems 2004 (WE8-5), Yokohama, Japan.

*Japanese Psychology Research,* 49, 148–156.

Springer-Verlag.

Springer Verlag.

*Waseda University*.

Despite the fact that people have very different and ambiguous concepts and knowledge, they are able to talk to one another. How does human mind work? How can people give answers to questions? Modelling human conversation and knowledge demands to deal with uncertainty and deductions.

Human knowledge consists of static and dynamic knowledge chunks. The static ones include the so called lexical knowledge or the ability to sense similarities between facts and between predicates. Through dynamic attainments one can make deductions or one can give answers to a question. There are several and very different approaches to make a model of human knowledge, but one of the most common and widespread fields of research is based on fuzzy logic.

Fuzzy sets theory, proposed by Zadeh (1965), is a realistic and practical means to describe the world that we live in. The method has successfully been applied in various fields, among others in decision making, logic programming, and approximate reasoning. In the last decade, a number of papers have dealt with that subject, e.g. (Formato et al 2000, Sessa 2002, Medina et al 2004, Straccia et al 2009). They deal with different aspects of modelling and handling uncertainty. (Straccia 2008) gives a detailed overview of this topic with widespread references. Our investigations have begun independently of these works, and have run parallel to them. Of course there are some similar features, but our model differs from the others detailed in literature.

As a generalization of fuzzy sets, intuitionistic fuzzy sets were presented by Atanassov (Atanassov 1983), and have allowed people to deal with uncertainty and information in a much broader perspective. Another well-known generalization of an ordinary fuzzy set is the interval-valued fuzzy set, which was first introduced by Zadeh (Zadeh 1975). These generalizations make descriptions and models of the world more realistic, and practical.

In the beginning, our knowledge-base model was based on the concept of fuzzy logic, later on it was extended to intuitionistic and interval-valued logic. In this model, the static part is a background knowledge module, while the dynamic part consists of a Datalog based deduction mechanism. To develop this mechanism, it was necessary to generalize the Datalog language and to extend it into fuzzy and intuitionistic direction. (Achs 1995, 2007, 2010).

From Fuzzy Datalog to Multivalued Knowledge-Base 27

this model it is impossible to make any true fact false and still have a model consistent with

An interpretation assigns truth or falsehood to every possible instance of the program's predicates. An interpretation is a model, if it makes the rules true, no matter what assignment of values from the domain is made for the variables in each rule. Although there are infinite many implications, it is proved that it is enough to consider only the Herbrand

The Herbrand universe of a program *P* (denoted by *HP*) is the set of all possible ground terms constructed by using constants and function symbols occurring in *P*. The Herbrand base of *P* (*BP*) is the set of all possible ground atoms whose predicate symbols occur in *P* and

In general, a term is a variable, a constant or a complex term of the form *f(t1, …, tn)*, where *f* is a function symbol and *t1, …, tn* are terms. An atom is a formula of the form *p(t)*, where *p* is a predicate symbol of a finite arity (say *n*) and *t* is a sequence of terms of length *n* (arguments). A literal is either an atom (positive literal) or its negation (negative literal). A term, atom or literal is ground if it is free of variables. As in fuzzy extension, we did not deal with function symbols, so in our case the ground terms are the constants of the program.

In the case of Datalog programs there are several equivalent approaches to define the semantics of the program. In fuzzy extension we mainly rely on the fixed-point base aspect. The above concepts are detailed in classical works such as (Ceri et al 1990, Loyd 1990,

In fuzzy Datalog (fDATALOG) the facts can be completed with an uncertainty level, the rules with an uncertainty level and an implication operator. With the use of this operator and these levels deductions can be made. As in classical cases, logical correctness is extremely important as well, i.e., the consequence must be a model of the program. This means that for each rule of the program, the truth-value of the fuzzy implication following

β

≥ *0),* 

 *A1,…,An (n* 

where *A* is an atom (the head of the rule), *A1,…,An* are literals (the body of the rule); *I* is an

For getting a finite result, all the rules in the program must be safe. An fDATALOG rule is safe if all variables occurring in the head also occur in the body, and all variables occurring in a negative literal also occur in a positive one. An fDATALOG program is a finite set of

∈ (0,1] (the level of the rule).

*; I*, where r is a formula of the form

the rule has to be at least as large as the given uncertainty level.

*A* ←

**2.1.1 Syntax and semantics of fuzzy datalog** 

**Definition 1**. An fDATALOG rule is a triplet *r;* 

β

More precisely, the notion of fuzzy rule is the following:

interpretation defined on the Herbrand universe and the Herbrand base.

the database.

Ullman 1988).

**2.1 Fuzzy Datalog** 

implication operator and

safe fDATALOG rules.

whose arguments are elements of *HP*.

In many frameworks, in order to answer a query, we have to compute the whole intended model by a bottom-up fixed-point computation and then answer with the evaluation of the query in this model. This always requires computing a whole model, even if not all the facts and rules are required to determine answer. Therefore a possible top-down like evaluation algorithm has been developed for our model. This algorithm is not a pure top-down one but the combination of top down and bottom up evaluations. Our aim is to improve this algorithm and perhaps to develop a pure top down evaluation based on fuzzy or multivalued unification algorithm. There are fuzzy unification algorithms described for example in (Alsinet et al 1998, Formato et al 2000, Virtanen 1994), but they are inappropriate for evaluating our knowledge-base.

However, the concept of (Julian-Iranzo et al 2009, 2010) is similar but not identical with one of our former ideas about evaluating of special fuzzy Datalog programs (Achs 2006). Reading these papers has led to the assumption that this former idea may be the base of a top-down-like evaluation strategy in special multivalued cases as well. Based on this idea, a multivalued unification algorithm was developed and used for to determine the conclusion of a multivalued knowledge-base.

In this chapter this possible model for handling uncertain information will be provided. This model is based on the multivalued extensions of Datalog. Starting from fuzzy Datalog, the concept of intuitionistic Datalog and bipolar Datalog will be described. This will be the first pillar of the knowledge-base. The second one deals with the similarities of facts and concepts. These similarities are handled with proximity relations. The third component connects the first two with each other. In the final part of the paper, an evaluating algorithm is presented. It is discussed in general, but in special cases it is based on fuzzy, or multivalued unification, which is also mentioned.

### **2. Extensions of datalog**

When one builds a knowledge-base, it is very important to deal with a database management system. It is based on the relational data model developed by Codd in 1970. This model is a very useful one, but it can not handle every problem. For example, the standard query language for relational databases (SQL) is not Turing-complete, in particular it lacks recursion and therefore concepts like transitive closure of a relation can not be expressed in SQL. Along with other problems this is why different extensions of the relational data model or the development of other kinds of models are necessary. A more complete one is the world of deductive databases. A deductive database consists of facts and rules, and a query is answered by building chains of deductions. Therefore the term of deductive database highlights the ability to use a logic programming style for expressing deductions concerning the contents of a database. One of the best known deductive database query languages is Datalog.

As any deductive database, a Datalog program consists of facts and rules, which can be regarded as first order logic formulas. Using these rules, new facts can be inferred from the program's facts so that the consequence of a program will be logically correct. This means that evaluating the program, the result is a model of the formulas belonging to the rules. On the other hand, it is also important that this model will contain only those true facts, which are the consequences of the program; that is, the minimality of this model is expected, i.e. in 26 Fuzzy Logic – Algorithms, Techniques and Implementations

In many frameworks, in order to answer a query, we have to compute the whole intended model by a bottom-up fixed-point computation and then answer with the evaluation of the query in this model. This always requires computing a whole model, even if not all the facts and rules are required to determine answer. Therefore a possible top-down like evaluation algorithm has been developed for our model. This algorithm is not a pure top-down one but the combination of top down and bottom up evaluations. Our aim is to improve this algorithm and perhaps to develop a pure top down evaluation based on fuzzy or multivalued unification algorithm. There are fuzzy unification algorithms described for example in (Alsinet et al 1998, Formato et al 2000, Virtanen 1994), but they are inappropriate

However, the concept of (Julian-Iranzo et al 2009, 2010) is similar but not identical with one of our former ideas about evaluating of special fuzzy Datalog programs (Achs 2006). Reading these papers has led to the assumption that this former idea may be the base of a top-down-like evaluation strategy in special multivalued cases as well. Based on this idea, a multivalued unification algorithm was developed and used for to determine the conclusion

In this chapter this possible model for handling uncertain information will be provided. This model is based on the multivalued extensions of Datalog. Starting from fuzzy Datalog, the concept of intuitionistic Datalog and bipolar Datalog will be described. This will be the first pillar of the knowledge-base. The second one deals with the similarities of facts and concepts. These similarities are handled with proximity relations. The third component connects the first two with each other. In the final part of the paper, an evaluating algorithm is presented. It is discussed in general, but in special cases it is based on fuzzy, or

When one builds a knowledge-base, it is very important to deal with a database management system. It is based on the relational data model developed by Codd in 1970. This model is a very useful one, but it can not handle every problem. For example, the standard query language for relational databases (SQL) is not Turing-complete, in particular it lacks recursion and therefore concepts like transitive closure of a relation can not be expressed in SQL. Along with other problems this is why different extensions of the relational data model or the development of other kinds of models are necessary. A more complete one is the world of deductive databases. A deductive database consists of facts and rules, and a query is answered by building chains of deductions. Therefore the term of deductive database highlights the ability to use a logic programming style for expressing deductions concerning the contents of a database. One of the best known deductive

As any deductive database, a Datalog program consists of facts and rules, which can be regarded as first order logic formulas. Using these rules, new facts can be inferred from the program's facts so that the consequence of a program will be logically correct. This means that evaluating the program, the result is a model of the formulas belonging to the rules. On the other hand, it is also important that this model will contain only those true facts, which are the consequences of the program; that is, the minimality of this model is expected, i.e. in

for evaluating our knowledge-base.

of a multivalued knowledge-base.

**2. Extensions of datalog** 

database query languages is Datalog.

multivalued unification, which is also mentioned.

this model it is impossible to make any true fact false and still have a model consistent with the database.

An interpretation assigns truth or falsehood to every possible instance of the program's predicates. An interpretation is a model, if it makes the rules true, no matter what assignment of values from the domain is made for the variables in each rule. Although there are infinite many implications, it is proved that it is enough to consider only the Herbrand interpretation defined on the Herbrand universe and the Herbrand base.

The Herbrand universe of a program *P* (denoted by *HP*) is the set of all possible ground terms constructed by using constants and function symbols occurring in *P*. The Herbrand base of *P* (*BP*) is the set of all possible ground atoms whose predicate symbols occur in *P* and whose arguments are elements of *HP*.

In general, a term is a variable, a constant or a complex term of the form *f(t1, …, tn)*, where *f* is a function symbol and *t1, …, tn* are terms. An atom is a formula of the form *p(t)*, where *p* is a predicate symbol of a finite arity (say *n*) and *t* is a sequence of terms of length *n* (arguments). A literal is either an atom (positive literal) or its negation (negative literal). A term, atom or literal is ground if it is free of variables. As in fuzzy extension, we did not deal with function symbols, so in our case the ground terms are the constants of the program.

In the case of Datalog programs there are several equivalent approaches to define the semantics of the program. In fuzzy extension we mainly rely on the fixed-point base aspect. The above concepts are detailed in classical works such as (Ceri et al 1990, Loyd 1990, Ullman 1988).

### **2.1 Fuzzy Datalog**

In fuzzy Datalog (fDATALOG) the facts can be completed with an uncertainty level, the rules with an uncertainty level and an implication operator. With the use of this operator and these levels deductions can be made. As in classical cases, logical correctness is extremely important as well, i.e., the consequence must be a model of the program. This means that for each rule of the program, the truth-value of the fuzzy implication following the rule has to be at least as large as the given uncertainty level.

### **2.1.1 Syntax and semantics of fuzzy datalog**

More precisely, the notion of fuzzy rule is the following:

**Definition 1**. An fDATALOG rule is a triplet *r;* β*; I*, where r is a formula of the form

$$A \gets A\_1, \ldots, A\_n \qquad \quad (n \ge 0),$$

where *A* is an atom (the head of the rule), *A1,…,An* are literals (the body of the rule); *I* is an implication operator and β∈ (0,1] (the level of the rule).

For getting a finite result, all the rules in the program must be safe. An fDATALOG rule is safe if all variables occurring in the head also occur in the body, and all variables occurring in a negative literal also occur in a positive one. An fDATALOG program is a finite set of safe fDATALOG rules.

From Fuzzy Datalog to Multivalued Knowledge-Base 29

*T1 = T(T0)* 

*Tn = T(Tn-1)*  ...

It is clear that *NTP* is inflationary transformation over *L*, and if *P* is negation-free, then *NTP* is monotone as well. (A transformation *T* is inflationary if *X* ≤ *T(X)* for every *X* ∈ *L* and it is

In (Ceri et al 1990) it is shown that an inflationary transformation over a complete lattice has a fixed point and if it is monotone then it has a least fixed point (Loyd 1990). Therefore *NTP* has a fixed point, i.e. there exists an *X* ∈ *F(BP)* for which *NTP(X)* = *X*. If *P* is positive, then *X*

The fixed point of the transformation will be denoted by *lfp(NTP)*. It can be shown (Achs

α*Ai*

β

*A* the condition *I(*

α

In the case of c because of the construction of *T0 Ai* is not negative, that is *Ai* is not among the

**Proof** In the case of a positive Datalog program, the least fixed point is the least model (Ceri et al 1990, Ullman 1988). In the case of fuzzy Datalog, according to the definition of the consequence transformation, the level of the rule's head is the least value satisfying the criterion of modelness. The application of the transformations may arise only one problem. A lower level would be ordered to the same rule's head, but according to the definitions we should accept the higher value. But such a case can arise only in the case of programs

According to the above statements, the meaning of the programs can be defined by this

**Definition 3.** *lfp(NTP)* is the nondeterministic semantics of fDATALOG *P* program.

*A)* ∈ *lfp(NTP)* and *(|Ai|,* 

α*body,* α*<sup>A</sup> ) = 1* ≥

**Proposition 1**. For negation-free fDATALOG program *P lfp(NTP)* is the least model.

*)* ∉ *lfp(NTP).*

α*Ai*

*body = 0*, so *I(*

containing any negation. Therefore the proposition is true.

An ordering relation can be defined over *F(BP)*. For *G; H : BP* → [*0; 1*]; *G* ≤ *H* iff *(*

ω } if ω

is a limit ordinal.

 *)* ∈ *lfp(NTP), 1*≤ *i* ≤ *n*.

α*body,* α*<sup>A</sup> )* ≥ β

, namely *lfp(NTP)* is a model.

∀*d* ∈ *BP)*

is realized.

 *=* least upper bound of { *Tn | n* <

*G(d)* ≤ *H(d)*. It easily can be seen that *L = (F(BP),* ≤ *)* is a complete lattice.

is the least fixed point. (That is for any *Z=T(Z)* : *X* ≤ *Z*.)

**Proof** In *ground(P)* there are rules in the next forms:

*; I);* ∃*i : (|Ai|,* 

α

In the case of a, b because of the construction of

Moreover, the next proposition is true as well.

1995) that this fixed point is a model of *P*.

**Theorem 1**. *lfp(NTP)* is a model of *P*.

β*; I); (A,* α

β

*= 0*, therefore

and let

a/ *(A* ←*;* 

facts, so

fixed point:

β*; I).* b/ *(A* ← *A1, ..., An;* 

c/ *(A* ← *A1, ..., An;* 

α*Ai* *T*ω

monotone if *T(X)* ≤ *T(Y)* if *X* ≤ *Y*).

There is a special type of rule, called fact. A fact has the form *A* ←*;* β*; I*. From now on, the facts are referred as (*A,*β), because according to implication *I*, the level of *A* easily can be computed and in the case of the implication operators detailed in this chapter it is β.

For defining the meaning of a program, we need again the concepts of Herbrand universe and Herbrand base, but this time they are based on fuzzy logic. Now a ground instance of a rule *r;* β*; I* in *P* is a rule obtained from *r* by replacing every variable in *r* with a constant of *HP*. The set of all ground instances of *r;* β*; I* is denoted by *ground(r);* β*; I*. The ground instance of *P* is *ground(P) =* ∪ *(r; I;* β*)*∈*P (ground(r); I;* β*)*.

An interpretation of a program *P* is a fuzzy set of the program's Herbrand base, *BP*, i.e. it is: ∪ *<sup>A</sup>*∈*BP (A;* α*A).* An interpretation is a model of *P* if for each *(A* ← *A1,…,An;* β*; I)* ∈ *ground(P)*

$$I(\alpha\_{A1\_{A\gets\mathcal{A}}An}, \alpha\_{\mathcal{A}}) \ge \beta.$$

A model *M* is least if for any model *N*, *M* ≤ *N*. A model *M* is minimal if there is not any model *N*, where *N* ≤ *M*.

To be short α *A1*∧*...*∧ *An* will be denoted as αbody and αA as αhead.

In the extensions of Datalog several implication operators are used, but all cases are restricted to min-max conjunction and disjunction, and to the complement to 1 as negation. So: α*A*∧*B = min(*α*A,* α*B),* α*A*∨*B = max(*α*A,* α*B)* and α¬*A = 1* − α*A*.

The semantics of fDATALOG is defined as the fixed points of consequence transformations. Depending on evaluating sequences two semantics can be defined: a deterministic and a nondeterministic one. Further on only the nondeterministic semantics will be discussed, the deterministic one is detailed in (Achs 2010). It was proved that the two semantics are equivalent in the case of negation- and function-free fDatalog programs, but they differ if the program has any negation. In this case merely the nondeterministic semantics is applicable. The nondeterministic transformation is as follows:

**Definition 2.** Let *BP* be the Herbrand base of the program *P*, and let *F(BP)* denote the set of all fuzzy sets over *BP*. The consequence transformation *NTP* : *F(BP)* → *F(BP)* is defined as

$$NT\_F(\mathcal{X}) = \{ (\mathcal{A}, \mathcal{\alpha}\_{\mathcal{A}}) \} \cup \mathcal{X} \tag{1}$$

where

$$(\mathcal{A} \gets \mathcal{A}\_{\mathcal{l}}, \dots, \mathcal{A}\_{\mathcal{n}}; \mathcal{\mathcal{J}} \mathcal{I}) \in \operatorname{ground}(\mathcal{P}), \ (\lvert A\_{i} \rvert, \lVert \mathcal{A}\_{\mathcal{l}i} \rangle \in \mathcal{X}, \ (\mathsf{1} \leq \mathsf{i} \leq \mathsf{n}); \mathcal{I})$$

α*A = max(0, min{*γ *| I(*α*body,* γ*)* ≥ β*}).* 


It can be proved that this transformation has a fixed point. To prove it, let us define the powers of a transformation:

For any *T : F(BP)* → *F(BP)* transformation let

$$T\_0 = \{ \cup \{ (\mathbf{A}, \alpha\_{\mathbf{A}}) \} \mid (\mathbf{A} \leftarrow \text{; I}; \mathfrak{f}) \in \text{ground}(\mathbf{P}), \alpha\_{\mathbf{A}} = \max(0, \min\{ \gamma \mid \mathbf{I}(1, \gamma) \ge \beta \}) \mid \cup \{ (\mathbf{A}, \bullet) \mid \exists \ (\mathbf{A}, \bullet) \in \text{ground}(\mathbf{P}) \}$$

and let

28 Fuzzy Logic – Algorithms, Techniques and Implementations

For defining the meaning of a program, we need again the concepts of Herbrand universe and Herbrand base, but this time they are based on fuzzy logic. Now a ground instance of a

An interpretation of a program *P* is a fuzzy set of the program's Herbrand base, *BP*, i.e. it is:

A model *M* is least if for any model *N*, *M* ≤ *N*. A model *M* is minimal if there is not any

In the extensions of Datalog several implication operators are used, but all cases are restricted to min-max conjunction and disjunction, and to the complement to 1 as negation.

The semantics of fDATALOG is defined as the fixed points of consequence transformations. Depending on evaluating sequences two semantics can be defined: a deterministic and a nondeterministic one. Further on only the nondeterministic semantics will be discussed, the deterministic one is detailed in (Achs 2010). It was proved that the two semantics are equivalent in the case of negation- and function-free fDatalog programs, but they differ if the program has any negation. In this case merely the nondeterministic semantics is

**Definition 2.** Let *BP* be the Herbrand base of the program *P*, and let *F(BP)* denote the set of all fuzzy sets over *BP*. The consequence transformation *NTP* : *F(BP)* → *F(BP)* is defined as

> α*A )}* ∪

 *ground(P), (|Ai|,* 

γ *| I(*α*body,* γ*)* ≥ β*}).* 


It can be proved that this transformation has a fixed point. To prove it, let us define the

 *ground(P),* 

α*A1 ,…,* α*An).*

α

*body* = *min(*

α*Ai)* ∈ *X, (1* ≤ *i* ≤ *n);* 

*A= max(0, min{*

 *ground(P)}* 

γ *| I(1,* γ*)* ≥ β*}) }* ∪

*NTP(X) = {(A,* 

*A = max(0, min{*

α

*; I* in *P* is a rule obtained from *r* by replacing every variable in *r* with a constant of

*; I* is denoted by *ground(r);* 

β

β

β

 *X ,* (1)

), because according to implication *I*, the level of *A* easily can be

*; I*. From now on, the

β.

*; I*. The ground instance

*; I)* ∈ *ground(P)*

There is a special type of rule, called fact. A fact has the form *A* ←*;* 

*P (ground(r); I;* 

*I(*α *A1*∧*...*∧ *An ,*α*A)* ≥ β*.* 

α*A,* α*B)* and α¬*A = 1* − α*A*.

applicable. The nondeterministic transformation is as follows:

 *A1,…,An;* 

*Ai*, if *Ai* is negative) and

For any *T : F(BP)* → *F(BP)* transformation let

β*; I )* ∈

α

← *; I;* β*)* ∈

> ∃ *(B* ← *...*¬ *A...; I;* β*)* ∈

*{(A, 0) |* 

 *An* will be denoted as αbody and αA as αhead.

computed and in the case of the implication operators detailed in this chapter it is

β

β*)*.

*A).* An interpretation is a model of *P* if for each *(A* ← *A1,…,An;* 

β

β*)*∈

*HP*. The set of all ground instances of *r;* 

facts are referred as (*A,*

of *P* is *ground(P) =* ∪ *(r; I;* 

α

model *N*, where *N* ≤ *M*.

α *A1*∧*...*∧

> α*A,* α*B),* α*A*∨*B = max(*

> > *(A* ←

rule *r;* β

∪ *<sup>A</sup>*∈*BP (A;* 

To be short

So: α*A*∧*B = min(*

where

literal, and

¬

*T0 = {* ∪*{(A,*α*A)} | (A* 

powers of a transformation:

$$T\_1 = T(T\_0)$$

$$T\_n = T(T\_{n-1})$$

$$\dots$$

$$T\_{\alpha} \text{=least upper bound of } \{ T\_n \mid n \le \alpha \} \text{ if } \alpha \text{ is a limit ordinal.}$$

An ordering relation can be defined over *F(BP)*. For *G; H : BP* → [*0; 1*]; *G* ≤ *H* iff *(*∀*d* ∈ *BP) G(d)* ≤ *H(d)*. It easily can be seen that *L = (F(BP),* ≤ *)* is a complete lattice.

It is clear that *NTP* is inflationary transformation over *L*, and if *P* is negation-free, then *NTP* is monotone as well. (A transformation *T* is inflationary if *X* ≤ *T(X)* for every *X* ∈ *L* and it is monotone if *T(X)* ≤ *T(Y)* if *X* ≤ *Y*).

In (Ceri et al 1990) it is shown that an inflationary transformation over a complete lattice has a fixed point and if it is monotone then it has a least fixed point (Loyd 1990). Therefore *NTP* has a fixed point, i.e. there exists an *X* ∈ *F(BP)* for which *NTP(X)* = *X*. If *P* is positive, then *X* is the least fixed point. (That is for any *Z=T(Z)* : *X* ≤ *Z*.)

The fixed point of the transformation will be denoted by *lfp(NTP)*. It can be shown (Achs 1995) that this fixed point is a model of *P*.

**Theorem 1**. *lfp(NTP)* is a model of *P*.

**Proof** In *ground(P)* there are rules in the next forms:

a/ *(A* ←*;* β*; I).* b/ *(A* ← *A1, ..., An;* β*; I); (A,* α*A)* ∈ *lfp(NTP)* and *(|Ai|,* α*Ai )* ∈ *lfp(NTP), 1*≤ *i* ≤ *n*. c/ *(A* ← *A1, ..., An;* β*; I);* ∃*i : (|Ai|,* α*Ai )* ∉ *lfp(NTP).*

In the case of a, b because of the construction of α*A* the condition *I(*α*body,* α*<sup>A</sup> )* ≥ β is realized. In the case of c because of the construction of *T0 Ai* is not negative, that is *Ai* is not among the facts, so α*Ai = 0*, therefore α*body = 0*, so *I(*α*body,* α*<sup>A</sup> ) = 1* ≥ β, namely *lfp(NTP)* is a model.

Moreover, the next proposition is true as well.

**Proposition 1**. For negation-free fDATALOG program *P lfp(NTP)* is the least model.

**Proof** In the case of a positive Datalog program, the least fixed point is the least model (Ceri et al 1990, Ullman 1988). In the case of fuzzy Datalog, according to the definition of the consequence transformation, the level of the rule's head is the least value satisfying the criterion of modelness. The application of the transformations may arise only one problem. A lower level would be ordered to the same rule's head, but according to the definitions we should accept the higher value. But such a case can arise only in the case of programs containing any negation. Therefore the proposition is true.

According to the above statements, the meaning of the programs can be defined by this fixed point:

**Definition 3.** *lfp(NTP)* is the nondeterministic semantics of fDATALOG *P* program.

From Fuzzy Datalog to Multivalued Knowledge-Base 31

As the next examples show, there some problems would arise if the program had any

*M1 = {(p(a), 0.7) } and M2 = {(q(b), 1) }.* 

*lfp(NTP) = {(r(a), 0.8), (p(a), 0.6), (q(a), 0.5)},* 

*lfp(NTP) = {(r(a), 0.8), (p(a), 0.5), (q(a), 0.5)}.*  According to the above examples, in the case of programs containing negation there are problems with the model's minimality. However, the nondeterministic semantics – *lfp(NTP)* – is minimal under certain conditions. These conditions are referred to as stratification. Stratification gives an evaluating sequence in which the literals are evaluated before

To stratify a program, it is necessary to define the concept of dependency graph. This is a directed graph, whose nodes are the predicates of *P*. There is an arc from predicate *p* to predicate *q* if there is a rule whose body contains *p* or ¬*p* and whose head predicate is *q*. A program is recursive, if its dependency graph has one or more cycles. A program is stratified if whenever there is a rule with head predicate *p* and a negated body literal ¬q,

The stratification of a program *P* is a partition of the predicate symbols of *P* into subsets

b/ if *p* ∈ *Pi* and *q* ∈ *Pj* and there is a rule with the head *p* whose body contains ¬*q*, then *i* > *j*. Stratification specifies an order of evaluation. The rules whose head-predicates are in P*1* are evaluated first, then those whose head-predicates are in P*2* and so on. The sets *P1,..., Pn* are

 *q(b); IG; 0.7.* 

So *lfp(NTP) = {(p(a), 0.8), (r(b), 0.6), (q(a, b), 0.6), (q(b, a), 0.5), (s(a), 0.7), (s(b), 0.7) }.*

*p(a)* ← ¬

**Example 3.** This example shows that there is a difference between the fixed points.

**Example 2.** Look at the next one-rule program:

This program has no least model, only two minimal ones:

The result depends on the evaluation order. If it is 1., 2., 3., 4., then

(The result of the above fixed point algorithm is *M1*.)

negation.

1. (r(a), 0.8).

while in the order 1., 3., 2., 4.

**2.1.2 Stratified fuzzy datalog** 

there is no path in the dependency graph from *p* to *q*.

*P1,..., Pn* such that the following conditions are satisfied:

a/ if *p* ∈ *Pi* and *q* ∈ *Pj* and there is an edge from *q* to *p* then *i* ≥ *j*;

negating them.

2. p(x) ← r(x),¬ q(x); 0.6; IG. 3. q(x) ← r(x); 0.5; IG. 4. p(x) ← q(x); 0.8; IG.

To compute the level of rule-heads, we need the concept of uncertainty level function.

**Definition 4**. The uncertainty-level function is:

$$f(\mathsf{I}, \mathsf{\alpha}, \mathsf{\beta}) = \min \left( \| \, \mathsf{\gamma} \mid \mathsf{I} \, (\mathsf{\alpha}, \mathsf{\gamma}) \geq \mathsf{\beta} \right).$$

According to this function the level of a rule-head is: α*head = f(I,*α*body,* β*).*

It is an extremely important question whether the fixed-point algorithm terminates or not. It depends on the feature of uncertainty level function:

**Proposition 2**. If *f(I,* α*,* β*)* ≤ α for∀α∈ [0; 1] then the fixed point algorithm terminates.

**Proof** As *P* is finite, therefore in the fixed point there are only finite many ground atoms. The only problem may occur with the level of recursive predicates, but according to the above property of the uncertainty-level function, the level of the rule's head cannot be greater than any former one, so this algorithm must terminate.

In former papers (Achs 1995, Achs 2006) several implications were detailed (the operators are detailed in (Dubois et al, 1991)), for now three are chosen from these. The values of their uncertainty-level functions can be easily computed. They are the following:

$$\begin{array}{ll} \text{Gödel} & I\_{\mathbb{C}}(\alpha,\gamma) = \begin{cases} 1 & \alpha \le \gamma \\ \gamma & \text{otherwise} \end{cases} & f(\mathbb{I}\_{\mathbb{C}},\alpha,\beta) = \min(\alpha,\beta) \\\ \text{Lukasisiewicz} & I\_{\mathbb{L}}(\alpha,\gamma) = \begin{cases} 1 & \alpha \le \gamma \\ 1 - \alpha + \gamma & \text{otherwise} \end{cases} & f(\mathbb{I}\_{\mathbb{L}},\alpha,\beta) = \max(0, \alpha + \beta - 1) \\\ \text{Kleene-Dienes} & I\_{\mathbb{K}}(\alpha,\gamma) = \max(1 - \alpha, \gamma) & f(\mathbb{I}\_{\mathbb{K}},\alpha,\beta) = \begin{cases} 0 & \alpha + \beta \le 1 \\ \beta & \alpha + \beta > 1 \end{cases} & f(\mathbb{I}\_{\mathbb{C}},\alpha,\beta) = \begin{cases} 1 & \alpha = \beta \\ \beta & \alpha + \beta > 1 \end{cases} \end{array}$$

It is obvious that *IG* and *IL* satisfy the condition of Proposition 2, and it is easy to see that in the case of *IK* the fixed point algorithm terminates as well. (Among the operators of (Dubois et al, 1991) there is one for which the algorithm does not terminate and one for which the uncertainty-level function does not exists.)

**Example 1.** Let us consider the next program:

$$\begin{array}{l} \text{( $p(a)$ ,  $0.8$ ).}\\ \text{( $r(b)$ ,  $0.6$ ).}\\ q(\text{x}, y) \leftarrow p(\text{x}); \; r(y); \; 0.7; \; I\_{G.}\\ q(\text{x}, y) \leftarrow q(y, \text{x}); \; 0.9; \; I\_{L.}\\ s(\text{x}) \leftarrow q(\text{x}, y); \; 0.7; \; I\_{K.} \end{array}$$

Then *T0* = {*(p(a), 0.8), (r(b), 0.6)* } and the computed atoms are:

$$\begin{aligned} \text{( $q(a,b)$ , min(min(0.8, 0.6), 0.7) = 0.6 $);}\\ \text{($ q(b,a) $, max(0, 0.6 + 0.9 - 1) = 0.5$ );}\\ \text{( $s(a)$ )'} \begin{cases} 0 & 0.6 + 0.7 \le 1 \\ 0.7 & 0.6 + 0.7 > 1 \end{cases} = 0.7); \\ \text{( $s(b)$ )'} \begin{cases} 0 & 0.5 + 0.7 \le 1 \\ 0.7 & 0.5 + 0.7 > 1 \end{cases} = 0.7); \end{aligned}$$

So *lfp(NTP) = {(p(a), 0.8), (r(b), 0.6), (q(a, b), 0.6), (q(b, a), 0.5), (s(a), 0.7), (s(b), 0.7) }.*

As the next examples show, there some problems would arise if the program had any negation.

**Example 2.** Look at the next one-rule program:

$$p(a) \leftarrow \neg q(b); \text{ ló: } 0.7.$$

This program has no least model, only two minimal ones:

*M1 = {(p(a), 0.7) } and M2 = {(q(b), 1) }.* 

(The result of the above fixed point algorithm is *M1*.)

**Example 3.** This example shows that there is a difference between the fixed points.

1. (r(a), 0.8).

30 Fuzzy Logic – Algorithms, Techniques and Implementations

γ *| I (*α*,* γ *)* ≥ β *}).* 

It is an extremely important question whether the fixed-point algorithm terminates or not. It

**Proof** As *P* is finite, therefore in the fixed point there are only finite many ground atoms. The only problem may occur with the level of recursive predicates, but according to the above property of the uncertainty-level function, the level of the rule's head cannot be

In former papers (Achs 1995, Achs 2006) several implications were detailed (the operators are detailed in (Dubois et al, 1991)), for now three are chosen from these. The values of their

γ

 *otherwise*

It is obvious that *IG* and *IL* satisfy the condition of Proposition 2, and it is easy to see that in the case of *IK* the fixed point algorithm terminates as well. (Among the operators of (Dubois et al, 1991) there is one for which the algorithm does not terminate and one for which the

> *q(x, y)* ← *p(x), r(y); 0.7; IG. q(x, y)* ← *q(y, x); 0.9; IL. s(x)* ← *q(x, y); 0.7; IK.*

*(q(a, b), min(min(0.8, 0.6), 0.7) = 0.6); (q(b, a), max(0, 0.6 + 0.9 - 1) = 0.5);*  0 0.6 +0.7 1 ( ( ), 0.7); 0.7 0.6 0.7 1

<sup>≤</sup> <sup>=</sup> + >

0 0.5+0.7 1 ( ( ), 0.7); 0.7 0.5 0.7 1 *s b* <sup>≤</sup> <sup>=</sup> + >

<sup>≤</sup> <sup>=</sup> − + *f(IL,* 

*f(IG,* 

*) K*

 *otherwise*

α*,* γ

α α*head = f(I,*

α*body,* β*).*

> α*,* β*) = min(*

*) = max(0,* 

( ,,) <sup>1</sup> α β

βα

*f I +* 

α*,* β

α β α*,* β*)* 

α *+* β *-1)* 

 *+* 

<sup>≤</sup> <sup>=</sup> <sup>&</sup>gt;

0 1

 β

∈ [0; 1] then the fixed point algorithm terminates.

To compute the level of rule-heads, we need the concept of uncertainty level function.

*) = min ({* 

**Definition 4**. The uncertainty-level function is:

According to this function the level of a rule-head is:

depends on the feature of uncertainty level function:

α*,* β*)* ≤ α for∀α

Gödel *<sup>G</sup>*

Lukasiewicz *<sup>L</sup>*

Kleene-Dienes *IK(*

uncertainty-level function does not exists.) **Example 1.** Let us consider the next program:

**Proposition 2**. If *f(I,* 

*f(I,* α*,* β

greater than any former one, so this algorithm must terminate.

 *I*

1

(,)

α γ

 *I*

α γ

(,) <sup>1</sup>

α*,*γ

Then *T0* = {*(p(a), 0.8), (r(b), 0.6)* } and the computed atoms are:

*s a*

uncertainty-level functions can be easily computed. They are the following:

1

<sup>≤</sup> <sup>=</sup>

γ

α γ

α γ

 *) = max(1-*

*(p(a), 0.8). (r(b), 0.6).* 


The result depends on the evaluation order. If it is 1., 2., 3., 4., then

$$\text{lfp}(\text{NT}\_P) = \{ (r(a), 0.8), (p(a), 0.6), (q(a), 0.5) \}.$$

while in the order 1., 3., 2., 4.

$$\text{lfp}(\text{NT}\_P) = \{(r(a), 0.8), (p(a), 0.5), (q(a), 0.5)\}.$$

According to the above examples, in the case of programs containing negation there are problems with the model's minimality. However, the nondeterministic semantics – *lfp(NTP)* – is minimal under certain conditions. These conditions are referred to as stratification. Stratification gives an evaluating sequence in which the literals are evaluated before negating them.

### **2.1.2 Stratified fuzzy datalog**

To stratify a program, it is necessary to define the concept of dependency graph. This is a directed graph, whose nodes are the predicates of *P*. There is an arc from predicate *p* to predicate *q* if there is a rule whose body contains *p* or ¬*p* and whose head predicate is *q*. A program is recursive, if its dependency graph has one or more cycles. A program is stratified if whenever there is a rule with head predicate *p* and a negated body literal ¬q, there is no path in the dependency graph from *p* to *q*.

The stratification of a program *P* is a partition of the predicate symbols of *P* into subsets *P1,..., Pn* such that the following conditions are satisfied:

a/ if *p* ∈ *Pi* and *q* ∈ *Pj* and there is an edge from *q* to *p* then *i* ≥ *j*;

b/ if *p* ∈ *Pi* and *q* ∈ *Pj* and there is a rule with the head *p* whose body contains ¬*q*, then *i* > *j*.

Stratification specifies an order of evaluation. The rules whose head-predicates are in P*1* are evaluated first, then those whose head-predicates are in P*2* and so on. The sets *P1,..., Pn* are

From Fuzzy Datalog to Multivalued Knowledge-Base 33

By induction it will be shown that *Ln* is a minimal model of *P*. For this purpose, we need the

**Definition 5**. An fDATALOG program *P* is semi-positive if its negated predicates are solely

**Proof.** A semi-positive program is almost the same as a positive one, because if *p* is a negated predicate of a rule-body, then it can be replaced by the fact q = ¬p. As p is a fact predicate, therefore the uncertainty level of q may be easily calculated. So the negation can be eliminated from the program and this program has a least fixed point which is the least

because according to the stratification each negative literal of the i-th strata belongs to a

This means that evaluating the rules in the order of stratification, the least fixed point of the program's nondeterministic transformation is the minimal model of the program as well. So: **Proposition 3.** For stratified fDATALOG program *P*, there is an evaluation sequence, in

As shown in Example 4, a program can have more then one stratification. Will the different stratifications yield the same semantics? Fortunately, the answer is yes. (Ceri et al 1990) declares, (Abiteboul et al 1995) proves the theorem, according to which for stratified Datalog programs the resulting minimal model is independent of the actual stratification. That is, two stratifications of a Datalog program yield the same semantics on all inputs. As the order of stratification depends only on the predicates of the program and it is not influenced by the uncertainty levels, therefore this theorem is true in the case of fDATALOG programs as

**Theorem 3.** Let *P* be a stratifiable fDATALOG program. The least fixed point according to

**Example 5.** In Example 3. the right stratified order is 1., 3., 2., 4.; so the least fixed point of

In fuzzy theory, uncertainty is measured by a single value between zero and one, and negation can be calculated as its complement to 1. However, human beings sometimes hesitate expressing these values, that is, there may be some hesitation degree. This illuminates a well-known psychological fact that linguistic negation does not always correspond to the logical one. Based on this observation, as a generalization of fuzzy sets, the concept of intuitionistic fuzzy sets was introduced and developed by Atanassov in 1983 and later (Atanassov 1983, 1999, Atanassov & Gargov 1989). In the next paragraphs some

an arbitrary order of stratification is a unique minimal model of the program.

the program is: *lfp(NTP)* = {*(r(a), 0.8), (p(a), 0.5), (q(a), 0.5)* }*.*

possible multivalued extensions of Datalog will be discussed.

**Theorem 2**. If *P* is a stratified fDATALOG program then *Ln* is a minimal model of *P*.

*\** is semi-positive,

\* , which is minimal model

**Lemma 1.** A semi-positive program *P* has a minimal model: *L = lfp(NTP).* 

According to the lemma, *L1* is the least fixed point for *P1\**. Generally *Li-1* ∪ *Pi*

predicate of a lower level strata. So Li is the least fixed point for Pi

for the given stratification. Therefore the next theorem is true:

which *lfp(NTP)* is a minimal model of *P*.

next definition and lemma.

facts.

model.

well.

**2.2 Multivalued datalog** 

called the strata of the stratification. A program *P* is called stratified if and only if it admits stratification. There is a very simple method for finding stratification for a stratified program in (Ceri et al 1990, Ullman 1988). Because this algorithm groups the predicates of the program, this is suitable for the fDATALOG programs as well.

The first of the following Datalog programs is not stratified, the other one has more distinct stratifications.

**Example 4**. Consider the one-rule program:

$$p(\mathbf{x}) \leftarrow \neg p(\mathbf{x}).$$

This is not stratified.

The next program has more stratification (Abiteboul et al 1995):

1. *s(x)* ← *r1(x),* ¬*r(x).*  2. *t(x)* ← *r2(x), r(x).*  3. *u(x)* ← *r3(x), t(x).*  4. *v(x)* ← *r4(x), s(x), u(x).* 

The program has five distinct stratifications, namely:

```
 {1.}, {2.}, {3.}, {4.} 
 {2.}, {1.}, {3.}, {4.} 
 {2.}, {3.}, {1.}, {4.} 
 {1., 2.}, {3.}, {4.} 
 {2.}, {1., 3.}, {4.}
```
These lead to five different ways of reading the program. As will be seen later, each of them yields the same semantics.

Let *P* be a stratified fDATALOG program with stratification *P1,..., Pn*. Let *Pi \** denote the set of all rules of *P* corresponding to stratum *Pi* , that is the set of all rules whose head-predicate is in *Pi*. Let

$$L\_1 = \text{lfp}(NT\_{P\_1}\text{})\_\prime$$

where the starting point of the computation is *T0* defined earlier.

$$L\_2 = lfp \text{ (NT}\_{P\_2} \text{:)}\_{\prime}$$

where the starting point of the computing is *L1*.

$$L\_n = l! p \text{ (NT } {}\_{P\_n}\text{)}\_{\prime}$$

where the starting point is *Ln-1*.

In other words: the least fixed point - *L1* - corresponding to the first stratum of *P* is computed at first. Once this fixed point has been computed, we can take a step to the next strata.

32 Fuzzy Logic – Algorithms, Techniques and Implementations

called the strata of the stratification. A program *P* is called stratified if and only if it admits stratification. There is a very simple method for finding stratification for a stratified program in (Ceri et al 1990, Ullman 1988). Because this algorithm groups the predicates of the

The first of the following Datalog programs is not stratified, the other one has more distinct

 {1.}, {2.}, {3.}, {4.} {2.}, {1.}, {3.}, {4.} {2.}, {3.}, {1.}, {4.} {1., 2.}, {3.}, {4.} {2.}, {1., 3.}, {4.} These lead to five different ways of reading the program. As will be seen later, each of them

all rules of *P* corresponding to stratum *Pi* , that is the set of all rules whose head-predicate is

*L1 = lfp(NTP1*

*L2 = lfp (NTP2*

 *Ln = lfp (NT Pn*

In other words: the least fixed point - *L1* - corresponding to the first stratum of *P* is computed at first. Once this fixed point has been computed, we can take a step to the next

*\*),* 

*\*),* 

*\*),* 

*\** denote the set of

Let *P* be a stratified fDATALOG program with stratification *P1,..., Pn*. Let *Pi*

where the starting point of the computation is *T0* defined earlier.

where the starting point of the computing is *L1*.

where the starting point is *Ln-1*.

*p(x)* ← *p(x).* 

program, this is suitable for the fDATALOG programs as well.

The next program has more stratification (Abiteboul et al 1995):

The program has five distinct stratifications, namely:

**Example 4**. Consider the one-rule program:

stratifications.

This is not stratified.

2. *t(x)* 

3. *u(x)* 

4. *v(x)* 

1. *s(x)* ← *r1(x),* ¬*r(x).* 

← *r2(x), r(x).* 

← *r3(x), t(x).* 

← *r4(x), s(x), u(x).* 

yields the same semantics.

in *Pi*. Let

strata.

By induction it will be shown that *Ln* is a minimal model of *P*. For this purpose, we need the next definition and lemma.

**Definition 5**. An fDATALOG program *P* is semi-positive if its negated predicates are solely facts.

**Lemma 1.** A semi-positive program *P* has a minimal model: *L = lfp(NTP).* 

**Proof.** A semi-positive program is almost the same as a positive one, because if *p* is a negated predicate of a rule-body, then it can be replaced by the fact q = ¬p. As p is a fact predicate, therefore the uncertainty level of q may be easily calculated. So the negation can be eliminated from the program and this program has a least fixed point which is the least model.

According to the lemma, *L1* is the least fixed point for *P1\**. Generally *Li-1* ∪ *Pi \** is semi-positive, because according to the stratification each negative literal of the i-th strata belongs to a predicate of a lower level strata. So Li is the least fixed point for Pi \* , which is minimal model for the given stratification. Therefore the next theorem is true:

**Theorem 2**. If *P* is a stratified fDATALOG program then *Ln* is a minimal model of *P*.

This means that evaluating the rules in the order of stratification, the least fixed point of the program's nondeterministic transformation is the minimal model of the program as well. So:

**Proposition 3.** For stratified fDATALOG program *P*, there is an evaluation sequence, in which *lfp(NTP)* is a minimal model of *P*.

As shown in Example 4, a program can have more then one stratification. Will the different stratifications yield the same semantics? Fortunately, the answer is yes. (Ceri et al 1990) declares, (Abiteboul et al 1995) proves the theorem, according to which for stratified Datalog programs the resulting minimal model is independent of the actual stratification. That is, two stratifications of a Datalog program yield the same semantics on all inputs. As the order of stratification depends only on the predicates of the program and it is not influenced by the uncertainty levels, therefore this theorem is true in the case of fDATALOG programs as well.

**Theorem 3.** Let *P* be a stratifiable fDATALOG program. The least fixed point according to an arbitrary order of stratification is a unique minimal model of the program.

**Example 5.** In Example 3. the right stratified order is 1., 3., 2., 4.; so the least fixed point of the program is: *lfp(NTP)* = {*(r(a), 0.8), (p(a), 0.5), (q(a), 0.5)* }*.*

### **2.2 Multivalued datalog**

In fuzzy theory, uncertainty is measured by a single value between zero and one, and negation can be calculated as its complement to 1. However, human beings sometimes hesitate expressing these values, that is, there may be some hesitation degree. This illuminates a well-known psychological fact that linguistic negation does not always correspond to the logical one. Based on this observation, as a generalization of fuzzy sets, the concept of intuitionistic fuzzy sets was introduced and developed by Atanassov in 1983 and later (Atanassov 1983, 1999, Atanassov & Gargov 1989). In the next paragraphs some possible multivalued extensions of Datalog will be discussed.

From Fuzzy Datalog to Multivalued Knowledge-Base 35

The next question is whether this fixed point is a model of *P*. The fixed point is an

Similarly to the proof of Theorem 1, it can easily be proved that this fixed point is a model of the program. For negation-free iDATALOG this is the least model of the program. (Achs

In fDATALOG a fact can be negated by completing its membership degree to 1. In iDATALOG the uncertainty level of a negated fact can be computed according to negators. A negator on *LF* or *LV* is a decreasing mapping ordering *0FV* and *1FV* together (Cornelis et al 2004). The applied negators are relevant for the computational meaning of a program, but they have no influence on the stratification. So for a stratified iDATALOG program *P* there is an evaluation sequence in which *lfp(iNTP)* is a unique minimal model of *P*. Therefore

After defining the syntax and semantics of extended fuzzy Datalog, it is necessary to examine the properties of possible implication operators and the extended uncertainty-level functions. A number of intuitionistic implications are discussed in (Cornelis et al 2004, Atanassov 2005, 2006) and other papers, four of them are the extensions of the above three fuzzy implication operators. Now these operators will be presented and completed by the suitable interval-value operators and the uncertainty-level functions. The computations will

The coordinates of intuitionistic and interval-valued implication operators can be determined by each other. The uncertainty-level functions can be computed according to the applied implication. The connection between *IF* and *IV* and the extended versions of

> α*,*γ

α*',*γ*');*

The studied operators and the related uncertainty-level functions are the following:

*) = (max(1-*

α*',*γ*');* γ*'=(*γ*1,1*γ*2).* 

α*2,* γ*1), min(*α*1,* γ*2))* 

α*2,* γ

*) = (IV1, IV2);* 

α*'=(*α*1,1*α*2);* 

*1 }), max ({* 

*1 }), min ({* 

*1), max(1-*

α*1,* γ*2))* 

γ*2 | IF2 (*α*,* γ *)* ≤ β*2 }))* 

γ*2 | IV2 (*α*,* γ *)* ≥ β*2 }))* 

 *ground(P), IFV(*

α*,* γ *)* ≥*FV* β*.* 

interpretation of *P*, which is a model, if for each

*A* ←

2010).

 *A1 ,…, An;* 

*lfp(iNTP)* can be regarded as the semantics of iDATALOG.

uncertainty-level functions are given below: For

 *) = (min ({* 

 *) = (min ({* 

**2.2.1.1 Extension of Kleene-dienes implication** 

*f (IF,* α*,* β

*f (IV,* α*,* β

not be shown here, only the starting points and results are presented.

*IV(*

 *IV1=IF1(*

γ*1 | IF1 (*α*,* γ *)* ≥ β

γ*1 | IV1 (*α*,* γ *)* ≥ β

One possible extension of Kleene-Dienes implication for IFS is:

*IFK(*α*,* γ*) = (max(*

The appropriate computed elements are the following:

*IVK(*α*,* γ

 *IV2=1-IF2(*

β*; IFV* ∈

### **2.2.1 Intuitionistic- and interval-valued extensions of datalog**

In intuitionistic fuzzy systems (IFS) and interval-valued systems (IVS) the uncertainty is represented by two values, μ = *(*μ*1,* μ*2)* instead of a single one. In the intuitionistic case the two elements must satisfy the condition μ*1+*μ*<sup>2</sup>* ≤ *1*, while in the interval-valued case the condition is μ*<sup>1</sup>* ≤ μ*2*. If μ = *(*μ*1,* μ*2)* belonging to a predicate *p* is an IFS level, then *p* is definitely true on level μ*1* and definitely false on level μ*<sup>2</sup>*, while in IVS the truth value is between μ*1* and μ*2*. It is obvious that the relation μ*'1* = μ*1*, μ*'2 = 1* − μ*<sup>2</sup>* creates a mutual connection between the two systems. (The equivalence of IVS and IFS was stated first in (Atanassov & Gargov 1989).)

The fixed point theory of programming is based on the theory of lattices. So does the theory of fuzzy Datalog as well, which is based on the lattice of fuzzy sets. The extension of the programs into an intuitionistic and interval-valued direction needs the extension of lattices as well.

**Definition 6.** *LF* and *LV* are lattices of IFS and IVS respectively, where

$$\mathbf{L}\_{F} = \{ (\mathbf{x}\_{1}, \mathbf{x}\_{2}) \in [0, 1]^{2} \mid \mathbf{x}\_{1} + \mathbf{x}\_{2} \le 1 \}, \ (\mathbf{x}\_{1}, \mathbf{x}\_{2}) \lessapprox (y\_{1}, y\_{2}) \Leftrightarrow \mathbf{x}\_{1} \le y\_{1} \text{ and } \mathbf{x}\_{2} \ge y\_{2}.$$

$$\mathbf{L}\_{V} = \{ (\mathbf{x}\_{1}, \mathbf{x}\_{2}) \in [0, 1]^{2} \mid \mathbf{x}\_{1} \le \mathbf{x}\_{2} \}, \ (\mathbf{x}\_{1}, \mathbf{x}\_{2}) \lessapprox\_{V} (y\_{1}, y\_{2}) \Leftrightarrow \mathbf{x}\_{1} \le y\_{1} \text{ and } \mathbf{x}\_{2} \le y\_{2}$$

It can be proved that both *LF* and *LV* are complete lattices (Cornelis et al 2004), so it can be the base of intuitionistic Datalog (ifDATALOG) and interval-valued Datalog (ivDATALOG) as well. (If the distinction is not important, both of them will be denoted by iDATALOG.)

The so called i-extended DATALOG is defined on these lattices, and the necessary concepts are generalizations of the ones presented in Definition 1 and Definition 2. Let us continue to denote by *BP* the Herbrand base of the program *P*, and let *FV(BP)* the set of all IFS or IVS sets over *BP*.

**Definition 7.** The i-extended Datalog program (iDATALOG) is a finite set of safe iDATALOG rules *r;* β*; IFV*;


> α*A = maxFV (0FV, minFV{*γ *| IFV(*α*body,* γ*)* ≥*FV* β *});*


$$f(\underline{\mathsf{I}}\_{FV}\underline{\mathsf{Q}},\underline{\mathsf{Q}}) = \min\_{\mathsf{V}} \iota\_{\mathsf{V}}(\{\underline{\mathsf{y}} \mid \mathsf{I}\_{FV}(\underline{\mathsf{Q}},\underline{\mathsf{y}}) \triangleq\_{\mathsf{V}} \underline{\mathsf{B}})),$$

where α, β, γ are elements of *LF*, *LV* respectively, *IFV = IF* or *IV* is an implication of *LF* or *LV*; *maxFV* = *maxF* or *maxV*; *minFV* = *minF* or *minV* are the max or min operator of *LF* or *LV*; *0FV* is *0F = (0,1)* or *0V = (0,0)* and ≥*FV* is ≥*F* or ≥*V*.

As *iNTP* is inflationary transformation over the complete lattices *LF* or *LV*, thus according to (Ceri et al 1990) it has an inflationary fixed point denoted by *lfp(iNTP)*. If *P* is positive (without negation), *iNTP* is a monotone transformation, so *lfp(iNTP)* is the least fixed point.

34 Fuzzy Logic – Algorithms, Techniques and Implementations

In intuitionistic fuzzy systems (IFS) and interval-valued systems (IVS) the uncertainty is

connection between the two systems. (The equivalence of IVS and IFS was stated first in

The fixed point theory of programming is based on the theory of lattices. So does the theory of fuzzy Datalog as well, which is based on the lattice of fuzzy sets. The extension of the programs into an intuitionistic and interval-valued direction needs the extension of lattices

 *1}, (x1,x2)* 

It can be proved that both *LF* and *LV* are complete lattices (Cornelis et al 2004), so it can be the base of intuitionistic Datalog (ifDATALOG) and interval-valued Datalog (ivDATALOG) as well. (If the distinction is not important, both of them will be denoted by iDATALOG.)

The so called i-extended DATALOG is defined on these lattices, and the necessary concepts are generalizations of the ones presented in Definition 1 and Definition 2. Let us continue to denote by *BP* the Herbrand base of the program *P*, and let *FV(BP)* the set of all IFS or IVS sets

**Definition 7.** The i-extended Datalog program (iDATALOG) is a finite set of safe


γ *| IFV(*α*body,* γ*)* ≥*FV* β *});* 

γ *| IFV(*α*,* γ *)* ≥*FV* β *}),* 

*maxFV* = *maxF* or *maxV*; *minFV* = *minF* or *minV* are the max or min operator of *LF* or *LV*; *0FV* is *0F*

As *iNTP* is inflationary transformation over the complete lattices *LF* or *LV*, thus according to (Ceri et al 1990) it has an inflationary fixed point denoted by *lfp(iNTP)*. If *P* is positive (without negation), *iNTP* is a monotone transformation, so *lfp(iNTP)* is the least fixed point.

are elements of *LF*, *LV* respectively, *IFV = IF* or *IV* is an implication of *LF* or *LV*;

 *x2}, (x1,x2)* 

≤*F (y1,y2)* 

≤*V (y1,y2)*  ⇔ *x1* ≤

⇔ *x1* ≤

μ*1+*μ

*1* and definitely false on level

≤

≤

*2)* instead of a single one. In the intuitionistic case the

*2)* belonging to a predicate *p* is an IFS level, then *p* is

μ

μ*'1* = μ*1*, μ*'2 = 1* − μ

*<sup>2</sup>* ≤ *1*, while in the interval-valued case the

 *y1 and x2*

 *y1 and x2*

≥ *y2,*

≤ *y2* 

*<sup>2</sup>*, while in IVS the truth value is

*<sup>2</sup>* creates a mutual

**2.2.1 Intuitionistic- and interval-valued extensions of datalog** 

*2*. It is obvious that the relation

**Definition 6.** *LF* and *LV* are lattices of IFS and IVS respectively, where

 *[0,1]2 | x1+x2*

 *[0,1]2 | x1*

μ = *(*μ*1,* μ

represented by two values,

μ*<sup>1</sup>* ≤ μ*2*. If μ = *(*μ*1,* μ

(Atanassov & Gargov 1989).)

*LF = {(x1, x2)* 

*LV = {(x1, x2)* 

β*; IFV*;

α


*= (0,1)* or *0V = (0,0)* and ≥*FV* is ≥*F* or ≥*V*.

*f(IFV,*α*,*β

*A = maxFV (0FV, minFV{*

*) = minFV({*

definitely true on level

μ*1* and μ

condition is

between

as well.

over *BP*.

where α, β, γ

iDATALOG rules *r;* 

as *NTP* in (1) except:

two elements must satisfy the condition

μ

∈

∈

The next question is whether this fixed point is a model of *P*. The fixed point is an interpretation of *P*, which is a model, if for each

$$A \gets A\_1, \dots, A\_n; \underline{\mathcal{B}} \text{ } \underline{\mathcal{I}}\_{FV} \in \operatorname{ground}(\mathcal{P}), \operatorname{I}\_{\mathbb{P}V}(\underline{\mathcal{Q}}, \underline{\mathcal{Y}}) \not\simeq\_{\mathbb{V}V} \underline{\mathcal{B}}.$$

Similarly to the proof of Theorem 1, it can easily be proved that this fixed point is a model of the program. For negation-free iDATALOG this is the least model of the program. (Achs 2010).

In fDATALOG a fact can be negated by completing its membership degree to 1. In iDATALOG the uncertainty level of a negated fact can be computed according to negators. A negator on *LF* or *LV* is a decreasing mapping ordering *0FV* and *1FV* together (Cornelis et al 2004). The applied negators are relevant for the computational meaning of a program, but they have no influence on the stratification. So for a stratified iDATALOG program *P* there is an evaluation sequence in which *lfp(iNTP)* is a unique minimal model of *P*. Therefore *lfp(iNTP)* can be regarded as the semantics of iDATALOG.

After defining the syntax and semantics of extended fuzzy Datalog, it is necessary to examine the properties of possible implication operators and the extended uncertainty-level functions. A number of intuitionistic implications are discussed in (Cornelis et al 2004, Atanassov 2005, 2006) and other papers, four of them are the extensions of the above three fuzzy implication operators. Now these operators will be presented and completed by the suitable interval-value operators and the uncertainty-level functions. The computations will not be shown here, only the starting points and results are presented.

The coordinates of intuitionistic and interval-valued implication operators can be determined by each other. The uncertainty-level functions can be computed according to the applied implication. The connection between *IF* and *IV* and the extended versions of uncertainty-level functions are given below: For

 *IV(*α*,*γ*) = (IV1, IV2); IV1=IF1(*α*',*γ*');* α*'=(*α*1,1*α*2); IV2=1-IF2(*α*',*γ*');* γ*'=(*γ*1,1*γ*2). f (IF,* α*,* β *) = (min ({* γ*1 | IF1 (*α*,* γ *)* ≥ β*1 }), max ({* γ*2 | IF2 (*α*,* γ *)* ≤ β*2 })) f (IV,* α*,* β *) = (min ({* γ*1 | IV1 (*α*,* γ *)* ≥ β*1 }), min ({* γ*2 | IV2 (*α*,* γ *)* ≥ β*2 }))* 

The studied operators and the related uncertainty-level functions are the following:

### **2.2.1.1 Extension of Kleene-dienes implication**

One possible extension of Kleene-Dienes implication for IFS is:

$$\underline{\mathbf{J}}\_{\mathbb{F}\mathbb{K}}(\underline{\alpha}, \underline{\gamma}) = (\max(\alpha\_{\mathbb{L}}, \gamma\_{\mathbb{I}}), \min(\alpha\_{\mathbb{I}}, \gamma\_{\mathbb{Z}})) $$

The appropriate computed elements are the following:

*IVK(*α*,* γ*) = (max(1*α*2,* γ*1), max(1*α*1,* γ*2))* 

From Fuzzy Datalog to Multivalued Knowledge-Base 37

or rather "not in all cases". For example, in the case of the Kleene-Dienes and the Lukasiewicz intuitionistic operators the levels of a rule-head satisfy the condition of intuitionism only if the sum of the levels of the rule-body is at least as large as the sum of the levels of the rule. That is, the solution is inside the scope of IFS, if the level of the rulebody is less "intuitionistic" than the level of the rule. In the case of the first Gödel operator, the solution is inside the scope of IFS only if the level of the rule-body is more certain than the level of the rule (Achs 2010). However, for the second Gödel operator the next

proposition can easily be proven:

α = *(*α*1,* α

*if* α*1+*α*2* ≤ *1, β1+β<sup>2</sup>*

α

**2.2.2 Bipolar extension of datalog** 

*if* α*1*≤ α*2, β1*≤

*2), β = (β1, β2)*

≤

 *1 then f1(IFG2,* 

Similarly to Proposition 2, it can be seen that the fixed-point algorithm terminates if *f(IFV,* 

The intuitive background of intuitionistic levels is some psychological perception. Experiments have shown that when making decisions people deal with positive and negative facts in different ways (Dubois et al 2000, 2005). Continuing this idea, it can be stated that there would be differences not only in the scaling of truth values, but in the way of concluding as well. This means that in a way similar to the facts, positive and negative inferences can be separated. The idea of bipolar Datalog is based on the above observation: two kinds of ordinary fuzzy implications are used for positive and negative deduction, namely, a pair of consequence transformations is defined instead of a single one. Since in the original transformations lower bounds are used with degrees of uncertainty, therefore starting from IFS or IVS facts, the resulting degrees will be lower bounds of membership and non-membership respectively, instead of the upper bound for non-membership. However, if each non-membership value μ is transformed into membership value μ' = 1 − μ, then both members of head-level can be deduced similarly. So the appropriate concepts are

**Definition 8.** The bipolar Datalog program (bDATALOG) defined on *LF* or *LV* is a finite set of

The elements of the bipolar nondeterministic consequence transformation *bNTP = (NTP1,* 

γ*'2 | I2(*α*'body2,* γ*'2)* ≥ β*'2});* 

γ*1 | I1(*α*1,* γ*1)* ≥ β*1});* 

*β2 then f1(IVG2,* 

∈ *LFV* (Achs 2010). G2 satisfies this condition, so:

**Proposition 5**. In the case of G2 operator the fixed-point algorithm terminates.

α

α*, β)* ≤ *f2(IVG2,* α*, β)* 

*, β) + f2(IFG2,* 

α*, β)* ≤ *1* 

> α*,*

**Proposition 4**. For

for each

β*)* ≤*FV* α

as follows.

safe bDATALOG rules *r; (*

β*1,* β*2); (I1,I2).*

α

The uncertainty-level function is: *f = (f1, f2)* where

where α'body2=min FV2 (α'A12,…,α'An2).

*NTP2)* are similar to *NTP* in (1) except in *NTP2* the level of rule's head is:

*'2 = max FV2 (0, min FV2 {*

*f1 = min FV2 ({*

$$f\_1(\underline{I}\_{FK}, \underline{\underline{\alpha}}, \underline{\beta}) = \begin{cases} 0 & \alpha\_2 \ge \beta\_1 \\ \beta\_1 & \text{otherwise} \end{cases}$$

$$f\_1(\underline{I}\_{VV}, \underline{\underline{\alpha}}, \underline{\underline{\beta}}) = \begin{cases} 0 & 1 - \alpha\_2 \ge \beta\_1 \\ \beta\_1 & \text{otherwise} \end{cases}$$

$$f\_2(\underline{\mathcal{I}}\_{FK}, \underline{\mathcal{Q}}\_{\prime} \underline{\mathcal{B}}) = \begin{cases} 1 & \alpha\_1 \le \beta\_2 \\ \beta\_2 & \text{otherwise} \end{cases}$$

$$f\_2(\underline{\mathcal{I}}\_{VV}, \underline{\mathcal{Q}}\_{\prime} \underline{\mathcal{B}}) = \begin{cases} 0 & 1 \cdot \alpha\_1 \ge \beta\_2 \\ \beta\_2 & \text{otherwise} \end{cases}$$

### **2.2.1.2 Extension of Lukasiewicz implication**

One possible extension of Lukasiewicz implication for IFS is:

$$\underline{\mathbf{I}}\_{\rm FL}(\underline{\mathbf{a}}, \underline{\mathbf{y}}) = (\min(\alpha\_{\mathbf{\tilde{z}}}, \underline{\mathbf{y}}), \min(\alpha\_{\mathbf{l}}, \underline{\mathbf{y}})) $$

The appropriate computed elements are as follows:

$$\begin{array}{rcl} \underline{I}\_{\rm VL}(\underline{\alpha}, \underline{\beta}) &=& (\max(1, \alpha\_{\rm L} + \underline{\gamma}), \max(0, \alpha\_{\rm l} + \underline{\gamma}\_{\rm l} - 1)) \\\\ f\_{\rm l}(\underline{\underline{I}\_{\rm FL}}, \underline{\alpha}, \underline{\beta}) &=& \min(1 - \alpha\_{\rm L}, \max(0, \beta\_{\rm l} - \underline{\alpha})) \\\\ f\_{\rm l}(\underline{\underline{I}\_{\rm FL}}, \underline{\alpha}, \underline{\beta}) &=& \max(1 - \alpha\_{\rm l}, \min(1, \ 1 - \alpha\_{\rm l} + \beta\_{\rm l})) \\\\ f\_{\rm l}(\underline{\underline{I}\_{\rm VL}}, \alpha, \beta) &=& \max(0, \alpha\_{\rm l} + \beta\_{\rm l} - 1) \\\\ f\_{\rm l}(\underline{\underline{I}\_{\rm VL}}, \alpha, \beta) &=& \max(0, \alpha\_{\rm l} + \beta\_{\rm l} - 1) \end{array}$$

### **2.2.1.3 Extensions of Gödel implication**

There are several alternative extensions of Gödel implication, two of them are presented here:

$$\underline{I}\_{\mathbb{FG}1}(\underline{\alpha}, \underline{\gamma}) = \begin{cases} (1,0) & \alpha\_1 \le \gamma\_1 \\ (\gamma\_1, 0) & \alpha\_1 > \gamma\_1, \alpha\_2 \ge \gamma\_2 \\ (\gamma\_1, \gamma\_2)\alpha\_1 & \gamma\_1, \alpha\_2 < \gamma\_2 \end{cases} \qquad \underline{I}\_{\mathbb{FG}2}(\underline{\alpha}, \underline{\gamma}) \quad = \begin{cases} (1,0) & \alpha\_1 \le \gamma\_1, \alpha\_2 \ge \gamma\_2 \\ (\gamma\_1, \gamma\_2) & \text{otherwise} \end{cases}$$

The appropriate computed elements are:

$$\begin{aligned} \underline{I}\_{\text{VGL}}(\underline{\alpha},\underline{\gamma}) &= \begin{cases} (1,1) & \alpha\_{1} \le \gamma\_{1} \\ (\gamma\_{1},1) & \alpha\_{1} > \gamma\_{1}, \alpha\_{2} \ge \gamma\_{2} \\ (\gamma\_{1},\gamma\_{2})\alpha\_{1} & \gamma\_{1}, \alpha\_{2} < \gamma\_{2} \end{cases} & \underline{I}\_{\text{VGL}}(\underline{\alpha},\underline{\gamma}) &= \begin{cases} (1,1) & \alpha\_{1} \le \gamma\_{1}, \alpha\_{2} \le \gamma\_{2} \\ (\gamma\_{1},\gamma\_{2}) & \text{otherwise} \end{cases} \\\\ \underline{f}(\underline{I}\_{\text{FGL}},\underline{\alpha},\underline{\beta}) &= \min(\alpha,\beta\_{1}) & \underline{f}\_{2}\left(\underline{I}\_{\text{FGL}},\underline{\alpha},\underline{\beta}\right) &= \begin{cases} 1 & \alpha\_{1} \le \beta\_{2} \\ \max(\alpha\_{2},\beta\_{2}) & \text{otherwise} \end{cases} \\\\ \underline{f}(\underline{I}\_{\text{VGL}},\underline{\alpha},\underline{\beta}) &= \min(\alpha\_{\text{b}},\beta\_{1}) & \underline{f}\_{2}\left(\underline{I}\_{\text{VGL}},\underline{\alpha},\underline{\beta}\right) &= \begin{cases} 0 & \alpha\_{1} \le \beta\_{2} \\ \min(\alpha\_{2},\beta\_{2}) & \text{otherwise} \end{cases} \\\\ \underline{f}(\underline{I}\_{\text{VGL}},\underline{\alpha},\underline{\beta}) &= \min(\alpha,\beta\_{1}) & \underline{f}\_{2}(\underline{I}\_{\text{VGL}},\underline{\alpha},\underline{\beta}) &= \max(\alpha,\beta\_{2}) \end{cases} \end{aligned}$$

An important question is whether the resulting degrees satisfy the conditions referring to IFS and IVS respectively. Unfortunately, for implications other than G2, the answer is "no", 36 Fuzzy Logic – Algorithms, Techniques and Implementations

α*2,*γ*1), min(*α*1,*γ*2))* 

> α*2 +* γ

α

α

α*2+ β1 -1)* 

α*1+ β2 -1)* 

There are several alternative extensions of Gödel implication, two of them are presented

*FG*

*VG*

*1, β1)* 1 2 2 1

*1, β1)* 1 2 2 1

An important question is whether the resulting degrees satisfy the conditions referring to IFS and IVS respectively. Unfortunately, for implications other than G2, the answer is "no",

2

α γ

> α β

α β

α

α

2

α γ

*) = (max(1,* 

*, β) = min(1-*

*, β) = max(1-*

*, β) = max(0,* 

*, β) = max(0,* 

2

2

*1), max(0,* 

α*2))* 

α*1+ β2))* 

*2, max(0, β1-*

*1, min(1, 1-*

α*1 +* γ*2 -1))* 

*(,) I( )* 1 12 2

1 0 ,

≤ ≥

γ α

γ α γ

γ

α

1 1 ,

2 2

 β

2 2

α β

α β

α*2, β2)* 

α*2, β2)* 

 β

≤ ≤

1 2

*(,) I( )* 1 12 2

, = ( , ) otherwise α

1 2

<sup>≤</sup>

<sup>≤</sup>

γ γ

1 , , = max( , ) otherwise *FG f (I )* α

0 , , = min( , ) otherwise *VG f (I )* α

*, β) = max(*

*, β) = min(*

γ γ

, = ( , ) otherwise

 *f I*

 *- f I*

α β

α β 1 2

α  β

1 2

β

2 1 ( , , )= otherwise *FK*

2

β

β

α

01 ( , , )= otherwise *VK*

<sup>≥</sup>

<sup>≤</sup>

2 1

α

One possible extension of Lukasiewicz implication for IFS is:

*IFL(*α*,*γ*) = (min(*

α*,* γ

α

α

α

α

γ

γ

γ

γ

*1, β1) f2(IFG2,* 

*1, β1) f2(IVG2,* 

 β

2 1

β

1 0 ( , , )= otherwise *FK*

1

The appropriate computed elements are as follows:

( , , )= otherwise *VK*

**2.2.1.2 Extension of Lukasiewicz implication** 

β

β

α

01

− ≥

<sup>≥</sup>

1

1

 *f I*

 *f I*

α β

*IVL(*

*f1(IFL,* 

*f2(IFL,* 

 *f1(IVL,* 

 *f2(IVL,* 

here:

*FG*

*VG*

*f1(IFG1,* α

*f1(IVG1,* α

*f1(IFG2,* α

*f1(IVG2,* α

*I( )*

α γ

*I( )*

α γ

**2.2.1.3 Extensions of Gödel implication** 

1 1 1 12 2

The appropriate computed elements are:

1 1 1 12 2

<sup>≤</sup>

α

α

α

α

(,) ,

 > ≥ > <

α

α

*(,)*

 γ

γ γ α

*, β) = min(*

*, β) = min(*

*, β) = min(*

*, β) = min(*

11 , = ( ,1) ,

<sup>≤</sup>

α

 > ≥ > <

α

10 , = ( ,0) , (,) ,

*(,)*

 γ

γ γ α 1 1

 γ

γ α

12 1 1 2 2

γ α

1 1

 γ

γ α

γ α

12 1 1 2 2

α β or rather "not in all cases". For example, in the case of the Kleene-Dienes and the Lukasiewicz intuitionistic operators the levels of a rule-head satisfy the condition of intuitionism only if the sum of the levels of the rule-body is at least as large as the sum of the levels of the rule. That is, the solution is inside the scope of IFS, if the level of the rulebody is less "intuitionistic" than the level of the rule. In the case of the first Gödel operator, the solution is inside the scope of IFS only if the level of the rule-body is more certain than the level of the rule (Achs 2010). However, for the second Gödel operator the next proposition can easily be proven:

**Proposition 4**. For α = *(*α*1,* α*2), β = (β1, β2)*

$$\begin{array}{l} \text{if } \alpha\_1 + \alpha\_2 \le 1, \ f\_1 + \beta\_2 \le 1 \text{ then } f\_1(\underline{\mathcal{L}}\_{\text{FG}\_{\mathcal{D}}} \underline{\mathcal{Q}}, \underline{\mathcal{Q}}) + f\_2(\underline{\mathcal{L}}\_{\text{FG}\_{\mathcal{D}}} \underline{\mathcal{Q}}, \underline{\mathcal{Q}}) \le 1 \\\\ \text{if } \alpha\_1 \le \alpha\_2, \ f\_1 \le \beta\_2 \text{ then } f\_1(\underline{\mathcal{L}}\_{\text{VG}\mathcal{D}} \underline{\mathcal{Q}}, \underline{\mathcal{Q}}) \le f\_2(\underline{\mathcal{L}}\_{\text{VG}\mathcal{D}}, \underline{\mathcal{Q}}, \underline{\mathcal{Q}}) \end{array}$$

Similarly to Proposition 2, it can be seen that the fixed-point algorithm terminates if *f(IFV,* α*,*  β*)* ≤*FV* α for each α∈ *LFV* (Achs 2010). G2 satisfies this condition, so:

**Proposition 5**. In the case of G2 operator the fixed-point algorithm terminates.

### **2.2.2 Bipolar extension of datalog**

The intuitive background of intuitionistic levels is some psychological perception. Experiments have shown that when making decisions people deal with positive and negative facts in different ways (Dubois et al 2000, 2005). Continuing this idea, it can be stated that there would be differences not only in the scaling of truth values, but in the way of concluding as well. This means that in a way similar to the facts, positive and negative inferences can be separated. The idea of bipolar Datalog is based on the above observation: two kinds of ordinary fuzzy implications are used for positive and negative deduction, namely, a pair of consequence transformations is defined instead of a single one. Since in the original transformations lower bounds are used with degrees of uncertainty, therefore starting from IFS or IVS facts, the resulting degrees will be lower bounds of membership and non-membership respectively, instead of the upper bound for non-membership. However, if each non-membership value μ is transformed into membership value μ' = 1 − μ, then both members of head-level can be deduced similarly. So the appropriate concepts are as follows.

**Definition 8.** The bipolar Datalog program (bDATALOG) defined on *LF* or *LV* is a finite set of safe bDATALOG rules *r; (*β*1,* β*2); (I1,I2).*

The elements of the bipolar nondeterministic consequence transformation *bNTP = (NTP1, NTP2)* are similar to *NTP* in (1) except in *NTP2* the level of rule's head is:

> α*'2 = max FV2 (0, min FV2 {*γ*'2 | I2(*α*'body2,* γ*'2)* ≥ β*'2});*

where α'body2=min FV2 (α'A12,…,α'An2).

The uncertainty-level function is: *f = (f1, f2)* where

*f1 = min FV2 ({*γ*1 | I1(*α*1,* γ*1)* ≥ β*1});* 

From Fuzzy Datalog to Multivalued Knowledge-Base 39

*{(p(a), (0.7, 0.2)), (q(b), (0.65, 0.3)), (r(a,b), (0.7, 0.2)) }.* 

*q(x, y)* ← *p(x, y); (0.85, 0.95); I1. q(x, y)* ← *p(x, z), q(z, y); (0.8, 0.9); I2.* 

*1, β1) f2(IVG2,* 

*2+ β1 -1) f2(IVL,* 

α

X*0* = {*(p(a, b), (0.7, 0.8)), (p(a, c), (0.8, 0.9)), (p(b, d), (0.75, 0.8)), (p(d, e), (0.9, 0.95))*} X*1* =X*0* ∪ { *(q(a, b), (0.7, 0.8)), (q(a, c), (0.8, 0.9)), (q(b, d), (0.75, 0.8)), (q(d, e), (0.85, 0.95))*,

As fuzzy Datalog is a special kind of its multivalued extensions, so further on both fDATALOG and any of above extensions will be called multivalued Datalog (mDATALOG).

The facts of an mDATALOG program can be regarded as any kind of lexical knowledge including uncertainty as well, and from this knowledge other facts can be deduced according to the rules. Therefore a multivalued Datalog program is suitable to be the deduction mechanism of a knowledge base. Sometimes, however, it is not enough for getting answer to a question. For example, if there are rules describing the options of loving a good composer, and there is a fact declaring that Vivaldi is a good composer, what is the

Before showing the fixed point algorithm, two computations are set out. According to the

According to the second rule, the uncertainty of *q(a, d)* can be computed in this way:

α

*(q(a, d), (0.6, 0.6), (q(b, e), (0.6, 0.65))* }

α

α

α

*, β) = min(*

*, β) = max(0,* 

*, β) = (min(*

*1+ β2 -1) = (max(0, 0.8 + 0.8-1), max(0, 0.7 + 0.9 – 1)) =* 

α*2, β2)* 

> α*1+ β2 -1)*

α

*body = min((0.7, 0.8),(0.75, 0.8)) =(0.7, 0.8);* 

*1, β1), min(*

α*2, β2)) =* 

Considering the other level of r*(a,b)*, its resulting level is *(max(0.4, 0.7), 1*

*(p(a, b), (0.7, 0.8)). (p(a, c), (0.8, 0.9)). (p(b, d), (0.75, 0.8)). (p(d, e), (0.9, 0.95)).* 

Let *I1 = IVG2*, *I2 = IVL* , that is the appropriate uncertainty level functions are:

α

α

**Example 7.** Consider the next (recursive) IVS program:

α

α

*(min(0.7, 0.85), min(0.8, 0.95)) = (0.7, 0.8).* 

The body of the rule is: *p(a, b), q(b, d), so* 

α

X*2* =X*1* ∪ { *(q(a, e), (0.45, 0.5))* }

X*2* is fixed point, so it is the result of the program.

The steps of fixed point algorithm are:

**3. Multivalued knowledge-base** 

*, β) = min(*

*, β) = max(0,* 

first rule for *q(x, y)* the uncertainty level of *q(a, b)* is: *f(IVG2,* 

*2+ β1 -1), max(0,* 

*β2) = 1 – min(0.7, 0.8) = 0.3*. So the uncertainty level of rule's head is (0.4, 0.3).

−*max(1*−*0.3, 1*−*0.2)) =* 

*min(*α*'2,1*−

*(0.7, 0.2)*, so the fixed point is:

 *f1(IVG2,* 

 *f1(IVL,* 

*, β) =( max(0,* 

*f(IVL,* α

*(0.6, 0.6).* 

$$f\_2 = 1 - \min\_{FV2} \left( \{ 1 - \underline{\gamma}\_2 \mid I\_2(1 - \alpha\_2, 1 - \underline{\gamma}\_2) \ge 1 - \underline{\beta}\_2 \} \right).$$

It is evident that applying the transformation μ'1 = μ1, μ'2 = 1 - μ2, for all IFS levels of the program, the above definition can be applied to IVS degrees as well. As a simple computation can show, contrary to the results of iDATALOG, the resulting degrees for most variants of bipolar Datalog satisfy the conditions referring to IFS:

**Proposition 5.** For α = (α1, α2), β = (β1, β2) and for implication-pairs I = (IG, IG); I = (IL, IL); I = (IL, IG); I = (IK, IK); I = (IL, IK);

$$\text{if } \alpha\_1 + \alpha\_2 \le 1, \beta\_1 + \beta\_2 \le 1 \text{ then } f\_1(\mathsf{l}\_1, \underline{\alpha}, \underline{\beta}) + f\_2(\mathsf{l}\_2, \underline{\alpha}, \underline{\beta}) \le 1.$$

From the construction of bipolar consequence transformations follows:

**Proposition 6.** The nondeterministic bipolar consequence transformation has a least fixed point, which is a model of program *P* in the following sense: for each *A*←*A1,...,An;* β*; I* ∈ *ground(P)* 

$$I(\alpha\_{\text{body1}}, \alpha\_1) \ge \beta\_1 \colon I(\alpha'\_{\text{body2}}, \alpha'\_2) \ge \beta'\_2$$

As the termination of the consequence transformations based on these three implication operators was proven in the case of fDATALOG (Achs 2006) and since this property does not change in bipolar case, the bipolar consequence transformations terminate as well.

The bipolar extension of Datalog has no influence on the stratification, so the propositions detailed in the case of stratified fDATALOG programs are true in the case of bipolar fuzzy Datalog programs as well, that is, for a stratified bDATALOG program *P*, there is an evaluation sequence, in which *lfp(bNTP )* is a unique minimal model of *P*.

**Example 6.** Consider the next IFS program:

$$\begin{array}{l} (p(a), (0.7, 0.2)). \\ (q(b), (0.65, 0.3)). \\ (r(a, b), (0.7, 0.2)). \\ r(x, y) \leftarrow p(x), q(y); (0.75, 0.2); L. \end{array}$$

Let I = IFG2, then according to the rule r(a,b) is inferred and uncertainty can be computed as follows: αbody = minF((0.7, 0.2), (0.65, 0.3)) = (0.65, 0.3), f1(IFG2, α, β) = min(α1, β1) = min(0.65, 0.75) = 0.65, f2(IFG2, α, β) = max(α2, β2) = 0.3, that is, the level of the rule's head is (0.65, 0.3). There is a fact for r(a,b) as well, so the resulting level is the union of the level of the rule's head and the level of the fact: maxF ((0.65, 0.3), (0.7, 0.2)) = (0.7, 0.2). So the fixed point of the program is:

$$\{(p(a), (0.7, 0.2)), (q(b), (0.65, 0.3)), (r(a, b), (0.7, 0.2))\}.$$

Let us see the bipolar evaluation of the program. Let *I = (IL, IG)*. That is let the first element of the uncertainty level be computed according to the Lukasiewicz operator and the second one according to the Gödel operator. The Lukasiewicz operator defines the uncertainty level function *f (IL,* α*,* β*) = max(0,* α *+* β −*1)*. Then α*body1 = min(0.7, 0.65) = 0.65,* α*'body2 = min(1*−*0.3, 1*−*0.2) = 0.7*; *f1(IL,* α*1, β1) = max(0,* α*1+β1*−*1) = 0.65+0.75*−*1 = 0.4*, *f2(IG,* α*'2, β'2) = 1* −

38 Fuzzy Logic – Algorithms, Techniques and Implementations

It is evident that applying the transformation μ'1 = μ1, μ'2 = 1 - μ2, for all IFS levels of the program, the above definition can be applied to IVS degrees as well. As a simple computation can show, contrary to the results of iDATALOG, the resulting degrees for most

**Proposition 5.** For α = (α1, α2), β = (β1, β2) and for implication-pairs I = (IG, IG); I = (IL, IL);

 *1 then f1(I1,* 

**Proposition 6.** The nondeterministic bipolar consequence transformation has a least fixed point, which is a model of program *P* in the following sense: for each *A*←*A1,...,An;* 

As the termination of the consequence transformations based on these three implication operators was proven in the case of fDATALOG (Achs 2006) and since this property does not change in bipolar case, the bipolar consequence transformations terminate as well.

The bipolar extension of Datalog has no influence on the stratification, so the propositions detailed in the case of stratified fDATALOG programs are true in the case of bipolar fuzzy Datalog programs as well, that is, for a stratified bDATALOG program *P*, there is an

*r(x, y)* ← *p(x), q(y); (0.75, 0.2); I.* 

Let I = IFG2, then according to the rule r(a,b) is inferred and uncertainty can be computed as follows: αbody = minF((0.7, 0.2), (0.65, 0.3)) = (0.65, 0.3), f1(IFG2, α, β) = min(α1, β1) = min(0.65, 0.75) = 0.65, f2(IFG2, α, β) = max(α2, β2) = 0.3, that is, the level of the rule's head is (0.65, 0.3). There is a fact for r(a,b) as well, so the resulting level is the union of the level of the rule's head and the level of the fact: maxF ((0.65, 0.3), (0.7, 0.2)) = (0.7, 0.2). So the fixed point of the

*{(p(a), (0.7, 0.2)), (q(b), (0.65, 0.3)), (r(a, b), (0.7, 0.2)) }.*  Let us see the bipolar evaluation of the program. Let *I = (IL, IG)*. That is let the first element of the uncertainty level be computed according to the Lukasiewicz operator and the second one according to the Gödel operator. The Lukasiewicz operator defines the uncertainty level

α

*1) = 0.65+0.75*

*body1 = min(0.7, 0.65) = 0.65,* 

−

α

α

*1 = 0.4*, *f2(IG,* 

*'body2 = min(1*

*'2, β'2) = 1* 

−*0.3,* 

−

*1)*. Then

α*1+β1*− −α*2, 1* −γ*2)* ≥ *1* −β*2}).* 

α

*, β) + f2(I2,* 

α*, β)* ≤ *1*

> β*; I* ∈

*f2 =1*−

I = (IL, IG); I = (IK, IK); I = (IL, IK);

*ground(P)* 

program is:

function *f (IL,* 

*0.2) = 0.7*; *f1(IL,* 

*1*− α*,* β

*) = max(0,* 

α

α *+* β −

*1, β1) = max(0,* 

if α*1+*α*2* ≤ *1, β1+β<sup>2</sup>*

**Example 6.** Consider the next IFS program:

 *min FV2 ({1* 

variants of bipolar Datalog satisfy the conditions referring to IFS:

− γ*2 | I2(1* 

≤

From the construction of bipolar consequence transformations follows:

evaluation sequence, in which *lfp(bNTP )* is a unique minimal model of *P*.

*(p(a), (0.7, 0.2)). (q(b), (0.65, 0.3)). (r(a, b), (0.7, 0.2)).* 

*I(*α*body1,* α*1)* ≥ β*1; I(*α*'body2,* α*'2)* ≥ β*'2* *min(*α*'2,1*−*β2) = 1 – min(0.7, 0.8) = 0.3*. So the uncertainty level of rule's head is (0.4, 0.3). Considering the other level of r*(a,b)*, its resulting level is *(max(0.4, 0.7), 1*−*max(1*−*0.3, 1*−*0.2)) = (0.7, 0.2)*, so the fixed point is:

*{(p(a), (0.7, 0.2)), (q(b), (0.65, 0.3)), (r(a,b), (0.7, 0.2)) }.* 

**Example 7.** Consider the next (recursive) IVS program:

*(p(a, b), (0.7, 0.8)). (p(a, c), (0.8, 0.9)). (p(b, d), (0.75, 0.8)). (p(d, e), (0.9, 0.95)). q(x, y)* ← *p(x, y); (0.85, 0.95); I1. q(x, y)* ← *p(x, z), q(z, y); (0.8, 0.9); I2.* 

Let *I1 = IVG2*, *I2 = IVL* , that is the appropriate uncertainty level functions are:

 *f1(IVG2,* α*, β) = min(*α*1, β1) f2(IVG2,* α*, β) = min(*α*2, β2) f1(IVL,* α*, β) = max(0,* α*2+ β1 -1) f2(IVL,* α*, β) = max(0,* α*1+ β2 -1)* 

Before showing the fixed point algorithm, two computations are set out. According to the first rule for *q(x, y)* the uncertainty level of *q(a, b)* is: *f(IVG2,* α*, β) = (min(*α*1, β1), min(*α*2, β2)) = (min(0.7, 0.85), min(0.8, 0.95)) = (0.7, 0.8).* 

According to the second rule, the uncertainty of *q(a, d)* can be computed in this way:

The body of the rule is: *p(a, b), q(b, d), so* α*body = min((0.7, 0.8),(0.75, 0.8)) =(0.7, 0.8); f(IVL,* α*, β) =( max(0,* α*2+ β1 -1), max(0,* α*1+ β2 -1) = (max(0, 0.8 + 0.8-1), max(0, 0.7 + 0.9 – 1)) = (0.6, 0.6).* 

The steps of fixed point algorithm are:

X*0* = {*(p(a, b), (0.7, 0.8)), (p(a, c), (0.8, 0.9)), (p(b, d), (0.75, 0.8)), (p(d, e), (0.9, 0.95))*} X*1* =X*0* ∪ { *(q(a, b), (0.7, 0.8)), (q(a, c), (0.8, 0.9)), (q(b, d), (0.75, 0.8)), (q(d, e), (0.85, 0.95))*, *(q(a, d), (0.6, 0.6), (q(b, e), (0.6, 0.65))* } X*2* =X*1* ∪ { *(q(a, e), (0.45, 0.5))* }

X*2* is fixed point, so it is the result of the program.

As fuzzy Datalog is a special kind of its multivalued extensions, so further on both fDATALOG and any of above extensions will be called multivalued Datalog (mDATALOG).

### **3. Multivalued knowledge-base**

The facts of an mDATALOG program can be regarded as any kind of lexical knowledge including uncertainty as well, and from this knowledge other facts can be deduced according to the rules. Therefore a multivalued Datalog program is suitable to be the deduction mechanism of a knowledge base. Sometimes, however, it is not enough for getting answer to a question. For example, if there are rules describing the options of loving a good composer, and there is a fact declaring that Vivaldi is a good composer, what is the

From Fuzzy Datalog to Multivalued Knowledge-Base 41

Ann loves the music of Bach very much (*(love(Ann, Bach), (0.9, 0.95))*) and the concept of love is similar to the concept of like (*RSVS ("love", "like") = (0.8, 0.9)*) and the music of Bach is more or less similar to the music of Vivaldi (*RGVG (Bach, Vivaldi) = (0.7, 0.75)*) then how strongly can be stated that Ann likes Vivaldi, that is, what is the uncertainty of the predicate

To solve this problem, the concept of proximity-based uncertainty function will be introduced. According to this function, the uncertainty levels of "synonyms" can be computed from the levels of original fact and from the proximity values of actual predicates and its arguments. It is expectable that in the case of identity, the level must be unchanged, but in other cases it is should be equal or less than the original level or than the proximity values. Furthermore, this function is required to increase monotonically. This function will

Let *p* be a predicate symbol with *n* arguments, then *p/n* is called the functor of the atom,

*, 1FV, 1FV,…, 1FV) =* 

Any triangular norm obeys the above constraints so they are appropriate proximity-based

**Example 8**. Let *(p(a), (0.7, 0.2))* be an *IFS* fact and *RSFS(p, q) = (0.8, 0.1), RGFG(a, b) = (0.7,* 

**Example 9**. Let *(love(Ann, Bach), (0.9, 0.95)* be an *IVS fact* and *RSVS("love", "like") = (0.8, 0.9),* 

⋅

⋅

⋅

*, λ, λ1, λ2) = minV(*

*0.7), min(0.95, 1, 1*

*1), min(0.95, 0.9, 1*

*0.7), min(0.95, 0.9, 1*

→

α

 *[0FV, 1FV]* 

*, λ, λ1,…, λn)* 

*(0.7, 0.3))) = (min(0.7, 0.7), max(0.2, 0.3)) = (0.7, 0.3));* 

*(1, 0))) = (min(0.7, 0.8), max(0.2, 0.1)) = (0.7, 0.2));* 

*(0.7, 0.3))) = (min(0.7, 0.56), max(0.2, 0.37)) = (0.56, 0.37));* 

α*, λ, λ1*⋅*λ2).* Then

⋅

⋅

⋅

*0.75) = (0.7, 0.75));* 

*1) = (0.8, 0.9));* 

*0.75) = (0.7, 0.75));* 

α

*, λ, λ1,…, λn) : (0FV, 1FV]n+2*

*, λ, λ1,…, λn) ≤ min FV (*

*like(Ann, Vivaldi)*?

where

and ϕ*p(*α

*0.3)* and

uncertainty functions.

ϕ*p(*α

be ordered to each atom of a program.

characterized by this predicate symbol.

*, λ, λ1) =* minF*(*

*(p(b), (min((0.7, 0.2), ((1, 0)*

*(q(b), (min((0.7, 0.2), ((0.8, 0.1)*

*RGVG (Bach, Vivaldi) = (0.7, 0.75))* and

*(q(a), (min((0.7, 0.2), ((0.8, 0.1)*

The uncertainty levels of *p(b)*, *q(a)* and *q(b)* are:

(In IFS the product is defined as:

**Definition 12.** A proximity-based uncertainty function of *p/n* is:

ϕ*p(*α

μ ⋅ *λ* = *(*μ*1*⋅*λ1, 1-(1*μ*2)*⋅*(1- λ2))*.)

⋅

⋅

*(love(Ann, Vivaldi), (min(0.9, 1, 1*

*(like(Ann, Vivaldi), (min(0.9, 0.8, 1*

*(like(Ann, Bach), (min(0.9, 0.8, 1*

⋅

ϕ*love(*α

*, λ, λ1,…, λn)* is monotonically increasing in each argument.

ϕ*p(*α

> ϕ*p(*α

> > α*, λ*⋅*λ1).*

possible answer to the question inquiring about liking Bach? Getting an answer needs the use of synonyms and similarities. For handling this kind of information, our model includes a background knowledge module.

### **3.1 Background knowledge**

Some "synonyms" and "similarities" will be defined between the potential predicates and between the potential constants of a given problem, so it can be examined in a larger context. More precisely, proximity relations will be defined on the sets of the program's predicates and terms. These relations will serve as the basis for the background knowledge.

**Definition 9.** A multivalued proximity on a domain *D* is an IFS or IVS valued relation *RFVD* : *D* × *D* → [*0FV, 1FV*] which satisfies the following properties:


A proximity is similarity if it is transitive, that is

$$\underline{R}\_{FV\_D}(\mathbf{x}, z) \succeq\_{FV} \min\_{FV} \left( \underline{R}\_{FV\_D}(\mathbf{x}, y), \underline{R}\_{FV\_D}(y, \mathbf{x}) \right) \text{ \textquotedbl{}\forall x, y, z \in D.}$$

In the case of similarity, equivalence classifications can be defined over *D.* The effect of this classification is perhaps a simpler or more effective algorithm, but in many cases the requirement of similarity is a too strict constraint. Therefore this chapter deals only with the more general proximity.

Background knowledge consists of the "synonyms" of each terms and each predicates of the program. The "synonyms" of any element form the proximity set of the element, and all of the proximity sets compose the background knowledge. More precisely:

**Definition 10.** Let *d* ∈ *D* any element of domain *D*. The proximity set of *d* is an IFS or IVS subset over *D*:

$$\mathcal{R}\_{FV\_d} = \{ (d\_{1\prime} \,\underline{\lambda}\_{FV\_1}), \, (d\_{2\prime} \,\underline{\lambda}\_{FV\_2}), \dots, \, (d\_{n\prime} \,\underline{\lambda}\_{FV\_n}) \},$$

where *di* ∈ *D and λFVi = RFVD (d, di)* for *i = 1, … ,n.*

**Definition 11.** Let *G* be any set of ground terms and *S* any set of predicate symbols. Let *RGFVG* and *RSFVS* be any proximity over *G* and *S* respectively. The background knowledge is the set of proximity sets:

$$Bk = \{ \mathcal{R}\mathcal{G}\_{\mathbb{F}V\_{\mathcal{S}}} \mid \mathcal{g} \in \mathcal{G} \} \cup \{ \mathcal{R}\mathcal{S}\_{\mathbb{F}V\_{\mathcal{S}}} \mid s \in S \}$$

### **3.2 Computing uncertainties**

Up to now, the deduction mechanism and the background knowledge of a multivalued knowledge-base have been defined. Now, the question remains: how can the two parts be connected to each other? How can we find the "synonyms"? For example, if it is known that Ann loves the music of Bach very much (*(love(Ann, Bach), (0.9, 0.95))*) and the concept of love is similar to the concept of like (*RSVS ("love", "like") = (0.8, 0.9)*) and the music of Bach is more or less similar to the music of Vivaldi (*RGVG (Bach, Vivaldi) = (0.7, 0.75)*) then how strongly can be stated that Ann likes Vivaldi, that is, what is the uncertainty of the predicate *like(Ann, Vivaldi)*?

To solve this problem, the concept of proximity-based uncertainty function will be introduced. According to this function, the uncertainty levels of "synonyms" can be computed from the levels of original fact and from the proximity values of actual predicates and its arguments. It is expectable that in the case of identity, the level must be unchanged, but in other cases it is should be equal or less than the original level or than the proximity values. Furthermore, this function is required to increase monotonically. This function will be ordered to each atom of a program.

Let *p* be a predicate symbol with *n* arguments, then *p/n* is called the functor of the atom, characterized by this predicate symbol.

**Definition 12.** A proximity-based uncertainty function of *p/n* is:

$$(\underline{\mathfrak{g}}\_{\underline{\mathbb{P}}}(\underline{\alpha}\,\,\underline{\lambda}\,\,\underline{\lambda}\_{1}\,\,\underline{\lambda}\_{1},\ldots,\,\underline{\lambda}\_{n}) : (\underline{\Omega}\_{FV}\,\,\underline{\mathbf{1}}\_{FV}\,\|\_{\mathbb{P}^{\mathsf{T}}}\,\, \rightarrow \|\underline{\mathbf{Q}}\_{FV}\,\,\underline{\mathbf{1}}\_{FV}\|\_{\mathbb{P}})$$

where

40 Fuzzy Logic – Algorithms, Techniques and Implementations

possible answer to the question inquiring about liking Bach? Getting an answer needs the use of synonyms and similarities. For handling this kind of information, our model includes

Some "synonyms" and "similarities" will be defined between the potential predicates and between the potential constants of a given problem, so it can be examined in a larger context. More precisely, proximity relations will be defined on the sets of the program's predicates and terms. These relations will serve as the basis for the background knowledge. **Definition 9.** A multivalued proximity on a domain *D* is an IFS or IVS valued relation *RFVD* :

> *RFD (x, y) = λF(x, y) = (λ1, λ2), λ1 + λ<sup>2</sup> ≤ 1 RVD (x, y) = λV(x, y) =(λ1, λ2), 0 ≤ λ<sup>1</sup> ≤ λ<sup>2</sup> ≤ 1*

> > ∀*x* ∈

*FV min FV (RFVD (x, y), RFVD (y, x))* 

In the case of similarity, equivalence classifications can be defined over *D.* The effect of this classification is perhaps a simpler or more effective algorithm, but in many cases the requirement of similarity is a too strict constraint. Therefore this chapter deals only with the

Background knowledge consists of the "synonyms" of each terms and each predicates of the program. The "synonyms" of any element form the proximity set of the element, and all of

**Definition 10.** Let *d* ∈ *D* any element of domain *D*. The proximity set of *d* is an IFS or IVS

*FVd = {(d1, λFV1 ), (d2, λFV2 ),… (dn, λFVn )},* 

**Definition 11.** Let *G* be any set of ground terms and *S* any set of predicate symbols. Let *RGFVG* and *RSFVS* be any proximity over *G* and *S* respectively. The background knowledge is

Up to now, the deduction mechanism and the background knowledge of a multivalued knowledge-base have been defined. Now, the question remains: how can the two parts be connected to each other? How can we find the "synonyms"? For example, if it is known that

 *D* (reflexivity)

 *D* (symmetry).

∀*x, y, z* ∈ *D.* 

∀*x* ∈

 *D* → [*0FV, 1FV*] which satisfies the following properties:

*RFVD (x, y)= 1FV* 

≥

R

*Bk = {*

A proximity is similarity if it is transitive, that is

*RFVD (x, z)* 

more general proximity.

subset over *D*:

∈

the set of proximity sets:

 *D and λFVi*

**3.2 Computing uncertainties** 

where *di*

*RFVD (x, y)= RFVD (y, x)* 

the proximity sets compose the background knowledge. More precisely:

 *= RFVD (d, di)* for *i = 1, … ,n.*

RG

*FVg | g* ∈ *G}* ∪ *{*RS*FVs | s* ∈ *S}* 

a background knowledge module.

**3.1 Background knowledge** 

*D* ×

$$\mathfrak{gl}\_{\mathcal{V}}(\underline{\alpha}\,\,\underline{\lambda}\,\,\underline{\lambda}\_1,\dots,\,\lambda\_n) \le \min\,\,\_{FV}(\underline{\alpha}\,\,\underline{\lambda}\,\,\underline{\lambda}\_1\dots,\,\underline{\lambda}\_n)$$

ϕ*p(*α*, 1FV, 1FV,…, 1FV) =* α

and ϕ*p(*α*, λ, λ1,…, λn)* is monotonically increasing in each argument.

Any triangular norm obeys the above constraints so they are appropriate proximity-based uncertainty functions.

**Example 8**. Let *(p(a), (0.7, 0.2))* be an *IFS* fact and *RSFS(p, q) = (0.8, 0.1), RGFG(a, b) = (0.7, 0.3)* and ϕ*p(*α*, λ, λ1) =* minF*(*α*, λ*⋅*λ1).* 

(In IFS the product is defined as: μ ⋅ *λ* = *(*μ*1*⋅*λ1, 1-(1*μ*2)*⋅*(1- λ2))*.)

The uncertainty levels of *p(b)*, *q(a)* and *q(b)* are:

*(p(b), (min((0.7, 0.2), ((1, 0)*⋅*(0.7, 0.3))) = (min(0.7, 0.7), max(0.2, 0.3)) = (0.7, 0.3));* 

*(q(a), (min((0.7, 0.2), ((0.8, 0.1)*⋅*(1, 0))) = (min(0.7, 0.8), max(0.2, 0.1)) = (0.7, 0.2));* 

*(q(b), (min((0.7, 0.2), ((0.8, 0.1)*⋅*(0.7, 0.3))) = (min(0.7, 0.56), max(0.2, 0.37)) = (0.56, 0.37));* 

**Example 9**. Let *(love(Ann, Bach), (0.9, 0.95)* be an *IVS fact* and *RSVS("love", "like") = (0.8, 0.9), RGVG (Bach, Vivaldi) = (0.7, 0.75))* and ϕ*love(*α*, λ, λ1, λ2) = minV(*α*, λ, λ1*⋅*λ2).* Then

> *(love(Ann, Vivaldi), (min(0.9, 1, 1*⋅*0.7), min(0.95, 1, 1*⋅*0.75) = (0.7, 0.75));*

*(like(Ann, Bach), (min(0.9, 0.8, 1*⋅*1), min(0.95, 0.9, 1*⋅*1) = (0.8, 0.9));* 

*(like(Ann, Vivaldi), (min(0.9, 0.8, 1*⋅*0.7), min(0.95, 0.9, 1*⋅*0.75) = (0.7, 0.75));* 

From Fuzzy Datalog to Multivalued Knowledge-Base 43

The modifying algorithm is irrelevant to the evaluation sequence, so stratification can be applied with the same condition. That is, the modified consequence transformation has a least fixed point in the case of stratified programs as well. This transformation makes connections between an mDATALOG program, the background knowledge and the decoding-set of the program. So these four components can form a knowledge-base. However, there should be other transformations connecting the three other parts with each other, therefore the universal concept of a multivalued knowledge-base can be defined with

Φ

decoding-set of *P* and *dA* is any deduction algorithm connecting the three other part with each other. The least fixed point of the deduction algorithm is called the consequence of the

Because the actual deduction algorithm is the modified consequence transformation, now

Note: If it is important to underline that there are bound predicates in P, then *mKB* can be

Φ

**Example 10**. Let us suppose that an internet agent's job is to send a message to its clients if the cinema (*C*) presents a film, which its clients like. The agent knows that people generally go (*go*) to the cinema if they can pay (*cp*) for the ticket and are interested in (*in*) the film presented (*pr*) in the cinema. It also knows that people usually want to see (*ws*) a film if they like (*li*) its actor (*ac*). Moreover it knows that Paul (P) has enough money (*hm*) and he enjoys (*en*) Chaplin (*Ch*) very much. In the cinema, a film of Stan and Pan (*SP*) is presented. Should

This situation can be modelled, for example, by the following multivalued knowledge-base.

Let the IVS valued mDATALOG program and the background knowledge be as follows

*P, dA) = lfp(dA(P, Bk,*

*P, dA) = lfp(modNTP).* 

where *P* is a multivalued Datalog program, *Bk* is a background knowledge,

*P, dA(P, Bk,*

Φ*P)),* 

> Φ*P)).*

*P)),* where *Bp* is the set of bound predicates.

 *pr(C, f), in(x, f), cp(x); (0.85, 0.95); IVL. (R1)* 

 *ac(f, y), li(x, y); (0.8, 0.85); IVG2. (R2)* 

Φ*<sup>P</sup>* is a

**Definition 15.** A multivalued knowledge-base (*mKB*) is a quadruple

*C(Bk, P,* 

*C(Bk, P,* 

Φ

 *go(x, C)* 

 *(hm(P), (0.75, 0.8)). (en(P, Ch), (0.9, 0.95)).* 

 *(pr(C, Film), (1,1)). (ac(Film, SP), (1,1)).* 

 *ws(f, x)* 

*mKB = (P, Bk,*

Φ

Φ

the agent inform Paul about this film? How much will he want to go to the cinema?

←

←

According to their roll, *pr(C, Film)* and *ac(Film, SP)* have no alternatives.

*P, dA(P, Bp, Bk,*

an arbitrary deduction algorithm:

knowledge-base, denoted by

denoted by *mKB = (P, Bp, Bk,*

the consequence is

As the above examples show, the levels of "synonyms" can be computed according to proximity-based uncertainty functions. To determine all direct or indirect conclusions of the facts and rules of a program, a proximity based uncertainty function has to be ordered to each predicate of the program. The set of these functions will be the decoding-set of the program.

**Definition 13.** Let *P* be a multivalued Datalog program, and *FP* be the set of the program's functors. The decoding-set of *P* is: Φ*P =* {ϕ*p(*α*, λ, λ1,…, λn) |* ∀ *p/n* ∈ *FP* }*.*

### **3.3 Deduction with background knowledge**

The original deducing mechanism makes conclusions according to the rules of the program, but from now on the background knowledge must be considered as well. So the original mechanism has to be modified. This modified deduction consists of two alternating parts: starting from the facts, their "synonyms" are determined, then applying the suitable rules, other facts are derived, followed by their "synonyms" determined, and again the rules are applied, etc. To define it in a precise manner the concept of modified consecution transformation will be introduced.

The consequence transformation of a mDATALOG *P* program is defined over the set of all multivalued sets of *P*'s Herbrand base, that is, over *FV(BP)*. To define the modified transformation's domain, let us extend *P*'s Herbrand universe with all possible ground terms of the background knowledge, obtaining the so called modified Herbrand universe, *modHP*. The modified Herbrand base, *modBP* is the set of all ground atoms, whose predicate symbols occur in *P* ∪ *Bk* and whose arguments are elements of *modHP*.

However, it is possible, that there are some special predicates in *P*, which have no alternatives, even if their arguments have "synonyms". These predicates are named as bound predicates. For such predicates, the modified Herbrand base only includes atoms that are present in the original Herbrand base.

**Definition 14.** The modified consequence transformation *modNTP : FV(modBP)* → *FV(modBP )* is defined as

$$\begin{array}{c} \text{modNT}\_{\text{P}} \left( \mathbf{X} \right) = \left\{ \left( q(\mathbf{s}\_{1}, \dots, \mathbf{s}\_{n}) \right) \mid \underline{\mathfrak{Q}}\_{\text{p}} (\underline{\mathfrak{Q}}\_{\text{p}} \mid \underline{\mathfrak{A}}\_{\text{p}} \mid \underline{\mathfrak{A}}\_{\text{p}} \dots \mid \underline{\mathfrak{A}}\_{\text{p}}) \right) \\\\ \left( q, \underline{\lambda}\_{\mathfrak{Q}} \right) \in \underline{\operatorname{RS}}\_{\text{FV}\_{\text{P}'}} ; \left( \underline{\mathfrak{s}}\_{i} \underline{\lambda}\_{\mathfrak{s}\_{i}} \right) \in \underline{\operatorname{RG}}\_{\text{FV}\_{\text{V}\_{i}}} \cdot 1 \le i \le n \right\} \cup \mathbf{X}\_{\text{r}} \end{array}$$

where

$$(p(t\_1, \ldots, t\_n) \leftarrow A\_1, \ldots, A\_k; \underline{\mathcal{B}}\\_{\mathbb{L}}) \in \operatorname{ground}(\mathbb{P}),$$

$$(\lfloor A\_i \rfloor, \lfloor \underline{\mathcal{Q}}\_{A\_i} \rangle \in X, \; 1 \le i \le k) \qquad (\lfloor A\_i \rfloor \text{ is the kernel of } A\_i)$$

and α*<sup>p</sup>* is computed according to the actual extension of (1).

This transformation is inflationary over *FV(modBP)* and it is monotone if *P* is positive. So, according to (Ceri et al 1990) it has a least fixed point. If *P* is positive, this is the least fixed point. This fixed point is a model of *P*, but because *lfp(NTP)* ⊆ *lfp(mod NTP)*, it is not a minimal one (Achs 2010).

42 Fuzzy Logic – Algorithms, Techniques and Implementations

As the above examples show, the levels of "synonyms" can be computed according to proximity-based uncertainty functions. To determine all direct or indirect conclusions of the facts and rules of a program, a proximity based uncertainty function has to be ordered to each predicate of the program. The set of these functions will be the decoding-set of the

**Definition 13.** Let *P* be a multivalued Datalog program, and *FP* be the set of the program's

The original deducing mechanism makes conclusions according to the rules of the program, but from now on the background knowledge must be considered as well. So the original mechanism has to be modified. This modified deduction consists of two alternating parts: starting from the facts, their "synonyms" are determined, then applying the suitable rules, other facts are derived, followed by their "synonyms" determined, and again the rules are applied, etc. To define it in a precise manner the concept of modified consecution

The consequence transformation of a mDATALOG *P* program is defined over the set of all multivalued sets of *P*'s Herbrand base, that is, over *FV(BP)*. To define the modified transformation's domain, let us extend *P*'s Herbrand universe with all possible ground terms of the background knowledge, obtaining the so called modified Herbrand universe, *modHP*. The modified Herbrand base, *modBP* is the set of all ground atoms, whose predicate

However, it is possible, that there are some special predicates in *P*, which have no alternatives, even if their arguments have "synonyms". These predicates are named as bound predicates. For such predicates, the modified Herbrand base only includes atoms that

**Definition 14.** The modified consequence transformation *modNTP : FV(modBP)* → *FV(modBP )*

*)* ∈ *RGFVti*

 *A1,…,Ak ;* 

This transformation is inflationary over *FV(modBP)* and it is monotone if *P* is positive. So, according to (Ceri et al 1990) it has a least fixed point. If *P* is positive, this is the least fixed

ϕ*p(*α*p, λq, λs1*

β*; I )* ∈ *,…, λsn) ) |* 

∪ *X,* 

⊆

 *lfp(mod NTP)*, it is not a

 *, 1 ≤ i ≤ n}* 

 *k (|Ai| is the kernel of Ai)* 

 *ground(P),* 

*, λ, λ1,…, λn) |* 

∀ *p/n* ∈ *FP* }*.*

Φ*P =* {ϕ*p(*α

symbols occur in *P* ∪ *Bk* and whose arguments are elements of *modHP*.

*modNTP (X) = {(q(s1,…,sn),* 

*(p(t1,…,tn)* 

*<sup>p</sup>* is computed according to the actual extension of (1).

point. This fixed point is a model of *P*, but because *lfp(NTP)* 

 *RSFVp ; (si, λsi*

←

program.

functors. The decoding-set of *P* is:

transformation will be introduced.

are present in the original Herbrand base.

*(q, λq)* ∈

*(|Ai|,* α*Ai )* ∈ *X, 1* ≤ *i* ≤

is defined as

where

and α

minimal one (Achs 2010).

**3.3 Deduction with background knowledge** 

The modifying algorithm is irrelevant to the evaluation sequence, so stratification can be applied with the same condition. That is, the modified consequence transformation has a least fixed point in the case of stratified programs as well. This transformation makes connections between an mDATALOG program, the background knowledge and the decoding-set of the program. So these four components can form a knowledge-base. However, there should be other transformations connecting the three other parts with each other, therefore the universal concept of a multivalued knowledge-base can be defined with an arbitrary deduction algorithm:

**Definition 15.** A multivalued knowledge-base (*mKB*) is a quadruple

$$
\dim \text{KB} = (P, \text{Bk}, \text{Qp}, \text{dA}(P, \text{Bk}, \text{Qp})),
$$

where *P* is a multivalued Datalog program, *Bk* is a background knowledge, Φ*<sup>P</sup>* is a decoding-set of *P* and *dA* is any deduction algorithm connecting the three other part with each other. The least fixed point of the deduction algorithm is called the consequence of the knowledge-base, denoted by

$$\mathbf{C}(\mathsf{Bk}, \mathsf{P}, \mathsf{Q}\_{\mathsf{P}}, \mathsf{d}\mathbf{A}) = \mathsf{l}\mathsf{f}\mathsf{p}(\mathsf{d}\mathbf{A}(\mathsf{P}, \mathsf{Bk}, \mathsf{Q}\_{\mathsf{P}})) .$$

Because the actual deduction algorithm is the modified consequence transformation, now the consequence is

$$\mathbf{C}(\mathbf{B}k, \mathrm{P}, \,\,\Phi\_{\mathrm{P}}, \,\mathrm{d}A) = \mathrm{l}\!\!\!p(mod \,\mathrm{NT}\_{\mathrm{P}})\,.$$

Note: If it is important to underline that there are bound predicates in P, then *mKB* can be denoted by *mKB = (P, Bp, Bk,*Φ*P, dA(P, Bp, Bk,*Φ*P)),* where *Bp* is the set of bound predicates.

**Example 10**. Let us suppose that an internet agent's job is to send a message to its clients if the cinema (*C*) presents a film, which its clients like. The agent knows that people generally go (*go*) to the cinema if they can pay (*cp*) for the ticket and are interested in (*in*) the film presented (*pr*) in the cinema. It also knows that people usually want to see (*ws*) a film if they like (*li*) its actor (*ac*). Moreover it knows that Paul (P) has enough money (*hm*) and he enjoys (*en*) Chaplin (*Ch*) very much. In the cinema, a film of Stan and Pan (*SP*) is presented. Should the agent inform Paul about this film? How much will he want to go to the cinema?

This situation can be modelled, for example, by the following multivalued knowledge-base.

Let the IVS valued mDATALOG program and the background knowledge be as follows

$$\text{go(x, C)} \leftarrow \text{pr(C, f)}, \text{in(x, f)}, \text{cp(x)}; \text{(0.85, 0.95)}; I\_{\text{VL}}. \tag{\mathbb{R}1}$$

$$ws(f, x) \leftarrow ac(f, y), \text{li(x, y): (0.8, 0.85); } I\_{VG\_2} \tag{R2}$$

 *(hm(P), (0.75, 0.8)). (en(P, Ch), (0.9, 0.95)).* 

 *(pr(C, Film), (1,1)). (ac(Film, SP), (1,1)).* 

According to their roll, *pr(C, Film)* and *ac(Film, SP)* have no alternatives.

From Fuzzy Datalog to Multivalued Knowledge-Base 45

atom. *q* may contain variables, and its levels may be known or unknown values. The

In standard Datalog, the most common approach in top-down direction is called query – sub-query framework. A goal, together with a program, determines a query. Literals in the body of any one of the rules defining the goal predicate are sub-goals of the given goal. Thus, a sub-goal together with the program yields a sub-query. In order to answer the query, each goal is expanded in a list of sub-goals, which are recursively expanded in their turn. That is, considering a goal, all rule-heads unifying with the goal are selected and the literals of the rule-body are the new sub-goals of given goal, which are evaluated one by one

The situation is the same with a multivalued knowledge-base as well, but in this case the algorithm is completed with the computation of the unification levels. However, it is possible that such rules do not exist, but perhaps they do exist for the synonyms. For example, the goal is to know if Ann likes Bach, but there are rules only for describing the options of loving somebody and there are facts only about Vivaldi. In such cases, the synonyms are used. Therefore, the algorithm has to consider the proximities and has to compute the uncertainty levels. It is a bidirectional evaluation: firstly, the uncertainty-free rules and the proximities are evaluated in a top-down manner, obtaining the required starting facts, then the computing of uncertainties is executed in the opposite direction, that

The uncertainty levels are not required in the top-down part of the evaluation, so, this part of the algorithm can be based on the concept of classical substitution and unification (Ceri et al 1990, Ullman 1988, etc.) However other kinds of substitutions may be necessary as well: to

From now on, for the sake of a simpler terminology, the terms "goal", "rule" and "fact" will refer to these concepts without uncertainty levels. An AND/OR tree arises during the evaluation, this is the searching tree. Its root is the goal; its leaves are either YES or NO. The parent nodes of YES are the facts, and uncertainty can be computed moving towards the direction of the root. This tree is built up by a periodic change of three kinds of steps: a

Proximity-based unification unifies the predicate symbols of sub-goals and the members of their proximity sets. Rule-based unification unifies the sub-goals with the head of suitable rules, and continues the evaluating by the bodies of these rules. The splitting step splits the rule-body into sub-goals if the body contains more literals or splits a literal of proximity sets

During the construction process, the edges are labelled by necessary information for computing the uncertainties. The searching graph according to its depth is build up in the

substitute some predicate *p* or term *t* with their proximity sets *RSFVp* and *RGFVt*

proximity-based unification, a rule-based unification and a splitting step.

in an arbitrary order. This procedure continues until the facts have been obtained.

is the fuzzy, the intuitionistic, the interval-valued or the bipolar level of the

α

*)*, where *q(t1, t2, …,* 

, and to

enough to consider only a part of them. A goal is a pair *(q(t1, t2, …, tn);* 

*tn)* is an atom,

α

evaluation algorithm gives answer to this query.

is, according to the fixed-point algorithm.

into literals of the suitable ground atoms.

following way.

**4.1 Evaluation of a general knowledge-base** 

substitute some proximity sets with their members.

Let the proximities be:

$$
\underline{R}\_V \text{ (in, } wss) = (0.7, 0.8). \qquad \underline{R}\_V \text{ (Ch, SP)} = (0.8, 0.9).
$$

$$
\underline{R}\_V \text{ (li, } en) \quad = (0.8, 0.9).
$$

$$
\underline{R}\_V \text{ (cp, lom)} = (0.9, 0.95).
$$

According to the connecting algorithm, it is enough to consider only the proximity-based uncertainty functions of head-predicates. Let these functions be the minimum function:

$$\mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\overline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,,\,\underline{\lambda}\mathbb{T}) = \mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,,\,\underline{\lambda}\mathbb{T}) = \mathfrak{gl}\_{\mathbb{R}}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,,\,\underline{\lambda}\mathbb{T}) = \mathfrak{gl}\_{\mathbb{R}}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,\,\underline{\lambda}\mathbb{T}) = \mathfrak{gl}\_{\mathbb{R}}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,\,\underline{\lambda}\mathbb{T}) := \mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,\,\underline{\lambda}\mathbb{T}) := \mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,\,\underline{\lambda}\mathbb{T}) := \mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,\,\underline{\lambda}\mathbb{T}) := \mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T}\,\,\underline{\lambda}\mathbb{T}) := \mathfrak{gl}\_{\mathbb{R}^3}(\underline{\mathbf{c}}\,\underline{\lambda}\,\,\underline{\lambda}\mathbb{T})$$

$$\underline{\mathfrak{g}}\_{\text{um}}(\underline{\mathfrak{g}}, \underline{\lambda}, \underline{\lambda}\_1) := \min\_{V} \left( \underline{\mathfrak{g}}, \underline{\lambda}, \underline{\lambda}\_1 \right),$$

The modified consequence transformation has the next steps:

*X0 =* {*(hm(P), (0.75, 0.8)), (en(P, Ch), (0.9, 0.95)), (pr(C, Film), (1,1)), (ac(Film, SP), (1,1))*} (according to the proximity)  *X1 = modNTP(X0) = X0* ∪ {*(cp(P),* ϕ*hm((0.75, 0.8), (0.9, 0.95), (1,1)) = (min(0.75, 0.9, 1), min(0.8, 0.95, 1)) = (0.75, 0.8)), (en(P, SP),* ϕ*en((0.9, 0.95), (1,1), (1,1), (0.85, 0.9)) = (0.85, 0.9)), (li(P, Ch),* ϕ*en((0.9, 0.95), (0.8, 0.9), (1,1), (1,1)) = (0.8, 0.9)), (li(P, SP),* ϕ*en((0.9, 0.95), (0.8, 0.9), (1,1), (0.85, 0.9)) = (0.8, 0.9))*} (applying the rules – only (R2) can be applied)  *X2 = modNTP (X1) = X1* ∪ {(*ws(Film, P), f(IVG2,* α*,* β *) = f(IVG2, minV((1,1), (0.8, 0.9)), (0.8, 0.85)) = minV((0.8, 0.9),(0.8, 0.85)) = (0.8, 0.85))* } (according to the proximity)  *X3 = modNTP (X2) = X2* ∪ {*(in(Film, P),* ϕ*ws( (0.8, 0.85), (0.7, 0.8), (1,1), (1,1)) = (0.7, 0.8))*} (applying the rules – (R1) can be applied)  *X4 = modNTP (X3) = X3* ∪ {*(go(P, C), f(IVL,* α*,* β *) = f(IVL, minV((1,1), (0.7, 0.8), (0.75, 0.8)), (0.85, 0.95)) = (max(0, 0.8 + 0.85 - 1), max(0, 0.7 + 0.95 - 1)) = (0.65, 0.65))*}

X*4* is a fixed point, so it is the consequence of the knowledgebase.

According to this result, the agent will know that the message can be sent, because Paul will probably enjoy Stan and Pan at a likelihood level of 85-90% (level (0.85, 0.9)), and there is a good chance (65%) that Paul will go to the cinema.

### **4. Evaluating algorithms**

The fixed point-query is a bottom-up evaluation algorithm, which may involve many superfluous calculations. However, very often, only a particular question is of interest and the answer to this question needs to be searched. If a goal (query) is specified together with the multivalued knowledge-base, it is not necessary to evaluate all facts and rules, and it is 44 Fuzzy Logic – Algorithms, Techniques and Implementations

According to the connecting algorithm, it is enough to consider only the proximity-based uncertainty functions of head-predicates. Let these functions be the minimum function:

*, λ, λ1, λ2)* 

*X0 =* {*(hm(P), (0.75, 0.8)), (en(P, Ch), (0.9, 0.95)), (pr(C, Film), (1,1)), (ac(Film, SP), (1,1))*}

*en((0.9, 0.95), (1,1), (1,1), (0.85, 0.9)) = (0.85, 0.9)),* 

*en((0.9, 0.95), (0.8, 0.9), (1,1), (0.85, 0.9)) = (0.8, 0.9))*} (applying the rules – only (R2) can be applied)

*ws( (0.8, 0.85), (0.7, 0.8), (1,1), (1,1)) = (0.7, 0.8))*} (applying the rules – (R1) can be applied)

According to this result, the agent will know that the message can be sent, because Paul will probably enjoy Stan and Pan at a likelihood level of 85-90% (level (0.85, 0.9)), and there is a

The fixed point-query is a bottom-up evaluation algorithm, which may involve many superfluous calculations. However, very often, only a particular question is of interest and the answer to this question needs to be searched. If a goal (query) is specified together with the multivalued knowledge-base, it is not necessary to evaluate all facts and rules, and it is

*en((0.9, 0.95), (0.8, 0.9), (1,1), (1,1)) = (0.8, 0.9)),* 

*, λ, λ1, λ2),* 

= ϕ*pr(*α

*hm((0.75, 0.8), (0.9, 0.95), (1,1)) = (min(0.75, 0.9, 1), min(0.8, 0.95, 1)) = (0.75, 0.8)),* 

 *) = f(IVG2, minV((1,1), (0.8, 0.9)), (0.8, 0.85)) =* 

 *) = f(IVL, minV((1,1), (0.7, 0.8), (0.75, 0.8)), (0.85, 0.95)) =* 

*, λ, λ1, λ2)* 

= ϕ*ac(*α

*, λ, λ1, λ2)* 

:=

*RV (in, ws) = (0.7, 0.8). RV (Ch, SP) = (0.8, 0.9).*

= ϕ*en(*α

(according to the proximity)

(according to the proximity)

 *(max(0, 0.8 + 0.85 - 1), max(0, 0.7 + 0.95 - 1)) = (0.65, 0.65))*} X*4* is a fixed point, so it is the consequence of the knowledgebase.

*minV (*α

*RV (li, en) = (0.8, 0.9).* 

*RV (cp, hm) = (0.9, 0.95).* 

*, λ, λ1, λ2)* 

ϕ*hm(*α*, λ, λ1)* := *minV (*α*, λ, λ1),* 

The modified consequence transformation has the next steps:

α*,* β

 *minV((0.8, 0.9),(0.8, 0.85)) = (0.8, 0.85))* }

ϕ

α*,* β

good chance (65%) that Paul will go to the cinema.

Let the proximities be:

*, λ, λ1, λ2)* 

 *X1 = modNTP(X0) = X0* ∪

 *X2 = modNTP (X1) = X1* ∪ {(*ws(Film, P), f(IVG2,* 

 *X3 = modNTP (X2) = X2* ∪

 *X4 = modNTP (X3) = X3* ∪ {*(go(P, C), f(IVL,* 

**4. Evaluating algorithms** 

{*(in(Film, P),* 

ϕ

ϕ

ϕ

ϕ

{*(cp(P),* 

 *(en(P, SP),* 

 *(li(P, Ch),* 

 *(li(P, SP),* 

= ϕ*ws(*α

ϕ*go(*α enough to consider only a part of them. A goal is a pair *(q(t1, t2, …, tn);* α*)*, where *q(t1, t2, …, tn)* is an atom, α is the fuzzy, the intuitionistic, the interval-valued or the bipolar level of the atom. *q* may contain variables, and its levels may be known or unknown values. The evaluation algorithm gives answer to this query.

In standard Datalog, the most common approach in top-down direction is called query – sub-query framework. A goal, together with a program, determines a query. Literals in the body of any one of the rules defining the goal predicate are sub-goals of the given goal. Thus, a sub-goal together with the program yields a sub-query. In order to answer the query, each goal is expanded in a list of sub-goals, which are recursively expanded in their turn. That is, considering a goal, all rule-heads unifying with the goal are selected and the literals of the rule-body are the new sub-goals of given goal, which are evaluated one by one in an arbitrary order. This procedure continues until the facts have been obtained.

The situation is the same with a multivalued knowledge-base as well, but in this case the algorithm is completed with the computation of the unification levels. However, it is possible that such rules do not exist, but perhaps they do exist for the synonyms. For example, the goal is to know if Ann likes Bach, but there are rules only for describing the options of loving somebody and there are facts only about Vivaldi. In such cases, the synonyms are used. Therefore, the algorithm has to consider the proximities and has to compute the uncertainty levels. It is a bidirectional evaluation: firstly, the uncertainty-free rules and the proximities are evaluated in a top-down manner, obtaining the required starting facts, then the computing of uncertainties is executed in the opposite direction, that is, according to the fixed-point algorithm.

### **4.1 Evaluation of a general knowledge-base**

The uncertainty levels are not required in the top-down part of the evaluation, so, this part of the algorithm can be based on the concept of classical substitution and unification (Ceri et al 1990, Ullman 1988, etc.) However other kinds of substitutions may be necessary as well: to substitute some predicate *p* or term *t* with their proximity sets *RSFVp* and *RGFVt*, and to substitute some proximity sets with their members.

From now on, for the sake of a simpler terminology, the terms "goal", "rule" and "fact" will refer to these concepts without uncertainty levels. An AND/OR tree arises during the evaluation, this is the searching tree. Its root is the goal; its leaves are either YES or NO. The parent nodes of YES are the facts, and uncertainty can be computed moving towards the direction of the root. This tree is built up by a periodic change of three kinds of steps: a proximity-based unification, a rule-based unification and a splitting step.

Proximity-based unification unifies the predicate symbols of sub-goals and the members of their proximity sets. Rule-based unification unifies the sub-goals with the head of suitable rules, and continues the evaluating by the bodies of these rules. The splitting step splits the rule-body into sub-goals if the body contains more literals or splits a literal of proximity sets into literals of the suitable ground atoms.

During the construction process, the edges are labelled by necessary information for computing the uncertainties. The searching graph according to its depth is build up in the following way.

From Fuzzy Datalog to Multivalued Knowledge-Base 47

In this way each starting fact can be appointed. Then a solution can be determined by connecting the suitable unifications and computing in succession the uncertainties according to the labels of edges in the path from the symbol YES to the root of the graph. The union of

**Proposition 7**. For a given goal and in the case of finite evaluation graph, the top-down

**Example 11.** Consider the IFS program of Example 6, and let it be completed by proximities

 *p(x), q(y); (0.75, 0.2); IFG2.* 

The three facts-sets can easily be seen in the graph. From the first one, the uncertainty of

As *(r(a, b), (0.7, 0.2))* is a known member of the set, so knowing this uncertainty, the proximity-based function and the proximities of knowledge base, the uncertainties can be

*r (a1, b), (minF ((0.7, 0.2), (1, 0), (0.8, 0.1), (1,0))= (0.7, 0.2)).* 

*r (a1, b1), (minF ((0.7, 0.2), (1, 0), (0.8, 0.1), (0.6,0.3))= (0.6, 0.3)).*  Applying the next label of the path the uncertainty of *r1(a1, b)* and *r1(a1, b1)* can be found.

*(r1(a1, b), (0.7, 0.2))* 

*(r1(a1, b1), (0.6, 0.3))* 

ϕ*p(*α*, λ, λ1)* 

ϕ*q(*α*, λ, λ1)* 

ϕ*r(*α

*)*, where *x* is a variable. Then the evaluation graph is on the next

:= α⋅ *λ*⋅*λ1.*

:= *minF (*α*, λ*⋅*λ1).*

α

*, λ, λ1, λ2).*

*, λ, λ1, λ2) := minF (*

evaluation terminates and gives the same answer as the fixed point query.

←

these solutions is the answer to the given query. From the construction of searching graph follows:

and proximity-based uncertainty functions

 *(p(a), (0.7, 0.2)).* 

 *(q(b), (0.65, 0.3)).* 

 *(r(a, b), (0.7, 0.2)).* 

*RF (p, p1) = (0.7, 0.1).*

*RF (q, q1) = (0.8, 0.1).*

*RF (r, r1) = (0.75, 0.2).*

*RF (a, a1) = (0.8, 0.1).* 

*RF (b, b1) = (0.6, 0.3).* 

*r(a1, b)* and *r(a1, b1)* can be computed.

α

 *r(x, y)* 

Let the goal be: *(r1(a1,x);* 

page.

determined:

They are as follows:

If the goal is on depth *0*, then every successor of any node on depth *3k+2 (k = 0, 1, …)* is in AND connection, and the others are in OR connection. The step after depth *3k (k = 0, 1, …)* is a proximity-based unification, after depth *3k+1 (k=0, 1, …)* is a rule-based unification and after depth *3k+2 (k=0,1,…)* is a splitting step. In detail:

If the atom *p(t1, t2, …, tn)* is in depth *3k (k = 0, 1, …)*, then the successor nodes let be all possible *p'(t1, t2, …, tn)*, where *p'* ∈ *RSFVp* . The edges starting from these nodes are labelled by the proximity-based uncertainty functions ϕ*p'*.

If the atom *L* is in depth *3k+1 (k=0, 1, …)*, then the successor nodes will be


That is, if the head of rule *M* ← *M1,…,Mn*, *(n>0)* is unifiable with *L*, then the successor of *L* will be *M1*θ*,…,Mn*θ, where θ is the most general unification of *L* and *M*. The edges starting from these nodes are labelled by the uncertainty functions belonging to the implication operator of the applied rules and by the uncertainty level of the rule.

If *n=0*, that is, in the program there is any fact with the predicate symbol of *L*, then the successors will be the unified facts. If *L = p(t1, t2, …, tn)* and in the program there is any fact with predicate symbol *p*, then let the successor nodes be all possible *p(t'1, t'2,…, t'n)*, where *t'i* = *ti* if *ti* is a ground term or *t'i* = *RGFVti*θ if *ti* is a variable and θ is a suitable unification. The edges starting from these nodes are not labelled.

According to the previous paragraph, there are three kinds of nodes in depth *3k+2 (k=0,1,…)*: a unified body of a rule; a unified fact the arguments of with are ordinary ground terms or proximity sets; or the symbol NO.

In the first case, the successors are the members of the body. They are in AND connection. The connected edges will not be labelled, but because of the AND connection, during the computation, the minimum value of the successor's levels will be regarded.

In the second case, the successors are the so called facts-sets. This means, that if the node is *p(t1, t2,…, tn)*, where *ti* is a ground term or a proximity set, then the facts-set is the set of all possible *p(t'1, t'2,…, t'n)*, where *t'i* ∈ *RGFVti* . The edges starting from these nodes are labelled by the proximity-based uncertainty functions ϕ*p*.

The facts-set has a further successor, the symbol YES.

The NO-node has no successor.

A solution can be achieved in the graph along the path ending in the symbol YES. According to the unification algorithm, one of the literals that are located at the parent node of YES can also be found among the original facts of the program. Knowing its uncertainty and using the proximity-based uncertainty function of the label leading to this facts-set, the uncertainty of other members of the facts-set can be computed. However, it is not necessary to compute all of them, only those ones, which are appropriate for the pattern of the literal being in the parent node of facts-set. This means, that if the literal is *p(t1, t2,…, tn)*, and *ti* is a ground term, then it is enough to consider only *p(t'1, t'2,…, ti ,…, t'n)* from the facts-set, but if *ti* is a proximity set, then it is necessary to deal with all *p(t'1, t'2,…, t'i ,…, t'n)*, where *t'i* ∈ *RGFVti* .

46 Fuzzy Logic – Algorithms, Techniques and Implementations

If the goal is on depth *0*, then every successor of any node on depth *3k+2 (k = 0, 1, …)* is in AND connection, and the others are in OR connection. The step after depth *3k (k = 0, 1, …)* is a proximity-based unification, after depth *3k+1 (k=0, 1, …)* is a rule-based unification and

If the atom *p(t1, t2, …, tn)* is in depth *3k (k = 0, 1, …)*, then the successor nodes let be all possible *p'(t1, t2, …, tn)*, where *p'* ∈ *RSFVp* . The edges starting from these nodes are labelled

> ϕ*p'*.

That is, if the head of rule *M* ← *M1,…,Mn*, *(n>0)* is unifiable with *L*, then the successor of *L*

from these nodes are labelled by the uncertainty functions belonging to the implication

If *n=0*, that is, in the program there is any fact with the predicate symbol of *L*, then the successors will be the unified facts. If *L = p(t1, t2, …, tn)* and in the program there is any fact with predicate symbol *p*, then let the successor nodes be all possible *p(t'1, t'2,…, t'n)*, where

According to the previous paragraph, there are three kinds of nodes in depth *3k+2 (k=0,1,…)*: a unified body of a rule; a unified fact the arguments of with are ordinary ground

In the first case, the successors are the members of the body. They are in AND connection. The connected edges will not be labelled, but because of the AND connection, during the

In the second case, the successors are the so called facts-sets. This means, that if the node is *p(t1, t2,…, tn)*, where *ti* is a ground term or a proximity set, then the facts-set is the set of all possible *p(t'1, t'2,…, t'n)*, where *t'i* ∈ *RGFVti* . The edges starting from these nodes are labelled

> ϕ*p*.

A solution can be achieved in the graph along the path ending in the symbol YES. According to the unification algorithm, one of the literals that are located at the parent node of YES can also be found among the original facts of the program. Knowing its uncertainty and using the proximity-based uncertainty function of the label leading to this facts-set, the uncertainty of other members of the facts-set can be computed. However, it is not necessary to compute all of them, only those ones, which are appropriate for the pattern of the literal being in the parent node of facts-set. This means, that if the literal is *p(t1, t2,…, tn)*, and *ti* is a ground term, then it is enough to consider only *p(t'1, t'2,…, ti ,…, t'n)* from the facts-set, but if *ti* is a proximity set, then

if *ti* is a variable and

θ

computation, the minimum value of the successor's levels will be regarded.

it is necessary to deal with all *p(t'1, t'2,…, t'i ,…, t'n)*, where *t'i* ∈ *RGFVti* .

is the most general unification of *L* and *M*. The edges starting

θ

is a suitable unification. The

If the atom *L* is in depth *3k+1 (k=0, 1, …)*, then the successor nodes will be

• the unified facts if L is unifiable with any fact of the program, or

operator of the applied rules and by the uncertainty level of the rule.

after depth *3k+2 (k=0,1,…)* is a splitting step. In detail:

by the proximity-based uncertainty functions

• NO, if there is not any unifiable rule or fact.

θ

• the bodies of suitable unified rules or

*t'i* = *ti* if *ti* is a ground term or *t'i* = *RGFVti*

terms or proximity sets; or the symbol NO.

by the proximity-based uncertainty functions

The NO-node has no successor.

The facts-set has a further successor, the symbol YES.

edges starting from these nodes are not labelled.

will be *M1*

θ*,…,Mn*θ, where In this way each starting fact can be appointed. Then a solution can be determined by connecting the suitable unifications and computing in succession the uncertainties according to the labels of edges in the path from the symbol YES to the root of the graph. The union of these solutions is the answer to the given query.

From the construction of searching graph follows:

**Proposition 7**. For a given goal and in the case of finite evaluation graph, the top-down evaluation terminates and gives the same answer as the fixed point query.

**Example 11.** Consider the IFS program of Example 6, and let it be completed by proximities and proximity-based uncertainty functions

```
 (p(a), (0.7, 0.2)). 
 (q(b), (0.65, 0.3)). 
 (r(a, b), (0.7, 0.2)). 
 r(x, y) ← p(x), q(y); (0.75, 0.2); IFG2. 
 RF (p, p1) = (0.7, 0.1). ϕp(α, λ, λ1) := α⋅ λ⋅ λ1.
 RF (q, q1) = (0.8, 0.1). ϕq(α, λ, λ1) := minF (α, λ⋅ λ1).
 RF (r, r1) = (0.75, 0.2). ϕr(α, λ, λ1, λ2) := minF (α, λ, λ1, λ2).
 RF (a, a1) = (0.8, 0.1). 
 RF (b, b1) = (0.6, 0.3).
```
Let the goal be: *(r1(a1,x);* α*)*, where *x* is a variable. Then the evaluation graph is on the next page.

The three facts-sets can easily be seen in the graph. From the first one, the uncertainty of *r(a1, b)* and *r(a1, b1)* can be computed.

As *(r(a, b), (0.7, 0.2))* is a known member of the set, so knowing this uncertainty, the proximity-based function and the proximities of knowledge base, the uncertainties can be determined:

*r (a1, b), (minF ((0.7, 0.2), (1, 0), (0.8, 0.1), (1,0))= (0.7, 0.2)).* 

*r (a1, b1), (minF ((0.7, 0.2), (1, 0), (0.8, 0.1), (0.6,0.3))= (0.6, 0.3)).* 

Applying the next label of the path the uncertainty of *r1(a1, b)* and *r1(a1, b1)* can be found. They are as follows:

```
(r1(a1, b), (0.7, 0.2)) 
(r1(a1, b1), (0.6, 0.3))
```
From Fuzzy Datalog to Multivalued Knowledge-Base 49

As mentioned earlier, in the case of standard Datalog, the heart of the evaluation algorithm is the unification process. Our special evaluation of multivalued knowledge is based on unification as well, but on multivalued unification. The multivalued unification consists of two parts, one is the alternation of a rule-based-unification and the other is a proximity-based one. Both of them are the extensions of the classical unification algorithm. Now, the splitting step is inside these unifications: evaluating a fact, the last proximity-based unification unifies the fact with its facts-set, and a rule-based unification splits these sets into their members.

This unification algorithm is similar to the classical one, that is, the goal can be unified with the body of any one of the rules defining the goal predicate – if the body is not empty. The level of the unification is the level of the rule defining the goal predicate. In that case, a variable can be substituted with other variable or with a constant; a constant can be substituted with itself only. The next sub-goal of the evaluation process will be the first member of the body. It is possible that during the evaluation a variable of a later member is substituted by a proximity set. In such a case, in the course of later evaluation, this

If the predicate symbol of the goal is the predicate symbol of a fact, its arguments can be

• The variables of the goal can be substituted with the proximity set of the constants being the corresponding arguments of the fact. (E.g.: if *q(a,b)* is a fact predicate and the

There is a special case of unification: the facts-set of a predicate is unified with its members

According to the previous unifications, between the literals of these facts-set there is one from the facts of the program. Knowing its uncertainty and the level of proximities, the uncertainty of other members can be computed. Then these members can be unified

The level of this unification is the level of the fitting member, and the former proximity set-

set of unification is {*p(a), p(b)*}, which is unified with empty clauses . The substitutions for *x* are *x|a* and *x|b*, and the levels are the computed levels of *p(a)* and *p(b)* respectively. )

• with the empty clause, if there is no other literal to evaluate in the parent-node;

.)

= {*a, b*}, then the facts-

goal is *q(x, y)*, then *x* can be substituted with *RFVa* and *y* with *RFVb*

• with their proximity set if the goal does not contain any variable. • The proximity set argument of a goal can be substituted with itself.

If there is no fact with the same predicate symbol, the unification process fails.

• with themselves, if the goal contains any variable or

• with the remaining part of the body to be evaluated.

substitution of a variable is replaced by the suitable member of this set.

(E.g. : if there is a former *x|RFVa* substitution for literal *p(x)*, and *RFVa*

**4.2.1 Rule-based unification** 

substituted as follows:

proximity set will be substituted with itself.

• The constants of the goal can be substituted

The level of the unification is *1FV*.

in the following way:

respectively:

Fig. 1. The evaluation graph of Example 11.

In the second facts-set *(p(a), (0.7, 0.2))* is the known fact, from this *(p(a1), (0.56, 0.28))*. Similarly *(q(b), (0.65, 0.3)), (q(b1), (0.6, 0.3))*. Applying the min, fFG2 and ϕr functions:

```
(r1(a1, b), (0.56, 0.3)) 
(r1(a1, b1), (0.56, 0.3))
```
As the answer is the union of the different solutions, the final answer is:

```
(r1(a1, b), (0.7, 0.2)) 
(r1(a1, b1), (0.6, 0.3))
```
### **4.2 Special evaluation based on multivalued unification**

The necessity of bidirectional evaluation is derived from the generality of implications and proximity functions, because their values can be computed only from known arguments, namely in bottom-up manner. However, in special cases, computation can be realized parallel with the evaluation of rules and proximities, so the algorithm could be a more efficient pure top-down evaluation. This is the situation if all of the functions are the minimum function. That is, all implications that are used are the second Gödel implication (G2) and all proximity-based uncertainty functions are the minimum function.

As mentioned earlier, in the case of standard Datalog, the heart of the evaluation algorithm is the unification process. Our special evaluation of multivalued knowledge is based on unification as well, but on multivalued unification. The multivalued unification consists of two parts, one is the alternation of a rule-based-unification and the other is a proximity-based one. Both of them are the extensions of the classical unification algorithm. Now, the splitting step is inside these unifications: evaluating a fact, the last proximity-based unification unifies the fact with its facts-set, and a rule-based unification splits these sets into their members.

### **4.2.1 Rule-based unification**

48 Fuzzy Logic – Algorithms, Techniques and Implementations

In the second facts-set *(p(a), (0.7, 0.2))* is the known fact, from this *(p(a1), (0.56, 0.28))*.

Similarly *(q(b), (0.65, 0.3)), (q(b1), (0.6, 0.3))*. Applying the min, fFG2 and ϕr functions:

As the answer is the union of the different solutions, the final answer is:

(G2) and all proximity-based uncertainty functions are the minimum function.

**4.2 Special evaluation based on multivalued unification** 

*(r1(a1, b), (0.56, 0.3))* 

*(r1(a1, b1), (0.56, 0.3))* 

*(r1(a1, b), (0.7, 0.2))* 

*(r1(a1, b1), (0.6, 0.3))* 

The necessity of bidirectional evaluation is derived from the generality of implications and proximity functions, because their values can be computed only from known arguments, namely in bottom-up manner. However, in special cases, computation can be realized parallel with the evaluation of rules and proximities, so the algorithm could be a more efficient pure top-down evaluation. This is the situation if all of the functions are the minimum function. That is, all implications that are used are the second Gödel implication

Fig. 1. The evaluation graph of Example 11.

This unification algorithm is similar to the classical one, that is, the goal can be unified with the body of any one of the rules defining the goal predicate – if the body is not empty. The level of the unification is the level of the rule defining the goal predicate. In that case, a variable can be substituted with other variable or with a constant; a constant can be substituted with itself only. The next sub-goal of the evaluation process will be the first member of the body. It is possible that during the evaluation a variable of a later member is substituted by a proximity set. In such a case, in the course of later evaluation, this proximity set will be substituted with itself.

If the predicate symbol of the goal is the predicate symbol of a fact, its arguments can be substituted as follows:

	- with themselves, if the goal contains any variable or
	- with their proximity set if the goal does not contain any variable.

The level of the unification is *1FV*.

If there is no fact with the same predicate symbol, the unification process fails.

There is a special case of unification: the facts-set of a predicate is unified with its members in the following way:

According to the previous unifications, between the literals of these facts-set there is one from the facts of the program. Knowing its uncertainty and the level of proximities, the uncertainty of other members can be computed. Then these members can be unified respectively:


The level of this unification is the level of the fitting member, and the former proximity setsubstitution of a variable is replaced by the suitable member of this set.

(E.g. : if there is a former *x|RFVa* substitution for literal *p(x)*, and *RFVa* = {*a, b*}, then the factsset of unification is {*p(a), p(b)*}, which is unified with empty clauses . The substitutions for *x* are *x|a* and *x|b*, and the levels are the computed levels of *p(a)* and *p(b)* respectively. )

From Fuzzy Datalog to Multivalued Knowledge-Base 51

 *ac(f, y), li(x, y); (0.8, 0.85); IVG2. (R1)* 

 *su(f, y), li(x, y); (0.75, 0.85); IVG2. (R2)* 

 *ws(f, x)* 

 *ws(f, x)* 

 (en(P, Ch), (0.9, 0.95)). (en(P, H), (0.6, 0.7)).

 (ac(F1, SP), (1, 1)). (to(F2, W), (0.9, 0.95)). (to(F3, W), (0.55, 0.6)).

RV (to, su ) = (0.9, 1).

Fig. 2. The evaluation graph of Example 12.

Let each proximity-based uncertainty function be the minimum function.

applied substitution is *x|F1* and the minimum of levels is *(0.7, 0.8)*.

According to the left path of the graph one can see, that *(in (P, F1), (0.7, 0.8))* because the

The other paths are only half-drawn, and they are continued in a partial graph, because this part of evaluation is similar in all cases. The only differences are in uncertainty levels.

Then the evaluation graph is Fig.2.

←

←

According to its roll, *ac(F1, SP)* has no alternatives. Let the other proximities be:

 RV (in, ws) = (0.7, 0.8). RV (Ch, SP) = (0.8, 0.9). RV (li, en) = (0.8, 0.9). RV (H, W) = (0.6, 0.8).

### **4.2.2 Proximity-based unification**

This unification serves for handling proximities. When these steps are implemented, the following substitutions can be realized:


### **4.2.3 The unification algorithm**

The evaluation algorithm combines the two kinds of unification. It starts with the proximity based unification and after it is finished, they alternate. The query is successful if the unification algorithm ends with an empty clause or a failure. In the first case the variables get the values defined during the substitutions. If all uncertainties are regarded as a minimum value, the actual level of unification can be computed as the minimum of former levels and when the algorithm reaches the empty clause, its level will be the level of the goal as well.

If the unification algorithm ends with a failure, there is no answer on this path.

If more answers arise during the evaluation, their union will be the resolution of the query.

According to the construction of unifications, the following proposition is true.

**Proposition 8.** For a given goal, and in the case of a finite evaluation graph, the above topdown evaluation gives the same answer as the fixed point query.

Notes:


**Example 12.** Let us consider a part of Example 10, and let it be completed with a new rule and new facts. That is now the internet agent knows that people usually want to see (*ws*) a film if they like (*li*) its actor (*ac*), or they like more or less the subject (*su*) of the film. Moreover, it knows that Paul (P) enjoys (*en*) Chaplin (*Ch*) very much and mostly enjoys the historical films (*H*). In the cinema, a film (*F1*) of Stan and Pan (*SP*) is presented. There are two other films (*F2, F3*). Both films' topics (*to*) are the war (*W*), but in different manner. The first one's central message is the war, the second one play in wartime, but it is only a background. From the former example it is known, that the agent wants to know the interest of Paul. Therefore let our goal be *(in(P,x);* α*)*.

Let the IVS valued mDATALOG program and the background knowledge be as follows

50 Fuzzy Logic – Algorithms, Techniques and Implementations

This unification serves for handling proximities. When these steps are implemented, the

• A predicate symbol can be substituted with the elements of its proximity set. The level

• A proximity set can be substituted with itself except in the last step of the evaluation of a literal. In this case, that is if each argument of the literal is a proximity set, the literal

The evaluation algorithm combines the two kinds of unification. It starts with the proximity based unification and after it is finished, they alternate. The query is successful if the unification algorithm ends with an empty clause or a failure. In the first case the variables get the values defined during the substitutions. If all uncertainties are regarded as a minimum value, the actual level of unification can be computed as the minimum of former levels and when the algorithm reaches the empty clause, its level will be the level of the goal

If more answers arise during the evaluation, their union will be the resolution of the query.

**Proposition 8.** For a given goal, and in the case of a finite evaluation graph, the above top-

• Although this algorithm was described for a knowledge-base based on a negation free program, it is similar in the case of stratified programs; the only difference is the calculation of the uncertainty of the negated sub-goal, but the computing of minimum

• With a good depth limit this algorithm is suitable for evaluating recursive programs or

**Example 12.** Let us consider a part of Example 10, and let it be completed with a new rule and new facts. That is now the internet agent knows that people usually want to see (*ws*) a film if they like (*li*) its actor (*ac*), or they like more or less the subject (*su*) of the film. Moreover, it knows that Paul (P) enjoys (*en*) Chaplin (*Ch*) very much and mostly enjoys the historical films (*H*). In the cinema, a film (*F1*) of Stan and Pan (*SP*) is presented. There are two other films (*F2, F3*). Both films' topics (*to*) are the war (*W*), but in different manner. The first one's central message is the war, the second one play in wartime, but it is only a background. From the former example it is known, that the agent wants to know the interest

α*)*. Let the IVS valued mDATALOG program and the background knowledge be as follows

**4.2.2 Proximity-based unification** 

**4.2.3 The unification algorithm** 

as well.

Notes:

remains the same.

infinite graphs as well.

of Paul. Therefore let our goal be *(in(P,x);* 

following substitutions can be realized:

• A constant or a variable can be substituted with itself only.

can be unified with its facts-set. The level of the unification is *1FV*.

If the unification algorithm ends with a failure, there is no answer on this path.

According to the construction of unifications, the following proposition is true.

down evaluation gives the same answer as the fixed point query.

of the unification is the current proximity value.

$$\begin{aligned} \text{ws}(\mathbf{f}, \mathbf{x}) &\leftarrow \text{ac}(\mathbf{f}, \mathbf{y}), \text{li}(\mathbf{x}, \mathbf{y}); (0.8, 0.85); \text{I}\_{\text{V}\mathbf{G}}. \end{aligned} \tag{R1}$$

$$\begin{aligned} \text{ws}(\mathbf{f}, \mathbf{x}) &\leftarrow \text{su}(\mathbf{f}, \mathbf{y}), \text{li}(\mathbf{x}, \mathbf{y}); (0.75, 0.85); \text{I}\_{\text{V}\mathbf{G}}. \end{aligned} \tag{R2}$$

$$\begin{aligned} \text{(en(P, Ch), (0.9, 0.95))}. \\ \text{(en(P, H), (0.6, 0.7))}. \\\\ \text{(ac(F1, SP), (1, 1))}. \\ \text{(to(F2, W), (0.9, 0.95))}. \\ \text{(to(F3, W), (0.55, 0.6))}. \end{aligned}$$

According to its roll, *ac(F1, SP)* has no alternatives. Let the other proximities be:


Then the evaluation graph is Fig.2.

Let each proximity-based uncertainty function be the minimum function.

According to the left path of the graph one can see, that *(in (P, F1), (0.7, 0.8))* because the applied substitution is *x|F1* and the minimum of levels is *(0.7, 0.8)*.

The other paths are only half-drawn, and they are continued in a partial graph, because this part of evaluation is similar in all cases. The only differences are in uncertainty levels.

From Fuzzy Datalog to Multivalued Knowledge-Base 53

The deduction mechanism can be any of the extensions of Datalog. These extensions are the fuzzy Datalog, based on fuzzy logic, the intuitionistic- or interval-value Datalog, based on the suitable logics and bipolar Datalog, which is some kind of coexistence of the former ones.

The semantics of Datalog is a fixed-point semantics, so the algorithm, which connects the two main pillars of the knowledge-base is the generalization of the consequence transformation determining this fixed-point. This transformation is defined on the extended Herbrand base of the knowledge-base, which is generated from the ground terms of

Applying this transformation it is necessary to compute the uncertainty levels of "synonyms". The proximity-based uncertainty functions can do it, giving uncertainty values from the levels of the original fact and from the proximity values. The set of this kind of

Two possible evaluation strategies were presented as well. One of them evaluates a general knowledgebase with arbitrary proximity-based uncertainty functions and arbitrary implication operators. The other one allows minimum functions only as proximity based uncertainty functions, and the special extension of Gödel operator, but in this case a multivalued unification algorithm can be determined. This strategy is based on the

The improvement of this strategy and/or the deduction algorithm and/or the structure of

A well structured multivalued knowledge-base and an efficient evaluating algorithm determining its consequence could be the basis of making decisions based on uncertain information, or it would be useful for handling argumentation or negotiation of agents. An

Abiteboul, S.; Hull, R. and Vianu, V. (1995). *Foundations of Databases*. Addison-Wesley

Achs, A. & Kiss, A. (1995). Fuzzy extension of datalog, *Acta Cybernetica Szeged*, Vol.12, pp.

Achs, A. (2006). Models for handling uncertainty, *PhD thesis*, University of Debrecen, 2006. Achs, A. (2007). From Fuzzy- to Bipolar- Datalog, In: *Proceedings of 5th EUSFLAT Conference*,

Achs, A. (2010). A multivalued knowledge-base model, *Acta Universitatis Sapientiae,* 

Alsinet, T. & Godo, L. (1998). Fuzzy Unification Degree, In: *Proceedings 2nd Int Workshop on* 

Atanassov, K. (1983). Intuitionistic fuzzy sets, *VII ITKR's Session, Sofia* (deposed in Central Science-Technical Library of Bulgarian Academy of Science, 1697/84). Atanassov, K. & Gargov, G. (1989). Interval-valued intuitionistic fuzzy sets, *Fuzzy Sets and* 

Atanassov, K. (1994). Remark on intuitionistic fuzzy expert systems. *BUSEFAL*, Vol.59, pp.

*Logic Programming and Soft Computing '98, in conjunction with JICSLP'98,* 

implementation of this model would be an interesting future development as well.

knowledge base and its background knowledge.

alternating rule-based- and proximity-based unification.

background knowledge is a subject of further investigations.

Publishing Company, Reading, Massachusetts.

Ostrava, Czech Republic, pp. 221-227.

*Informatica*, Vol.2, No.1, pp. 51-79.

*Systems*, Vol.31, No.3, pp. 343-349.

Manchester, UK, pp. 23-43.

functions is the decoding set.

**6. References** 

153-166.

71-76.

So, according to these paths the other answers for the query are:

*(in (P, F2), (0.6, 0.7)), (in (P, F3), (0.55, 0.6))* 

Fig. 3. The enlarged evaluation graph of Example 12.

### **5. Conclusion**

In this chapter, a possible multivalued knowledge-base was presented as a quadruple of background knowledge, a deduction mechanism, a decoding set and an algorithm connecting the background knowledge to the deduction mechanism.

The background knowledge is based on proximity relations between terms and between predicates and it serves as a mechanism handling "synonyms".

52 Fuzzy Logic – Algorithms, Techniques and Implementations

So, according to these paths the other answers for the query are:

*(in (P, F2), (0.6, 0.7)), (in (P, F3), (0.55, 0.6))* 

Fig. 3. The enlarged evaluation graph of Example 12.

connecting the background knowledge to the deduction mechanism.

predicates and it serves as a mechanism handling "synonyms".

In this chapter, a possible multivalued knowledge-base was presented as a quadruple of background knowledge, a deduction mechanism, a decoding set and an algorithm

The background knowledge is based on proximity relations between terms and between

**5. Conclusion** 

The deduction mechanism can be any of the extensions of Datalog. These extensions are the fuzzy Datalog, based on fuzzy logic, the intuitionistic- or interval-value Datalog, based on the suitable logics and bipolar Datalog, which is some kind of coexistence of the former ones.

The semantics of Datalog is a fixed-point semantics, so the algorithm, which connects the two main pillars of the knowledge-base is the generalization of the consequence transformation determining this fixed-point. This transformation is defined on the extended Herbrand base of the knowledge-base, which is generated from the ground terms of knowledge base and its background knowledge.

Applying this transformation it is necessary to compute the uncertainty levels of "synonyms". The proximity-based uncertainty functions can do it, giving uncertainty values from the levels of the original fact and from the proximity values. The set of this kind of functions is the decoding set.

Two possible evaluation strategies were presented as well. One of them evaluates a general knowledgebase with arbitrary proximity-based uncertainty functions and arbitrary implication operators. The other one allows minimum functions only as proximity based uncertainty functions, and the special extension of Gödel operator, but in this case a multivalued unification algorithm can be determined. This strategy is based on the alternating rule-based- and proximity-based unification.

The improvement of this strategy and/or the deduction algorithm and/or the structure of background knowledge is a subject of further investigations.

A well structured multivalued knowledge-base and an efficient evaluating algorithm determining its consequence could be the basis of making decisions based on uncertain information, or it would be useful for handling argumentation or negotiation of agents. An implementation of this model would be an interesting future development as well.

### **6. References**


**0**

**3**

Hashim Habiballa *University of Ostrava Czech Republic*

**Resolution Principle and Fuzzy Logic**

Fuzzy Predicate Logic with Evaluated Syntax (FPL) (Novák, V.) is a well-studied and wide-used logic capable of expressing vagueness. It has a lot of applications based on robust theoretical background. It also requires an efficient formal proof theory. However the most widely applied resolution principle (Duki´c, N.) brings syntactically several obstacles mainly arising from normal form transformation. FPL is associating with even harder problems when trying to use the resolution principle. Solutions to these obstacles based on the non-clausal

In this article it would be presented a natural integration of these two formal logical systems into fully functioning inference system with effective proof search strategies. It leads to the refutational resolution theorem prover for FPL (*RRTPFPL*). Another issue addressed in the paper concerns to the efficiency of presented inference strategies developed originally for the proving system. It is showed their perspectives in combination with standard proof-search strategies. The main problem for the fuzzy logic theorem proving lies in the large amount of possible proofs with different degrees and there is presented an algorithm (Detection of Consequent Formulas - DCF) solving this problem. The algorithm is based on detection of

The article presents the method which is the main point of the work on any automated prover. There is a lot of strategies which makes proofs more efficient when we use refutational proving. We consider well-known strategies - orderings, filtration strategy, set of support etc. One of the most effective strategies is the elimination of consequent formulas. It means the check if a resolvent is not a logical consequence of a formula in set of axioms or a previous resolvent. If such a condition holds it is reasonable to not include the resolvent into the set of resolvents, because if the refutation can be deduced from it, then so it can be deduced from

For the purposes of (*RRTPFPL*) it will be used generalized principle of resolution, which is defined in the research report (Bachmair, L.). There is a propositional form of the rule defined at first and further it is lifted into first-order logic. It is introduced the propositional form of

*F*[*G*] *F*�

[*G*]

*<sup>F</sup>*[*G*/⊥] <sup>∨</sup> *<sup>F</sup>*�[*G*/�] (1)

resolution (Bachmair, L.) were already proposed in (Habiballa, H.).

such redundant formulas (proofs) with different degrees.

the original resolvent, which it implies of.

**General resolution - propositional version**

**2. First-order logic**

the general resolution.

**1. Introduction**


### **Resolution Principle and Fuzzy Logic**

Hashim Habiballa *University of Ostrava Czech Republic*

### **1. Introduction**

54 Fuzzy Logic – Algorithms, Techniques and Implementations

Atanassov, K. (2005). Intuitionistic fuzzy implications and modus ponens. *Notes on* 

Atanassov, K. (2006). On some intuitionistic fuzzy implications. *Comptes Rendus de* 

Ceri, S.; Gottlob, G. & Tanca, L. (1990). *Logic Programming and Databases*, Springer-Verlag,

Cornelis, C.; Deschrijver, G. & Kerre, E.E. (2004). Implication in intuitionistic fuzzy and

Dubois, D. & Prade, H. (1991). Fuzzy sets in approximate reasoning, Part 1: Inference with possibility distributions, *Fuzzy Sets and Systems,* Vol.40, pp. 143-202. Dubois, D.; Hajek, P. & Prade, H. (2000). Knowledge-Driven versus Data-Driven Logics,

Dubois, D.; Gottwald, S.; Hajek, P.; Kacprzyk, J. & Prade, H. (2005). Terminological

Formato, F.; Gerla, G. & Sessa, M.I. (2000). Similarity-based Unification, *Fundamenta* 

Julian-Iranzo, P. & Rubio-Manzano, C. (2009). A declarative semantics for Bousi~Prolog,

Julian-Iranzo, P. & Rubio-Manzano, C. (2010). An Efficient Fuzzy Unification Method and its

Medina, J.; Ojeda-Aciego, M. & Vojtas, P. (2004). Similarity-based unification: a multi-adjoint

Sessa, M. I. (2002). Approximate reasoning by similarity-based SLD resolution, *Theoretical* 

Straccia, U. (2008). Managing Uncertainty and Vagueness in Description Logics, Logic

Straccia, U.; Ojeda-Aciego, M. & Damasio, C.V. (2009). On Fixed-points of Multi-valued

Ullman, J.D. (1988). *Principles of Database and Knowledge-base Systems*, Computer Science

Virtanen, H.E. (1994) Fuzzy Unification, *Proc. of IPMU'94, Paris (France)*, pp. 1147–1152. Zadeh, L. A. (1975) The concept of a linguistic variable and its application to approximate

*International Journal of Approximate Reasoning*, Vol.35, pp. 55-95.

*Journal of Logic, Language, and Information*, Vol.9, pp. 65-89.

Lloyd, J.W. (1990). *Foundations of Logic Programming*, Springer-Verlag, Berlin

approach, *Fuzzy Sets and Systems*, Vol.146, No.1, pp. 43-62.

Programs, *SIAM Journal on Computing*, Vol.8, pp. 1881-1911.

*Computer Science*, Vol.275, No.1-2, pp. 389-426.

al. (Eds), pp. 54-103, Springer-Verlag, Berlin

interval-valued fuzzy set theory: construction, classification, application,

difficulties in fuzzy set theory – The case of Intuitionistic Fuzzy Sets, *Fuzzy Sets and* 

*PPDP '09: Proceedings of the 11th ACM SIGPLAN conference on Principles and practice* 

Implementation into the BousiProlog System, *WCCI2010 IEEE World Congress On* 

Programs and Description Logic Programs, In: Reasoning Web 2008, C. Baroglio et

Functions on Complete Lattices and their Application to Generalized Logic

reasoning (I–II-III), *Information Sciences*, Vol.8, pp. 199-249; 301-357; Vol.9 pp. 43-80.

*Intuitionistic Fuzzy Sets,* Vol. 11, pp. 1-5.

*Systems*, Vol.15, pp. 485-491.

*of declarative programming*.

Press, Rockville

*Informaticae*, Vol.41, pp. 393-414.

*Computational Intelligence, Barcelona*.

Berlin

*l'Academie Bulgare des Sciences, Tome*, Vol. 59, pp. 19-24. Atanassov, K. (1999). *Intuitionistic Fuzzy Sets*, Springer-Verlag, Heidelberg

> Fuzzy Predicate Logic with Evaluated Syntax (FPL) (Novák, V.) is a well-studied and wide-used logic capable of expressing vagueness. It has a lot of applications based on robust theoretical background. It also requires an efficient formal proof theory. However the most widely applied resolution principle (Duki´c, N.) brings syntactically several obstacles mainly arising from normal form transformation. FPL is associating with even harder problems when trying to use the resolution principle. Solutions to these obstacles based on the non-clausal resolution (Bachmair, L.) were already proposed in (Habiballa, H.).

> In this article it would be presented a natural integration of these two formal logical systems into fully functioning inference system with effective proof search strategies. It leads to the refutational resolution theorem prover for FPL (*RRTPFPL*). Another issue addressed in the paper concerns to the efficiency of presented inference strategies developed originally for the proving system. It is showed their perspectives in combination with standard proof-search strategies. The main problem for the fuzzy logic theorem proving lies in the large amount of possible proofs with different degrees and there is presented an algorithm (Detection of Consequent Formulas - DCF) solving this problem. The algorithm is based on detection of such redundant formulas (proofs) with different degrees.

> The article presents the method which is the main point of the work on any automated prover. There is a lot of strategies which makes proofs more efficient when we use refutational proving. We consider well-known strategies - orderings, filtration strategy, set of support etc. One of the most effective strategies is the elimination of consequent formulas. It means the check if a resolvent is not a logical consequence of a formula in set of axioms or a previous resolvent. If such a condition holds it is reasonable to not include the resolvent into the set of resolvents, because if the refutation can be deduced from it, then so it can be deduced from the original resolvent, which it implies of.

### **2. First-order logic**

For the purposes of (*RRTPFPL*) it will be used generalized principle of resolution, which is defined in the research report (Bachmair, L.). There is a propositional form of the rule defined at first and further it is lifted into first-order logic. It is introduced the propositional form of the general resolution.

### **General resolution - propositional version**

$$\frac{F[G] \quad F'[G]}{F[G/\perp] \lor F'[G/\top]} \tag{1}$$

*7. F is positive in a clause C if it is an element of C.*

**General ordered resolution with selection** *O*�

**Lemma 1.** *Lifting lemma*

*is an inference in O*�

*is an inference in O*�

*1.* ¬*a* ∨ ¬*b* ∨ *c (axiom), 2. a (axiom), 3. b (axiom)*

*premise respectively)* ⇒ *c*

*positive premise respectively)* ⇒ ¬*b* ∨ *c*

*such that*

of General resolution based on orderings applied to clauses.

*Let M be a set of clauses and K* = *G*(*M*) *(set of ground instances). If*

*SM* (*K*) *then there exist clauses C*�

*i*

level ordered) in formula 1 (it means 1. is a negative premise).

**Example 2.** *General resolution with equivalence*

*4.* [*a* ∧ ⊥] ∨ [*a* ∧ �] *(resolvent from (2), (2) on c)* ⇒ *a 5.* [*a* ∧ ⊥] ∨ [*a* ∧�↔ *b* ∧ *d*] *((2), (1) on c)* ⇒ *a* ↔ *b* ∧ *d*

*6.* ⊥ ∨ [� ↔ *b* ∧ *d*] *((4), (5) on a)* ⇒ *b* ∧ *d 7.* ⊥ ∧ *d* ∨�∧ *d ((6), (6) on b)* ⇒ *d 8. b* ∧⊥∨ *b* ∧ � *((6), (6) on d)* ⇒ *b*

Consider following table showing various cases of resolution on clauses.

*1. a* ∧ *c* ↔ *b* ∧ *d (axiom), 2. a* ∧ *c (axiom), 3.*¬[*b* ∧ *d*] *(axiom) - negated goal*

**Example 1.** *General resolution - polarity based selection*

*<sup>S</sup>* (*M*)*, Ci* = *C*�

Note that this proposition applies both to formulas and clauses and allows us to determine polarity of any subformula in a formula. It is safe to *select any sequence of negative atoms* in a general clause, since a negative atom cannot be false in an interpretation the clause is false. With the notion of the polarity as a selection function there is possible to state another notion

Resolution Principle and Fuzzy Logic 57

*S C*1(*A*1)...*Cn*(*An*) *D*(*A*1, ..., *An*)

where (i) either *A*1, ..., *An* is selected by *S* in *D*, or else *S*(*D*) is empty, *n* = 1, *A*<sup>1</sup> is maximal in

According to the (Bachmair, L.) an inference system based on this rule is refutationally complete. When trying to extend this into the first-order case we to use lifting lemma.

> *C*1...*Cn C*<sup>0</sup> *C*

> > *C*�

*σ.*

*4.* ⊥ ∨ ¬� ∨ ¬*b* ∨ *c (a is a negative atom in (1) - selected in (1) as negative premise, and (2) as*

*5.* ⊥ ∨ ¬� ∨ *c (b is a negative atom in (4) - selected in (4) as negative premise, and (3) as positive*

In the example we used the notion of polarity as a selection function. For example in the line 4 we select the atom a upon negative polarity (according the proposition criteria 1, 3 and 2 -

Further we can observe the behavior of the rule within the frame of clausal form resolution.

*<sup>i</sup> in M, a clause C*�

*, and a ground substitution σ*

*D*, (ii) each atom *Ai* is maximal in *Ci*, and (iii) no clause *Ci* contains a selected atom.

*C*� 1...*C*� *<sup>n</sup> C*� 0

*σ, and C* = *C*�

*<sup>C</sup>*1(⊥)...*Cn*(⊥) *<sup>D</sup>*(�, ..., �) (2)

where the propositional logic formulas *F* and *F*� are the premises of inference and *G* is an occurrence of a subformula of both *F* and *F*� . The expression *F*[*G*/⊥] ∨ *F*� [*G*/�] is the resolvent of the premises on *G*. Every occurrence of G is replaced by false in the first formula and by true in the second one. It is also called F the positive, F' the negative premise, and G the resolved subformula.

The proof of the soundness of the rule is similar to clausal resolution rule proof. Suppose the Interpretation I in which both premises are valid. In I, G is either true or false. If G (¬*G*) is true in I, so is *F*� [*G*/�] (*F*[*G*/⊥]).

Revised version of the paper which forms the core of the handbook (Bachmair, L.) is closely related with notion of selection functions and ordering constraints. By a selection functions it is meant a mapping *S* that assigns to each clause *C* a (possibly empty) multiset *S*(*C*) of negative literals in *C*. In other words, the function S selects (a possibly empty) negative subclause of *C*. We say that an atom *A*, or a literal ¬*A*, is selected by *S* if ¬*A* occurs in *S*(*C*). There are no selected atoms or literals if *S*(*C*) is empty. Lexicographic path ordering can be used as an usual ordering over a total precedence. But in this case the ordering is admissible if predicate symbols have higher precedence than logical symbols and the constants � and ⊥ are smaller than the other logical symbols. It means the ordering is following *A* �≡�⊃� ¬ � ∨ � ∧ � � � ⊥. The handbook also addresses another key issues for automated theorem proving - the efficiency of the proof search. This efficiency is closely related with the notion of *redundancy*.

If we want to generalize the notion of resolution and lift it into first-order case we have to define first the notion of selection function for general clauses. General clauses are multisets of arbitrary quantifier-free formulas, denoting the disjunction of their elements. Note that we can also work with a special case of such general clause with one element, which yields to a standard quantifier-free formula of first-order logic. A (general selection) function is a mapping *S* that assigns to each general clause *C* a (possibly empty) set *C* of non-empty sequences of (distinct) atoms in *C* such that either *S*(*C*) is empty or else, for all interpretations *I* in which *C* is false, there exists a sequence *A*1, ..., *Ak* in *S*(*C*), all atoms of which are true in *I*. A sequence *A*1, ..., *Ak* in *S*(*C*) is said to be *selected* (by *S*).

We have to define the notion of polarity for these reasons according to the handbook (Bachmair, L.). It is based on the following assumption that a subformula *F*� in *E*[*F*� ] is *positive* (resp. *negative*), if *E*[*F*� /�] (resp. *E*[*F*� /⊥]) is a tautology. Thus, if *F*� is *positive* (resp. *negative*) in *E*, *F*� (resp. ¬*F*� ) logically implies *E*. Even it should seem that determining of the polarity of any subformula is NP-complete (hard) problem, we can use syntactic criteria for this computation. In this case the complexity of the algorithm is linear (note that we base our theory on similar syntactic criteria below - structural notions definition).

### **Proposition 1.** *Polarity criteria*


2 Will-be-set-by-IN-TECH

where the propositional logic formulas *F* and *F*� are the premises of inference and *G* is an

resolvent of the premises on *G*. Every occurrence of G is replaced by false in the first formula and by true in the second one. It is also called F the positive, F' the negative premise, and G

The proof of the soundness of the rule is similar to clausal resolution rule proof. Suppose the Interpretation I in which both premises are valid. In I, G is either true or false. If G (¬*G*) is

Revised version of the paper which forms the core of the handbook (Bachmair, L.) is closely related with notion of selection functions and ordering constraints. By a selection functions it is meant a mapping *S* that assigns to each clause *C* a (possibly empty) multiset *S*(*C*) of negative literals in *C*. In other words, the function S selects (a possibly empty) negative subclause of *C*. We say that an atom *A*, or a literal ¬*A*, is selected by *S* if ¬*A* occurs in *S*(*C*). There are no selected atoms or literals if *S*(*C*) is empty. Lexicographic path ordering can be used as an usual ordering over a total precedence. But in this case the ordering is admissible if predicate symbols have higher precedence than logical symbols and the constants � and ⊥ are smaller than the other logical symbols. It means the ordering is following *A* �≡�⊃� ¬ � ∨ � ∧ � � � ⊥. The handbook also addresses another key issues for automated theorem proving - the efficiency of the proof search. This efficiency is

If we want to generalize the notion of resolution and lift it into first-order case we have to define first the notion of selection function for general clauses. General clauses are multisets of arbitrary quantifier-free formulas, denoting the disjunction of their elements. Note that we can also work with a special case of such general clause with one element, which yields to a standard quantifier-free formula of first-order logic. A (general selection) function is a mapping *S* that assigns to each general clause *C* a (possibly empty) set *C* of non-empty sequences of (distinct) atoms in *C* such that either *S*(*C*) is empty or else, for all interpretations *I* in which *C* is false, there exists a sequence *A*1, ..., *Ak* in *S*(*C*), all atoms of which are true in

We have to define the notion of polarity for these reasons according to the handbook (Bachmair, L.). It is based on the following assumption that a subformula *F*� in *E*[*F*�

polarity of any subformula is NP-complete (hard) problem, we can use syntactic criteria for this computation. In this case the complexity of the algorithm is linear (note that we base our

*2. If* ¬*G is a positive (resp. negative) subformula of F, then G is a negative (resp. positive) subformula*

*5. If G* → *H is a positive subformula of F, then G is a negative subformula and H is a positive*

*3. If G* ∨ *H is a positive subformula of F, then G and H are both positive subformulas of F. 4. If G* ∧ *H is a negative subformula of F, then G and H are both negative subformulas of F.*

/�] (resp. *E*[*F*�

theory on similar syntactic criteria below - structural notions definition).

*6. If G* → ⊥ *is a negative subformula of F, then G is a positive subformula of F.*

. The expression *F*[*G*/⊥] ∨ *F*�

[*G*/�] is the

] is

/⊥]) is a tautology. Thus, if *F*� is *positive* (resp.

) logically implies *E*. Even it should seem that determining of the

occurrence of a subformula of both *F* and *F*�

[*G*/�] (*F*[*G*/⊥]).

closely related with the notion of *redundancy*.

*positive* (resp. *negative*), if *E*[*F*�

**Proposition 1.** *Polarity criteria 1. F is a positive subformula of F.*

*of F.*

*subformula of F.*

*negative*) in *E*, *F*� (resp. ¬*F*�

*I*. A sequence *A*1, ..., *Ak* in *S*(*C*) is said to be *selected* (by *S*).

the resolved subformula.

true in I, so is *F*�

### *7. F is positive in a clause C if it is an element of C.*

Note that this proposition applies both to formulas and clauses and allows us to determine polarity of any subformula in a formula. It is safe to *select any sequence of negative atoms* in a general clause, since a negative atom cannot be false in an interpretation the clause is false. With the notion of the polarity as a selection function there is possible to state another notion of General resolution based on orderings applied to clauses.

#### **General ordered resolution with selection** *O*� *S*

$$\frac{\mathbb{C}\_1(A\_1)...\mathbb{C}\_n(A\_n) \quad D(A\_1,...,A\_n)}{\mathbb{C}\_1(\bot)...\mathbb{C}\_n(\bot) \quad D(\top,...,\top)}\tag{2}$$

where (i) either *A*1, ..., *An* is selected by *S* in *D*, or else *S*(*D*) is empty, *n* = 1, *A*<sup>1</sup> is maximal in *D*, (ii) each atom *Ai* is maximal in *Ci*, and (iii) no clause *Ci* contains a selected atom.

According to the (Bachmair, L.) an inference system based on this rule is refutationally complete. When trying to extend this into the first-order case we to use lifting lemma.

**Lemma 1.** *Lifting lemma*

*Let M be a set of clauses and K* = *G*(*M*) *(set of ground instances). If*

$$\frac{C\_1...C\_n \quad C\_0}{C}$$

*is an inference in O*� *SM* (*K*) *then there exist clauses C*� *<sup>i</sup> in M, a clause C*� *, and a ground substitution σ such that*

$$\frac{C\_1'...C\_n'\quad C\_0'}{C'}$$

*is an inference in O*� *<sup>S</sup>* (*M*)*, Ci* = *C*� *i σ, and C* = *C*� *σ.*

#### **Example 1.** *General resolution - polarity based selection*

*1.* ¬*a* ∨ ¬*b* ∨ *c (axiom),*

*2. a (axiom), 3. b (axiom)*

*4.* ⊥ ∨ ¬� ∨ ¬*b* ∨ *c (a is a negative atom in (1) - selected in (1) as negative premise, and (2) as positive premise respectively)* ⇒ ¬*b* ∨ *c*

*5.* ⊥ ∨ ¬� ∨ *c (b is a negative atom in (4) - selected in (4) as negative premise, and (3) as positive premise respectively)* ⇒ *c*

In the example we used the notion of polarity as a selection function. For example in the line 4 we select the atom a upon negative polarity (according the proposition criteria 1, 3 and 2 level ordered) in formula 1 (it means 1. is a negative premise).

Further we can observe the behavior of the rule within the frame of clausal form resolution. Consider following table showing various cases of resolution on clauses.

#### **Example 2.** *General resolution with equivalence*

*1. a* ∧ *c* ↔ *b* ∧ *d (axiom), 2. a* ∧ *c (axiom), 3.*¬[*b* ∧ *d*] *(axiom) - negated goal 4.* [*a* ∧ ⊥] ∨ [*a* ∧ �] *(resolvent from (2), (2) on c)* ⇒ *a 5.* [*a* ∧ ⊥] ∨ [*a* ∧�↔ *b* ∧ *d*] *((2), (1) on c)* ⇒ *a* ↔ *b* ∧ *d 6.* ⊥ ∨ [� ↔ *b* ∧ *d*] *((4), (5) on a)* ⇒ *b* ∧ *d 7.* ⊥ ∧ *d* ∨�∧ *d ((6), (6) on b)* ⇒ *d 8. b* ∧⊥∨ *b* ∧ � *((6), (6) on d)* ⇒ *b*

the ordering with respect to the scope of variables (which is also essential for skolemization simulation - unification is restricted for existential variables). Polarity enables to decide the global meaning of a variable (e.g. globally an existential variable is universal if its quantification subformula has negative polarity). Sound unification requires further definitions on variable quantification. We will introduce notions of the corresponding quantifier for a variable occurrence, substitution mapping and significance mapping (we have to distinguish between original variables occurring in special axioms and newly introduced

Resolution Principle and Fuzzy Logic 59

*Let F be a formula of FOL, G* = *p*(*t*1, ..., *tn*) ∈ *Sub*∗(*F*) *atom in F and α a variable occurring in ti. Variable mappings Qnt(quantifier assignment), Sbt (variable substitution) and Sig(significance) are*

*A variable α occurring in F* ∈ *LAx* ∪ *SAx is significant w.r.t. existential substitution, Sig*(*α*) = 1 *iff*

Example: ∀*x*(∀*xA*(*x*) → *B*(*x*)) - *Qnt*(*x*) = ∀*xA*(*x*), for *x* in *A*(*x*) and *Qnt*(*x*) = ∀*x*(∀*xA*(*x*) →

Note that with Qnt mapping (assignment of first name matching quantifier variable in a formula hierarchy from bottom) we are able to distinguish between variables of the same name and there is no need to rename any variable. Sbt mapping holds substituted terms in a quantifier and there is no need to rewrite all occurrences of a variable when working with this mapping within unification. It is also clear that if *Qnt*(*α*) = ∅ then *α* is a free variable. These variables could be simply avoided by introducing new universal quantifiers to F. Significance mapping is important for differentiating between original formula universal variables and newly introduced ones during proof search (an existential variable can't be bounded with it). Before we can introduce the standard unification algorithm, we should formulate the notion of global universal and global existential variable (it simulates conversion into prenex normal

Example: *F* = ∀*y*(∀*xA*(*x*) → *B*(*y*)) - *x* is a global existential variable, *y* is a global universal

It is clear w.r.t. skolemization technique that an existential variable can be substituted into an universal one only if all global universal variables over the scope of the existential one have been already substituted by a term. Skolem functors function in the same way. Now we can define the most general unification algorithm based on recursive conditions (extended

*Let F be a formula without free variables and α be a variable occurrence in a term of F.*

*1. <sup>α</sup> is a global universal variable* (*<sup>α</sup>* ∈ *Var*∀(*F*)) *iff* (*Qnt*(*α*) = ∀*α<sup>H</sup>* ∧*Pol*(*Qnt*(*α*)) = 1) *or* (*Qnt*(*α*) = ∃*αH* ∧ *Pol*(*Qnt*(*α*)) = −1) *2. <sup>α</sup> is a global existential variable* (*<sup>α</sup>* ∈ *Var*∃(*F*)) *iff* (*Qnt*(*α*) = ∃*α<sup>H</sup>* ∧*Pol*(*Qnt*(*α*)) = 1) *or* (*Qnt*(*α*) = ∀*αH* ∧ *Pol*(*Qnt*(*α*)) = −1) *Var*∀(*F*) *and Var*∃(*F*) *are sets of global universal and existential variables.*

� *.*

**Definition 2.** *Variable assignment, substitution and significance*

] *is a substitution of term t*� *into α in F* ⇒ *Sbt*(*α*) = *t*

*Qnt*(*α*) = *QαH*, *whereQ* = ∃ ∨ *Q* = ∀, *H*, *I* ∈ *Sub*∗(*F*), *QαH* ∈ *Sup*∗(*G*),

ones in the proof sequence).

∀*QαI* ∈ *Sup*∗(*G*) ⇒ *Lev*(*QαI*) < *Lev*(*QαH*)*.*

*variable is significant, Sig*(*α*) = 0 *otherwise.*

**Definition 3.** *Global quantification*

unification in contrast to standard MGU).

*defined as follows:*

*B*(*x*)), for *x* in *B*(*x*).

*F*[*α*/*t* �

form).

variable.


Table 1. Clausal resolution in the context of the non-clausal resolution

*9.* ⊥∨¬[� ∧ *d*] *((8), (3) on b)* ⇒ ¬*d 10.* ⊥ ∨ ¬� *((7), (9) on d)* ⇒ ⊥ *(refutation)*

When trying to refine the general resolution rule for fuzzy predicate logic, it is important to devise a sound and complete unification algorithm. Standard unification algorithms require variables to be treated only as universally quantified ones. We will present a more general unification algorithm, which can deal with existentially quantified variables without the need for those variables be eliminated by skolemization. It should be stated that the following unification process does not allow an occurrence of the equivalence connective. It is needed to remove equivalence by rewrite rule: *A* ↔ *B* ⇔ [*A* → *B*] ∧ [*B* → *A*].

We assume that the language and semantics of FOL is standard. We use terms - individuals (*a*, *b*, *c*, ...), functions (with n arguments) (*f* , *g*, *h*, ...), variables (*X*,*Y*, *Z*, ...), predicates(with n arguments) (*p*, *q*,*r*, ...), logical connectives (∧, ∨, →, ¬), quantifiers (∃, ∀) and logical constants (⊥, �). We also work with standard notions of logical and special axioms (sets LAx, SAx), logical consequence, consistency etc. as they are used in mathematical logic.

### **Definition 1.** *Structural notions of a FOL formula*

*Let F be a formula of FOL then the structural mappings Sub (subformula), Sup (superformula), Pol (polarity) and Lev (level) are defined as follows:*


*Sup*(*F*) = ∅ ⇒ *Lev*(*F*) = 0, *Pol*(*F*) = 1*,*

*Sup*(*F*) �= ∅ ⇒ *Lev*(*F*) = *Lev*(*Sup*(*F*)) + 1

*For mappings Sub and Sup reflexive and transitive closures Sub*∗ *and Sup*∗ *are defined recursively as follows:*

*1. Sub*∗(*F*) ⊇ {*F*}*, Sup*∗(*F*) ⊇ {*F*} *2. Sub*∗(*F*) ⊇ {*H*|*G* ∈ *Sub*∗(*F*) ∧ *H* ∈ *Sub*(*G*)}*, Sup*∗(*F*) ⊇ {*H*|*G* ∈ *Sup*∗(*F*) ∧ *H* ∈ *Sup*(*G*)}

Example: *A* → *B* - *Pol*(*A*) = −1, *Pol*(*B*) = 1, *Lev*(*A*) = 1

These structural mappings provide framework for assignment of quantifiers to variable occurrences. It is needed for the correct simulation of skolemization (the information about a variable quantification in the prenex form). Subformula and superformula mappings and its closures encapsulate essential hierarchical information of a formula structure. Level gives 4 Will-be-set-by-IN-TECH

When trying to refine the general resolution rule for fuzzy predicate logic, it is important to devise a sound and complete unification algorithm. Standard unification algorithms require variables to be treated only as universally quantified ones. We will present a more general unification algorithm, which can deal with existentially quantified variables without the need for those variables be eliminated by skolemization. It should be stated that the following unification process does not allow an occurrence of the equivalence connective. It is needed

We assume that the language and semantics of FOL is standard. We use terms - individuals (*a*, *b*, *c*, ...), functions (with n arguments) (*f* , *g*, *h*, ...), variables (*X*,*Y*, *Z*, ...), predicates(with n arguments) (*p*, *q*,*r*, ...), logical connectives (∧, ∨, →, ¬), quantifiers (∃, ∀) and logical constants (⊥, �). We also work with standard notions of logical and special axioms (sets LAx, SAx),

*Let F be a formula of FOL then the structural mappings Sub (subformula), Sup (superformula), Pol*

*F* = *G* ∧ *H or F* = *G* ∨ *H Sub*(*F*) = {*G*, *H*}*, Sup*(*G*) = *F, Sup*(*H*) = *F*

*F* = *G* → *H Sub*(*F*) = {*G*, *H*}*, Sup*(*G*) = *F, Sup*(*H*) = *F*

*Pol*(*G*) = −*Pol*(*F*)

*For mappings Sub and Sup reflexive and transitive closures Sub*∗ *and Sup*∗ *are defined recursively as*

*2. Sub*∗(*F*) ⊇ {*H*|*G* ∈ *Sub*∗(*F*) ∧ *H* ∈ *Sub*(*G*)}*, Sup*∗(*F*) ⊇ {*H*|*G* ∈ *Sup*∗(*F*) ∧ *H* ∈ *Sup*(*G*)}

These structural mappings provide framework for assignment of quantifiers to variable occurrences. It is needed for the correct simulation of skolemization (the information about a variable quantification in the prenex form). Subformula and superformula mappings and its closures encapsulate essential hierarchical information of a formula structure. Level gives

*Pol*(*G*) = *Pol*(*F*)*, Pol*(*H*) = *Pol*(*F*)

*Pol*(*G*) = −*Pol*(*F*)*, Pol*(*H*) = *Pol*(*F*)

Premise1 Premise2 Resolvent Simplified Comments *a* ∨ *b b* ∨ *c* (*a* ∨ ⊥) ∨ (� ∨ *c*) � no compl. pair *a* ∨ ¬*b b* ∨ *c* (*a* ∨ �) ∨ (� ∨ *c*) � redundant inference *a* ∨ *b* ¬*b* ∨ *c* (*a* ∨ ⊥) ∨ (⊥ ∨ *c*) *a* ∨ *c* clausal resolution *a* ∨ ¬*b* ¬*b* ∨ *c* (*a* ∨ �) ∨ (⊥ ∨ *c*) � no compl. pair

Table 1. Clausal resolution in the context of the non-clausal resolution

to remove equivalence by rewrite rule: *A* ↔ *B* ⇔ [*A* → *B*] ∧ [*B* → *A*].

**Definition 1.** *Structural notions of a FOL formula*

*(polarity) and Lev (level) are defined as follows:*

*Sup*(*F*) = ∅ ⇒ *Lev*(*F*) = 0, *Pol*(*F*) = 1*, Sup*(*F*) �= ∅ ⇒ *Lev*(*F*) = *Lev*(*Sup*(*F*)) + 1

*1. Sub*∗(*F*) ⊇ {*F*}*, Sup*∗(*F*) ⊇ {*F*}

*follows:*

logical consequence, consistency etc. as they are used in mathematical logic.

*F* = ¬*G Sub*(*F*) = {*G*}*, Sup*(*G*) = *F*

*F* = ∃*αG or F* = ∀*αG Sub*(*F*) = {*G*}*, Sup*(*G*) = *F*

*(α is a variable) Pol*(*G*) = *Pol*(*F*)

Example: *A* → *B* - *Pol*(*A*) = −1, *Pol*(*B*) = 1, *Lev*(*A*) = 1

*9.* ⊥∨¬[� ∧ *d*] *((8), (3) on b)* ⇒ ¬*d 10.* ⊥ ∨ ¬� *((7), (9) on d)* ⇒ ⊥ *(refutation)* the ordering with respect to the scope of variables (which is also essential for skolemization simulation - unification is restricted for existential variables). Polarity enables to decide the global meaning of a variable (e.g. globally an existential variable is universal if its quantification subformula has negative polarity). Sound unification requires further definitions on variable quantification. We will introduce notions of the corresponding quantifier for a variable occurrence, substitution mapping and significance mapping (we have to distinguish between original variables occurring in special axioms and newly introduced ones in the proof sequence).

### **Definition 2.** *Variable assignment, substitution and significance*

*Let F be a formula of FOL, G* = *p*(*t*1, ..., *tn*) ∈ *Sub*∗(*F*) *atom in F and α a variable occurring in ti. Variable mappings Qnt(quantifier assignment), Sbt (variable substitution) and Sig(significance) are defined as follows:*

*Qnt*(*α*) = *QαH*, *whereQ* = ∃ ∨ *Q* = ∀, *H*, *I* ∈ *Sub*∗(*F*), *QαH* ∈ *Sup*∗(*G*), ∀*QαI* ∈ *Sup*∗(*G*) ⇒ *Lev*(*QαI*) < *Lev*(*QαH*)*. F*[*α*/*t* � ] *is a substitution of term t*� *into α in F* ⇒ *Sbt*(*α*) = *t* � *. A variable α occurring in F* ∈ *LAx* ∪ *SAx is significant w.r.t. existential substitution, Sig*(*α*) = 1 *iff variable is significant, Sig*(*α*) = 0 *otherwise.*

Example: ∀*x*(∀*xA*(*x*) → *B*(*x*)) - *Qnt*(*x*) = ∀*xA*(*x*), for *x* in *A*(*x*) and *Qnt*(*x*) = ∀*x*(∀*xA*(*x*) → *B*(*x*)), for *x* in *B*(*x*).

Note that with Qnt mapping (assignment of first name matching quantifier variable in a formula hierarchy from bottom) we are able to distinguish between variables of the same name and there is no need to rename any variable. Sbt mapping holds substituted terms in a quantifier and there is no need to rewrite all occurrences of a variable when working with this mapping within unification. It is also clear that if *Qnt*(*α*) = ∅ then *α* is a free variable. These variables could be simply avoided by introducing new universal quantifiers to F. Significance mapping is important for differentiating between original formula universal variables and newly introduced ones during proof search (an existential variable can't be bounded with it).

Before we can introduce the standard unification algorithm, we should formulate the notion of global universal and global existential variable (it simulates conversion into prenex normal form).

#### **Definition 3.** *Global quantification*

*Let F be a formula without free variables and α be a variable occurrence in a term of F.*


*Var*∀(*F*) *and Var*∃(*F*) *are sets of global universal and existential variables.*

Example: *F* = ∀*y*(∀*xA*(*x*) → *B*(*y*)) - *x* is a global existential variable, *y* is a global universal variable.

It is clear w.r.t. skolemization technique that an existential variable can be substituted into an universal one only if all global universal variables over the scope of the existential one have been already substituted by a term. Skolem functors function in the same way. Now we can define the most general unification algorithm based on recursive conditions (extended unification in contrast to standard MGU).

**Definition 5.** *General resolution for first-order logic* (*GRFOL*)

<sup>1</sup>,..., *G*�

⇒ *Sig*(*α*) = 1 *in F or F*� *iff Sig*(*α*) = 1 *in Fσ*[*G*/⊥] ∨ *F*�

*A* = {*G*1,..., *Gk*, *G*�

*identical.*

*resolvent of the premises on G.*

refutational theorem prover for FOL.

**Example 3.** *Variable Unification Restriction*

*F*<sup>0</sup> : ∀*X*∃*Yp*(*X*,*Y*)*. F*1(¬*query*) : ¬∃*Y*∀*X p*(*X*,*Y*)*.*

*simply unifiable since the variables are the same. Non-trivial cases:*[*F*1&*F*0] *: no resolution is possible.*

*F0 :* ∃*Y*∀*X p*(*X*,*Y*)*. F1 (*¬*query) :* ¬∀*X*∃*Yp*(*X*,*Y*) *In this case we can simply derive a refutation:*

*R*[*F*1&*F*0] *:* ⊥ ∨ ¬�(*ref utation*)

[*F*0&*F*1] *: no resolution is possible (the same reason as above). No refutation could be derived from F*<sup>0</sup> *and F*<sup>1</sup> *due to VUR.*

*Further we would like to prove* ∃*Y*∀*X p*(*X*,*Y*) � ∀*X*∃*Yp*(*X*,*Y*)*.*

*variable over the scope of X in F*1*; Sbt*(*X*) = *X and Sbt*(*Y*) = *Y.*

of conjunction, disjunction etc. to be bound with Łukasiewicz operators.

**3. Fuzzy predicate logic and refutational proof**

*F*[*G*1, , ..., *Gk*] *F*�

*negative premise, G represents an occurrence of an atom. The expression Fσ*[*G*/⊥] ∨ *F*�

*where σ* = *MGU*(*A*) *is the most general unifier (MGU) of the set of the atoms*

[*G*� <sup>1</sup>, ..., *G*� *n*]

*<sup>n</sup>*} *, G* = *G*1*σ. For every variable α in F or F*�

Resolution Principle and Fuzzy Logic 61

Note that with Qnt mapping we are able to distinguish variables not only by its name (which may not be unique) but also with this mapping (it is unique). Sig property enables to separate variables, which were not originally in the scope of an existential variable. When utilizing the rule it should be set the Sig mapping for every variable in axioms and negated goal to one. We present a very simple example of existential variable unification before we introduce the

*We would try to prove if* ∀*X*∃*Yp*(*X*,*Y*) � ∃*Y*∀*X p*(*X*,*Y*)*? We will use refutational proving and therefore we will construct a special axiom from the first formula and negation of the second formula:*

*There are 2 trivial and 2 non-trivial combinations how to resolve F*<sup>0</sup> *and F*<sup>1</sup> *(combinations with the same formula as the positive and the negative premise could not lead to refutation since they are consistent): Trivial cases: R*[*F*1&*F*1] : ⊥∨� *and R*[*F*0&*F*0] : ⊥∨�*. Both of them lead to* � *and the atoms are*

*<sup>Y</sup>* ∈ *Var*∀(*F*1) *and Y* ∈ *Var*∃(*F*0) *can't unify since VUR for* (*Y*,*Y*) *does not hold - there is a variable <sup>X</sup>* ∈ *Sup*∗(*Qnt*(*Y*))*(over the scope), X* ∈ *Var*∀(*F*0), *Sbt*(*X*) = <sup>∅</sup>*); the case with variable X is*

*<sup>X</sup>* ∈ *Var*∀(*F*0) *and X* ∈ *Var*∃(*F*1) *can unify since VUR for* (*X*, *<sup>X</sup>*) *holds - there is no global universal*

The fuzzy predicate logic with evaluated syntax is a flexible and fully complete formalism, which will be used for the below presented extension (Novák, V.). In order to use an efficient form of the resolution principle we have to extend the standard notion of a proof (provability value and degree) with the notion of refutational proof (refutation degree). Propositonal version of the fuzzy resolution principle has been already presented in (Habiballa, H.). We suppose that set of truth values is Łukasiewicz algebra. Therefore we assume standard notions

*<sup>F</sup>σ*[*G*/⊥] <sup>∨</sup> *<sup>F</sup>*�*σ*[*G*/�] (3)

*,* (*Sbt*(*γ*) = *α*) ∩ *σ* = ∅

*σ*[*G*/�] *is the*

*σ*[*G*/�]*. F is called positive and F' is called*

### **Definition 4.** *Most general unifier algorithm*

*Let G* = *p*(*t*1, ..., *tn*) *and G*� = *r*(*u*1, ..., *un*) *be atoms. Most general unifier (substitution mapping) MGU(G, G') = σ is obtained by following atom and term unification steps or the algorithm returns fail-state for unification. For the purposes of the algorithm we define the Variable Unification Restriction (VUR).*

### *Variable Unification Restriction*

*Let F*<sup>1</sup> *be a formula and α be a variable occurring in F*1*, F*<sup>2</sup> *be a formula, t be a term occurring in F*<sup>2</sup> *and β be a variable occurring in F*2*. Variable Unification Restriction (VUR) for (α,t) holds if one of the conditions 1. and 2. holds:*


#### *Atom unification*


#### *Term unification* (*t* � , *u*� )


*MGU*(*A*) = *σ for a set of atoms A* = {*G*1,..., *Gk*} *is computed by the atom unification for* (*G*1, *Gi*), *σ<sup>i</sup>* = *MGU*(*G*1, *Gi*), ∀*i*, *σ*<sup>0</sup> = ∅*, where before every atom unification* (*G*1, *Gi*)*, σ is set to σi*−1*.*

With above defined notions it is simple to state the general resolution rule for FOL (without the equivalence connective). It conforms to the definition from (Bachmair, L.).

6 Will-be-set-by-IN-TECH

*Let G* = *p*(*t*1, ..., *tn*) *and G*� = *r*(*u*1, ..., *un*) *be atoms. Most general unifier (substitution mapping) MGU(G, G') = σ is obtained by following atom and term unification steps or the algorithm returns fail-state for unification. For the purposes of the algorithm we define the Variable Unification*

*Let F*<sup>1</sup> *be a formula and α be a variable occurring in F*1*, F*<sup>2</sup> *be a formula, t be a term occurring in F*<sup>2</sup> *and β be a variable occurring in F*2*. Variable Unification Restriction (VUR) for (α,t) holds if one of the*

*1. α is a global universal variable and t* �= *β, where β is a global existential variable and α not*

*2. α is a global universal variable and t* = *β, where β is a global existential variable and* ∀*F* ∈ *Sup*∗(*Qnt*(*β*))*, F* = *QγG, Q* ∈ {∀, ∃}*, γ is a global universal variable, Sig*(*γ*) = 1 ⇒

*2. if n* > 0 *and p* = *r then perform term unification for pairs* (*t*1, *u*1),...,(*tn*, *un*)*; If for every pair*

*4. if t*� = *a, u*� = *b are individual constants anda=b then for (t',u') unifier exists (success-state).*

) = *σ obtained during term unification (success state).*

) ∈ *σ then perform term unification for* (*v*�

) ∈ *σ then perform term unification for* (*t*

) *holds then unifier exists for* (*t*

) *holds then unifier exists for* (*t*

� <sup>1</sup>, *u*�

) *(success-state for an already substituted variable).*

) *(success-state for an already substituted variable).*

<sup>1</sup>)*, ...,* (*t* � *<sup>n</sup>*, *u*�

*<sup>n</sup>*) *are function symbols with arguments and f* = *g then*

� , *u*�

, *u*�

� , *v*�

*<sup>n</sup>*) *(success-state).*

� , *u*�

� , *u*� ) *(success-state)*

)*; The unifier*

)*; The unifier*

) *holds and σ* =

) *holds and σ* =

) ∈ *σ, r*� *is a term (existential substitution).*

*1. if u*� = *α, t*� = *β are variables and Qnt*(*α*) = *Qnt*(*β*) *then unifier exists for* (*t*

, *u*�

<sup>1</sup>, ..., *u*�

) *exists iff unifier exists for every pair* (*t*

� , *u*�

the equivalence connective). It conforms to the definition from (Bachmair, L.).

, *t* �

*MGU*(*A*) = *σ for a set of atoms A* = {*G*1,..., *Gk*} *is computed by the atom unification for* (*G*1, *Gi*), *σ<sup>i</sup>* = *MGU*(*G*1, *Gi*), ∀*i*, *σ*<sup>0</sup> = ∅*, where before every atom unification* (*G*1, *Gi*)*, σ is set*

With above defined notions it is simple to state the general resolution rule for FOL (without

� , *v*�

*1. if n* = 0 *and p* = *r then σ* = ∅ *and the unifier exists (success-state).*

**Definition 4.** *Most general unifier algorithm*

*occurring in t (non-existential substitution)*

*3. In any other case unifier does not exist (fail-state).*

) *exists iff it exists for* (*v*�

) *exists iff it exists for* (*t*

*<sup>m</sup>*)*, u*� = *g*(*u*�

) *(success-state).*

) *(success-state). 8. In any other case unifier does not exist (fail-state).*

*Restriction (VUR).*

*conditions 1. and 2. holds:*

(*Sbt*(*γ*) = *r*�

*Atom unification*

*Term unification* (*t*

*for* (*t* � , *u*�

*for* (*t* � , *u*�

*5. if t*� = *f*(*t*

*to σi*−1*.*

*unifier for* (*t*

*σ* ∪ (*Sbt*(*α*) = *u*�

*σ* ∪ (*Sbt*(*α*) = *t*

� <sup>1</sup>, ..., *t* �

� , *u*�

*Variable Unification Restriction*

*unifier exists then MGU*(*G*, *G*�

� , *u*� )

*(occurrence of the same variable). 2. if t*� = *α is a variable and* (*Sbt*(*α*) = *v*�

*3. if u*� = *α is a variable and* (*Sbt*(*α*) = *v*�

*6. if t*� = *α is a variable and VUR for* (*t*

*7. if u*� = *α is a variable and VUR for* (*u*�

�

**Definition 5.** *General resolution for first-order logic* (*GRFOL*)

$$\frac{F[\mathcal{G}\_{1\prime}, \dots, \mathcal{G}\_{k}]}{F\sigma[\mathcal{G}/\perp] \lor F'\sigma[\mathcal{G}/\top]}\tag{3}$$

*where σ* = *MGU*(*A*) *is the most general unifier (MGU) of the set of the atoms A* = {*G*1,..., *Gk*, *G*� <sup>1</sup>,..., *G*� *<sup>n</sup>*} *, G* = *G*1*σ. For every variable α in F or F*� *,* (*Sbt*(*γ*) = *α*) ∩ *σ* = ∅ ⇒ *Sig*(*α*) = 1 *in F or F*� *iff Sig*(*α*) = 1 *in Fσ*[*G*/⊥] ∨ *F*� *σ*[*G*/�]*. F is called positive and F' is called negative premise, G represents an occurrence of an atom. The expression Fσ*[*G*/⊥] ∨ *F*� *σ*[*G*/�] *is the resolvent of the premises on G.*

Note that with Qnt mapping we are able to distinguish variables not only by its name (which may not be unique) but also with this mapping (it is unique). Sig property enables to separate variables, which were not originally in the scope of an existential variable. When utilizing the rule it should be set the Sig mapping for every variable in axioms and negated goal to one. We present a very simple example of existential variable unification before we introduce the refutational theorem prover for FOL.

### **Example 3.** *Variable Unification Restriction*

*We would try to prove if* ∀*X*∃*Yp*(*X*,*Y*) � ∃*Y*∀*X p*(*X*,*Y*)*? We will use refutational proving and therefore we will construct a special axiom from the first formula and negation of the second formula: F*<sup>0</sup> : ∀*X*∃*Yp*(*X*,*Y*)*. F*1(¬*query*) : ¬∃*Y*∀*X p*(*X*,*Y*)*.*

*There are 2 trivial and 2 non-trivial combinations how to resolve F*<sup>0</sup> *and F*<sup>1</sup> *(combinations with the same formula as the positive and the negative premise could not lead to refutation since they are consistent): Trivial cases: R*[*F*1&*F*1] : ⊥∨� *and R*[*F*0&*F*0] : ⊥∨�*. Both of them lead to* � *and the atoms are simply unifiable since the variables are the same.*

*Non-trivial cases:*[*F*1&*F*0] *: no resolution is possible.*

*<sup>Y</sup>* ∈ *Var*∀(*F*1) *and Y* ∈ *Var*∃(*F*0) *can't unify since VUR for* (*Y*,*Y*) *does not hold - there is a variable <sup>X</sup>* ∈ *Sup*∗(*Qnt*(*Y*))*(over the scope), X* ∈ *Var*∀(*F*0), *Sbt*(*X*) = <sup>∅</sup>*); the case with variable X is identical.*

[*F*0&*F*1] *: no resolution is possible (the same reason as above). No refutation could be derived from F*<sup>0</sup> *and F*<sup>1</sup> *due to VUR.*

*Further we would like to prove* ∃*Y*∀*X p*(*X*,*Y*) � ∀*X*∃*Yp*(*X*,*Y*)*. F0 :* ∃*Y*∀*X p*(*X*,*Y*)*. F1 (*¬*query) :* ¬∀*X*∃*Yp*(*X*,*Y*) *In this case we can simply derive a refutation: R*[*F*1&*F*0] *:* ⊥ ∨ ¬�(*ref utation*)

*<sup>X</sup>* ∈ *Var*∀(*F*0) *and X* ∈ *Var*∃(*F*1) *can unify since VUR for* (*X*, *<sup>X</sup>*) *holds - there is no global universal variable over the scope of X in F*1*; Sbt*(*X*) = *X and Sbt*(*Y*) = *Y.*

### **3. Fuzzy predicate logic and refutational proof**

The fuzzy predicate logic with evaluated syntax is a flexible and fully complete formalism, which will be used for the below presented extension (Novák, V.). In order to use an efficient form of the resolution principle we have to extend the standard notion of a proof (provability value and degree) with the notion of refutational proof (refutation degree). Propositonal version of the fuzzy resolution principle has been already presented in (Habiballa, H.). We suppose that set of truth values is Łukasiewicz algebra. Therefore we assume standard notions of conjunction, disjunction etc. to be bound with Łukasiewicz operators.

**Definition 8.** *Evaluated proof, refutational proof and refutation degree*

*rsyn*(*Ai*<sup>1</sup> , ..., *Aim* ), *i*1, ..., *im* < *n or ai*

Resolution Principle and Fuzzy Logic 63

*Let T be a fuzzy theory and A* ∈ *FJ a formula. We write T* �*<sup>a</sup> A and say that the formula A is a*

*<sup>T</sup>* �*<sup>a</sup> <sup>A</sup>* iff*<sup>a</sup>* <sup>=</sup> {*Val*(*w*)<sup>|</sup> w is a proof of A from LAx <sup>∪</sup> SAx} (5)

D |= *T* if ∀*A* ∈ LAx : LAx(*A*) ≤ D(*A*), *A* ∈ SAx : SAx(*A*) ≤ D(*A*) (6)

*<sup>T</sup>* <sup>|</sup>=*<sup>a</sup> <sup>A</sup>* iff *<sup>a</sup>* <sup>=</sup> {D(*A*)|D|<sup>=</sup> *<sup>T</sup>*} (7)

*An evaluated refutational formal proof of a formula A from X is w, where additionally a*<sup>0</sup>

*We write T* |=*<sup>a</sup> A and say that the formula A is true in the degree a in the fuzzy theory T.*

*rMP* :

possible to introduce following notion of resolution w.r.t. the modus ponens.

*a* ⊗ *b* 

*where σ* = *MGU*(*A*) *is the most general unifier (MGU) of the set of the atoms*

**Definition 11.** *General resolution for fuzzy predicate logic* (*GRFPL*)

*rGR* : *a* 

*G represents an occurrence of an atom. The expression Fσ*[*G*/⊥] ∨ *F*�

<sup>1</sup>,..., *G*�

*F or F*� *iff Sig*(*α*) = 1 *in Fσ*[*G*/⊥] ∨ *F*�

*a A*, *b A*⇒*B*

> *a* ⊗ *b*

*where from premise A holding in the degree a and premise A*⇒*B holding in the degree b we infer B*

In classical logic *rMP* could be viewed as a special case of the resolution. The fuzzy resolution rule presented below is also able to simulate fuzzy *rMP*. From this fact the completeness of a system based on resolution can be deduced. It will only remain to prove the soundness. It is

*F*[*G*1, , ..., *Gk*], *b*

 *F*� [*G*� <sup>1</sup>, ..., *G*� *n*]

*<sup>n</sup>*} *, G* = *G*1*σ. For every variable α in F or F*�

<sup>∼</sup> *FJ is a finite sequence of evaluated*

 *Ai.*

> *A*<sup>0</sup> :=

*An such that An* := *A and for each i* ≤ *n, either there exists*

*Ai* := *X*(*Ai*)

*<sup>B</sup>* (8)

*<sup>F</sup>σ*[*G*/⊥]∇*F*�*σ*[*G*/�] (9)

*σ*[*G*/�]*. F is called positive and F' is called negative premise,*

*,* (*Sbt*(*γ*) = *α*) ∩ *σ* = ∅

*σ*[*G*/�] *is the resolvent of the*

*An evaluated formal proof of a formula A from the fuzzy set X* ⊂

*We will denote the value of the evaluated proof by Val*(*w*) = *an.*

¬*A and An* := ⊥*. Val*(*w*) = *an is called refutation degree of A.*

*theorem in the degree a, or provable in the degree a in the fuzzy theory T.*

The fuzzy modus ponens rule could be formulated:

*A*1, ..., *an*

*formulas w* := *a*<sup>0</sup>

*ai* 

1   *A*0, *a*<sup>1</sup> 

**Definition 9.** *Provability and truth*

**Definition 10.** *Fuzzy modus ponens*

*holding in the degree a* ⊗ *b.*

*A* = {*G*1,..., *Gk*, *G*�

⇒ *Sig*(*α*) = 1 *in*

*premises on G.*

*an m-ary inference rule r such that*

*Ai* :<sup>=</sup> *<sup>r</sup>evl*(*ai*<sup>1</sup> , ..., *aim* )

We will assume Łukasewicz algebra to be

$$\mathcal{L}\_{\mathbb{E}} = \langle [0,1], \wedge, \vee, \otimes, \to, 0, 1 \rangle$$

where [0, 1] is the interval of reals between 0 and 1, which are the smallest and greatest elements respectively. Basic and additional operations are defined as follows:

*a* ⊗ *b* = 0 ∨ (*a* + *b* − 1) *a* → *b* = 1 ∧ (1 − *a* + *b*) *a* ⊕ *b* = 1 ∧ (*a* + *b*) ¬*a* = 1 − *a*

The biresiduation operation ↔ could be defined *a* ↔ *b* =*d f* (*a* → *b*) ∧ (*b* → *a*), where ∧ is infimum operation. The following properties of L<sup>Ł</sup> will be used in the sequel: *a* ⊗ 1 = *a*, *a* ⊗ 0 = 0, *a* ⊕ 1 = 1, *a* ⊕ 0 = *a*, *a* → 1 = 1, *a* → 0 = ¬*a*, 1 → *a* = *a*, 0 → *a* = 1

The syntax and semantics of fuzzy predicate logic is following:


Graded fuzzy predicate calculus assigns grade to every axiom, in which the formula is valid. It will be written as *a A* where A is a formula and *a* is a syntactic evaluation. We use several standard notions defined in (Novák, V.) namely: inference rule, formal fuzzy theory with set of logical and special axioms, evaluated formal proof.

#### **Definition 6.** *Inference rule*

*An n-ary inference rule r in the graded logical system is a scheme*

$$r: \frac{a\_1 / A\_1, \dots, a\_{\text{fl}} / A\_{\text{fl}}}{r^{\text{evl}}(a\_1, \dots, a\_{\text{fl}}) / r^{\text{syn}}(A\_1, \dots, A\_{\text{fl}})} \tag{4}$$

*using which the evaluated formulas a*<sup>1</sup> *A*1, ..., *an An are assigned the evaluated formula revl*(*a*1, ..., *an*) *rsyn*(*A*1, ..., *An*)*. The syntactic operation rsyn is a partial n-ary operation on FJ and the evaluation operation revl is an n-ary lower semicontinous operation on L (i.e. it preserves arbitrary suprema in all variables).*

**Definition 7.** *Formal fuzzy theory A formal fuzzy theory T in the language J is a triple*

$$T = \langle \text{LAx}, \text{SAx}, R \rangle$$

*where* LAx ⊂ <sup>∼</sup> *FJ is a fuzzy set of logical axioms,* SAx <sup>⊂</sup> <sup>∼</sup> *FJ is a fuzzy set of special axioms, and R is a set of sound inference rules.*

8 Will-be-set-by-IN-TECH

LŁ = �[0, 1], ∧, ∨, ⊗, →, 0, 1� where [0, 1] is the interval of reals between 0 and 1, which are the smallest and greatest

*a* ⊗ *b* = 0 ∨ (*a* + *b* − 1) *a* → *b* = 1 ∧ (1 − *a* + *b*) *a* ⊕ *b* = 1 ∧ (*a* + *b*) ¬*a* = 1 − *a*

The biresiduation operation ↔ could be defined *a* ↔ *b* =*d f* (*a* → *b*) ∧ (*b* → *a*), where ∧ is

• predicates with *p*1, ..., *pm* are syntactically equivalent to FOL ones. Instead of 0 we write ⊥ and instead of 1 we write �, connectives - & (Łukasiewicz conjunction), ∇ (Łukasiewicz disjunction), ⇒ (implication), ¬ (negation), ∀*X* (universal quantifier),∃*X* (existential quantifier) and furthermore by *FJ* we denote set of all formulas of fuzzy logic

• FPL formulas have the following semantic interpretations (D is the universe): Interpretation of terms is equivalent to FOL, D(*pi*(*ti*<sup>1</sup> , ..., *tin* )) = *Pi*(D(*ti*<sup>1</sup> ), ..., D(*tin* )) where *Pi* is a fuzzy relation assigned to *pi*, D(a) = *a* for *a* ∈ [0, 1], D(*A* & *B*) = D(*A*) ⊗ D(*B*),

• for every subformula defined above *Sub*, *Sup*, *Pol*, *Lev*, *Qnt*, *Sbt*, *Sig* and other derived properties defined for classical logic hold (where the classical FOL connective is presented

Graded fuzzy predicate calculus assigns grade to every axiom, in which the formula is valid.

standard notions defined in (Novák, V.) namely: inference rule, formal fuzzy theory with set

*A*1, ..., *an*

*the evaluation operation revl is an n-ary lower semicontinous operation on L (i.e. it preserves arbitrary*

*T* = �LAx, SAx, *R*�

 *An*

*rsyn*(*A*1, ..., *An*)*. The syntactic operation rsyn is a partial n-ary operation on FJ and*

*rsyn*(*A*1, ..., *An*)

*An are assigned the evaluated formula*

<sup>∼</sup> *FJ is a fuzzy set of special axioms, and R is a*

(4)

*revl*(*a*1, ..., *an*)

*A*1, ..., *an*

*A* where A is a formula and *a* is a syntactic evaluation. We use several

D(*A*∇*B*) = D(*A*) ⊕ D(*B*), D(*A*⇒*B*) = D(*A*) → D(*B*), D(¬*A*) = ¬D(*A*), D(∀*X*(*A*)) = D(*A*[*x*/*d*]|*<sup>d</sup>* ∈ *<sup>D</sup>*), D(∃*X*(*A*)) = D(*A*[*x*/*d*]|*<sup>d</sup>* ∈ *<sup>D</sup>*)

*a* ⊗ 1 = *a*, *a* ⊗ 0 = 0, *a* ⊕ 1 = 1, *a* ⊕ 0 = *a*, *a* → 1 = 1, *a* → 0 = ¬*a*, 1 → *a* = *a*, 0 → *a* = 1

elements respectively. Basic and additional operations are defined as follows:

infimum operation. The following properties of L<sup>Ł</sup> will be used in the sequel:

The syntax and semantics of fuzzy predicate logic is following:

the Łukasiewicz one has the same mapping value).

of logical and special axioms, evaluated formal proof.

*An n-ary inference rule r in the graded logical system is a scheme*

*<sup>r</sup>* : *<sup>a</sup>*<sup>1</sup>

We will assume Łukasewicz algebra to be

• terms *t*1, ..., *tn* are defined as in FOL

in language *J*

It will be written as *a*

*revl*(*a*1, ..., *an*)

*where* LAx ⊂

**Definition 6.** *Inference rule*

*suprema in all variables).*

*set of sound inference rules.*

*using which the evaluated formulas a*<sup>1</sup>

**Definition 7.** *Formal fuzzy theory*

*A formal fuzzy theory T in the language J is a triple*

<sup>∼</sup> *FJ is a fuzzy set of logical axioms,* SAx <sup>⊂</sup>

### **Definition 8.** *Evaluated proof, refutational proof and refutation degree*

*An evaluated formal proof of a formula A from the fuzzy set X* ⊂ <sup>∼</sup> *FJ is a finite sequence of evaluated formulas w* := *a*<sup>0</sup> *A*0, *a*<sup>1</sup> *A*1, ..., *an An such that An* := *A and for each i* ≤ *n, either there exists an m-ary inference rule r such that*

*ai Ai* :<sup>=</sup> *<sup>r</sup>evl*(*ai*<sup>1</sup> , ..., *aim* ) *rsyn*(*Ai*<sup>1</sup> , ..., *Aim* ), *i*1, ..., *im* < *n or ai Ai* := *X*(*Ai*) *Ai. We will denote the value of the evaluated proof by Val*(*w*) = *an. An evaluated refutational formal proof of a formula A from X is w, where additionally a*<sup>0</sup> *A*<sup>0</sup> := 1 ¬*A and An* := ⊥*. Val*(*w*) = *an is called refutation degree of A.*

### **Definition 9.** *Provability and truth*

*Let T be a fuzzy theory and A* ∈ *FJ a formula. We write T* �*<sup>a</sup> A and say that the formula A is a theorem in the degree a, or provable in the degree a in the fuzzy theory T.*

$$T \vdash\_a A \text{ if } \text{if } a = \bigvee \{Val(w) | \text{ w is a proof of A from LAx} \cup \text{SAx} \}\tag{5}$$

*We write T* |=*<sup>a</sup> A and say that the formula A is true in the degree a in the fuzzy theory T.*

$$\mathcal{D} \vdash T \text{ if } \forall A \in \text{LAx} : \text{LAx}(A) \leq \mathcal{D}(A), A \in \text{SAx} : \text{SAx}(A) \leq \mathcal{D}(A) \tag{6}$$

$$T \vdash\_a A \text{ iff } a = \bigwedge \{ \mathcal{D}(A) \mid \mathcal{D} \mid = T \} \tag{7}$$

The fuzzy modus ponens rule could be formulated:

**Definition 10.** *Fuzzy modus ponens*

$$r\_{MP}:\ \frac{a \, \slash\!\!\/ A \, \slash\!\/ A \Rightarrow B}{a \otimes b \, \slash\!\/ B} \tag{8}$$

*where from premise A holding in the degree a and premise A*⇒*B holding in the degree b we infer B holding in the degree a* ⊗ *b.*

In classical logic *rMP* could be viewed as a special case of the resolution. The fuzzy resolution rule presented below is also able to simulate fuzzy *rMP*. From this fact the completeness of a system based on resolution can be deduced. It will only remain to prove the soundness. It is possible to introduce following notion of resolution w.r.t. the modus ponens.

### **Definition 11.** *General resolution for fuzzy predicate logic* (*GRFPL*)

$$r\_{GR}:\frac{a\left/F\left[\mathbf{G}\_{1\prime\prime}\dots\mathbf{G}\_{k}\right]\,,\not{b}/F\left[\mathbf{G}\_{1\prime\prime}^{\prime}\dots\mathbf{G}\_{n}^{\prime}\right]}{a\odot\not{b}/F\sigma\left[\mathbf{G}/\perp\right]\nabla F^{\prime}\sigma\left[\mathbf{G}/\top\right]}\tag{9}$$

*where σ* = *MGU*(*A*) *is the most general unifier (MGU) of the set of the atoms A* = {*G*1,..., *Gk*, *G*� <sup>1</sup>,..., *G*� *<sup>n</sup>*} *, G* = *G*1*σ. For every variable α in F or F*� *,* (*Sbt*(*γ*) = *α*) ∩ *σ* = ∅ ⇒ *Sig*(*α*) = 1 *in*

*F or F*� *iff Sig*(*α*) = 1 *in Fσ*[*G*/⊥] ∨ *F*� *σ*[*G*/�]*. F is called positive and F' is called negative premise, G represents an occurrence of an atom. The expression Fσ*[*G*/⊥] ∨ *F*� *σ*[*G*/�] *is the resolvent of the premises on G.*

**Definition 12.** *Refutational resolution theorem prover for FPL*

*rs*<sup>∇</sup> :

**Lemma 3.** *Provability and refutation degree for GRFPL*

*a* ⊥∇*A a A*

proof of we can construct refutational proof as follows (*Val*(*w*) ≤ *a*):

<sup>¬</sup>*A*∇*A*, *ai*<sup>+</sup><sup>2</sup>

<sup>¬</sup>*A*, ..., *<sup>a</sup>*

can be replaced by *rGR*, *rs*∇, *rs*⇒. Indeed, let *<sup>w</sup>* be a proof:

 *B*{*rs*⇒}

*<sup>T</sup>* �*<sup>a</sup> <sup>A</sup>* iff *<sup>a</sup>* = {*Val*(*w*)| w is a refutational proof of A from LAx ∪ SAx}

If *<sup>a</sup>* = {*Val*(*w*)| w is refutational proof of A from LAx ∪ SAx}(*Val*(*w*) ≤ *<sup>a</sup>*):

a since the formulas are either axioms or results of application of resolution.

**Theorem 1.** *Completeness for fuzzy logic with rGR, rs*∇*, rs*<sup>⇒</sup> *instead of rMP*

*<sup>A</sup>*⇒*<sup>B</sup>* {proof *wA*⇒*B*}, *<sup>a</sup>* <sup>⊗</sup> *<sup>b</sup>*

*<sup>A</sup>*⇒*B*{*proo f wA*⇒*B*}, *<sup>a</sup>* <sup>⊗</sup> *<sup>b</sup>*

*Ai*, *ai*<sup>+</sup><sup>2</sup>

**Definition 13.** *Simplification rules for* ∇, ⇒

 *Ai*, 1 

 *Ai*, 1 

and the formula ¬*A*∇*A* may be removed.

*A*0, ..., *ai*

*w* := *a* 

*w* := *a*<sup>0</sup>

*w* := *a* 

proof: *w* := *a* 

*a* ⊗ *b*  The proof *w*�� := *a*<sup>0</sup>

There is a proof: *w*� := *a*<sup>0</sup>

*A* {proof *A*}, 1

*A*0, ..., *ai*

*A*0, ..., *ai*

*set of formulas T* �*<sup>a</sup> A iff T* |=*<sup>a</sup> A.*

*A* {proof *wa*}, *b*

*<sup>A</sup>*{*proo f wa*}, *<sup>b</sup>*

**4. Implementation and efficiency**

�⇒*B*{*rs*∇, *<sup>a</sup>* <sup>⊗</sup> *<sup>b</sup>*

*Refutational non-clausal resolution theorem prover for FPL* (*RRTPFPL*) *is the inference system with the inference rule GRFPL and sound simplification rules for* ⊥*,* � *(standard equivalencies for logical constants). A refutational proof by definition 8 represents a proof of a formula G (goal) from the set of special axioms N. It is assumed that Sig*(*α*) = 1 *for* ∀*α in F* ∈ *N* ∪ ¬*G formula, every formula in a*

Resolution Principle and Fuzzy Logic 65

*and rs*<sup>⇒</sup> :

⊥, where *A*0, ..., *Ai* are axioms.

 ⊥∇*A*.

*Ai*<sup>+</sup>2∇*A*, ..., *<sup>a</sup>*

*Ai*<sup>+</sup>2∇*A*, ..., *<sup>a</sup>*

All the schemes of the type *Aj*∇*A* , *j* > *i* could be simplified by sound simplification rules

*Formal fuzzy theory, where rMP is replaced with rGR, rs*∇*, rs*⇒*, is complete i.e. for every A from the*

*Proof.* The left to right implication (soundness of such formal theory) could be easily done from the soundness of the resolution rule. Conversely it is sufficient to prove that the rule *rMP*

Using the last sequence we can easily make a proof with *rMP* also with the proposed *rR* and simplification rules. Since usual formal theory with *rMP* is complete as it is proved in (Novák, V.), every fuzzy formal theory with these rules is also complete. Note that the non-ground case (requiring unification) could be simulated in the same way like in the proof of soundness.

The author also currently implements the non-clausal theorem prover into fuzzy logic as an extension of previous prover for FOL (GEneralized Resolution Deductive System - GERDS) (Habiballa, H.). Experiments concerning prospective inference strategies can be performed with this extension. The prover called Fuzzy Predicate Logic GEneralized Resolution

⊥∇[�⇒*B*]{*rGR*},

*Proof.* If *<sup>T</sup>* �*<sup>a</sup> <sup>A</sup>* then *<sup>a</sup>* = {*Val*(*w*)| w is a proof of A from LAx ∪ SAx} and for every such a

<sup>¬</sup>*<sup>A</sup>* {member of refutational proof}, *<sup>a</sup>* <sup>⊗</sup> <sup>1</sup>

*a* �⇒*A a A*

> ⊥ {*rGR*}

*A* is a correct proof of A in the degree

*B* {*rMP*}. Then we can replace it by the

*proof has no free variable and has no quantifier for a variable not occurring in the formula.*

### **Lemma 2.** *Soundness of rGR*

*The inference rule rGR for FPL based on* L*Ł is sound i.e. for every truth valuation* D*,*

$$\mathcal{D}(r^{syn}(A\_1, \ldots, A\_{\text{fl}})) \ge r^{\varepsilon \text{vl}}(\mathcal{D}(A\_1), \ldots, \mathcal{D}(A\_{\text{fl}})) \tag{10}$$

*holds true.*

*Proof.* Before we solve the core of *GRFPL* we should prove that the unification algorithm preserves soundness. But it could be simply proved since in the classical FPL with the rule of Modus-Ponens (Novák, V.) from the axiom � (∀*x*)*A*⇒*A*[*x*/*t*] and � (∀*x*)*A* we can prove *A*[*x*/*t*]. For *rGR* we may rewrite the values of the left and right parts of equation (10):

$$\mathcal{D}(r^{\text{sym}}(A\_1, \dots, A\_n)) = \mathcal{D}[\mathcal{D}(F\_1[G/\perp])\nabla \mathcal{D}(F\_2[G/\top])]$$

$$r^{\text{col}}(\mathcal{D}(A\_1), \dots, \mathcal{D}(A\_n)) = \mathcal{D}(F\_1[G]) \otimes \mathcal{D}(F\_2[G])$$

It is sufficient to prove the equality for ⇒ since all other connectives could be defined by it. By induction on the complexity of formula |*A*|, defined as the number of occurrences of connectives, we can prove:

Let premises *F*<sup>1</sup> and *F*<sup>2</sup> be atomic formulas. Since they must contain the same subformula then *F*<sup>1</sup> = *F*<sup>2</sup> = *G* and it holds

$$\mathcal{D}[\mathcal{D}(F\_{\mathbb{T}}[G/\perp])\nabla \mathcal{D}(F\_{\mathbb{Z}}[G/\top])] = D(\perp \nabla \top) = 0 \oplus 1 = 1 \geq \mathcal{D}(F\_{\mathbb{T}}[G]) \otimes \mathcal{D}(F\_{\mathbb{Z}}[G])$$

Induction step: Let premises *F*<sup>1</sup> and *F*<sup>2</sup> be complex formulas and let *A* and *B* are subformulas of *F*1, *C* and *D* are subformulas of *F*<sup>2</sup> and *G* is an atom where generally *F*<sup>1</sup> = (*A*⇒*B*) and *F*<sup>2</sup> = (*C*⇒*D*). The complexity of |*F*1| = |*A*| + 1 or |*F*1| = |*B*| + 1 and |*F*2| = |*C*| + 1 or |*F*2| = |*D*| + 1. Since they must contain the same subformula and for *A*, *B*, *C*, *D* the induction presupposition hold it remain to analyze the following cases:

1. *F*<sup>1</sup> = *A*⇒*G F*<sup>2</sup> = *G*⇒*D* : D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] = D([*A*⇒⊥]∇[�⇒*D*]) = D(¬*A*∇*D*) = 1 ∧ (1 − *a* + *d*)

We have rewritten the expression into Łukasiewicz interpretation. Now we will try to rewrite the right side of the inequality, which has to be proven.

D(*F*1[*G*]) ⊗ D(*F*2[*G*]) = D(*A*⇒*G*) ⊗ D(*G*⇒*D*) = 0 ∨ ((1 ∧ (1 − *a* + *g*)) + (1 ∧ (1 − *g* + *d*)) − 1) = 1∧ (1 − *a* + *d*) The left and right side of the equation (10) are equal and therefore

$$\mathcal{D}[\mathcal{D}(F\_1[G/\perp])\nabla \mathcal{D}(F\_2[G/\top])] \ge \mathcal{D}(F\_1[G]) \otimes \mathcal{D}(F\_2[G])$$

for this case holds.


By induction we have proven that the inequality holds and the *rR* is sound. The induction of the case where only one of the premises has greater complexity is included in the above solved induction step.

10 Will-be-set-by-IN-TECH

*Proof.* Before we solve the core of *GRFPL* we should prove that the unification algorithm preserves soundness. But it could be simply proved since in the classical FPL with the rule of Modus-Ponens (Novák, V.) from the axiom � (∀*x*)*A*⇒*A*[*x*/*t*] and � (∀*x*)*A* we can prove *A*[*x*/*t*]. For *rGR* we may rewrite the values of the left and right parts of equation (10):

<sup>D</sup>(*rsyn*(*A*1, ..., *An*)) = <sup>D</sup>[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])]

*<sup>r</sup>evl*(D(*A*1), ..., <sup>D</sup>(*An*)) = <sup>D</sup>(*F*1[*G*]) ⊗ D(*F*2[*G*]) It is sufficient to prove the equality for ⇒ since all other connectives could be defined by it. By induction on the complexity of formula |*A*|, defined as the number of occurrences of

Let premises *F*<sup>1</sup> and *F*<sup>2</sup> be atomic formulas. Since they must contain the same subformula then

D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] = *D*(⊥∇�) = 0 ⊕ 1 = 1 ≥ D(*F*1[*G*]) ⊗ D(*F*2[*G*]) Induction step: Let premises *F*<sup>1</sup> and *F*<sup>2</sup> be complex formulas and let *A* and *B* are subformulas of *F*1, *C* and *D* are subformulas of *F*<sup>2</sup> and *G* is an atom where generally *F*<sup>1</sup> = (*A*⇒*B*) and *F*<sup>2</sup> = (*C*⇒*D*). The complexity of |*F*1| = |*A*| + 1 or |*F*1| = |*B*| + 1 and |*F*2| = |*C*| + 1 or |*F*2| = |*D*| + 1. Since they must contain the same subformula and for *A*, *B*, *C*, *D* the induction

1. *F*<sup>1</sup> = *A*⇒*G F*<sup>2</sup> = *G*⇒*D* : D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] = D([*A*⇒⊥]∇[�⇒*D*]) =

We have rewritten the expression into Łukasiewicz interpretation. Now we will try to

D(*F*1[*G*]) ⊗ D(*F*2[*G*]) = D(*A*⇒*G*) ⊗ D(*G*⇒*D*) = 0 ∨ ((1 ∧ (1 − *a* + *g*)) + (1 ∧ (1 − *g* + *d*)) − 1) = 1∧ (1 − *a* + *d*) The left and right side of the equation (10) are equal and therefore

D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] ≥ D(*F*1[*G*]) ⊗ D(*F*2[*G*])

2. *F*<sup>1</sup> = *A*⇒*G F*<sup>2</sup> = *C*⇒*G* : D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] = D([*A*⇒⊥]∇[*C*⇒�]) = 1 ≥

3. *F*<sup>1</sup> = *G*⇒*B F*<sup>2</sup> = *G*⇒*D* : D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] = D([⊥⇒*B*]∇[�⇒*D*]) = 1 ≥

4. *F*<sup>1</sup> = *G*⇒*B F*<sup>2</sup> = *C*⇒*G* : D[D(*F*1[*G*/⊥])∇D(*F*2[*G*/�])] = D([⊥⇒*B*]∇[*C*⇒�]) = 1 ≥

By induction we have proven that the inequality holds and the *rR* is sound. The induction of the case where only one of the premises has greater complexity is included in the above

presupposition hold it remain to analyze the following cases:

rewrite the right side of the inequality, which has to be proven.

<sup>D</sup>(*rsyn*(*A*1, ..., *An*)) <sup>≥</sup> *<sup>r</sup>evl*(D(*A*1), ..., <sup>D</sup>(*An*)) (10)

*The inference rule rGR for FPL based on* L*Ł is sound i.e. for every truth valuation* D*,*

**Lemma 2.** *Soundness of rGR*

connectives, we can prove:

*F*<sup>1</sup> = *F*<sup>2</sup> = *G* and it holds

D(¬*A*∇*D*) = 1 ∧ (1 − *a* + *d*)

for this case holds.

D(*F*1[*G*]) ⊗ D(*F*2[*G*])

D(*F*1[*G*]) ⊗ D(*F*2[*G*])

D(*F*1[*G*]) ⊗ D(*F*2[*G*])

solved induction step.

*holds true.*

### **Definition 12.** *Refutational resolution theorem prover for FPL*

*Refutational non-clausal resolution theorem prover for FPL* (*RRTPFPL*) *is the inference system with the inference rule GRFPL and sound simplification rules for* ⊥*,* � *(standard equivalencies for logical constants). A refutational proof by definition 8 represents a proof of a formula G (goal) from the set of special axioms N. It is assumed that Sig*(*α*) = 1 *for* ∀*α in F* ∈ *N* ∪ ¬*G formula, every formula in a proof has no free variable and has no quantifier for a variable not occurring in the formula.*

**Definition 13.** *Simplification rules for* ∇, ⇒

$$r\_{s\nabla} \colon \frac{a/\perp \nabla A}{a/A} \quad \text{and} \quad r\_{s\Longrightarrow} \colon \frac{a/\top \Rightarrow A}{a/A}$$

**Lemma 3.** *Provability and refutation degree for GRFPL <sup>T</sup>* �*<sup>a</sup> <sup>A</sup>* iff *<sup>a</sup>* = {*Val*(*w*)| w is a refutational proof of A from LAx ∪ SAx}

*Proof.* If *<sup>T</sup>* �*<sup>a</sup> <sup>A</sup>* then *<sup>a</sup>* = {*Val*(*w*)| w is a proof of A from LAx ∪ SAx} and for every such a proof of we can construct refutational proof as follows (*Val*(*w*) ≤ *a*): *w* := *a A* {proof *A*}, 1 <sup>¬</sup>*<sup>A</sup>* {member of refutational proof}, *<sup>a</sup>* <sup>⊗</sup> <sup>1</sup> ⊥ {*rGR*} If *<sup>a</sup>* = {*Val*(*w*)| w is refutational proof of A from LAx ∪ SAx}(*Val*(*w*) ≤ *<sup>a</sup>*): *w* := *a*<sup>0</sup> *A*0, ..., *ai Ai*, 1 <sup>¬</sup>*A*, ..., *<sup>a</sup>* ⊥, where *A*0, ..., *Ai* are axioms. There is a proof: *w*� := *a*<sup>0</sup> *A*0, ..., *ai Ai*, 1 <sup>¬</sup>*A*∇*A*, *ai*<sup>+</sup><sup>2</sup> *Ai*<sup>+</sup>2∇*A*, ..., *<sup>a</sup>* ⊥∇*A*. All the schemes of the type *Aj*∇*A* , *j* > *i* could be simplified by sound simplification rules and the formula ¬*A*∇*A* may be removed. 

The proof *w*�� := *a*<sup>0</sup> *A*0, ..., *ai Ai*, *ai*<sup>+</sup><sup>2</sup> *Ai*<sup>+</sup>2∇*A*, ..., *<sup>a</sup> A* is a correct proof of A in the degree a since the formulas are either axioms or results of application of resolution.

### **Theorem 1.** *Completeness for fuzzy logic with rGR, rs*∇*, rs*<sup>⇒</sup> *instead of rMP*

*Formal fuzzy theory, where rMP is replaced with rGR, rs*∇*, rs*⇒*, is complete i.e. for every A from the set of formulas T* �*<sup>a</sup> A iff T* |=*<sup>a</sup> A.*

*Proof.* The left to right implication (soundness of such formal theory) could be easily done from the soundness of the resolution rule. Conversely it is sufficient to prove that the rule *rMP* can be replaced by *rGR*, *rs*∇, *rs*⇒. Indeed, let *<sup>w</sup>* be a proof:

*w* := *a A* {proof *wa*}, *b <sup>A</sup>*⇒*<sup>B</sup>* {proof *wA*⇒*B*}, *<sup>a</sup>* <sup>⊗</sup> *<sup>b</sup> B* {*rMP*}. Then we can replace it by the proof:

$$w := a / A \{ 
frac{
sigma\_a}{
rho} \}, b / A 
Rightarrow 
{a 
odot 
w\_{A 
Rightarrow B}}, a 
⊗ b / 
\perp 
∇ [
⌊
 TR]
or 
{a 
odot b}, a 
⊗ b / 
\top 
{a 
odot b}, b 
$$

Using the last sequence we can easily make a proof with *rMP* also with the proposed *rR* and simplification rules. Since usual formal theory with *rMP* is complete as it is proved in (Novák, V.), every fuzzy formal theory with these rules is also complete. Note that the non-ground case (requiring unification) could be simulated in the same way like in the proof of soundness.

#### **4. Implementation and efficiency**

The author also currently implements the non-clausal theorem prover into fuzzy logic as an extension of previous prover for FOL (GEneralized Resolution Deductive System - GERDS) (Habiballa, H.). Experiments concerning prospective inference strategies can be performed with this extension. The prover called Fuzzy Predicate Logic GEneralized Resolution

resolve all possible combinations of an atom. It uses the following scheme:

**Fuzzy predicate Logic redundancy-based inefficient knowledge bases**

Example: *Rnew* = *p*(*a*), *Rold* = ∀*x*(*p*(*x*)), ¬(∀*x*(*p*(*x*)) → *p*(*a*)) MGU: *Sbt*(*x*) = *a*, *Res* = ¬(⊥→⊥) ∨ ¬(�→�) ⇒ ⊥ We have proved that *Rnew* is a logical consequence of *Rold*.

*b* ≥ *a* then Kill *Rold* (resolvent is removed).

processor as described below.

evaluation degree.

*a* ∧ *b*1⇒*z,*

*a* ∧ *b*<sup>1</sup> ∧ *b*1⇒*z,*

*a* ∧ *b*<sup>1</sup> ∧ *b*<sup>1</sup> ∧ *b*1⇒*z,*

*a* ∧ *b*<sup>1</sup> ∧ *b*<sup>1</sup> ∧ *b*<sup>1</sup> ∧ *b*1⇒*z,*

*...,* 0.51

0.61

0.71

0.81

**Example 4.** *Redundant knowledge base Consider the following knowledge base (fragment):*

limitation of the algorithm and it will not affect the completeness of the *RRTPFOL*.

*Rold* � *Rnew* ⇔ ¬(*Rold* → *Rnew*) � ⊥ Even the usage of this teachnique is a semidecidable problem, we can use time or step

Resolution Principle and Fuzzy Logic 67

In FPL we have to enrich the DCF procedure by the limitation on the provability degree. if *U* �*<sup>a</sup> Rold* ∧ *U* �*<sup>b</sup> Rnew* ∧ *b* ≤ *a* then we can apply DCF. DCF Trivial check performs a symbolic comparison of *Rold* and *Rnew* we use the same provability degree condition. In other cases we have to add *Rnew* into the set of resolvents and we can apply DCF Kill procedure. DCF Kill searches for every *Rold* being a logical consequence of *Rnew* and if *U* �*<sup>a</sup> Rold* ∧ *U* �*<sup>b</sup> Rnew* ∧

We will now show some efficiency results concerning many-valued logic both for Fuzzy Predicate Logic. We have used the above mentioned application FPLGERDS and originally developed DCF strategy for FPL. It is clear that inference in *RRTPFPL* and *RRTPFDL* on general knowledge bases is a problem solved in exponential time. Nevertheless as we would like to demonstrate the need to search for every possible proof (in contrast to the two-valued logic) will not necessarily in particular cases lead to the inefficient theory. We have devised knowledge bases (KB) on the following typical problems related to the use of fuzzy logic. We have performed experimental measurements concerning efficiency of the presented non-clausal resolution principle and also DCF technique. These measurements were done using the FPLGERDS application (Habiballa, H.). Special testing knowledge bases were prepared and several types of inference were tested on a PC with standard Intel Pentium 4

As it was shown above in the theorem proving example the problem of proof search is quite different in FPL and FDL in comparison with the two-valued logic. We have to search for the best refutation degree using refutational theorem proving in order to make sensible conclusions from the inference process. It means we cannot accept the **first successful** proof, but we have to check **"all possible proofs"** or we have to be sure that every omitted proof is **worse** that some another one. The presented DCF and DCF Kill technique belong to the third sort of proof search strategies, i.e. they omit proofs that are really worse than some another (see the explication above). Proofs and formulas causing this could be called redundant proofs and redundant formulas. Fuzzy logic makes this redundancy dimensionally harder since we could produce not only equivalent formulas but also equivalent formulas of different

Deductive System (Fig. 1) - FPLGERDS provides standard interface for input (knowledge base and goals) and output (proof sequence and results of fuzzy inference, statistics).

Fig. 1. Fuzzy Predicate Logic GEneralized Resolution Deductive System

There are already several efficient strategies proposed by author (mainly Detection of Consequent Formulas (DCF) adopted for the usage also in FPL). With these strategies the proving engine can be implemented in real-life applications since the complexity of theorem proving in FPL is dimensionally harder than in FOL (the need to search for all possible proofs - we try to find the best refutation degree). The DCF idea is to forbid the addition of a resolvent which is a logical consequence of any previously added resolvent. For refutational theorem proving it is a sound and complete strategy and it is emiprically very effective. Completeness of such a strategy is also straight-forward in FOL:

$$(R\_{old} \vdash R\_{new}) \land (\mathsf{U}, R\_{new} \vdash \bot) \Rightarrow (\mathsf{U}, R\_{old} \vdash \bot)$$

Example: *Rnew* = *p*(*a*), *Rold* = ∀*x*(*p*(*x*)), *Rold* � *Rnew*.

DCF could be implemented by the same procedures like General Resolution (we may utilize self-resolution). Self-resolution has the same positive and negative premise and needs to 12 Will-be-set-by-IN-TECH

Deductive System (Fig. 1) - FPLGERDS provides standard interface for input (knowledge

base and goals) and output (proof sequence and results of fuzzy inference, statistics).

Fig. 1. Fuzzy Predicate Logic GEneralized Resolution Deductive System

Completeness of such a strategy is also straight-forward in FOL:

Example: *Rnew* = *p*(*a*), *Rold* = ∀*x*(*p*(*x*)), *Rold* � *Rnew*.

There are already several efficient strategies proposed by author (mainly Detection of Consequent Formulas (DCF) adopted for the usage also in FPL). With these strategies the proving engine can be implemented in real-life applications since the complexity of theorem proving in FPL is dimensionally harder than in FOL (the need to search for all possible proofs - we try to find the best refutation degree). The DCF idea is to forbid the addition of a resolvent which is a logical consequence of any previously added resolvent. For refutational theorem proving it is a sound and complete strategy and it is emiprically very effective.

(*Rold* � *Rnew*) ∧ (*U*, *Rnew* � ⊥) ⇒ (*U*, *Rold* � ⊥)

DCF could be implemented by the same procedures like General Resolution (we may utilize self-resolution). Self-resolution has the same positive and negative premise and needs to resolve all possible combinations of an atom. It uses the following scheme:

$$R\_{old} \vdash R\_{new} \Leftrightarrow \neg (R\_{old} \to R\_{new}) \vdash \bot$$

Even the usage of this teachnique is a semidecidable problem, we can use time or step limitation of the algorithm and it will not affect the completeness of the *RRTPFOL*. Example: *Rnew* = *p*(*a*), *Rold* = ∀*x*(*p*(*x*)), ¬(∀*x*(*p*(*x*)) → *p*(*a*)) MGU: *Sbt*(*x*) = *a*, *Res* = ¬(⊥→⊥) ∨ ¬(�→�) ⇒ ⊥ We have proved that *Rnew* is a logical consequence of *Rold*.

In FPL we have to enrich the DCF procedure by the limitation on the provability degree. if *U* �*<sup>a</sup> Rold* ∧ *U* �*<sup>b</sup> Rnew* ∧ *b* ≤ *a* then we can apply DCF. DCF Trivial check performs a symbolic comparison of *Rold* and *Rnew* we use the same provability degree condition. In other cases we have to add *Rnew* into the set of resolvents and we can apply DCF Kill procedure. DCF Kill searches for every *Rold* being a logical consequence of *Rnew* and if *U* �*<sup>a</sup> Rold* ∧ *U* �*<sup>b</sup> Rnew* ∧ *b* ≥ *a* then Kill *Rold* (resolvent is removed).

We will now show some efficiency results concerning many-valued logic both for Fuzzy Predicate Logic. We have used the above mentioned application FPLGERDS and originally developed DCF strategy for FPL. It is clear that inference in *RRTPFPL* and *RRTPFDL* on general knowledge bases is a problem solved in exponential time. Nevertheless as we would like to demonstrate the need to search for every possible proof (in contrast to the two-valued logic) will not necessarily in particular cases lead to the inefficient theory. We have devised knowledge bases (KB) on the following typical problems related to the use of fuzzy logic.

We have performed experimental measurements concerning efficiency of the presented non-clausal resolution principle and also DCF technique. These measurements were done using the FPLGERDS application (Habiballa, H.). Special testing knowledge bases were prepared and several types of inference were tested on a PC with standard Intel Pentium 4 processor as described below.

### **Fuzzy predicate Logic redundancy-based inefficient knowledge bases**

As it was shown above in the theorem proving example the problem of proof search is quite different in FPL and FDL in comparison with the two-valued logic. We have to search for the best refutation degree using refutational theorem proving in order to make sensible conclusions from the inference process. It means we cannot accept the **first successful** proof, but we have to check **"all possible proofs"** or we have to be sure that every omitted proof is **worse** that some another one. The presented DCF and DCF Kill technique belong to the third sort of proof search strategies, i.e. they omit proofs that are really worse than some another (see the explication above). Proofs and formulas causing this could be called redundant proofs and redundant formulas. Fuzzy logic makes this redundancy dimensionally harder since we could produce not only equivalent formulas but also equivalent formulas of different evaluation degree.

**Example 4.** *Redundant knowledge base Consider the following knowledge base (fragment):*

*...,* 0.51 *a* ∧ *b*1⇒*z,* 0.61 *a* ∧ *b*<sup>1</sup> ∧ *b*1⇒*z,* 0.71 *a* ∧ *b*<sup>1</sup> ∧ *b*<sup>1</sup> ∧ *b*1⇒*z,* 0.81 *a* ∧ *b*<sup>1</sup> ∧ *b*<sup>1</sup> ∧ *b*<sup>1</sup> ∧ *b*1⇒*z,*

**Search DCF** Code **Description** Breadth Trivial BT Complete Breadth DCF BDC Complete Breadth DCF Kill BDK Complete Mod. Linear Trivial MT Incomplete (+) Mod. Linear DCF MDC Incomplete (+) Mod. Linear DCF Kill MDK Incomplete (+) Linear Trivial LT Incomplete Linear DCF LDC Incomplete Linear DCF Kill LDK Incomplete

Resolution Principle and Fuzzy Logic 69

it is computationally very simple and forms necessary essential restriction for possibly infinite inference process. The next method of DCF technique enables do detect the equivalency of a formula (potential new resolvent) by the means described above. DCF Kill technique additionally tries to remove every redundant resolvent from the set of resolvents. The important aspect of the theorem DCF lies in its simple implementation into an automated theorem prover based on general resolution. The prover handles formulas in the form of syntactical tree. It is programmed a procedure performing general resolution with two formulas on an atom. This procedure is also used for the implementation of the theorem. A "virtual tree" is created from candidate and former resolvent (axiom) connected by negated implication. Then it remains to perform self-resolution on such formula until a logical value is obtained. Let us compare the efficiency of standard strategies and the above-defined one. We have built-up 9 combinations of inference strategies from the mentioned proof search and DCF heuristics. They have different computational strength, i.e. their completeness is different for various classes of formulas. Fully complete (as described above) for general formulas of FPL and FDL are only breadth-first search combinations. Linear search strategies are not complete even for two-valued logic and horn clauses. Modified linear search has generally bad completeness results when an infinite loop is present in proofs, but for guarded knowledge bases it can assure completeness preserving better space efficiency than breadth-first search. We tested presented inference strategies on sample knowledge bases with redundancy level 5 with 20, 40, 60, 80 and 100 groups of mutually redundant formulas (total number of formulas in knowledge base is 120, 240, 360, 480 and 600). At first we have tested their time efficiency for inference process. As it could be observed from figure 2, the best results have **LDK and LDC** strategies. For simple guarded knowledge bases (not leading to an infinite loop in proof search and where the goal itself assures the best refutation degree) these two methods are **very efficient**. DCF strategies significantly reduces the proof search even in comparison with LT strategy (standard), therefore the usage of any non-trivial DCF heuristics is significant. Next important result concludes from the comparison of BDK and MDK, MDC strategies. We can conclude that MDK and MDC strategies are relatively comparable to BDK and moreover BDK

Space complexity is even more significantly affected by the DCF heuristics. There is an interesting comparison of trivial and non-trivial DCF heuristics in figure 3. Even BDK strategy brings significant reduction of resolvents amount, while LDK, LDC, MDK, MDC strategies have minimal necessary amount of kept resolvents during inference process. The second examined redundancy level 10 shows also important comparison for increasing redundancy in knowledge bases. Tested knowledge bases contained 10, 20, 30, 40 and 50 groups of 10 equivalent formulas (the total number of formulas was 110, 220, 330, 440 and 550 formulas).

Table 4. Inference strategies

preserves completeness for general knowledge bases.


Table 2. Proof search algorithms


Table 3. DCF heuristics

0.91 *<sup>a</sup>* <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*1⇒*z,* <sup>1</sup> *b*1*,*

*...,* 0.52 *a* ∧ *b*2⇒*z,* 0.62 *a* ∧ *b*<sup>2</sup> ∧ *b*2⇒*z,* 0.72 *a* ∧ *b*<sup>2</sup> ∧ *b*<sup>2</sup> ∧ *b*2⇒*z,* 0.82 *a* ∧ *b*<sup>2</sup> ∧ *b*<sup>2</sup> ∧ *b*<sup>2</sup> ∧ *b*2⇒*z,* 0.92 *<sup>a</sup>* <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*2⇒*z,* <sup>1</sup> *b*2*, ...,*

```
Goal: ? − a⇒z
```
*Searching for the best proof of a goal will produce a lot of logically equivalent formulas with different degrees. These resolvents make the inference process inefficient and one of the essential demands to the presented refutational theorem prover is a reasonable inference strategy with acceptable time complexity.*

We have compared efficiency of the standard **breadth-first search**, **linear search** and **modified linear search** (starting from every formula in knowledge base) and also combinations with DCF and DCF-kill technique (Habiballa, H.). We have prepared knowledge bases of the size 120, 240, 360, 480 and 600 formulas. It has been compared the time and space efficiency on the criterion of 2 redundancy levels. This level represents the number of redundant formulas to which the formula is equivalent (including the original formula). For example the level 5 means the knowledge base contain 5 equivalent redundant formulas for every formula (including the formula itself). The basic possible state space search techniques and DCF heuristics and their combinations are presented in the following tables.

We use standard state space search algorithms in the FPLGERDS application - Breadth-first and Linear search. Breadth-first method searches for every possible resolvent from the formulas of the level 0 (goal and special axioms). These resolvents form formulas of the level 1 and we try to combine them with all formulas of the same and lower level and continue by the same procedure until no other non-redundant resolvent could be found. Linear search performs depth-first search procedure, where every produced resolvent is used as one of the premises in succeeding step of inference. The first produced resolvents arises from the goal formula. Modified linear search method posses the same procedure as linear one, but it starts from goal and also from all the special axioms.

DCF methods for reduction of resolvent space are basically three. The simplest is trivial DCF method, which detects redundant resolvent only by its exact symbolic comparison, i.e. formulas are equivalent only if the are syntactically the same. Even it is a very rough method, 14 Will-be-set-by-IN-TECH

Breadth B Level order generation, start - special axioms + goal

Modified-Linear M Resolvent ⇒ premise, start - goal + special axioms

DCF DC Potential resolvent is consequent (no addition) DCF Kill DK DCF + remove all consequent resolvents

*Searching for the best proof of a goal will produce a lot of logically equivalent formulas with different degrees. These resolvents make the inference process inefficient and one of the essential demands to the presented refutational theorem prover is a reasonable inference strategy with acceptable time complexity.*

We have compared efficiency of the standard **breadth-first search**, **linear search** and **modified linear search** (starting from every formula in knowledge base) and also combinations with DCF and DCF-kill technique (Habiballa, H.). We have prepared knowledge bases of the size 120, 240, 360, 480 and 600 formulas. It has been compared the time and space efficiency on the criterion of 2 redundancy levels. This level represents the number of redundant formulas to which the formula is equivalent (including the original formula). For example the level 5 means the knowledge base contain 5 equivalent redundant formulas for every formula (including the formula itself). The basic possible state space search techniques and DCF

We use standard state space search algorithms in the FPLGERDS application - Breadth-first and Linear search. Breadth-first method searches for every possible resolvent from the formulas of the level 0 (goal and special axioms). These resolvents form formulas of the level 1 and we try to combine them with all formulas of the same and lower level and continue by the same procedure until no other non-redundant resolvent could be found. Linear search performs depth-first search procedure, where every produced resolvent is used as one of the premises in succeeding step of inference. The first produced resolvents arises from the goal formula. Modified linear search method posses the same procedure as linear one, but it starts

DCF methods for reduction of resolvent space are basically three. The simplest is trivial DCF method, which detects redundant resolvent only by its exact symbolic comparison, i.e. formulas are equivalent only if the are syntactically the same. Even it is a very rough method,

Linear L Resolvent ⇒ premise, start - goal

Trivial T Exact symbolic comparison

 *b*1*,*

 *b*2*,*

heuristics and their combinations are presented in the following tables.

**Search method Description**

**DCF Method Description**

Table 2. Proof search algorithms

*<sup>a</sup>* <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*<sup>1</sup> <sup>∧</sup> *<sup>b</sup>*1⇒*z,* <sup>1</sup>

*<sup>a</sup>* <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*<sup>2</sup> <sup>∧</sup> *<sup>b</sup>*2⇒*z,* <sup>1</sup>

from goal and also from all the special axioms.

Table 3. DCF heuristics

*a* ∧ *b*2⇒*z,*

*a* ∧ *b*<sup>2</sup> ∧ *b*2⇒*z,*

*a* ∧ *b*<sup>2</sup> ∧ *b*<sup>2</sup> ∧ *b*2⇒*z,*

*a* ∧ *b*<sup>2</sup> ∧ *b*<sup>2</sup> ∧ *b*<sup>2</sup> ∧ *b*2⇒*z,*

0.91

0.62

0.72

0.82

0.92

*Goal:* ? − *a*⇒*z*

*...,*

*...,* 0.52


Table 4. Inference strategies

it is computationally very simple and forms necessary essential restriction for possibly infinite inference process. The next method of DCF technique enables do detect the equivalency of a formula (potential new resolvent) by the means described above. DCF Kill technique additionally tries to remove every redundant resolvent from the set of resolvents. The important aspect of the theorem DCF lies in its simple implementation into an automated theorem prover based on general resolution. The prover handles formulas in the form of syntactical tree. It is programmed a procedure performing general resolution with two formulas on an atom. This procedure is also used for the implementation of the theorem. A "virtual tree" is created from candidate and former resolvent (axiom) connected by negated implication. Then it remains to perform self-resolution on such formula until a logical value is obtained. Let us compare the efficiency of standard strategies and the above-defined one. We have built-up 9 combinations of inference strategies from the mentioned proof search and DCF heuristics. They have different computational strength, i.e. their completeness is different for various classes of formulas. Fully complete (as described above) for general formulas of FPL and FDL are only breadth-first search combinations. Linear search strategies are not complete even for two-valued logic and horn clauses. Modified linear search has generally bad completeness results when an infinite loop is present in proofs, but for guarded knowledge bases it can assure completeness preserving better space efficiency than breadth-first search. We tested presented inference strategies on sample knowledge bases with redundancy level 5 with 20, 40, 60, 80 and 100 groups of mutually redundant formulas (total number of formulas in knowledge base is 120, 240, 360, 480 and 600). At first we have tested their time efficiency for inference process. As it could be observed from figure 2, the best results have **LDK and LDC** strategies. For simple guarded knowledge bases (not leading to an infinite loop in proof search and where the goal itself assures the best refutation degree) these two methods are **very efficient**. DCF strategies significantly reduces the proof search even in comparison with LT strategy (standard), therefore the usage of any non-trivial DCF heuristics is significant. Next important result concludes from the comparison of BDK and MDK, MDC strategies. We can conclude that MDK and MDC strategies are relatively comparable to BDK and moreover BDK preserves completeness for general knowledge bases.

Space complexity is even more significantly affected by the DCF heuristics. There is an interesting comparison of trivial and non-trivial DCF heuristics in figure 3. Even BDK strategy brings significant reduction of resolvents amount, while LDK, LDC, MDK, MDC strategies have minimal necessary amount of kept resolvents during inference process. The second examined redundancy level 10 shows also important comparison for increasing redundancy in knowledge bases. Tested knowledge bases contained 10, 20, 30, 40 and 50 groups of 10 equivalent formulas (the total number of formulas was 110, 220, 330, 440 and 550 formulas).

Fig. 2. Time complexity for redundancy level 5 (seconds)

Time efficiency results shows that higher redundancy level causes expected increase in the necessary time for the best proof search (figure 4). The approximate increase is double, while the proportion shows good results for MDK, MDC and LDK, LDC (linear search based) strategies. This property also holds for space complexity as shown in figure 5. Performed experiments shows the significance of originally developed DCF strategies in combination with standard breadth-first search (important for general knowledge bases - **BDK**). We also outlined high efficiency for linear search based strategies (mainly **LDK**). Even this strategy is not fully complete and could be used only for guarded fragment of FDL, this problem is already known in classical (two-valued) logic programming and automated theorem proving. We also use these highly efficient linear search strategies, even they are not complete.

Fig. 3. Space complexity for redundancy level 5 (resolvents)

Resolution Principle and Fuzzy Logic 71

16 Will-be-set-by-IN-TECH

Fig. 2. Time complexity for redundancy level 5 (seconds)

Time efficiency results shows that higher redundancy level causes expected increase in the necessary time for the best proof search (figure 4). The approximate increase is double, while the proportion shows good results for MDK, MDC and LDK, LDC (linear search based) strategies. This property also holds for space complexity as shown in figure 5. Performed experiments shows the significance of originally developed DCF strategies in combination with standard breadth-first search (important for general knowledge bases - **BDK**). We also outlined high efficiency for linear search based strategies (mainly **LDK**). Even this strategy is not fully complete and could be used only for guarded fragment of FDL, this problem is already known in classical (two-valued) logic programming and automated theorem proving.

We also use these highly efficient linear search strategies, even they are not complete.

Fig. 3. Space complexity for redundancy level 5 (resolvents)

Fig. 5. Space complexity for redundancy level 10 (resolvents)

Resolution Principle and Fuzzy Logic 73

Fig. 4. Time complexity for redundancy level 10 (seconds)

18 Will-be-set-by-IN-TECH

Fig. 4. Time complexity for redundancy level 10 (seconds)

Fig. 5. Space complexity for redundancy level 10 (resolvents)

Jorma K. Mattila

*Finland*

**4**

*Lappeenranta University of Technology*

**Many-Valued Logics** 

**Standard Fuzzy Sets and some** 

The aim of this chapter is to consider the relationship between standard fuzzy set theory and some many-valued logics. Prof. Lotfi A. Zadeh introduced his theory of fuzzy sets in sixties, and his first paper that circulated widely around the world is "Fuzzy Sets" (Zadeh, 1965). In

After Zadeh has introduced his theory, many-valued logic began to have a new interest. Especially, Łukasiewicz logic was enclosed quite closely in fuzzy sets. There is a strong opinion that Łukasiewicz infinite-valued logic has the role as the logic of fuzzy sets, similarly as classical logic has the role as the logic of crisp sets. But actually, it seems that Kleene's 3-valued logic was the closest logic connecting to fuzzy sets, when Zadeh created his theory. We will discuss this thing later. In the books Rescher (Rescher, 1969) and Bergmann

In Section 2 we consider the main concepts of fuzzy set theory. We will not do it completely, because our purpose is not to present the whole theory of standard fuzzy sets. We restrict our consideration on those things we need when we are "building a bridge" between fuzzy sets

In Section 3 we consider De Morgan algebras in general in order to have a formal base to our consideration. There are many sources for this topic. One remarkable one is Rasiowa's book

In Section 4 we introduce an algebraic approach for standard fuzzy set theory by applying De Morgan algebras. We choose an algebra from the infinite large collection of De Morgan algebras that fits completely to standard fuzzy set theory. We call this De Morgan algebra by the name *Zadeh algebra*. The concept "Zadeh algebra" was introduced by the author in an international symposium "Fuzziness in Finland" in 2004. Also Prof. Zadeh attended this event. In the same year, a more comprehensive article about Zahed algebra (*cf.* (Mattila, 2004)) was published by the author. This algebra gives a tool for studying connections between standard fuzzy sets and certain many-valued logics. Two of these logics are Kleene's logic and Łukasiewicz logic. Some analysis about Łukasiewicz and Kleene's logic is given for example in Mattila (Mattila, 2009). Especially, connections to modal logic are considered in that paper.

the long run, this theory was began to call by the name *theory of standard fuzzy sets*.

and some closely related logics. The section is based on Zadeh (Zadeh, 1965).

(Bergmann, 2008) descriptions about Kleene's logic are given.

**1. Introduction**

(Rasiowa, 1974).

### **5. Conclusions and further research**

The *Non-clausal Refutational Resolution Theorem Prover* forms a powerful inference system for automated theorem proving in fuzzy predicate logic. The main advantage in contrast with other inference systems lies in the possibility to utilize various inference strategies for effective reasoning. Therefore it is essential for practically successful theorem proving.

The Detection of Consequent Formulas algorithms family brings significant improvements in time and space efficiency for the best proof search. It has been shown results indicating specific behavior of some combinations of the DCF and standard proof search (breadth-first and linear search). DCF strategies (BDC, BDK) have interesting results even for fully general fuzzy predicate logic with evaluated syntax, where the strategy makes the inference process practically manageable (in contrast to unrestricted blind proof-search). However it seems to be more promising for practical applications to utilize incomplete strategies with high time efficiency like LDK (even for large knowledge bases it has very short solving times). It conforms to another successful practical applications in two-valued logic like logic programming or deductive databases where there are also used efficient incomplete strategies for fragments of fully general logics.

It has been briefly presented some efficiency results for the presented automated theorem prover and inference strategies. They show the significant reduction of time and space complexity for the DCF technique. Experimental application FPLGERDS can be obtained from URL:// *http://www1.osu.cz/home/habibal/files/gerds.zip*. The package contains current version of the application, source codes, examples and documentation. This work was supported by project DAR (1M0572).

### **6. References**


## **Standard Fuzzy Sets and some Many-Valued Logics**

Jorma K. Mattila *Lappeenranta University of Technology Finland*

### **1. Introduction**

20 Will-be-set-by-IN-TECH

74 Fuzzy Logic – Algorithms, Techniques and Implementations

The *Non-clausal Refutational Resolution Theorem Prover* forms a powerful inference system for automated theorem proving in fuzzy predicate logic. The main advantage in contrast with other inference systems lies in the possibility to utilize various inference strategies for effective

The Detection of Consequent Formulas algorithms family brings significant improvements in time and space efficiency for the best proof search. It has been shown results indicating specific behavior of some combinations of the DCF and standard proof search (breadth-first and linear search). DCF strategies (BDC, BDK) have interesting results even for fully general fuzzy predicate logic with evaluated syntax, where the strategy makes the inference process practically manageable (in contrast to unrestricted blind proof-search). However it seems to be more promising for practical applications to utilize incomplete strategies with high time efficiency like LDK (even for large knowledge bases it has very short solving times). It conforms to another successful practical applications in two-valued logic like logic programming or deductive databases where there are also used efficient incomplete strategies

It has been briefly presented some efficiency results for the presented automated theorem prover and inference strategies. They show the significant reduction of time and space complexity for the DCF technique. Experimental application FPLGERDS can be obtained from URL:// *http://www1.osu.cz/home/habibal/files/gerds.zip*. The package contains current version of the application, source codes, examples and documentation. This work was supported by

Bachmair, L., Ganzinger, H. (1997). A theory of resolution. Technical report:

Bachmair, L., Ganzinger, H. (2001). Resolution theorem proving. In Handbook of Automated

Duki´c, N., Avdagi´c, Z. (2005). Fuzzy Functional Dependency and the Resolution Principle.

Habiballa, H. (2000). Non-clausal resolution - theory and practice. Research report: University

Habiballa, H., Novák, V. (2002). Fuzzy General Resolution. In Proc. of Intl. Conf. Aplimat 2002.

Habiballa, H. (2006). Resolution Based Reasoning in Description Logic. In Proc. of Intl. Conf.

Habiballa, H.(2006a). Fuzzy Predicate Logic Generalized Resolution Deductive System.

Hájek, P. (2000). Metamathematics of fuzzy logic. Kluwer Academic Publishers - Dordrecht,

Hájek, P. (2005). Making fuzzy description logic more general. Fuzzy Sets and Systems

Novák, V., Perfilieva, I., Moˇckoˇr, J. (1999). Mathematical principles of fuzzy logic. Kluwer, 1999.

of Ostrava, 2000, http://www.volny.cz/habiballa/files/gerds.pdf

rep. at http://ac030.osu.cz/irafm/ps/rep47.ps

http://ac030.osu.cz/irafm/ps/rep66.ps.gz.

In Informatica, Vilnius: Lith. Acad. Sci. (IOSPRESS), 2005, Vol.16, No. 1, pp. 45 - 60,

Bratislava, Slovak Technical University, 2002. pp. 199-206, also available as research

ZNALOSTI 2006, Univ. of Hradec Kralove, 2006, also available as research rep. at

Technical Report, Institute for Research and Application of Fuzzy Modeling,

reasoning. Therefore it is essential for practically successful theorem proving.

**5. Conclusions and further research**

for fragments of fully general logics.

Max-Planck-Institut, 1997.

Reasoning, MIT Press, 2001.

University of Ostrava, 2006.

154(2005),pp. 1-15.

project DAR (1M0572).

2005.

2000.

**6. References**

The aim of this chapter is to consider the relationship between standard fuzzy set theory and some many-valued logics. Prof. Lotfi A. Zadeh introduced his theory of fuzzy sets in sixties, and his first paper that circulated widely around the world is "Fuzzy Sets" (Zadeh, 1965). In the long run, this theory was began to call by the name *theory of standard fuzzy sets*.

After Zadeh has introduced his theory, many-valued logic began to have a new interest. Especially, Łukasiewicz logic was enclosed quite closely in fuzzy sets. There is a strong opinion that Łukasiewicz infinite-valued logic has the role as the logic of fuzzy sets, similarly as classical logic has the role as the logic of crisp sets. But actually, it seems that Kleene's 3-valued logic was the closest logic connecting to fuzzy sets, when Zadeh created his theory. We will discuss this thing later. In the books Rescher (Rescher, 1969) and Bergmann (Bergmann, 2008) descriptions about Kleene's logic are given.

In Section 2 we consider the main concepts of fuzzy set theory. We will not do it completely, because our purpose is not to present the whole theory of standard fuzzy sets. We restrict our consideration on those things we need when we are "building a bridge" between fuzzy sets and some closely related logics. The section is based on Zadeh (Zadeh, 1965).

In Section 3 we consider De Morgan algebras in general in order to have a formal base to our consideration. There are many sources for this topic. One remarkable one is Rasiowa's book (Rasiowa, 1974).

In Section 4 we introduce an algebraic approach for standard fuzzy set theory by applying De Morgan algebras. We choose an algebra from the infinite large collection of De Morgan algebras that fits completely to standard fuzzy set theory. We call this De Morgan algebra by the name *Zadeh algebra*. The concept "Zadeh algebra" was introduced by the author in an international symposium "Fuzziness in Finland" in 2004. Also Prof. Zadeh attended this event. In the same year, a more comprehensive article about Zahed algebra (*cf.* (Mattila, 2004)) was published by the author. This algebra gives a tool for studying connections between standard fuzzy sets and certain many-valued logics. Two of these logics are Kleene's logic and Łukasiewicz logic. Some analysis about Łukasiewicz and Kleene's logic is given for example in Mattila (Mattila, 2009). Especially, connections to modal logic are considered in that paper.

**2. Zadeh's theory of standard fuzzy sets**

The *power set* of all fuzzy subsets of the set *X* is

function

as a special case.

(Zadeh, 1965).

following operations:

1996)).

For considering the standard system of fuzzy sets, the range of fuzzy sets (i.e., that of membership functions) is the unit interval **I** = [0, 1]. We give the definition of the concept *fuzzy set* using Zadeh's original definition. However, some symbols have been changed. Usually, the symbol of a fuzzy set, in general, is denoted by *μ*. A membership function of

Standard Fuzzy Sets and some Many-Valued Logics 77

**Definition 2.1** (Standard fuzzy set)**.** A *fuzzy subset A* of a set *X* is characterized by a *membership function* A(*x*) which associates with each point *x* in *X* a real number in the interval [0, 1], with the value of A(*x*) at *x* representing the "grade of membership" of *x* in *A*. Thus, the nearer the

This definition means that a fuzzy subset *A* of a universe of discourse *X* is represented by a

A : *X* −→ **I**.

An important subset of the set of all membership functions (2.1) is the set of functions taking

**<sup>2</sup>***<sup>X</sup>* <sup>=</sup> { *<sup>f</sup>* <sup>|</sup> *<sup>f</sup>* : *<sup>X</sup>* −→ {0, 1}}

It is also a well-known fact that **I** and **I***<sup>X</sup>* are partially ordered sets. (Actually, **I** is a totally ordered set, but hence it is also prtially ordered.) In fact, they are also distributive complete lattices. Generally, some main properties of **I** can be embedded to **I***<sup>X</sup>* (*cf.* e.g. Lowen (Lowen,

We consider operations, properties, and some concepts involved in fuzzy sets given by Zadeh

**Definition 2.2** (Basic operations)**.** Let <sup>A</sup>, <sup>B</sup> <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* and *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*. In **<sup>I</sup>***<sup>X</sup>* there are defined the

A(*x*) = 1 − A(*x*) *complementarity*

∀*x* ∈ *X*, A(*x*) = B(*x*).

A fuzzy set *A* is *contained* in a fuzzy set *B*, i.e., *A* is a *subset* of *B*, denoted by A ⊆ B, if their

∀ *x* ∈ *X*, A(*x*) ≤ B(*x*)

(A ∨ B)(*x*) = max{A(*x*), B(*x*)} *union* (A ∧ B)(*x*) = min{A(*x*), B(*x*)} *intersection*

Two fuzzy sets <sup>A</sup>, <sup>B</sup> <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* are *equal*, denoted by <sup>A</sup> <sup>=</sup> <sup>B</sup>, if

membership functions satisfy the condition

only values 1 or 0, i.e., the set of all characteristic functions of the crisp subsets of *X*

**<sup>I</sup>***<sup>X</sup>* <sup>=</sup> { <sup>A</sup> <sup>|</sup> <sup>A</sup> : *<sup>X</sup>* −→ **<sup>I</sup>** } (2.1)

a fuzzy set *A* in a reference set *X* can be written as *μA*(*x*) or A(*x*) where *x* ∈ *X*.

value of A(*x*) to unity, the higher the grade of membership of *x* in *A*.

In Section 5 we make some analysis about the essence of fuzziness from the formal point of view. We try to find the original point where fuzziness appears and how it "moves" from its hiding-place making some concepts fuzzy.

In Section 6 we give the definition of *propositional language* by introducing its alphabet and how the expressions, i.e., *wellformed formulas* (or *formulas*, for short) can be formed from the alphabet. This formal language can be used as classical propositional logic and as many-valued propositional logic, too. We do not consider any other logical properties here, because they are not necessary for our purpose. In addition to the formal language, only the concept *valuation* and *truth-function* are needed. About the truth value evaluation, we consider the common things for several logics. The counterparts are obtainable also from Zadeh algebra. We also construct a *propositional algebra* that appears to be a Zadeh algebra.

In Section 7 an important logic for fuzzy sets is Kleene's 3-valued logic, as we already noticed above. Hence, the consideration of this logic deserves its own section. We tell about Kleene's motivation for constructing his 3-valued logic and give the truth value evaluation rules for the basic connectives. These rules fit completely well to the fuzzy set operations Zadeh introduced. We also explain the connections between standard fuzzy sets and this logic from Zadeh's point of view. In the end of this section, we give a short description about *Kleene-Dienes many-valued logic* that is an extension of Kleene's 3-valued logic into infinite-valued logic.

In Section 8 we consider the main features of Łukasiewicz ifinite-valued logic. Our main problem is included in this section. Łukasiewicz chose the connectives negation and implication as primitive connectives and derived the connectives conjunction, disjunction, and equivalence from these primitives. This starting point does not fit together with the operations of Zadeh algebra. Only the counterpart of negation (the complementarity operation) is included in Zadeh algebra but implication does not appear in it. in Łukasiewicz logic the two other connectives, disjunction and conjunction, belongs to the derived connectives. But they have such a form that their truth value evaluation rules are exactly the same as the corresponding operations in Zadeh algebra. So, using the set negation, disjunction, and conjunction of Łukasiewicz logic's connectives, we have to derive the connective Łukasiewicz implication. Actually, for this task we need only negation and disjunction, as is seen in Proposition 8.2 and its proof. Our final result is presented in Proposition 8.3. Some considerations on this topic can be found in Mattila (Mattila, 2005).

In Section 9 we consider briefly MV-algebras and give some hints how the connection between standard fuzzy sets and Łukasiewicz logic can be found. MV-algebras and their applications to fuzzy set theory and soft computing are widely studied, and the study of this topic actually forms a mainstream in this research area. Three books are mentioned in References representing this topic, namely M. Bergmann (Bergmann, 2008), R. L. O. Cignoli et al. (Cignoli et al., 2000), and P. Hájek (Hájek, 1998). These books belongs to a quite central literature of the topic.

MV-algebras are more general than De Morgan algebras, but formally it can be proved that De Morgan algebras belong to MV-algebras as a special case. But according to our problem, the used ways to apply general MV-algebras seems to give a circuitous route rather than a straightforward bridge between standard fuzzy set theory and Łukasiewicz logic.

In Section 10 we point out the main results and other concluding remarks.

### **2. Zadeh's theory of standard fuzzy sets**

For considering the standard system of fuzzy sets, the range of fuzzy sets (i.e., that of membership functions) is the unit interval **I** = [0, 1]. We give the definition of the concept *fuzzy set* using Zadeh's original definition. However, some symbols have been changed. Usually, the symbol of a fuzzy set, in general, is denoted by *μ*. A membership function of a fuzzy set *A* in a reference set *X* can be written as *μA*(*x*) or A(*x*) where *x* ∈ *X*.

**Definition 2.1** (Standard fuzzy set)**.** A *fuzzy subset A* of a set *X* is characterized by a *membership function* A(*x*) which associates with each point *x* in *X* a real number in the interval [0, 1], with the value of A(*x*) at *x* representing the "grade of membership" of *x* in *A*. Thus, the nearer the value of A(*x*) to unity, the higher the grade of membership of *x* in *A*.

This definition means that a fuzzy subset *A* of a universe of discourse *X* is represented by a function

$$\mathcal{A}: \mathcal{X} \longrightarrow \mathbf{I}.$$

The *power set* of all fuzzy subsets of the set *X* is

$$\mathbf{I}^{X} = \{ \mathcal{A} \mid \mathcal{A} : X \longrightarrow \mathbf{I} \}\tag{2.1}$$

An important subset of the set of all membership functions (2.1) is the set of functions taking only values 1 or 0, i.e., the set of all characteristic functions of the crisp subsets of *X*

$$\mathfrak{2}^X = \{ f \mid f: X \longrightarrow \{ 0, 1 \} \}$$

as a special case.

2 Will-be-set-by-IN-TECH

In Section 5 we make some analysis about the essence of fuzziness from the formal point of view. We try to find the original point where fuzziness appears and how it "moves" from its

In Section 6 we give the definition of *propositional language* by introducing its alphabet and how the expressions, i.e., *wellformed formulas* (or *formulas*, for short) can be formed from the alphabet. This formal language can be used as classical propositional logic and as many-valued propositional logic, too. We do not consider any other logical properties here, because they are not necessary for our purpose. In addition to the formal language, only the concept *valuation* and *truth-function* are needed. About the truth value evaluation, we consider the common things for several logics. The counterparts are obtainable also from Zadeh algebra. We also construct a *propositional algebra* that appears to be a Zadeh algebra. In Section 7 an important logic for fuzzy sets is Kleene's 3-valued logic, as we already noticed above. Hence, the consideration of this logic deserves its own section. We tell about Kleene's motivation for constructing his 3-valued logic and give the truth value evaluation rules for the basic connectives. These rules fit completely well to the fuzzy set operations Zadeh introduced. We also explain the connections between standard fuzzy sets and this logic from Zadeh's point of view. In the end of this section, we give a short description about *Kleene-Dienes many-valued logic* that is an extension of Kleene's 3-valued logic into

In Section 8 we consider the main features of Łukasiewicz ifinite-valued logic. Our main problem is included in this section. Łukasiewicz chose the connectives negation and implication as primitive connectives and derived the connectives conjunction, disjunction, and equivalence from these primitives. This starting point does not fit together with the operations of Zadeh algebra. Only the counterpart of negation (the complementarity operation) is included in Zadeh algebra but implication does not appear in it. in Łukasiewicz logic the two other connectives, disjunction and conjunction, belongs to the derived connectives. But they have such a form that their truth value evaluation rules are exactly the same as the corresponding operations in Zadeh algebra. So, using the set negation, disjunction, and conjunction of Łukasiewicz logic's connectives, we have to derive the connective Łukasiewicz implication. Actually, for this task we need only negation and disjunction, as is seen in Proposition 8.2 and its proof. Our final result is presented in Proposition 8.3. Some considerations on this topic can be found in Mattila (Mattila, 2005).

In Section 9 we consider briefly MV-algebras and give some hints how the connection between standard fuzzy sets and Łukasiewicz logic can be found. MV-algebras and their applications to fuzzy set theory and soft computing are widely studied, and the study of this topic actually forms a mainstream in this research area. Three books are mentioned in References representing this topic, namely M. Bergmann (Bergmann, 2008), R. L. O. Cignoli et al. (Cignoli et al., 2000), and P. Hájek (Hájek, 1998). These books belongs to a quite central literature of the

MV-algebras are more general than De Morgan algebras, but formally it can be proved that De Morgan algebras belong to MV-algebras as a special case. But according to our problem, the used ways to apply general MV-algebras seems to give a circuitous route rather than a

straightforward bridge between standard fuzzy set theory and Łukasiewicz logic.

In Section 10 we point out the main results and other concluding remarks.

hiding-place making some concepts fuzzy.

infinite-valued logic.

topic.

It is also a well-known fact that **I** and **I***<sup>X</sup>* are partially ordered sets. (Actually, **I** is a totally ordered set, but hence it is also prtially ordered.) In fact, they are also distributive complete lattices. Generally, some main properties of **I** can be embedded to **I***<sup>X</sup>* (*cf.* e.g. Lowen (Lowen, 1996)).

We consider operations, properties, and some concepts involved in fuzzy sets given by Zadeh (Zadeh, 1965).

**Definition 2.2** (Basic operations)**.** Let <sup>A</sup>, <sup>B</sup> <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* and *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*. In **<sup>I</sup>***<sup>X</sup>* there are defined the following operations:


Two fuzzy sets <sup>A</sup>, <sup>B</sup> <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* are *equal*, denoted by <sup>A</sup> <sup>=</sup> <sup>B</sup>, if

$$\forall \mathfrak{x} \in X, \quad \mathcal{A}(\mathfrak{x}) = \mathcal{B}(\mathfrak{x}).$$

A fuzzy set *A* is *contained* in a fuzzy set *B*, i.e., *A* is a *subset* of *B*, denoted by A ⊆ B, if their membership functions satisfy the condition

$$\forall \, \mathbf{x} \in \mathbf{X}, \quad \mathcal{A}(\mathbf{x}) \le \mathcal{B}(\mathbf{x})$$

Consider the unit interval lattice **I** = ([0, 1], ≤). Sometimes we write **I** = [0, 1], for short. As

Standard Fuzzy Sets and some Many-Valued Logics 79

Hence, we can write the lattice **I** into the form **I** = ([0, 1], ∨, ∧). We will prove it is a distributive lattice. We consider it in the proof of Theorem 3.1 when we prove that **I** forms a De Morgan algebra. Especially, the order relation ≤ is a total order on [0, 1] because it is an order and any two elements from the interval [0, 1] are comparable with each other under it,

The interval [0, 1] is a metric space with the natural metric *distance* between two points of [0, 1]

We will see that this equality measure can be used in Łukasiewicz infinite-valued logic as the

**Theorem 3.1.** *The system L***<sup>I</sup>** = �**I**, ∨, ∧, ¬, 0, 1� *is De Morgan algebra, where for all x* ∈ [0, 1]*,*

*Proof.* First, we show that **I** is a distributive lattice. It is clear that **I** is a lattice. For showing distributivity, we choose arbitrarily elements *a*, *b*, *c* ∈ [0, 1]. Without loss of generality, we can

(*<sup>a</sup>* <sup>∨</sup> *<sup>b</sup>*) <sup>∧</sup> (*<sup>a</sup>* <sup>∨</sup> *<sup>c</sup>*) = *<sup>b</sup>* <sup>∧</sup> *<sup>c</sup>* <sup>=</sup> *<sup>b</sup>* <sup>=</sup><sup>⇒</sup> *<sup>a</sup>* <sup>∨</sup> (*<sup>b</sup>* <sup>∧</sup> *<sup>c</sup>*)=(*<sup>a</sup>* <sup>∨</sup> *<sup>b</sup>*) <sup>∧</sup> (*<sup>a</sup>* <sup>∨</sup> *<sup>c</sup>*)

Similarly, we have *a* ∧ (*b* ∨ *c*)=(*a* ∧ *b*) ∨ (*a* ∧ *c*). Hence, **I** = ([0, 1], ∨, ∧) is a distributive

¬¬*a* = 1 − (1 − *a*) = *a*

<sup>¬</sup>*<sup>a</sup>* ∧ ¬*<sup>b</sup>* = (<sup>1</sup> <sup>−</sup> *<sup>a</sup>*) <sup>∧</sup> (<sup>1</sup> <sup>−</sup> *<sup>b</sup>*) = <sup>1</sup> <sup>−</sup> *<sup>b</sup>* if *<sup>a</sup>* <sup>≤</sup> *<sup>b</sup>* <sup>=</sup>⇒ ¬(*<sup>a</sup>* <sup>∨</sup> *<sup>b</sup>*) = <sup>¬</sup>*<sup>a</sup>* ∧ ¬*<sup>b</sup>*

*x* ∨ *y* = *y x* ∧ *y* = *x*

*d*(*x*, *y*) = |*x* − *y*| , *x*, *y* ∈ [0, 1] (3.2)

*x* ∨ *y* = max{*x*, *y*} (3.3) *x* ∧ *y* = min{*x*, *y*} (3.4)

*L***<sup>I</sup>** = �**I**, max, min, ¬, 0, 1� (3.5)

(3.1)

is well known, the order relation ≤ and the operations ∨ and ∧ have the connection

<sup>∀</sup>*x*, *<sup>y</sup>* <sup>∈</sup> *<sup>X</sup>*, *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>* ⇐⇒

i.e., for any *x*, *y* ∈ **I**, we can state whether the order *x* ≤ *y* either holds or not.

given by the condition

¬*x* = 1 − *x.*

lattice.

evaluation rule for the connective *equivalency*.

suppose that *a* ≤ *b* ≤ *c*. Then, by (3.1) we have

*a* ∨ (*b* ∧ *c*) = *a* ∨ *b* = *b*

(DM1) holds because for all *a* ∈ [0, 1],

(DM2) holds because for all *a*, *b* ∈ [0, 1],

Hence, by Def. 3.1, *L***<sup>I</sup>** is a De Morgan algebra.

¬(*a* ∨ *b*) = 1 − (*a* ∨ *b*) = 1 − *b* if *a* ≤ *b*

From the ordering property (3.1) it follows that for all *x*, *y* ∈ **I**

Hence, we can express the algebra of Theorem 3.1 in the form

Zadeh also shows that the operations max and min are associative, distributive to each other, and De Morgan's laws hold, and they have the form

$$1 - \min\{\mathcal{A}(\mathbf{x}), \mathcal{B}(\mathbf{x})\} = \max\{1 - \mathcal{A}(\mathbf{x}), 1 - \mathcal{B}(\mathbf{x})\}\tag{2.2}$$

$$1 - \max\{\mathcal{A}(\mathbf{x}), \mathcal{B}(\mathbf{x})\} = \min\{1 - \mathcal{A}(\mathbf{x}), 1 - \mathcal{B}(\mathbf{x})\}\tag{2.3}$$

Actually, Zadeh gives the building materials for an algebra in his paper (Zadeh, 1965). However, he did not think any algebras when he created his paper "Fuzzy Sets". He thought the problem from another point of view. We return to this matter in the end of Section 4.

Finally, we present the following theorem due to C. V. Negoit ˘a and D. A. Ralescu (Negoit ˘a & Ralescu, 1975).

**Theorem 2.1.** *The set* **I***<sup>X</sup> is a complete distributive lattice.*

*Proof.* The reference set *X* has the membership function

$$
\mu\_X(\mathfrak{x}) = 1, \quad \mathfrak{x} \in X
$$

and the empty set ∅ the membership function

$$
\mu\_{\mathcal{D}}(\mathfrak{x}) = 0, \quad \mathfrak{x} \in X.
$$

This corresponds to the fact that **<sup>1</sup>**, **<sup>0</sup>** <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* where **<sup>1</sup>**(*x*) = 1 and **<sup>0</sup>**(*x*) = 0 for any *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*. Hence, the result follows by the definition of complete lattice and the order properties of the unit interval.

### **3. On De Morgan algebras**

To get an algebra of standard fuzzy sets we start by considering the concept of De Morgan algebras. The main source is Helena Rasiowa's book (Rasiowa, 1974).

**Definition 3.1** (De Morgan algebra)**.** An abstract algebra A = �*A*, ∨, ∧, ¬, **1**� is called *De Morgan algebra*, if (*A*, ∨, ∧) is a distributive lattice with unit element **1** (the neutral element of ∧ operation), and ¬ is a unary operation on *A* satisfying the following conditions:

(DM1) for all *a* ∈ *A*, ¬¬*a* = *a*,

$$\text{(DM2)}\qquad\text{for all }a,b\in A,\quad\neg(a\lor b)=\neg a\land\neg b.$$

It is easy to prove that in any De Morgan algebra �*A*, ∨, ∧, ¬, **1**� the following properties hold:

(DM3) there is a zero element **0** (the neutral element of ∨ operation),

(DM4) ¬**0** = **1** and ¬**1** = **0**,

(DM5) ¬(*a* ∧ *b*) = ¬*a* ∨ ¬*b*.

The unit element is the greatest element and the zero element the least element of *A*. By (DM3), we sometimes add the zero element of a De Morgan algebra into the component list of the entities belonging to the algebra: A = �*A*, ∨, ∧, ¬, **0**, **1**�.

4 Will-be-set-by-IN-TECH

Zadeh also shows that the operations max and min are associative, distributive to each other,

Actually, Zadeh gives the building materials for an algebra in his paper (Zadeh, 1965). However, he did not think any algebras when he created his paper "Fuzzy Sets". He thought the problem from another point of view. We return to this matter in the end of Section 4.

Finally, we present the following theorem due to C. V. Negoit ˘a and D. A. Ralescu (Negoit ˘a &

*μX*(*x*) = 1, *x* ∈ *X*

*μ*∅(*x*) = 0, *x* ∈ *X*.

This corresponds to the fact that **<sup>1</sup>**, **<sup>0</sup>** <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* where **<sup>1</sup>**(*x*) = 1 and **<sup>0</sup>**(*x*) = 0 for any *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*. Hence, the result follows by the definition of complete lattice and the order properties of the

To get an algebra of standard fuzzy sets we start by considering the concept of De Morgan

**Definition 3.1** (De Morgan algebra)**.** An abstract algebra A = �*A*, ∨, ∧, ¬, **1**� is called *De Morgan algebra*, if (*A*, ∨, ∧) is a distributive lattice with unit element **1** (the neutral element

It is easy to prove that in any De Morgan algebra �*A*, ∨, ∧, ¬, **1**� the following properties hold:

The unit element is the greatest element and the zero element the least element of *A*. By (DM3), we sometimes add the zero element of a De Morgan algebra into the component list

of ∧ operation), and ¬ is a unary operation on *A* satisfying the following conditions:

algebras. The main source is Helena Rasiowa's book (Rasiowa, 1974).

(DM3) there is a zero element **0** (the neutral element of ∨ operation),

of the entities belonging to the algebra: A = �*A*, ∨, ∧, ¬, **0**, **1**�.

1 − min{A(*x*), B(*x*)} = max{1 − A(*x*), 1 − B(*x*)} (2.2)

1 − max{A(*x*), B(*x*)} = min{1 − A(*x*), 1 − B(*x*)} (2.3)

and De Morgan's laws hold, and they have the form

**Theorem 2.1.** *The set* **I***<sup>X</sup> is a complete distributive lattice.*

*Proof.* The reference set *X* has the membership function

and the empty set ∅ the membership function

Ralescu, 1975).

unit interval.

**3. On De Morgan algebras**

(DM1) for all *a* ∈ *A*, ¬¬*a* = *a*,

(DM4) ¬**0** = **1** and ¬**1** = **0**, (DM5) ¬(*a* ∧ *b*) = ¬*a* ∨ ¬*b*.

(DM2) for all *a*, *b* ∈ *A*, ¬(*a* ∨ *b*) = ¬*a* ∧ ¬*b*.

Consider the unit interval lattice **I** = ([0, 1], ≤). Sometimes we write **I** = [0, 1], for short. As is well known, the order relation ≤ and the operations ∨ and ∧ have the connection

$$\forall \mathbf{x}, y \in \mathbf{X}, \; \mathbf{x} \le y \iff \begin{cases} \mathbf{x} \lor y = y \\ \mathbf{x} \land y = \mathbf{x} \end{cases} \tag{3.1}$$

Hence, we can write the lattice **I** into the form **I** = ([0, 1], ∨, ∧). We will prove it is a distributive lattice. We consider it in the proof of Theorem 3.1 when we prove that **I** forms a De Morgan algebra. Especially, the order relation ≤ is a total order on [0, 1] because it is an order and any two elements from the interval [0, 1] are comparable with each other under it, i.e., for any *x*, *y* ∈ **I**, we can state whether the order *x* ≤ *y* either holds or not.

The interval [0, 1] is a metric space with the natural metric *distance* between two points of [0, 1] given by the condition

$$d(\mathbf{x}, y) = |\mathbf{x} - y|\,,\quad \mathbf{x}, y \in [0, 1] \tag{3.2}$$

We will see that this equality measure can be used in Łukasiewicz infinite-valued logic as the evaluation rule for the connective *equivalency*.

**Theorem 3.1.** *The system L***<sup>I</sup>** = �**I**, ∨, ∧, ¬, 0, 1� *is De Morgan algebra, where for all x* ∈ [0, 1]*,* ¬*x* = 1 − *x.*

*Proof.* First, we show that **I** is a distributive lattice. It is clear that **I** is a lattice. For showing distributivity, we choose arbitrarily elements *a*, *b*, *c* ∈ [0, 1]. Without loss of generality, we can suppose that *a* ≤ *b* ≤ *c*. Then, by (3.1) we have

$$\begin{cases} a \lor (b \land c) = a \lor b = b\\ (a \lor b) \land (a \lor c) = b \land c = b \end{cases} \implies a \lor (b \land c) = (a \lor b) \land (a \lor c)$$

Similarly, we have *a* ∧ (*b* ∨ *c*)=(*a* ∧ *b*) ∨ (*a* ∧ *c*). Hence, **I** = ([0, 1], ∨, ∧) is a distributive lattice.

(DM1) holds because for all *a* ∈ [0, 1],

$$\neg\neg a = 1 - (1 - a) = a$$

(DM2) holds because for all *a*, *b* ∈ [0, 1],

$$\begin{cases} \neg(a \lor b) = 1 - (a \lor b) = 1 - b & \text{if } a \le b \\\neg a \land \neg b = (1 - a) \land (1 - b) = 1 - b & \text{if } a \le b \end{cases} \implies \neg(a \lor b) = \neg a \land \neg b$$

Hence, by Def. 3.1, *L***<sup>I</sup>** is a De Morgan algebra.

From the ordering property (3.1) it follows that for all *x*, *y* ∈ **I**

$$\mathbf{x} \lor \mathbf{y} = \max\{\mathbf{x}, \mathbf{y}\} \tag{3.3}$$

$$x \wedge y = \min\{x, y\} \tag{3.4}$$

Hence, we can express the algebra of Theorem 3.1 in the form

$$L\_{\mathbf{I}} = \langle \mathbf{I}, \max, \min, \neg, 0, 1 \rangle \tag{3.5}$$

is given in the proof of Theorem 4.1.) This thing is analogous to the classical set complement expressed by subtraction a set *A* to be complemented from the universe of discourse *X*, i.e.,

Standard Fuzzy Sets and some Many-Valued Logics 81

The operations max and min are clearly commutative. Based on the fact that the algebra (3.7) is De Morgan algebra, the algebra (4.1) is De Morgan algebra, too. We call this algebra *Zadeh algebra* because it is an algebraic description of standard fuzzy set theory, similarly as in classical set theory, a certain Boolean algebra (set algebra or algebra of characteristic functions) is the algebraic description of the system of classical sets. Now, we have the following

**Theorem 4.1.** *Zadeh algebra* <sup>Z</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� *is an algebraic approach to standard fuzzy*

(i) The operations max and min are exactly the same as in Zadeh's theory by Def. 2.2.

max{*μ*, *ν*} = max{*ν*, *μ*} and max{*μ*, max{*ν*, *τ*}} = max{max{*μ*, *ν*}, *τ*}

for all *<sup>μ</sup>*, *<sup>ν</sup>*, *<sup>τ</sup>* <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* because these laws clearly hold for the elements of **<sup>I</sup>**, and these laws can be embedded to **<sup>I</sup>***<sup>X</sup>* by pointwice calculation of values of the functions *<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* (*cf.*

(max{*μ*, **0**})(*x*) = max{*μ*(*x*), **0**(*x*)} = max{*μ*(*x*), 0} = *μ*(*x*)

(min{*μ*, **0**})(*x*) = *μ*(*x*)

(¬*μ*)(*x*)=(**1** − *μ*)(*x*) = **1**(*x*) − *μ*(*x*) = 1 − *μ*(*x*)

taking values from the unit interval [0, 1]. Hence, <sup>¬</sup>*<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, and <sup>¬</sup> is the complementarity

(vi) Clearly, Zadeh algebra <sup>Z</sup> satisfies the condition **<sup>0</sup>** �<sup>=</sup> **<sup>1</sup>**, by (iv). Hence, **<sup>2</sup>***<sup>X</sup>* <sup>⊂</sup> **<sup>I</sup>***X*. The constant functions **0** and **1** are the zero element and unit element of the algebra.

In classical set theory, an element either is or is not an element of a given set. In fuzzy set theory, we have three possibilities: a membership grade of an element in a given fuzzy set

For practical use, we may postulate Zadeh algebra by collecting the nevessary properties together. This means that we build Theor. 4.1 again using the main laws and properties

(v) For any membership function *<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, there exists <sup>¬</sup>*<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, such that for any *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*,

(iii) From Theorem 2.1, distributive laws follows for max and min on **I***<sup>X</sup>* because (**I***X*, max, min) is a distributive lattice and Zaheh-algebra (4.1) is De Morgan algebra.

(ii) The operations max and min are commutative and associative on **I***X*, i.e.,

Lowen (Lowen, 1996)). The same properties hold for min, too.

(iv) For all *<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, max{*μ*, **<sup>0</sup>**} <sup>=</sup> *<sup>μ</sup>* and min{*μ*, **<sup>1</sup>**} <sup>=</sup> *<sup>μ</sup>*, because for any *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*,

Zadeh (Zadeh, 1965) has also proved these laws.

Similarly, for any *x* ∈ *X*,

operation of Zadeh's theory.

equals either to zero or one, or is between them.

like postulates. The result is as follows.

This competes the proof.

*<sup>A</sup><sup>c</sup>* <sup>=</sup> *<sup>X</sup> <sup>A</sup>* where *<sup>A</sup>* <sup>⊂</sup> *<sup>X</sup>* and *<sup>A</sup><sup>c</sup>* is the complement of *<sup>A</sup>*.

*set theory.*

*Proof.*

Let *X* be a nonempty set. Consider a set of functions *μ* : *X* −→ **I**, i.e., the function set

$$\mathbb{I}^{X} = \{ \mu \mid \mu: X \longrightarrow \mathbb{I} \}\tag{3.6}$$

We extend the algebra of Theorem 3.1 into an *algebra of functions* (3.6)

$$L\_{\mathbf{I}^X} = \langle \mathbf{I}^X, \vee, \wedge, \neg, \mathbf{0}, \mathbf{1} \rangle \tag{3.7}$$

by pointwise calculation. Here **0** and **1** are constant functions, such that

$$\forall \mathbf{x} \in \mathbf{X}, \quad \mathbf{0}: \mathbf{x} \mapsto \mathbf{0}, \quad \mathbf{1}: \mathbf{x} \mapsto \mathbf{1} \tag{3.8}$$

The algebra (3.7) is a De Morgan algebra by its construction. This means that we calculate expressions *μ*(*x*) ∨ *ν*(*x*), *μ*(*x*) ∧ *ν*(*x*), ¬*μ*(*x*) etc. pointwise for any *x* ∈ *X*. Hence, the formulas (3.3) and (3.4) are applicable also in the function algebra (3.7).

As a special case, the algebra (3.7) has a subalgebra

$$L\_{\{0,1\}^X} = \langle \{0,1\}^X \rangle \text{max} \langle \text{min} \rangle \ \neg \text{,0} \ \mathbf{1} \rangle \tag{3.9}$$

being an algebra of characteristic functions of classical sets, *f* : *X* −→ {0, 1}. Sometimes we write **2** instead of {0, 1}, so, especially,

$$\mathbf{2}^{X} = \{ f\_{A} \mid f\_{A} : X \longrightarrow \{ 0, 1 \}, A \subset X \}$$

is the classical power set of a set *X* expressed by characteristic functions. The characteristic function of a given set *A* ⊂ *X*, *fA*, is the function

$$f\_A(\mathfrak{x}) = \begin{cases} 1 & \text{if} \quad \mathfrak{x} \in A\_{\sigma} \\ 0 & \text{if} \quad \mathfrak{x} \notin A\_{\sigma} \end{cases}$$

This function indicates by the value *fA*(*x*) = 1 that the element *x* ∈ *X* is an element of *A* and all the elements of *X* having the value *fA*(*x*) = 0 are elements of the complement of *A*. As a subalgebra of the algebra (3.7), the algebra (3.9) is a special De Morgan algebra, namely a *Boolean algebra*.

### **4. Algebra of standard fuzzy sets**

Consider the algebra (3.7). We may give a new label to it and use operation symbols max and min instead of ∨ and ∧, respectively, by the formulas (3.3) and (3.4). Hence, we have

$$\mathcal{Z}\_{\mathbb{N}\_1} = \langle \mathbb{I}^X, \mathbf{max}, \mathbf{min}, \neg, \mathbf{0}, \mathbf{1} \rangle \tag{4.1}$$

The subscript <sup>ℵ</sup><sup>1</sup> means the cardinality of continuum, so, **<sup>I</sup>***<sup>X</sup>* is a continuum because **<sup>I</sup>** is continuum, too. For short, we may refer to Zℵ<sup>1</sup> by Z, without the subscript, if there is no possibility for confusion. The complementarity operation ¬ is a mapping

$$\neg : \mathbb{I}^X \longrightarrow \mathbb{I}^X \text{, } \mu \mapsto \mathbf{1} - \mu \tag{4.2}$$

Hence, the complement function of a function *μ* is **1** − *μ*, such that for all *x* ∈ *X*, (**1** − *μ*)(*x*) = **1**(*x*) − *μ*(*x*) = 1− *μ*(*x*). (The proof, that ¬ defined in this way is really a membership function, is given in the proof of Theorem 4.1.) This thing is analogous to the classical set complement expressed by subtraction a set *A* to be complemented from the universe of discourse *X*, i.e., *<sup>A</sup><sup>c</sup>* <sup>=</sup> *<sup>X</sup> <sup>A</sup>* where *<sup>A</sup>* <sup>⊂</sup> *<sup>X</sup>* and *<sup>A</sup><sup>c</sup>* is the complement of *<sup>A</sup>*.

The operations max and min are clearly commutative. Based on the fact that the algebra (3.7) is De Morgan algebra, the algebra (4.1) is De Morgan algebra, too. We call this algebra *Zadeh algebra* because it is an algebraic description of standard fuzzy set theory, similarly as in classical set theory, a certain Boolean algebra (set algebra or algebra of characteristic functions) is the algebraic description of the system of classical sets. Now, we have the following

**Theorem 4.1.** *Zadeh algebra* <sup>Z</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� *is an algebraic approach to standard fuzzy set theory.*

*Proof.*

6 Will-be-set-by-IN-TECH

The algebra (3.7) is a De Morgan algebra by its construction. This means that we calculate expressions *μ*(*x*) ∨ *ν*(*x*), *μ*(*x*) ∧ *ν*(*x*), ¬*μ*(*x*) etc. pointwise for any *x* ∈ *X*. Hence, the formulas

being an algebra of characteristic functions of classical sets, *f* : *X* −→ {0, 1}. Sometimes we

**<sup>2</sup>***<sup>X</sup>* <sup>=</sup> { *fA* <sup>|</sup> *fA* : *<sup>X</sup>* −→ {0, 1}, *<sup>A</sup>* <sup>⊂</sup> *<sup>X</sup>*}

is the classical power set of a set *X* expressed by characteristic functions. The characteristic

This function indicates by the value *fA*(*x*) = 1 that the element *x* ∈ *X* is an element of *A* and all the elements of *X* having the value *fA*(*x*) = 0 are elements of the complement of *A*. As a subalgebra of the algebra (3.7), the algebra (3.9) is a special De Morgan algebra, namely a

Consider the algebra (3.7). We may give a new label to it and use operation symbols max and

The subscript <sup>ℵ</sup><sup>1</sup> means the cardinality of continuum, so, **<sup>I</sup>***<sup>X</sup>* is a continuum because **<sup>I</sup>** is continuum, too. For short, we may refer to Zℵ<sup>1</sup> by Z, without the subscript, if there is no

Hence, the complement function of a function *μ* is **1** − *μ*, such that for all *x* ∈ *X*, (**1** − *μ*)(*x*) = **1**(*x*) − *μ*(*x*) = 1− *μ*(*x*). (The proof, that ¬ defined in this way is really a membership function,

min instead of ∨ and ∧, respectively, by the formulas (3.3) and (3.4). Hence, we have

possibility for confusion. The complementarity operation ¬ is a mapping

1 if *x* ∈ *A*, 0 if *x* ∈/ *A*

*fA*(*x*) =

**<sup>I</sup>***<sup>X</sup>* <sup>=</sup> {*<sup>μ</sup>* <sup>|</sup> *<sup>μ</sup>* : *<sup>X</sup>* −→ **<sup>I</sup>**} (3.6)

*<sup>L</sup>***I***<sup>X</sup>* <sup>=</sup> �**I***X*, <sup>∨</sup>, <sup>∧</sup>, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� (3.7)

∀*x* ∈ *X*, **0** : *x* �→ 0, **1** : *x* �→ 1 (3.8)

*<sup>L</sup>*{0,1}*<sup>X</sup>* <sup>=</sup> �{0, 1}*X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� (3.9)

Zℵ<sup>1</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� (4.1)

<sup>¬</sup> : **<sup>I</sup>***<sup>X</sup>* −→ **<sup>I</sup>***X*, *<sup>μ</sup>* �→ **<sup>1</sup>** <sup>−</sup> *<sup>μ</sup>* (4.2)

Let *X* be a nonempty set. Consider a set of functions *μ* : *X* −→ **I**, i.e., the function set

We extend the algebra of Theorem 3.1 into an *algebra of functions* (3.6)

by pointwise calculation. Here **0** and **1** are constant functions, such that

(3.3) and (3.4) are applicable also in the function algebra (3.7).

As a special case, the algebra (3.7) has a subalgebra

function of a given set *A* ⊂ *X*, *fA*, is the function

write **2** instead of {0, 1}, so, especially,

**4. Algebra of standard fuzzy sets**

*Boolean algebra*.


max{*μ*, *ν*} = max{*ν*, *μ*} and max{*μ*, max{*ν*, *τ*}} = max{max{*μ*, *ν*}, *τ*}

for all *<sup>μ</sup>*, *<sup>ν</sup>*, *<sup>τ</sup>* <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* because these laws clearly hold for the elements of **<sup>I</sup>**, and these laws can be embedded to **<sup>I</sup>***<sup>X</sup>* by pointwice calculation of values of the functions *<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***<sup>X</sup>* (*cf.* Lowen (Lowen, 1996)). The same properties hold for min, too.


$$(\max\{\mu, \mathbf{0}\})(\mathbf{x}) = \max\{\mu(\mathbf{x}), \mathbf{0}(\mathbf{x})\} = \max\{\mu(\mathbf{x}), \mathbf{0}\} = \mu(\mathbf{x})$$

Similarly, for any *x* ∈ *X*,

$$(\min\{\mu, \mathbf{0}\})(\mathbf{x}) = \mu(\mathbf{x})$$

(v) For any membership function *<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, there exists <sup>¬</sup>*<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, such that for any *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*,

$$(\neg \mu)(\mathbf{x}) = (\mathbf{1} - \mu)(\mathbf{x}) = \mathbf{1}(\mathbf{x}) - \mu(\mathbf{x}) = 1 - \mu(\mathbf{x})$$

taking values from the unit interval [0, 1]. Hence, <sup>¬</sup>*<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X*, and <sup>¬</sup> is the complementarity operation of Zadeh's theory.

(vi) Clearly, Zadeh algebra <sup>Z</sup> satisfies the condition **<sup>0</sup>** �<sup>=</sup> **<sup>1</sup>**, by (iv). Hence, **<sup>2</sup>***<sup>X</sup>* <sup>⊂</sup> **<sup>I</sup>***X*. The constant functions **0** and **1** are the zero element and unit element of the algebra.

This competes the proof.

In classical set theory, an element either is or is not an element of a given set. In fuzzy set theory, we have three possibilities: a membership grade of an element in a given fuzzy set equals either to zero or one, or is between them.

For practical use, we may postulate Zadeh algebra by collecting the nevessary properties together. This means that we build Theor. 4.1 again using the main laws and properties like postulates. The result is as follows.

$$\mathbf{0}$$

Zadeh-algebra as a special case of De Morgan algebras give rise to closer analysis. Here we have done some part of it. The author thinks that Prof. Zadeh did not necessarily think about De Morgan algebras, when he created his crucial paper "Fuzzy Sets" (Zadeh, 1965). He thought the problem from another point of view, as can be seen in the construction of the paper. His leading idea was to model things in the eventful real world. In any way, it was a happy event that Prof. Zadeh's ideas met such a mathematical frame we have considered here. No others have been so successful to find such a *right interpretation* to some formal tools for modeling real world incidences. In the same time the *problem of interpretation* of many-valued logic got a solution. Many-valued logic began to give meaningful tools for analyzing and modeling things in real world. The role of many-valued logics were very nominal before Prof. Zadeh invented fuzzy set theory. After this, the study of many-valued logic met a new rise. Fuzzy set theory and fuzzy logic has helped the researchers to find new aspects from already

Standard Fuzzy Sets and some Many-Valued Logics 83

existing mathematical theories. This kind of work is now going on very strongly.

For example, imagine a set of beautiful women. Let us denote this set by *A*. There are women who do not belong to *A* with the highest grade 1. So, such a woman *does not* have some features which would make her beautiful. But she may have some of those features anyway. An intuitive hint about a possible answer to the question: "Where is the hiding-place of fuzziness?" can be found just on the second line above: "... *does not* ..." It seems that a partial

Let us compare Zadeh algebra with a general Boolean algebra with a supposition that the binary operations are associative because associativity holds in Zadeh algebra. The definition

**Definition 5.1.** Let ∧ (*meet*) and ∨ (*join*) be binary operations, and � (*complement*) a unary operation on a set *B*(�= ∅), and let **0** and **1** be the elements of *B*, such that the following

∀*x*, *y*, *z* ∈ *B*, *x* ∧ (*y* ∨ *z*)=(*x* ∧ *y*) ∨ (*x* ∧ *z*),

(BA4) ∀*x* ∈ *B*, *x* ∨ **0** = *x* and *x* ∧ **1** = *x*, i.e., **0** and **1** are the neutral elements (or identity

(BA5) For every element *x* ∈ *B* there exists an element *x*� ∈ *B*, such that *x* ∨ *x*� = **1** and

The only *structural difference* between these algebras is that between the axioms Z5 and BA 5. BA 5 is characteristic for complement operation, but Z5 does not satisfy the conditions of complement. So, fuzziness lies in the axiom Z5. The influence of this axiom is that also other

Hence, the set *B* together with these operations forms a *Boolean algebra* B = (*B*, ∨, ∧, �

*x* ∨ (*y* ∧ *z*)=(*x* ∨ *y*) ∧ (*x* ∨ *z*).

, **0**, **1**).

(BA1) ∧ and ∨ are commutative in *B*, i.e., ∀*x*, *y* ∈ *B*, *x* ∨ *y* = *y* ∨ *x* ja *x* ∧ *y* = *y* ∧ *x*;

**5. Where is the hiding-place of fuzziness?**

axioms hold:

*x* ∧ *x*� = **0**.

complementarity is somehow involved in this problem.

of this kind of Boolean algebra can be postulated as follows.

(BA2) The operations ∧ and ∨ are associative in *B*; (BA3) The operations ∧ and ∨ are distributive, i.e.,

elements) of the operations ∨ and ∧.

(BA6) For the elements **0** and **1** of *B* the condition **0** �= **1** holds.

**Proposition 4.1.** *Let* **<sup>I</sup>***<sup>X</sup>* <sup>=</sup> {*<sup>μ</sup>* <sup>|</sup> *<sup>μ</sup>* : *<sup>X</sup>* −→ **<sup>I</sup>**} *be the set of all functions from X to* **<sup>I</sup>***, where the operations* max *and* min *are pointwise defined between membership functions, and* ¬*μ* def = **1** − *μ. Then* <sup>Z</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� *is Zadeh algebra if it satisfies the conditions*


$$(\mathcal{Z}6) \quad \mathbf{0} \neq \mathbf{1}.$$

**Definition 4.1** (Kleene algebra)**.** De Morgan algebra is *Kleene algebra* if it satisfies the additional condition

(K) *x* ∧ ¬*x* ≤ *y* ∨ ¬*y*.

**Theorem 4.2.** *Zadeh algebra (4.1) is a Kleene algebra.*

*Proof.* Zadeh algebra is De Morgan algebra. The condition (K) in Zadeh algebra has the form

$$\min\{\mu, \neg \mu\} \le \max\{\nu, \neg \nu\}$$

for all *<sup>μ</sup>*, *<sup>ν</sup>* <sup>∈</sup> **<sup>I</sup>***X*.

To prove this, we can easily show that always min{*μ*, <sup>¬</sup>*μ*} ≤ <sup>1</sup> <sup>2</sup> and <sup>1</sup> <sup>2</sup> ≤ max{*ν*, ¬*ν*} for arbitrary *<sup>μ</sup>*, *<sup>ν</sup>* <sup>∈</sup> **<sup>I</sup>***X*, where the result follows immediately.

An alternative way is an easy task to check the four cases: (1◦) *<sup>μ</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , *<sup>ν</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , (2◦) *<sup>μ</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , *ν* > 1 <sup>2</sup> , (3◦) *<sup>μ</sup>* <sup>&</sup>gt; <sup>1</sup> <sup>2</sup> , *<sup>ν</sup>* <sup>≤</sup> <sup>1</sup> <sup>2</sup> , and (4◦) *<sup>μ</sup>* <sup>&</sup>gt; <sup>1</sup> <sup>2</sup> , *<sup>ν</sup>* <sup>&</sup>gt; <sup>1</sup> <sup>2</sup> , and find out that each of these cases satisfies the condition (K).

Zadeh algebra Zℵ<sup>1</sup> has subalgebras which are Zadeh algebras, too. A range of membership functions can be a suitable subset of the unit interval [0, 1], such that the postulates of Prop. 4.1 are satisfied. Here the suitability means that the set is closed under the operations of the algebra.

**Example 4.1.** Consider a set *A* = {0, 1} which is a subset of [0, 1] consisting of the extreme cases of the unit interval. The algebra �*AX*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� satisfies the conditions of Zadeh algebra. This algebra is really an extreme case, because it is the Boolean algebra of characteristic functions of strict (i.e., usual) sets. It is a subalgebra of Zℵ<sup>1</sup> .

**Example 4.2.** Consider a set *<sup>A</sup>* <sup>=</sup> {0, <sup>1</sup> <sup>2</sup> , 1} being a subset of [0, 1]. The set *A* is the range of functions *<sup>μ</sup>* : *<sup>X</sup>* −→ *<sup>A</sup>* where *<sup>X</sup>* �<sup>=</sup> <sup>∅</sup> is a set. These functions belong to the set **<sup>I</sup>***X*, by means of which *A* is a subset of **I***X*. The conditions of Prop. 4.1 are clearly satisfied. Hence, <sup>Z</sup><sup>3</sup> <sup>=</sup> �*AX*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� is a 3-valued Zadeh algebra, and hence, a subalgebra of Zℵ<sup>1</sup> .

**Example 4.3.** Consider a set *A* consisting of all the rationals from the unit interval [0, 1]. The number of the elements of *A* is countable, but infinite. Hence, the cardinality of *A* is ℵ0. Making similar considerations as in the previous example, we verify that Zℵ<sup>0</sup> = �*AX*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� is a subalgebra of Zℵ<sup>1</sup> .

8 Will-be-set-by-IN-TECH

**Proposition 4.1.** *Let* **<sup>I</sup>***<sup>X</sup>* <sup>=</sup> {*<sup>μ</sup>* <sup>|</sup> *<sup>μ</sup>* : *<sup>X</sup>* −→ **<sup>I</sup>**} *be the set of all functions from X to* **<sup>I</sup>***, where the*

(Z4) *The neutral elements of the operations* max *and* min *are* **<sup>0</sup>** *and* **<sup>1</sup>***, respectively, i.e., for all <sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X,*

**Definition 4.1** (Kleene algebra)**.** De Morgan algebra is *Kleene algebra* if it satisfies the additional

*Proof.* Zadeh algebra is De Morgan algebra. The condition (K) in Zadeh algebra has the form

min{*μ*, ¬*μ*} ≤ max{*ν*, ¬*ν*}

Zadeh algebra Zℵ<sup>1</sup> has subalgebras which are Zadeh algebras, too. A range of membership functions can be a suitable subset of the unit interval [0, 1], such that the postulates of Prop. 4.1 are satisfied. Here the suitability means that the set is closed under the operations of the

**Example 4.1.** Consider a set *A* = {0, 1} which is a subset of [0, 1] consisting of the extreme cases of the unit interval. The algebra �*AX*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� satisfies the conditions of Zadeh algebra. This algebra is really an extreme case, because it is the Boolean algebra of

of functions *<sup>μ</sup>* : *<sup>X</sup>* −→ *<sup>A</sup>* where *<sup>X</sup>* �<sup>=</sup> <sup>∅</sup> is a set. These functions belong to the set **<sup>I</sup>***X*, by means of which *A* is a subset of **I***X*. The conditions of Prop. 4.1 are clearly satisfied. Hence, <sup>Z</sup><sup>3</sup> <sup>=</sup> �*AX*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� is a 3-valued Zadeh algebra, and hence, a subalgebra of Zℵ<sup>1</sup> . **Example 4.3.** Consider a set *A* consisting of all the rationals from the unit interval [0, 1]. The number of the elements of *A* is countable, but infinite. Hence, the cardinality of *A* is ℵ0. Making similar considerations as in the previous example, we verify that Zℵ<sup>0</sup> =

<sup>2</sup> and <sup>1</sup>

<sup>2</sup> , and find out that each of these cases satisfies the

<sup>2</sup> , 1} being a subset of [0, 1]. The set *A* is the range

<sup>2</sup> , *<sup>ν</sup>* <sup>≤</sup> <sup>1</sup>

<sup>2</sup> ≤ max{*ν*, ¬*ν*} for

<sup>2</sup> , (2◦) *<sup>μ</sup>* <sup>≤</sup> <sup>1</sup>

<sup>2</sup> , *ν* >

(Z5) *For any membership function <sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X, there exists* <sup>¬</sup>*<sup>μ</sup>* <sup>∈</sup> **<sup>I</sup>***X, such that* (¬*μ*) = <sup>1</sup> <sup>−</sup> *<sup>μ</sup>;*

def = **1** − *μ.*

*operations* max *and* min *are pointwise defined between membership functions, and* ¬*μ*

*Then* <sup>Z</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� *is Zadeh algebra if it satisfies the conditions*

(Z1) *The operations* max *and* min *are commutative on* **<sup>I</sup>***X;* (Z2) *The operations* max *and* min *are associative on* **<sup>I</sup>***X;*

max{*μ*, **0**} = *μ and* min{*μ*, **1**} = *μ;*

**Theorem 4.2.** *Zadeh algebra (4.1) is a Kleene algebra.*

To prove this, we can easily show that always min{*μ*, <sup>¬</sup>*μ*} ≤ <sup>1</sup>

An alternative way is an easy task to check the four cases: (1◦) *<sup>μ</sup>* <sup>≤</sup> <sup>1</sup>

characteristic functions of strict (i.e., usual) sets. It is a subalgebra of Zℵ<sup>1</sup> .

<sup>2</sup> , *<sup>ν</sup>* <sup>&</sup>gt; <sup>1</sup>

arbitrary *<sup>μ</sup>*, *<sup>ν</sup>* <sup>∈</sup> **<sup>I</sup>***X*, where the result follows immediately.

<sup>2</sup> , and (4◦) *<sup>μ</sup>* <sup>&</sup>gt; <sup>1</sup>

(Z6) **0** �= **1***.*

(K) *x* ∧ ¬*x* ≤ *y* ∨ ¬*y*.

for all *<sup>μ</sup>*, *<sup>ν</sup>* <sup>∈</sup> **<sup>I</sup>***X*.

<sup>2</sup> , (3◦) *<sup>μ</sup>* <sup>&</sup>gt; <sup>1</sup>

condition (K).

algebra.

<sup>2</sup> , *<sup>ν</sup>* <sup>≤</sup> <sup>1</sup>

**Example 4.2.** Consider a set *<sup>A</sup>* <sup>=</sup> {0, <sup>1</sup>

�*AX*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� is a subalgebra of Zℵ<sup>1</sup> .

1

condition

(Z3) *The operations* max *and* min *are distributive to each other;*

Zadeh-algebra as a special case of De Morgan algebras give rise to closer analysis. Here we have done some part of it. The author thinks that Prof. Zadeh did not necessarily think about De Morgan algebras, when he created his crucial paper "Fuzzy Sets" (Zadeh, 1965). He thought the problem from another point of view, as can be seen in the construction of the paper. His leading idea was to model things in the eventful real world. In any way, it was a happy event that Prof. Zadeh's ideas met such a mathematical frame we have considered here. No others have been so successful to find such a *right interpretation* to some formal tools for modeling real world incidences. In the same time the *problem of interpretation* of many-valued logic got a solution. Many-valued logic began to give meaningful tools for analyzing and modeling things in real world. The role of many-valued logics were very nominal before Prof. Zadeh invented fuzzy set theory. After this, the study of many-valued logic met a new rise. Fuzzy set theory and fuzzy logic has helped the researchers to find new aspects from already existing mathematical theories. This kind of work is now going on very strongly.

### **5. Where is the hiding-place of fuzziness?**

For example, imagine a set of beautiful women. Let us denote this set by *A*. There are women who do not belong to *A* with the highest grade 1. So, such a woman *does not* have some features which would make her beautiful. But she may have some of those features anyway. An intuitive hint about a possible answer to the question: "Where is the hiding-place of fuzziness?" can be found just on the second line above: "... *does not* ..." It seems that a partial complementarity is somehow involved in this problem.

Let us compare Zadeh algebra with a general Boolean algebra with a supposition that the binary operations are associative because associativity holds in Zadeh algebra. The definition of this kind of Boolean algebra can be postulated as follows.

**Definition 5.1.** Let ∧ (*meet*) and ∨ (*join*) be binary operations, and � (*complement*) a unary operation on a set *B*(�= ∅), and let **0** and **1** be the elements of *B*, such that the following axioms hold:


$$\begin{aligned} \forall \mathbf{x}, \boldsymbol{y}, \boldsymbol{z} \in B, \quad & \mathbf{x} \wedge (\boldsymbol{y} \vee \boldsymbol{z}) = (\mathbf{x} \wedge \boldsymbol{y}) \vee (\mathbf{x} \wedge \boldsymbol{z}),\\ \mathbf{x} \vee (\boldsymbol{y} \wedge \boldsymbol{z}) = (\mathbf{x} \vee \boldsymbol{y}) \wedge (\mathbf{x} \vee \boldsymbol{z}).\end{aligned}$$


Hence, the set *B* together with these operations forms a *Boolean algebra* B = (*B*, ∨, ∧, � , **0**, **1**).

The only *structural difference* between these algebras is that between the axioms Z5 and BA 5. BA 5 is characteristic for complement operation, but Z5 does not satisfy the conditions of complement. So, fuzziness lies in the axiom Z5. The influence of this axiom is that also other

propositional wff's. Starting with propositional letters, we can combine them by connectives in the way shown by the production system (6.1). And finally, we can combine any formulas

Standard Fuzzy Sets and some Many-Valued Logics 85

We can refer to the propositional letters also by lower case letters in general, and to combined formulas by lower case Greek letters or by usual capital letters. These letters belong to the metalanguage we use when we discuss and describe object language. Here we use English

The definitions 6.1 and 6.2 above defines the language of propositional logic. Second, we consider some central semantical concepts being necessary for our consideration. This means that we will not present the whole machinery of formal semantics of standard many-valued

We have two important functions, *valuation* and *truth-function* we need in our consideration.

that associates truth values to propositional letters. **Prop** is a set of propositional variables.

where *n* is the number of propositional variables *p*, *q*, . . . in the formula defining a truth

In general, truth-functions are functions of several variables defined on the *n*-tuple of the

subindices *n* = 1, 2, 3, . . . are usually dropped. Actually, a propositional formula *ϕ* itself is a truth-function. Suppose that a formula *ϕ* consists of the propositional letters *p*, *q*, and *r*. Then we may write *ϕ* = *V*(*p*, *q*,*r*). The equality sign is used only between truth-functions and

In connected formulas, valuations of propositional variables give the values for the variables of the corresponding truth-function presented by the connected formula. Hence, a "valuation" of a connected formula is the value of the corresponding truth-function. Hence, to evaluate a truth value of the whole connected formula corresponding a given valuation for propositional variables, we calculate the value of the truth-function where the given valuation *v* first determines the values of the arguments of the truth-function. We may denote the truth value of a connected formula *ϕ* by *V*(*ϕ*), being like a valuation depending on a given valuation *v*

**Example 6.1.** Evaluate the truth value of a formula *p* ∧ (¬*q* ∨ *r*) with regard to a given valuation *v* for *p*, *q*, and *r*. Actually the formula is a truth-function *f*(*p*, *q*,*r*) = *p* ∧ (¬*q* ∨ *r*) where *p*, *q*, and *r* obtain their values from [0, 1]. These values are *v*(*p*), *v*(*q*), and *v*(*r*). Now,

*V*(*p* ∧ (¬*q* ∨ *r*)) = *v*(*p*) ∧ (¬*v*(*q*) ∨ *v*(*r*)

the truth value of the formula,given by this valuation *v*, is

*v* : **Prop** −→ **I** (6.2)

*Vn* : **<sup>I</sup>***<sup>n</sup>* −→ **<sup>I</sup>**, *<sup>n</sup>* <sup>=</sup> 1, 2, 3, . . . , (6.3)

*<sup>n</sup>* where the independent variables are proposition letters. The

equipped by some formal symbols (so-called meta-symbols) as a metalanguage.

according to the production system.

They are defined as follows.

*Truth-function* is a function

set of truth values, [0, 1]

for propositional letters.

**Definition 6.3.** *Valuation v* is a function

logics.

function.

truth values.

values can be considered as membership degrees than only 0 and 1. In Boolean algebras with the universe of discourse {0, 1} the postulate BA5 do not cause conflicts, like the intermediate values may do if these values are added to the universe.

Because complement operation satisfies the conditions of strong negation, a Boolean algebra B = ({0, 1}, ∨, ∧, � , **0**, **1**) is a special case of Z, i.e., the classical case is included in Z. Trivially, {0, 1} ⊂ [0, 1]. This means that crisp sets are special cases of fuzzy sets, as they should be also according to Zadeh's own theory. See also the proof of Theorem 2.1.

We may conclude that formally the core hiding-place of fuzziness is the statement Z5 in Proposition 4.1. In a concept being fuzzy there is always something that *does not* hold, i.e., some missing particle the concept does not have. Hence, the complementarity is somehow involved in a fuzzy concept.

### **6. Common features of many-valued logics based on Zadeh algebra**

We consider here some preliminary things being common for several many-valued logics. The main purpose is to find a connection between Zaheh algebra and the structures of some many-valued logics. We consider only some propositional logics, because the main concepts we consider here are basic to higher order many-valued logics, too. We restrict our considerations only to structural properties.

First, we need a formal language for our considerations. This language is that of propositional logic.

**Definition 6.1.** A propositional lanuage L consists of


These symbols are the *aplhabets* of the propositional language.

Usually only the so-called *primitive connectives* belong to the alphabet, but it is possible to choose some other connectives to the alphabet, too. Hence, we could drop either conjunction or disjunction from the alphabet if we like. Primitive connectives are connectives from which we can derive the other connectives.

In a standard nonclassical propositional language, the meanings of the connectives '∧' (*conjunction*), '∨' (*disjunction*), and '¬' (*negation*) can be given as follows: negation ¬ is a stong negation (i.e., it is a negation with involution property ¬¬*p* ≡ *p*), conjunction ∧ is *glb* (greatest lower bound), and disjunction ∨ is *lub* (least upper bound). The symbol '≡' and ⇐⇒ are used as a *meta-symbols* of equivalency, i.e., this symbol does not belong to the alphabet of the *object language* which is the language of a formal logic under consideration.

**Definition 6.2.** Well-formed formulas of L are given as follows:

$$\mathfrak{a} \implies \mathcal{p}\_{\mathbb{k}} \mid \neg \mathcal{q} \mid \mathfrak{q} \wedge \psi \mid \mathfrak{q} \vee \psi \,. \tag{6.1}$$

In this recursive production system of well formed formulas (wff's, for short) the symbol *pk* represents any propositional letter and lower case Greek letters are labels of any atomic or connected wff's. Hence, *α* is a label for any wff, and similarly *ϕ* and *ψ* represent any 10 Will-be-set-by-IN-TECH

values can be considered as membership degrees than only 0 and 1. In Boolean algebras with the universe of discourse {0, 1} the postulate BA5 do not cause conflicts, like the intermediate

Because complement operation satisfies the conditions of strong negation, a Boolean algebra

{0, 1} ⊂ [0, 1]. This means that crisp sets are special cases of fuzzy sets, as they should be also

We may conclude that formally the core hiding-place of fuzziness is the statement Z5 in Proposition 4.1. In a concept being fuzzy there is always something that *does not* hold, i.e., some missing particle the concept does not have. Hence, the complementarity is somehow

We consider here some preliminary things being common for several many-valued logics. The main purpose is to find a connection between Zaheh algebra and the structures of some many-valued logics. We consider only some propositional logics, because the main concepts we consider here are basic to higher order many-valued logics, too. We restrict our

First, we need a formal language for our considerations. This language is that of propositional

Usually only the so-called *primitive connectives* belong to the alphabet, but it is possible to choose some other connectives to the alphabet, too. Hence, we could drop either conjunction or disjunction from the alphabet if we like. Primitive connectives are connectives from which

In a standard nonclassical propositional language, the meanings of the connectives '∧' (*conjunction*), '∨' (*disjunction*), and '¬' (*negation*) can be given as follows: negation ¬ is a stong negation (i.e., it is a negation with involution property ¬¬*p* ≡ *p*), conjunction ∧ is *glb* (greatest lower bound), and disjunction ∨ is *lub* (least upper bound). The symbol '≡' and ⇐⇒ are used as a *meta-symbols* of equivalency, i.e., this symbol does not belong to the alphabet of the *object*

In this recursive production system of well formed formulas (wff's, for short) the symbol *pk* represents any propositional letter and lower case Greek letters are labels of any atomic or connected wff's. Hence, *α* is a label for any wff, and similarly *ϕ* and *ψ* represent any

*α* ::= *pk* | ¬*ϕ* | *ϕ* ∧ *ψ* | *ϕ* ∨ *ψ*. (6.1)

, **0**, **1**) is a special case of Z, i.e., the classical case is included in Z. Trivially,

values may do if these values are added to the universe.

according to Zadeh's own theory. See also the proof of Theorem 2.1.

**6. Common features of many-valued logics based on Zadeh algebra**

B = ({0, 1}, ∨, ∧, �

logic.

involved in a fuzzy concept.

considerations only to structural properties.

we can derive the other connectives.

**Definition 6.1.** A propositional lanuage L consists of 1. a set of propositional letters *p*0, *p*1,..., *pk*, . . . and 2. the truth-functional connectives '∧', '∨', and '¬'.

These symbols are the *aplhabets* of the propositional language.

*language* which is the language of a formal logic under consideration.

**Definition 6.2.** Well-formed formulas of L are given as follows:

propositional wff's. Starting with propositional letters, we can combine them by connectives in the way shown by the production system (6.1). And finally, we can combine any formulas according to the production system.

We can refer to the propositional letters also by lower case letters in general, and to combined formulas by lower case Greek letters or by usual capital letters. These letters belong to the metalanguage we use when we discuss and describe object language. Here we use English equipped by some formal symbols (so-called meta-symbols) as a metalanguage.

The definitions 6.1 and 6.2 above defines the language of propositional logic. Second, we consider some central semantical concepts being necessary for our consideration. This means that we will not present the whole machinery of formal semantics of standard many-valued logics.

We have two important functions, *valuation* and *truth-function* we need in our consideration. They are defined as follows.

**Definition 6.3.** *Valuation v* is a function

$$v: \mathbf{Prop} \longrightarrow \mathbb{I} \tag{6.2}$$

that associates truth values to propositional letters. **Prop** is a set of propositional variables.

*Truth-function* is a function

$$V\_{\mathfrak{n}} : \mathbb{I}^{\mathfrak{n}} \longrightarrow \mathbb{I}, \quad \mathfrak{n} = 1, 2, 3, \dots, \tag{6.3}$$

where *n* is the number of propositional variables *p*, *q*, . . . in the formula defining a truth function.

In general, truth-functions are functions of several variables defined on the *n*-tuple of the set of truth values, [0, 1] *<sup>n</sup>* where the independent variables are proposition letters. The subindices *n* = 1, 2, 3, . . . are usually dropped. Actually, a propositional formula *ϕ* itself is a truth-function. Suppose that a formula *ϕ* consists of the propositional letters *p*, *q*, and *r*. Then we may write *ϕ* = *V*(*p*, *q*,*r*). The equality sign is used only between truth-functions and truth values.

In connected formulas, valuations of propositional variables give the values for the variables of the corresponding truth-function presented by the connected formula. Hence, a "valuation" of a connected formula is the value of the corresponding truth-function. Hence, to evaluate a truth value of the whole connected formula corresponding a given valuation for propositional variables, we calculate the value of the truth-function where the given valuation *v* first determines the values of the arguments of the truth-function. We may denote the truth value of a connected formula *ϕ* by *V*(*ϕ*), being like a valuation depending on a given valuation *v* for propositional letters.

**Example 6.1.** Evaluate the truth value of a formula *p* ∧ (¬*q* ∨ *r*) with regard to a given valuation *v* for *p*, *q*, and *r*. Actually the formula is a truth-function *f*(*p*, *q*,*r*) = *p* ∧ (¬*q* ∨ *r*) where *p*, *q*, and *r* obtain their values from [0, 1]. These values are *v*(*p*), *v*(*q*), and *v*(*r*). Now, the truth value of the formula,given by this valuation *v*, is

$$V(p \land (\neg q \lor r)) = v(p) \land (\neg v(q) \lor v(r))$$

the evaluation rules (6.4), (6.5), and (6.6), the operations max, min, and ¬ exist at least in the logics having these evaluation rules. Let us compare the power set of fuzzy sets of the set *X*, i.e., the set **<sup>I</sup>***X*, to the set of all valuations *<sup>v</sup>* : **Prop** −→ **<sup>I</sup>**. Hence, the set of all valuations is **IProp**. Especially, **1** and **0** are constant valuations, such that **1** gives the truth value 1 to every propositional letters, and similarly, **0** gives the truth value 0. Hence, we also have the neutral elements corresponding those in Zadeh algebra. It seems that if we replace the set **I***<sup>X</sup>* by **IProp**

Standard Fuzzy Sets and some Many-Valued Logics 87

The values of valuations are truth values and those of membership functions are membership grades. Can these two interpretations for the elements [0, 1] be considered to be anyhow similar? According to formal consideration, we say *yes*. The both values are obtainable from the same set, namely from the unit interval [0, 1], and the construction of the both algebras are exactly the same. On the other hand, membership grades are in principle subjective opinions about the membership of an element in a given set. About truth values, a *degree of truth* of a given propositional letter in a given situation depends on the state of affairs associated to this situation. But there is a valuation for every state of affairs in any situation representing a suitable degree of truth expressed by a number obtained from [0, 1]. Hence, these degrees of truth correspond to suitable membership grades even so that for any valuation there exists a membership function that is identical with the valuation. Hence, these two apparently different interpretations can be considered to be the same. This means that we can interpret the values of the functions of the algebra (6.7) as truth values, or more accurately, degrees of

For historical reasons, we consider a piece of Kleene's 3-valued logic. S. C. Kleene was Zadeh's logic teacher, and it is natural that Zadeh compared his concept of fuzzy set with Kleene's

"Note that the notion of belonging", which plays a fundamental role in the case of ordinary sets, does not have the same role in the case of fuzzy sets. Thus, it is not meaningful to speak of a point *x* "belonging" to a fuzzy set *A* except in the trivial sense of *fA*(*x*) being positive. Less trivially, one can introduce two levels *α* and *β* (0 < *α* < 1, 0 < *β* < 1, *α* > *β*) and agree to say that (1) "*x* belongs to *A*" if *fA*(*x*) ≥ *α*; (2) "*x* does not belong to *A*" if *fA*(*x*) ≤ *β*; and (3) "*x* has an intermediate status relative to *A*" if *β* < *fA*(*x*) < *α*. This leads to a three-valued logic (Kleene, 1952) with three truth

The symbols of the truth values of Kleene's 3-valued logic are *T* (*true*), *U* (*unknown*), and *F* (*false*). In the literature, there are also some alternative symbols for the intermediate truth

Kleene introduced his 3-vaued logic in 1938. We denote it by **K**3. In order to describe Kleene's

"In Kleene's system, a proposition is to bear the third truth-value *I* not for fact-related, ontological reasons but for knowledge-related, epistemological ones: it is not to be excluded that the proposition may *in fact* be true or false, but it is merely *unknown*

3-valued logic. Zadeh ((Zadeh, 1965) p. 341-342) gives the following comment:

values *T* (*fA*(*x*) ≥ *α*), *F* (*fA*(*x*) ≤ *β*), and *U* (*β* < *fA*(*x*) < *α*).

value. For example, Rescher (Rescher, 1969) uses the symbol *I*.

logic, we refer to Rescher (Rescher, 1969), p. 34 - 36. He writes:

or undeterminable what its specific truth status may be.

Lℵ<sup>1</sup> <sup>=</sup> �**IProp**, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**�. (6.7)

then we have a special Zadeh algebra, namely, say, *propositional algebra*

truth.

**7. Description of Kleene's logic**

Suppose that *v* is a valuation where *v*(*p*) = 0.5, *v*(*q*) = 0.3, and *v*(*r*) = 1. Hence,

$$V(p \land (\neg q \lor r)) = 0.5 \land ((1 - 0.3) \lor 1) = 0.5 \land (0.7 \lor 1) = 0.5 \land 1 = 0.5$$

If we have two wff's representing the same state of affairs we use meta-equivalence sign '≡' or '⇐⇒' between them because the formulas are equivalent to each other, not identical. For example, the wff *ϕ* is the formula *p* ∧ (¬*q* ∨ *r*). So, we can write *ϕ* ≡ *p* ∧ (¬*q* ∨ *r*), or *ϕ* ⇐⇒ *p* ∧ (¬*q* ∨ *r*) to denote that we use an abbreviation *ϕ* for the formula *p* ∧ (¬*q* ∨ *r*). Another case is that we have two formulas being equivalent to each other, for example, ¬*p* ∨ ¬*q* ≡ ¬(*p* ∧ *q*). This equivalency describes one of De Morgan's laws. However, the expression

$$\forall \mathbf{x}, y \in [0, 1], \quad \neg \mathbf{x} \lor \neg y = \neg (\mathbf{x} \land y)$$

emphasizes that two truth-functions ¬*x* ∨ ¬*y* and ¬(*x* ∧ *y*) are identical.

Instead of propositional letters, we prefer to use "usual" variable symbols as variables of a truth-function, because of possible confusions.

We are interested in the logics where the *evaluation rules* for these connectives are

$$V(p \lor q) = \max\{v(p), v(q)\}\tag{6.4}$$

$$V(p \wedge q) = \min\{v(p), v(q)\}\tag{6.5}$$

$$V(\neg p) = 1 - v(p) \tag{6.6}$$

where *v*(*p*), *v*(*q*) ∈ [0, 1], or *v*(*p*), *v*(*q*) ∈ *A* where *A* is a suitable subset of [0, 1].

Evaluation rules are rules for evaluating truth values to connected logical formulas in a given logic.

We must remember that all the logics are not truth-functional. For example, modal logics are non-truth-functional.

In practice, we need some other connectives, too. Two of them are the connectives *implication* and *equivalency*.

The way to choose implication separates the logics based on Zadeh algebra or any De Morgan algebra. Hence, implication must be presented by means of disjunction and negation, or by means of conjunction and negation. There are several ways to define different implications from other connectives, depending on the logic in question. We consider these things in the case of each logic to be considered below.

The formulas (6.4), (6.5), and (6.6) somehow emphasize the relationship to algebraic construction. We have two binary operations and one unary operation defined on a nonempty set just as in usual algebraic system. Additionally, the binary operations are combined together, for example, being distributive. The formulas (6.4), (6.5), and (6.6) are the same as Zadeh's operations defined on the set of fuzzy sets. The bridge between standard fuzzy sets and some many-valued logics seems to be obvious. Having got this kind of motivation, we continue our construction of the bridge between standard fuzzy sets and some many-valued logics.

Consider Zadeh algebra Zℵ<sup>1</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� (*cf.* formula (4.1)). Now the question is wether there is a counterpart to this algebra in the scope of many-valued logic. According to 12 Will-be-set-by-IN-TECH

*V*(*p* ∧ (¬*q* ∨ *r*)) = 0.5 ∧ ((1 − 0.3) ∨ 1) = 0.5 ∧ (0.7 ∨ 1) = 0.5 ∧ 1 = 0.5

If we have two wff's representing the same state of affairs we use meta-equivalence sign '≡' or '⇐⇒' between them because the formulas are equivalent to each other, not identical. For example, the wff *ϕ* is the formula *p* ∧ (¬*q* ∨ *r*). So, we can write *ϕ* ≡ *p* ∧ (¬*q* ∨ *r*), or *ϕ* ⇐⇒ *p* ∧ (¬*q* ∨ *r*) to denote that we use an abbreviation *ϕ* for the formula *p* ∧ (¬*q* ∨ *r*). Another case is that we have two formulas being equivalent to each other, for example, ¬*p* ∨ ¬*q* ≡ ¬(*p* ∧ *q*).

∀*x*, *y* ∈ [0, 1], ¬*x* ∨ ¬*y* = ¬(*x* ∧ *y*)

Instead of propositional letters, we prefer to use "usual" variable symbols as variables of a

Evaluation rules are rules for evaluating truth values to connected logical formulas in a given

We must remember that all the logics are not truth-functional. For example, modal logics are

In practice, we need some other connectives, too. Two of them are the connectives *implication*

The way to choose implication separates the logics based on Zadeh algebra or any De Morgan algebra. Hence, implication must be presented by means of disjunction and negation, or by means of conjunction and negation. There are several ways to define different implications from other connectives, depending on the logic in question. We consider these things in the

The formulas (6.4), (6.5), and (6.6) somehow emphasize the relationship to algebraic construction. We have two binary operations and one unary operation defined on a nonempty set just as in usual algebraic system. Additionally, the binary operations are combined together, for example, being distributive. The formulas (6.4), (6.5), and (6.6) are the same as Zadeh's operations defined on the set of fuzzy sets. The bridge between standard fuzzy sets and some many-valued logics seems to be obvious. Having got this kind of motivation, we continue our construction of the bridge between standard fuzzy sets and some many-valued

Consider Zadeh algebra Zℵ<sup>1</sup> <sup>=</sup> �**I***X*, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**� (*cf.* formula (4.1)). Now the question is wether there is a counterpart to this algebra in the scope of many-valued logic. According to

*V*(*p* ∨ *q*) = max{*v*(*p*), *v*(*q*)} (6.4) *V*(*p* ∧ *q*) = min{*v*(*p*), *v*(*q*)} (6.5) *V*(¬*p*) = 1 − *v*(*p*) (6.6)

Suppose that *v* is a valuation where *v*(*p*) = 0.5, *v*(*q*) = 0.3, and *v*(*r*) = 1. Hence,

This equivalency describes one of De Morgan's laws. However, the expression

We are interested in the logics where the *evaluation rules* for these connectives are

where *v*(*p*), *v*(*q*) ∈ [0, 1], or *v*(*p*), *v*(*q*) ∈ *A* where *A* is a suitable subset of [0, 1].

emphasizes that two truth-functions ¬*x* ∨ ¬*y* and ¬(*x* ∧ *y*) are identical.

truth-function, because of possible confusions.

case of each logic to be considered below.

logic.

logics.

non-truth-functional.

and *equivalency*.

the evaluation rules (6.4), (6.5), and (6.6), the operations max, min, and ¬ exist at least in the logics having these evaluation rules. Let us compare the power set of fuzzy sets of the set *X*, i.e., the set **<sup>I</sup>***X*, to the set of all valuations *<sup>v</sup>* : **Prop** −→ **<sup>I</sup>**. Hence, the set of all valuations is **IProp**. Especially, **1** and **0** are constant valuations, such that **1** gives the truth value 1 to every propositional letters, and similarly, **0** gives the truth value 0. Hence, we also have the neutral elements corresponding those in Zadeh algebra. It seems that if we replace the set **I***<sup>X</sup>* by **IProp** then we have a special Zadeh algebra, namely, say, *propositional algebra*

$$\mathcal{L}\_{\aleph\_1} = \langle \mathbf{I}^{\mathbf{Prop}}, \mathbf{max}, \min, \neg, \mathbf{0}, \mathbf{1} \rangle. \tag{6.7}$$

The values of valuations are truth values and those of membership functions are membership grades. Can these two interpretations for the elements [0, 1] be considered to be anyhow similar? According to formal consideration, we say *yes*. The both values are obtainable from the same set, namely from the unit interval [0, 1], and the construction of the both algebras are exactly the same. On the other hand, membership grades are in principle subjective opinions about the membership of an element in a given set. About truth values, a *degree of truth* of a given propositional letter in a given situation depends on the state of affairs associated to this situation. But there is a valuation for every state of affairs in any situation representing a suitable degree of truth expressed by a number obtained from [0, 1]. Hence, these degrees of truth correspond to suitable membership grades even so that for any valuation there exists a membership function that is identical with the valuation. Hence, these two apparently different interpretations can be considered to be the same. This means that we can interpret the values of the functions of the algebra (6.7) as truth values, or more accurately, degrees of truth.

### **7. Description of Kleene's logic**

For historical reasons, we consider a piece of Kleene's 3-valued logic. S. C. Kleene was Zadeh's logic teacher, and it is natural that Zadeh compared his concept of fuzzy set with Kleene's 3-valued logic. Zadeh ((Zadeh, 1965) p. 341-342) gives the following comment:

"Note that the notion of belonging", which plays a fundamental role in the case of ordinary sets, does not have the same role in the case of fuzzy sets. Thus, it is not meaningful to speak of a point *x* "belonging" to a fuzzy set *A* except in the trivial sense of *fA*(*x*) being positive. Less trivially, one can introduce two levels *α* and *β* (0 < *α* < 1, 0 < *β* < 1, *α* > *β*) and agree to say that (1) "*x* belongs to *A*" if *fA*(*x*) ≥ *α*; (2) "*x* does not belong to *A*" if *fA*(*x*) ≤ *β*; and (3) "*x* has an intermediate status relative to *A*" if *β* < *fA*(*x*) < *α*. This leads to a three-valued logic (Kleene, 1952) with three truth values *T* (*fA*(*x*) ≥ *α*), *F* (*fA*(*x*) ≤ *β*), and *U* (*β* < *fA*(*x*) < *α*).

The symbols of the truth values of Kleene's 3-valued logic are *T* (*true*), *U* (*unknown*), and *F* (*false*). In the literature, there are also some alternative symbols for the intermediate truth value. For example, Rescher (Rescher, 1969) uses the symbol *I*.

Kleene introduced his 3-vaued logic in 1938. We denote it by **K**3. In order to describe Kleene's logic, we refer to Rescher (Rescher, 1969), p. 34 - 36. He writes:

"In Kleene's system, a proposition is to bear the third truth-value *I* not for fact-related, ontological reasons but for knowledge-related, epistemological ones: it is not to be excluded that the proposition may *in fact* be true or false, but it is merely *unknown* or undeterminable what its specific truth status may be.

Clearly, the algebraic approach to **K**<sup>3</sup> is Kleene algebra, i.e., a 3-valued Zadeh algebra with the

Standard Fuzzy Sets and some Many-Valued Logics 89

*Kleene-Dienes many-valued logic* is an extension of **<sup>K</sup>**<sup>3</sup> into **<sup>K</sup>**ℵ<sup>1</sup> having the set of truth values [0, 1]. The evaluation rules for conjunction, disjunction, and negation are the same as in the

def

being in accordance with the implication of **K**3. This means that the evaluation rule for

The equations (6.4), (6.5), (6.6), (7.7), and (7.9) are the truth value evaluation rules for disjunction, conjunction, negation, implication, and equivalence, respectively, of

The implication operation of Kleene-Dienes many-valued logic is a typical example about a so-called *S-implication*. Another example is the implication operation of classical logic. The

We begin with Łukasiewicz' many-valued logic Łℵ<sup>1</sup> having the closed unit interval [0, 1] as the

As we know, Łukasiewicz chose the connectives of *negation* and *implication* as primitives. This is a remarkable difference, for example, between Kleene's logic and Łukasiewicz logic. Hence, the connection between standard fuzzy set theory and Łℵ<sup>1</sup> cannot be seen immediately.

Let *<sup>v</sup>* be any valuation of Łℵ, then the truth value evaluation rules for negation and implication

⇐⇒ ¬*p* ∨ *q* (7.6)

*x* → *y* = max {1 − *x*, *y*} (7.7)

⇐⇒ (*p* → *q*) ∧ (*q* → *p*) (7.8)

∀*x*, *y* ∈ [0, 1], *x* ↔ *y* = min{*x* → *y*, *y* → *x*} (7.9)

*v*(¬*p*) = 1 − *v*(*p*) (Neg.) *v*(*p* → *q*) = min{1, 1 − *v*(*p*) + *v*(*q*)} (Impl.)

*p* → *q*

def

property (K).

*Kleene-Dienes many-valued logic*:

formulas (6.4), (6.5), and (6.6) above.

implication is as follows. For any *x*, *y* ∈ [0, 1],

Hence, the evaluation rule for equivalence is

**8. On Łukasiewicz' many-valued logic**

<sup>2</sup> *Cf.* Rescher (Rescher, 1969), p.36, and 337.

set of truth values.2

are

Implication of Kleene-Dienes many-valued logic is defined by

Now, the connective *equivalence* is defined in the usual way:

*p* ↔ *q*

Kleene-Dienes many-valued logic with the set of truth values [0, 1].

general principle for S-implication is just the formula (7.6).

In **K**3, we have the following truth value evaluation rules for the connectives negation ¬, conjunction ∧, and disjunction ∨:

$$V(\neg p) = T - v(p)\_{\prime} \tag{7.1}$$

$$V(p \wedge q) = \min\{v(p), v(q)\},\tag{7.2}$$

$$V(p \lor q) = \max\{v(p), v(q)\}. \tag{7.3}$$

Two of these connectives can form a set of *primitive connectives*. One of the primitives must be negation. Hence, for example, the connectives negation and disjunction can be chosen as primitives, and all the other connectives can be defined by means of these primitive connectives. The alternative case for primitives are negation and conjunction. Hence, we can define the nonprimitive one by negation and the fact that disjunction and conjunction are dual (i.e., by using a suitable De Morgan's law). The implication defined by means of negation and disjunction is given by the formula (7.4).

We see immediately that there is a strong analogy between the basic operations of fuzzy sets (*cf.* Def. 2.2) and these three connectives of Kleene's 3-valued logic. In addition to this, an analogy can be found between Kleene's valuations and Zadeh's membership functions, too, although they have different ranges. Zadeh's comment above connects these two concepts "membership" (*μ* : *X* −→ **I**) and "valuation" (*v* : **Prop** −→ {*F*, *U*, *T*}) together. We can compare the set {*F*, *<sup>U</sup>*, *<sup>T</sup>*} with the set {0, <sup>1</sup> <sup>2</sup> , 1} where *<sup>F</sup>* <sup>=</sup> 0, *<sup>U</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> , and *T* = 1. Hence, by analogy of the sets, we can understand some arithmetic operations in some evaluation rules.

Kleene defined the implication of his 3-valued logic, denoted by , analogously to material implication:

$$p \ni q \stackrel{\text{def}}{\iff} \neg p \lor q \tag{7.4}$$

hence, the evaluation rule of *p q* is

$$p \ni q = \max\{T - v(p), v(q)\}. \tag{7.5}$$

We may construct the truth tables according to the evaluation rules. Rescher tells that Kleene motivated the construction of his truth tables in terms of a mathematical application. He has in mind the case of a mathematical predicate *P* (i.e., a propositional function) of a variable *x* ranging over a domain *D* where "*P*(*x*)" is defined for only a part of this domain. For example, we might have the condition

$$P(\mathfrak{x}) \quad \text{iff} \quad 1 \le \frac{1}{\mathfrak{x}} \le 2.$$

Here *P*(*x*) will be:


Kleene presented his truth tables to formulate the rules of combination by logical connectives for such propositional functions. He writes:

"From this standpoint, the meaning of *Q* ∨ *R* is brought out clearly by the statement in words: *Q* ∨ *R* is true, if *Q* is true (here nothing is said about *R*) or if *R* is true (similarly); false, if *Q* and *R* are both false; defined only in these cases (and hence undefined, otherwise)."<sup>1</sup>

<sup>1</sup> Kleene, *Introduction to Metamathematics* (1952).

Clearly, the algebraic approach to **K**<sup>3</sup> is Kleene algebra, i.e., a 3-valued Zadeh algebra with the property (K).

*Kleene-Dienes many-valued logic*:

14 Will-be-set-by-IN-TECH

In **K**3, we have the following truth value evaluation rules for the connectives negation ¬,

Two of these connectives can form a set of *primitive connectives*. One of the primitives must be negation. Hence, for example, the connectives negation and disjunction can be chosen as primitives, and all the other connectives can be defined by means of these primitive connectives. The alternative case for primitives are negation and conjunction. Hence, we can define the nonprimitive one by negation and the fact that disjunction and conjunction are dual (i.e., by using a suitable De Morgan's law). The implication defined by means of negation

We see immediately that there is a strong analogy between the basic operations of fuzzy sets (*cf.* Def. 2.2) and these three connectives of Kleene's 3-valued logic. In addition to this, an analogy can be found between Kleene's valuations and Zadeh's membership functions, too, although they have different ranges. Zadeh's comment above connects these two concepts "membership" (*μ* : *X* −→ **I**) and "valuation" (*v* : **Prop** −→ {*F*, *U*, *T*}) together. We can

analogy of the sets, we can understand some arithmetic operations in some evaluation rules. Kleene defined the implication of his 3-valued logic, denoted by , analogously to material

def

We may construct the truth tables according to the evaluation rules. Rescher tells that Kleene motivated the construction of his truth tables in terms of a mathematical application. He has in mind the case of a mathematical predicate *P* (i.e., a propositional function) of a variable *x* ranging over a domain *D* where "*P*(*x*)" is defined for only a part of this domain. For example,

> 1 *x* ≤ 2.

*P*(*x*) iff 1 ≤

<sup>2</sup> to 1,

Kleene presented his truth tables to formulate the rules of combination by logical connectives

"From this standpoint, the meaning of *Q* ∨ *R* is brought out clearly by the statement in words: *Q* ∨ *R* is true, if *Q* is true (here nothing is said about *R*) or if *R* is true (similarly); false, if *Q* and *R* are both false; defined only in these cases (and hence undefined,

*p q*

<sup>2</sup> , 1} where *<sup>F</sup>* <sup>=</sup> 0, *<sup>U</sup>* <sup>=</sup> <sup>1</sup>

<sup>2</sup> , and *T* = 1. Hence, by

⇐⇒ ¬*p* ∨ *q*, (7.4)

*p q* = max{*T* − *v*(*p*), *v*(*q*)}. (7.5)

*V*(¬*p*) = *T* − *v*(*p*), (7.1) *V*(*p* ∧ *q*) = min{*v*(*p*), *v*(*q*)}, (7.2) *V*(*p* ∨ *q*) = max{*v*(*p*), *v*(*q*)}. (7.3)

conjunction ∧, and disjunction ∨:

and disjunction is given by the formula (7.4).

compare the set {*F*, *<sup>U</sup>*, *<sup>T</sup>*} with the set {0, <sup>1</sup>

hence, the evaluation rule of *p q* is

(1) *true* if *x* lies within the range from <sup>1</sup>

(2) *undefined* (or undetermined) if *x* = 0,

for such propositional functions. He writes:

<sup>1</sup> Kleene, *Introduction to Metamathematics* (1952).

we might have the condition

(3) *false* in all other cases.

Here *P*(*x*) will be:

otherwise)."<sup>1</sup>

implication:

*Kleene-Dienes many-valued logic* is an extension of **<sup>K</sup>**<sup>3</sup> into **<sup>K</sup>**ℵ<sup>1</sup> having the set of truth values [0, 1]. The evaluation rules for conjunction, disjunction, and negation are the same as in the formulas (6.4), (6.5), and (6.6) above.

Implication of Kleene-Dienes many-valued logic is defined by

$$p \to q \stackrel{\text{def}}{\iff} \neg p \lor q \tag{7.6}$$

being in accordance with the implication of **K**3. This means that the evaluation rule for implication is as follows. For any *x*, *y* ∈ [0, 1],

$$\mathbf{x} \to \mathbf{y} = \max\left\{1 - \mathbf{x}\_{\prime}\mathbf{y}\right\} \tag{7.7}$$

Now, the connective *equivalence* is defined in the usual way:

$$p \leftrightarrow q \stackrel{\text{def}}{\Longleftrightarrow} (p \to q) \land (q \to p) \tag{7.8}$$

Hence, the evaluation rule for equivalence is

$$\forall \mathbf{x}, y \in [0, 1], \quad \mathbf{x} \leftrightarrow y = \min \{ \mathbf{x} \to y, y \to \mathbf{x} \} \tag{7.9}$$

The equations (6.4), (6.5), (6.6), (7.7), and (7.9) are the truth value evaluation rules for disjunction, conjunction, negation, implication, and equivalence, respectively, of Kleene-Dienes many-valued logic with the set of truth values [0, 1].

The implication operation of Kleene-Dienes many-valued logic is a typical example about a so-called *S-implication*. Another example is the implication operation of classical logic. The general principle for S-implication is just the formula (7.6).

### **8. On Łukasiewicz' many-valued logic**

We begin with Łukasiewicz' many-valued logic Łℵ<sup>1</sup> having the closed unit interval [0, 1] as the set of truth values.2

As we know, Łukasiewicz chose the connectives of *negation* and *implication* as primitives. This is a remarkable difference, for example, between Kleene's logic and Łukasiewicz logic. Hence, the connection between standard fuzzy set theory and Łℵ<sup>1</sup> cannot be seen immediately.

Let *<sup>v</sup>* be any valuation of Łℵ, then the truth value evaluation rules for negation and implication are

$$v(\neg p) = 1 - v(p) \tag{\text{Neg.}}$$

$$v(p \to q) = \min\{1, 1 - v(p) + v(q)\}\tag{\text{Impl.}}$$

<sup>2</sup> *Cf.* Rescher (Rescher, 1969), p.36, and 337.

**Proposition 8.1.** *Suppose x*, *y* ∈ [0, 1]*. Then De Morgan's laws hold for* min{*x*, *y*} *and* max{*x*, *y*}*.*

Standard Fuzzy Sets and some Many-Valued Logics 91

*Proof.* Consider the operation min(*x*, *y*) and max(*x*, *y*), where *x* and *y* are variables taking their values from the interval [0, 1]. Using the arithmetical formula for min operation (i.e., the

> <sup>=</sup> <sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>+</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*− | <sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>+</sup> *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>* <sup>|</sup> 2

> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>+</sup> <sup>1</sup> <sup>−</sup> *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>+</sup> <sup>|</sup> <sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>+</sup> *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>* <sup>|</sup> 2

From the formula (8.5), by replacing *x* by 1 − *x* and *y* by 1 − *y*, and then solving max(*x*, *y*), the

The formulas (8.5) and (8.6) show that DeMorgan laws hold for max and min, and they are

Łukasiewicz knew that the operations max and min are dual of each other. Actually, this property is easily found in the classical special case, i.e. using characteristic functions in presenting crisp sets. But the general proof for this is easily done by using the expressions (8.4) for max and min in such cases where a distance metric is defined in the universe of

Second, consider the connection between max and Łukasiewicz implication using Zadeh algebra (similar considerations are done in Mattila (Mattila, 2005), but the following

*Proof.* Consider disjunction operation *x* ∨ *y* = max(*x*, *y*). Because 0 ≤ *x*, *y* ≤ 1, using the

<sup>Ł</sup> *<sup>y</sup>*) <sup>→</sup>

<sup>Ł</sup> *<sup>y</sup>*, (8.7)

max(*x*, *y*)=(*x* →

<sup>=</sup> <sup>1</sup> <sup>−</sup> (<sup>1</sup> <sup>−</sup> *<sup>x</sup>*)+(<sup>1</sup> <sup>−</sup> *<sup>y</sup>*)+ <sup>|</sup> (<sup>1</sup> <sup>−</sup> *<sup>y</sup>*) <sup>−</sup> (<sup>1</sup> <sup>−</sup> *<sup>x</sup>*) <sup>|</sup> 2

= 1 − max{ (1 − *x*),(1 − *y*) }. (8.5)

max{*x*, *y*} = 1 − min{ 1 − *x*, 1 − *y*}. (8.6)

expression for min in (8.4)), we have

following formula follows:

dual of each other.

*where x* →

This completes the proof.

discourse. This always holds at least for real numbers.

proposition 8.2 is not completely proved).

<sup>Ł</sup> *y is Łukasiewicz implication.*

arithmetical formula (8.4) for max, we have

**Proposition 8.2.** *For all x*, *y* ∈ [0, 1]*,*

min{*x*, *<sup>y</sup>*} <sup>=</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*− | *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>* <sup>|</sup>

2

By means of these connectives, Łukasiewicz defined the other connectives by the rules

$$p \lor q \stackrel{\text{def}}{\iff} (p \to q) \to q \tag{\text{Dist.}}$$

$$(p \land q \stackrel{\text{def}}{\iff} \neg(\neg p \lor \neg q) \tag{\text{Conj.}}$$

$$p \leftrightarrow q \stackrel{\text{def}}{\Longleftrightarrow} (p \to q) \land (q \to p) \tag{Eq.}$$

The truth value evaluation rules for these derived connectives are

$$\max\{v(p), v(q)\}\tag{8.1}$$

$$\min\{v(p), v(q)\} \tag{8.2}$$

1 − |*v*(*p*) − *v*(*q*)| for *p* ↔ *q* (8.3)

for any valuation *<sup>v</sup>* of Łℵ<sup>1</sup> .

In Zadeh algebra we have the operations representing disjunction, conjunction, and negation as given. Negation in Zadeh algebra has the same construction as that in Łukasiewicz' logic <sup>Ł</sup>ℵ<sup>1</sup> , so, we need not to do anything with it. Now our task is to derive algebraically the *implication* of Łℵ<sup>1</sup> by means of these three other connectives. For this we use the operations of Zadeh algebra. Actually, we need only complementarity and max operations in our solution. After succeeding to solve this problem we know that standard fuzzy sets and Łℵ<sup>1</sup> fits together completely, i.e., we can derive all the connectives of Łℵ<sup>1</sup> in terms of Zadeh algebra. The final result is given in Proposition 8.3. This is the main task in this section.

Consider again the special case of Zadeh algebra (6.7)

$$\mathcal{L}\_{\aleph\_1} = \langle \mathbf{I}^{\mathbf{Prop}}, \mathbf{max}, \min, \neg, \mathbf{0}, \mathbf{1} \rangle.$$

From the considertations above, we know that


We observed in Section 3 that [0, 1] is a metric space with the natural metric distance (3.2)

$$d(\mathfrak{x}, y) = |\mathfrak{x} - y|\, \, \, \, \mathfrak{x} \, y \in [0, 1].$$

This formula satisfies the general definition of the concept *metric*. We need it in the following consideration where we manipulate expressions involving maxima and minima.

In manipulating maxima and minima, the consideration can sometimes be done easier by using the following expressions for max and min operations:

$$\max\{\mathbf{x}, y\} = \frac{\mathbf{x} + \mathbf{y} + |\mathbf{x} - \mathbf{y}|}{2}, \qquad \min\{\mathbf{x}, y\} = \frac{\mathbf{x} + \mathbf{y} - |\mathbf{x} - \mathbf{y}|}{2} \tag{8.4}$$

These formulas hold on the set of real numbers **R**, and especially on the unit interval [0, 1].

First, consider the case where the operations min and max are used in the form of the formulas (8.4), and ¬ is defined in the usual way: ¬*x* = 1 − *x*.

16 Will-be-set-by-IN-TECH

In Zadeh algebra we have the operations representing disjunction, conjunction, and negation as given. Negation in Zadeh algebra has the same construction as that in Łukasiewicz' logic <sup>Ł</sup>ℵ<sup>1</sup> , so, we need not to do anything with it. Now our task is to derive algebraically the *implication* of Łℵ<sup>1</sup> by means of these three other connectives. For this we use the operations of Zadeh algebra. Actually, we need only complementarity and max operations in our solution. After succeeding to solve this problem we know that standard fuzzy sets and Łℵ<sup>1</sup> fits together completely, i.e., we can derive all the connectives of Łℵ<sup>1</sup> in terms of Zadeh algebra. The final

Lℵ<sup>1</sup> <sup>=</sup> �**IProp**, max, min, <sup>¬</sup>, **<sup>0</sup>**, **<sup>1</sup>**�.

• The unary operation ¬ is a complementarity operation with the property of involution. We observed in Section 3 that [0, 1] is a metric space with the natural metric distance (3.2)

*d*(*x*, *y*) = |*x* − *y*| , *x*, *y* ∈ [0, 1]. This formula satisfies the general definition of the concept *metric*. We need it in the following

In manipulating maxima and minima, the consideration can sometimes be done easier by

These formulas hold on the set of real numbers **R**, and especially on the unit interval [0, 1]. First, consider the case where the operations min and max are used in the form of the formulas

<sup>2</sup> , min{*x*, *<sup>y</sup>*} <sup>=</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>* <sup>−</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>|</sup>

<sup>2</sup> (8.4)

consideration where we manipulate expressions involving maxima and minima.

⇐⇒ (*p* → *q*) → *q* (Disj.)

⇐⇒ ¬(¬*p* ∨ ¬*q*) (Conj.)

⇐⇒ (*p* → *q*) ∧ (*q* → *p*) (Eq.)

max{*v*(*p*), *v*(*q*)} for *p* ∨ *q*, (8.1) min{*v*(*p*), *v*(*q*)} for *p* ∧ *q*, (8.2) 1 − |*v*(*p*) − *v*(*q*)| for *p* ↔ *q* (8.3)

By means of these connectives, Łukasiewicz defined the other connectives by the rules

def

def

def

*p* ∨ *q*

*p* ∧ *q*

*p* ↔ *q*

result is given in Proposition 8.3. This is the main task in this section.

• Lℵ<sup>1</sup> is a special Zadeh algebra, namely propositional algebra.

• The operations max and min are distributive to each others.

using the following expressions for max and min operations:

max{*x*, *<sup>y</sup>*} <sup>=</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>* <sup>+</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>|</sup>

(8.4), and ¬ is defined in the usual way: ¬*x* = 1 − *x*.

• The binary operations max and min are commutative and associative.

Consider again the special case of Zadeh algebra (6.7)

From the considertations above, we know that

The truth value evaluation rules for these derived connectives are

for any valuation *<sup>v</sup>* of Łℵ<sup>1</sup> .

**Proposition 8.1.** *Suppose x*, *y* ∈ [0, 1]*. Then De Morgan's laws hold for* min{*x*, *y*} *and* max{*x*, *y*}*.*

*Proof.* Consider the operation min(*x*, *y*) and max(*x*, *y*), where *x* and *y* are variables taking their values from the interval [0, 1]. Using the arithmetical formula for min operation (i.e., the expression for min in (8.4)), we have

$$\min\{x, y\} = \frac{x + y - \lfloor x - y \rfloor}{2}$$

$$= \frac{2 - 1 - 1 + x + y - \lfloor 1 - 1 + x - y \rfloor}{2}$$

$$= 1 - \frac{1 + 1 - x - y + \lfloor 1 - 1 + x - y \rfloor}{2}$$

$$= 1 - \frac{(1 - x) + (1 - y) + \lfloor (1 - y) - (1 - x) \rfloor}{2}$$

$$= 1 - \max\{ (1 - x), (1 - y) \}. \tag{8.5}$$

From the formula (8.5), by replacing *x* by 1 − *x* and *y* by 1 − *y*, and then solving max(*x*, *y*), the following formula follows:

$$\max\{\mathbf{x}, y\} = 1 - \min\{1 - \mathbf{x}, 1 - y\}. \tag{8.6}$$

The formulas (8.5) and (8.6) show that DeMorgan laws hold for max and min, and they are dual of each other.

This completes the proof.

Łukasiewicz knew that the operations max and min are dual of each other. Actually, this property is easily found in the classical special case, i.e. using characteristic functions in presenting crisp sets. But the general proof for this is easily done by using the expressions (8.4) for max and min in such cases where a distance metric is defined in the universe of discourse. This always holds at least for real numbers.

Second, consider the connection between max and Łukasiewicz implication using Zadeh algebra (similar considerations are done in Mattila (Mattila, 2005), but the following proposition 8.2 is not completely proved).

**Proposition 8.2.** *For all x*, *y* ∈ [0, 1]*,*

$$\max(\mathbf{x}, y) = (\mathbf{x} \underset{\mathbf{L}}{\rightarrow} y) \underset{\mathbf{L}}{\rightarrow} y,\tag{8.7}$$

*where x* → <sup>Ł</sup> *y is Łukasiewicz implication.*

*Proof.* Consider disjunction operation *x* ∨ *y* = max(*x*, *y*). Because 0 ≤ *x*, *y* ≤ 1, using the arithmetical formula (8.4) for max, we have

The case 4 follows from the case 2 by Prop. 8.2 as follows. When we consider the equation (8.10) in the proof of Prop. 8.2, we find two min-stuctures corresponding to the evaluation rule of implication, so that one of them is an inside part of the whole formula. If we denote the inner min-structure min(1, 1 − *x* + *y*) by *z* then the outer min-structure is min(1, 1 − *z* + *y*), i.e., the min-structures are formally the same. The implication operations in (8.9) are situated

Standard Fuzzy Sets and some Many-Valued Logics 93

<sup>=</sup> (*<sup>x</sup>* <sup>→</sup> *<sup>y</sup>*)+(*<sup>y</sup>* <sup>→</sup> *<sup>x</sup>*) <sup>−</sup> <sup>|</sup>(*<sup>x</sup>* <sup>→</sup> *<sup>y</sup>*) <sup>−</sup> (*<sup>y</sup>* <sup>→</sup> *<sup>x</sup>*)<sup>|</sup> 2

<sup>=</sup> min(1, 1 <sup>−</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*) + min(1, 1 <sup>−</sup> *<sup>y</sup>* <sup>+</sup> *<sup>x</sup>*) 2 <sup>−</sup> min(1, 1 <sup>−</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*) <sup>−</sup> min(1, 1 <sup>−</sup> *<sup>y</sup>* <sup>+</sup> *<sup>x</sup>*) 2 <sup>=</sup> <sup>4</sup> <sup>−</sup> <sup>2</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>|</sup> <sup>−</sup> |−2*<sup>x</sup>* <sup>+</sup> <sup>2</sup>*<sup>y</sup>* <sup>−</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>|</sup> <sup>+</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*|| 4

<sup>4</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>|</sup> .

These cases are similar to the truth value evaluation rules for connected formulas in Łℵ<sup>1</sup> . Hence, if we want to use algebraic approach for Łℵ<sup>1</sup> we need not necessarily to follow the mainstream described in Section 9 using the operations of MV-algebras for studying the

However, in the next section, we give also a very brief description about the alternative approach starting from the definition of general MV-algebra. It is the mainstream in this

We open another way a little for creating an algebra for Łukasiewicz logic. We adopt the definition and some properties of MV-algebras from Cignoli et. al. (Cignoli et al., 2000). Other

**Definition 9.1.** An *MV-algebra* is an algebra �*A*, ⊕, ¬, 0� with a binary operation ⊕, a unary

<sup>Ł</sup> *<sup>y</sup>*, by (Impl.).

in the same way. Hence, min(1, 1 − *x* + *y*) must be the evaluation rule of *x* →

*x* ↔ *y* = min{(*x* → *y*),(*y* → *x*)}

<sup>=</sup> <sup>4</sup> <sup>−</sup> <sup>4</sup> <sup>|</sup>*<sup>x</sup>* <sup>−</sup> *<sup>y</sup>*<sup>|</sup>

research topic, but a circuitous route in the case of Łukasiewicz logic.

sources are Bergmann (Bergmann, 2008) and Hájek (Hájek, 1998).

operation ¬ and a constant 0 satisfying the following equations:

A non-empty set *A* is the universe of the MV-algebra �*A*, ⊕, ¬, 0�.

**9. The relationship between Łukasiewicz logic and MV-algebras**

Hence, the connectives of Łℵ<sup>1</sup> can be created by Zadeh algebra.

connections between the connectives in Łℵ<sup>1</sup> .

(MV1) *x* ⊕ (*y* ⊕ *z*)=(*x* ⊕ *y*) ⊕ *z*

(MV6) ¬(¬*x* ⊕ *y*) ⊕ *y* = ¬(¬*y* ⊕ *x*) ⊕ *x*

(MV2) *x* ⊕ *y* = *y* ⊕ *x* (MV3) *x* ⊕ 0 = *x* (MV4) ¬¬*x* = *x* (MV5) *x* ⊕ ¬0 = ¬0

The case 5 is deduced as follows:

$$\begin{split} \max(\mathbf{x}, y) &= \min\{1, \max(\mathbf{x}, y)\} = \min\{1, \frac{\mathbf{x} + y + \mid \mathbf{x} - y\mid}{2}\} \\ &= \min\left\{1, \frac{2 - 1 - 1 + \mathbf{x} + 2y - y + \mid 1 - 1 + \mathbf{x} - y\mid}{2}\right\} \\ &= \min\left\{1, 1 - \frac{1 + (1 - \mathbf{x} + y) - \mid 1 - (1 - \mathbf{x} + y)\mid}{2} + y\right\} \\ &= \min\left\{1, 1 - \min(1, 1 - \mathbf{x} + y) + y\right\} \end{split} \tag{8.8}$$

On the other hand, in Łℵ<sup>1</sup> disjunction is defined by(Disj.), i.e., by the formula

$$\mathbf{x} \lor \mathbf{y} = (\mathbf{x} \underset{\mathbf{L}}{\rightarrow} \mathbf{y}) \underset{\mathbf{L}}{\rightarrow} \mathbf{y} \tag{8.9}$$

When we apply the evaluation rule of implication (Impl.) to the right side of the equation (8.9) we get the equation

$$\mathbf{x} \lor y = (\mathbf{x} \to y) \underset{\mathbf{L}}{\to} y = \min\left\{1, 1 - \min(1, 1 - \mathbf{x} + y) + y\right\} = \max(\mathbf{x}, y) \tag{8.10}$$

by (8.8). Hence, the assertion (8.7) follows, and the proof is complete.

Of course, Łukasiewicz must have known the connection between maximum operation and his truth evaluation formula (Impl.) of the implication because without any knowledge about this, he would have not been sure that everything fits well together in his logic. But how he has inferred this is not known. Maybe, he has shown this in some special cases by truth tables with *n* truth values where *n* is finite.

The result of the proof of the formula (8.7) shows that from the join operation max of our algebra we deduce a formula that expresses the rule of Łukasiewicz' implication, and this formula is the truth value evaluation rule in Łℵ<sup>1</sup> . Hence, we have shown that from our algebra (6.7) it is possible to derive similar rules as the truth value evaluation rules in Łℵ<sup>1</sup> .

Hence, we may conclude our main result in a formal way:

**Proposition 8.3.** *If the cases*


*hold, then the other cases*

$$\begin{aligned} \text{3. } &\begin{aligned} \text{3. } &\begin{aligned} \text{x} \wedge y = \min(\text{x}, y); \\ \text{4. } &\begin{aligned} \text{x} \rightarrow y = \min(1, 1 - \text{x} + y); \\ \text{5. } &\begin{aligned} \text{x} \rightarrow y = (\text{x} \rightarrow y) \wedge (y \rightarrow \text{x}) = 1 - |\text{x} - y|. \end{aligned} \end{aligned} \end{aligned} \end{cases}$$

*can be derived based on Zadeh algebra (6.7).*

*Proof.* The case 3 follows from the case 2 by duality. (Actually, this operation already belongs to Zadeh algebra, and hence to Łℵ<sup>1</sup> .)

The case 4 follows from the case 2 by Prop. 8.2 as follows. When we consider the equation (8.10) in the proof of Prop. 8.2, we find two min-stuctures corresponding to the evaluation rule of implication, so that one of them is an inside part of the whole formula. If we denote the inner min-structure min(1, 1 − *x* + *y*) by *z* then the outer min-structure is min(1, 1 − *z* + *y*), i.e., the min-structures are formally the same. The implication operations in (8.9) are situated in the same way. Hence, min(1, 1 − *x* + *y*) must be the evaluation rule of *x* → <sup>Ł</sup> *<sup>y</sup>*, by (Impl.).

The case 5 is deduced as follows:

18 Will-be-set-by-IN-TECH

1, <sup>2</sup> <sup>−</sup> <sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>+</sup> *<sup>x</sup>* <sup>+</sup> <sup>2</sup>*<sup>y</sup>* <sup>−</sup> *<sup>y</sup>*<sup>+</sup> <sup>|</sup> <sup>1</sup> <sup>−</sup> <sup>1</sup> <sup>+</sup> *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>* <sup>|</sup> 2

1, 1 <sup>−</sup> <sup>1</sup> + (<sup>1</sup> <sup>−</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*)− | <sup>1</sup> <sup>−</sup> (<sup>1</sup> <sup>−</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*) <sup>|</sup>

<sup>Ł</sup> *<sup>y</sup>*) <sup>→</sup>

= min { 1, 1 − min(1, 1 − *x* + *y*) + *y* } (8.8)

<sup>Ł</sup> *<sup>y</sup>* <sup>=</sup> min { 1, 1 <sup>−</sup> min(1, 1 <sup>−</sup> *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*) + *<sup>y</sup>* } <sup>=</sup> max(*x*, *<sup>y</sup>*) (8.10)

2 }

<sup>2</sup> <sup>+</sup> *<sup>y</sup>*

<sup>Ł</sup> *<sup>y</sup>* (8.9)

max(*x*, *<sup>y</sup>*) = min{ 1, max(*x*, *<sup>y</sup>*) } <sup>=</sup> min{ 1, *<sup>x</sup>* <sup>+</sup> *<sup>y</sup>*<sup>+</sup> <sup>|</sup> *<sup>x</sup>* <sup>−</sup> *<sup>y</sup>* <sup>|</sup>

On the other hand, in Łℵ<sup>1</sup> disjunction is defined by(Disj.), i.e., by the formula

by (8.8). Hence, the assertion (8.7) follows, and the proof is complete.

Hence, we may conclude our main result in a formal way:

*x* ∨ *y* = (*x* →

When we apply the evaluation rule of implication (Impl.) to the right side of the equation (8.9)

Of course, Łukasiewicz must have known the connection between maximum operation and his truth evaluation formula (Impl.) of the implication because without any knowledge about this, he would have not been sure that everything fits well together in his logic. But how he has inferred this is not known. Maybe, he has shown this in some special cases by truth tables

The result of the proof of the formula (8.7) shows that from the join operation max of our algebra we deduce a formula that expresses the rule of Łukasiewicz' implication, and this formula is the truth value evaluation rule in Łℵ<sup>1</sup> . Hence, we have shown that from our algebra

*Proof.* The case 3 follows from the case 2 by duality. (Actually, this operation already belongs

(6.7) it is possible to derive similar rules as the truth value evaluation rules in Łℵ<sup>1</sup> .

= min

= min

we get the equation

*x* ∨ *y* = (*x* →

with *n* truth values where *n* is finite.

**Proposition 8.3.** *If the cases*

*4. x* → *y* = min(1, 1 − *x* + *y*)*;*

*5. x* ↔ *y* = (*x* → *y*) ∧ (*y* → *x*) = 1− | *x* − *y* |*,*

*can be derived based on Zadeh algebra (6.7).*

to Zadeh algebra, and hence to Łℵ<sup>1</sup> .)

*1.* ¬*x* = 1 − *x; 2. x* ∨ *y* = max(*x*, *y*)*; hold, then the other cases 3. x* ∧ *y* = min(*x*, *y*)*;*

<sup>Ł</sup> *<sup>y</sup>*) <sup>→</sup>

$$\begin{split} x \leftrightarrow y &= \min\{ (\mathbf{x} \to y), (y \to \mathbf{x}) \} \\ &= \frac{(\mathbf{x} \to y) + (y \to \mathbf{x}) - |(\mathbf{x} \to y) - (y \to \mathbf{x})|}{2} \\ &= \frac{\min(1, 1 - \mathbf{x} + y) + \min(1, 1 - y + \mathbf{x})}{2} \\ &- \frac{\min(1, 1 - \mathbf{x} + y) - \min(1, 1 - y + \mathbf{x})}{2} \\ &= \frac{4 - 2|\mathbf{x} - y| - |-2\mathbf{x} + 2y - |\mathbf{x} - y| + |\mathbf{x} - y||}{4} \\ &= \frac{4 - 4|\mathbf{x} - y|}{4} = 1 - |\mathbf{x} - y|. \end{split}$$

Hence, the connectives of Łℵ<sup>1</sup> can be created by Zadeh algebra.

These cases are similar to the truth value evaluation rules for connected formulas in Łℵ<sup>1</sup> . Hence, if we want to use algebraic approach for Łℵ<sup>1</sup> we need not necessarily to follow the mainstream described in Section 9 using the operations of MV-algebras for studying the connections between the connectives in Łℵ<sup>1</sup> .

However, in the next section, we give also a very brief description about the alternative approach starting from the definition of general MV-algebra. It is the mainstream in this research topic, but a circuitous route in the case of Łukasiewicz logic.

### **9. The relationship between Łukasiewicz logic and MV-algebras**

We open another way a little for creating an algebra for Łukasiewicz logic. We adopt the definition and some properties of MV-algebras from Cignoli et. al. (Cignoli et al., 2000). Other sources are Bergmann (Bergmann, 2008) and Hájek (Hájek, 1998).

**Definition 9.1.** An *MV-algebra* is an algebra �*A*, ⊕, ¬, 0� with a binary operation ⊕, a unary operation ¬ and a constant 0 satisfying the following equations:

$$(\mathbf{M}\mathbf{V}\mathbf{1})\qquad \mathfrak{x}\oplus(\mathfrak{y}\oplus\mathfrak{z})=(\mathfrak{x}\oplus\mathfrak{y})\oplus\mathfrak{z}$$

$$(\mathsf{MV2})\qquad \mathfrak{x} \oplus \mathfrak{y} = \mathfrak{y} \oplus \mathfrak{x}$$


A non-empty set *A* is the universe of the MV-algebra �*A*, ⊕, ¬, 0�.

can be expressed in MV-algebra in the form (*cf.* Cignoli et al. (Cignoli et al., 2000), p. 78)

Standard Fuzzy Sets and some Many-Valued Logics 95

The equation (9.6) shows that between Łukasiewicz implication ant the operation ⊕ there is a similar connection as in S-implications which are defined by means of disjunction. But ⊕ is not disjunction in Łukasiewicz logic. In many cases ⊕ is interpreted as disjunction, but defined on the unit interval it gives different values as Łukasiewicz disjunction operation max. However, it is possible to define the operations max and min by means of the operations of MV-algebra, but then the result usually is a logic with additional operations having no reasonable interpretations (*cf.* for example, the logic Fuzzy*<sup>L</sup>* in Bergmann's book (Bergmann,

Wajsberg created an algebra, called by *Wajsberg algebra* (*W-algebra*, for short) which is know to serve an algebraic approach to Łukasiewicz infinite-valued logic. The following lemma (*cf.* Cignoli et al. (Cignoli et al., 2000), p. 83) gives the connection between Wajsberg algebras and

The binary operation of W-algebra is implication operation. In this algebra a unary operation is ¬ because it is needed to create the unit element 1. The zero element 0 belongs to this algebra because it implies the unit element by means of negation. Hence, the algebra is in a suitable form according to Łukasiewicz logic. Now, we have counterparts of the primitive connectives of Łukasiewicz logic as the operations of W-algebra. The other connectives can be created in

One consequence from this consideration is that in MV-algebras the operations max and min can be created by the operations of W-algebra, i.e., by the primitive connectives of Łukasiewicz

The main problem we considered here is to find connections between standard fuzzy sets and Łukasiewicz logic Łℵ<sup>1</sup> and to find a suitable algebra for it, especially, because the primitive connectives are negation and implication. In De Morgan algebras the counterparts for the logical connectives disjunction, conjunction, and negation appear as the algebraic operations. It cannot immediately be seen how Łukasiewicz implication, that belongs to the primitive connectives, are derived from the disjunction (max) and negation (¬*x* = 1 − *x*, *x* ∈ **I**). We have done it here using a special De Morgan algebra, namely, Zadeh algebra. Hence, the connection between standard fuzzy sets and Łukasiewicz logic Łℵ<sup>1</sup> becomes clear. The key result, where Łukasiewicz implication is derived algebraically from disjunction and negation,

Kleene's logic is considered because of its close connection to standard fuzzy sets, already motivated by Zadeh. The sections 2, 3, 4, 6, and 8 gives the method we have used for creating

def

<sup>=</sup> <sup>¬</sup>*<sup>x</sup>* <sup>⊕</sup> *y and* <sup>1</sup> def

= ¬0*. Then* �*A*, →, 1� *is a*

2008). Some comments on Fuzzy*<sup>L</sup>* is given in (Mattila, 2010)).

the similar way in W-algebra as Łukasiewicz has introduced them.

**Lemma 9.1.** *Let A be an MV-algebra, and put x* → *y*

logic by means of the evaluation rule of (9.6).

whence,

MV-algebras.

**10. Conclusion**

is given in Proposition 8.3.

our results.

*W-algebra.*

*x* → *y* = ¬*x* ⊕ *y* (9.6)

*x* ⊕ *y* = ¬*x* → *y*. (9.7)

In particular, axioms (MV1) - (MV3) state that �*A*, ⊕, 0� is an *abelian monoid*.

Given an MV-algebra *<sup>A</sup>* and a set *<sup>X</sup>*, the set *<sup>A</sup><sup>X</sup>* of all functions *<sup>f</sup>* : *<sup>X</sup>* −→ *<sup>A</sup>* becomes an MV-algebra if the operations ⊕ and ¬ and the element 0 are defined pointwise. It is obvious that the unit interval [0, 1] is an MV-algebra. The continuous functions from [0, 1] into [0, 1] form a subalgebra of the MV-algebra [0, 1] [0,1] .

On each MV-algebra *A* we define the constant 1 and the operations � and � as follows:

1 def = ¬0 , (9.1)

$$\mathfrak{x} \ominus \mathfrak{y} \stackrel{\text{def}}{=} \neg(\neg \mathfrak{x} \oplus \neg \mathfrak{y}) \, , \tag{9.2}$$

$$
\mathfrak{x} \ominus \mathfrak{y} \stackrel{\text{def}}{=} \mathfrak{x} \ominus \neg \mathfrak{y}.\tag{9.3}
$$

An MV-algebra is nontrivial is and only if 0 �= 1. The following identities are immediate consequences of (MV4):

$$(\mathbf{M}\mathbf{V}\mathbf{\bar{\mathbf{\bar{\mathbf{\bar{\mathbf{\bar{\mathbf{\bar{\mathbf{\bar{\mathbf{\bar{\mathbf{\mathbf{\bar{\mathbf{\mathbf{\bar{\mathbf{\mathbf{\mathbf{\bar{\mathbf{\mathbf{\mathbf{\cdot}}}}}}}}}}}}}}}}}}}}}}}}}$$
}}

(MV8) *x* ⊕ *y* = ¬(¬*x* � ¬*y*).

Axioms (MV5) and (MV6) can now be written as:

(MV5� ) *x* ⊕ 1 = 1 , (MV6� ) (*x* � *y*) ⊕ *y* = (*y* � *x*) ⊕ *x* .

Setting *y* = ¬0 in (MV6) we obtain

$$(\mathsf{MV9})\qquad \mathfrak{x} \oplus \neg \mathfrak{x} = 1\_{\mathsf{M}}$$

In the MV-algebra �[0, 1], ⊕, ¬, 0� we have

$$\mathbf{x} \odot \mathbf{y} = \max(0, \mathbf{x} + \mathbf{y} - 1) \tag{9.4}$$

$$
\pi \ominus y = \max(0, \pi - y) \tag{9.5}
$$

*Notation*: Following common usage, we consider the ¬ operation more binding than any other operation, and the � operation more binding than ⊕ and �.

Consider the question about the connection between Łukasiewicz implication and operations in MV-algebra.

Given an MV-algebra �*A*, ⊕, ¬, 0� and a set *X*, the set

$$A^X = \{ f \mid f: X \longrightarrow A \}$$

becomes an MV-algebra if the operations ⊕ and ¬ and the element 0 are defined pointwice (Cignoli et al. (Cignoli et al., 2000), p. 8). To define 0 pointwice means here that the result is a constant function **0** : *x* �→ 0 for any *x* in the universe of that algebra.

Further, Łukasiewicz implication

$$x \to y \stackrel{\text{def}}{=} \min\{1, 1 - x + y\}$$

can be expressed in MV-algebra in the form (*cf.* Cignoli et al. (Cignoli et al., 2000), p. 78)

$$
\mathfrak{x} \to \mathfrak{y} = \neg \mathfrak{x} \oplus \mathfrak{y} \tag{9.6}
$$

whence,

20 Will-be-set-by-IN-TECH

Given an MV-algebra *<sup>A</sup>* and a set *<sup>X</sup>*, the set *<sup>A</sup><sup>X</sup>* of all functions *<sup>f</sup>* : *<sup>X</sup>* −→ *<sup>A</sup>* becomes an MV-algebra if the operations ⊕ and ¬ and the element 0 are defined pointwise. It is obvious that the unit interval [0, 1] is an MV-algebra. The continuous functions from [0, 1] into [0, 1]

An MV-algebra is nontrivial is and only if 0 �= 1. The following identities are immediate

*Notation*: Following common usage, we consider the ¬ operation more binding than any other

Consider the question about the connection between Łukasiewicz implication and operations

*<sup>A</sup><sup>X</sup>* <sup>=</sup> { *<sup>f</sup>* <sup>|</sup> *<sup>f</sup>* : *<sup>X</sup>* −→ *<sup>A</sup>*}

becomes an MV-algebra if the operations ⊕ and ¬ and the element 0 are defined pointwice (Cignoli et al. (Cignoli et al., 2000), p. 8). To define 0 pointwice means here that the result is a

= min{1, 1 − *x* + *y*}

= ¬0 , (9.1)

= ¬(¬*x* ⊕ ¬*y*), (9.2)

= *x* � ¬*y* . (9.3)

*x* � *y* = max(0, *x* + *y* − 1) (9.4) *x* � *y* = max(0, *x* − *y*) (9.5)

[0,1] .

On each MV-algebra *A* we define the constant 1 and the operations � and � as follows:

1 def

*x* � *y* def

*x* � *y* def

In particular, axioms (MV1) - (MV3) state that �*A*, ⊕, 0� is an *abelian monoid*.

form a subalgebra of the MV-algebra [0, 1]

consequences of (MV4):

(MV8) *x* ⊕ *y* = ¬(¬*x* � ¬*y*).

Setting *y* = ¬0 in (MV6) we obtain

Further, Łukasiewicz implication

) *x* ⊕ 1 = 1 ,

(MV9) *x* ⊕ ¬*x* = 1 ,

in MV-algebra.

Axioms (MV5) and (MV6) can now be written as:

) (*x* � *y*) ⊕ *y* = (*y* � *x*) ⊕ *x* .

In the MV-algebra �[0, 1], ⊕, ¬, 0� we have

operation, and the � operation more binding than ⊕ and �.

constant function **0** : *x* �→ 0 for any *x* in the universe of that algebra.

*x* → *y*

def

Given an MV-algebra �*A*, ⊕, ¬, 0� and a set *X*, the set

(MV7) ¬1 = 0 ,

(MV5�

(MV6�

$$
\mathfrak{x} \oplus \mathfrak{y} = \neg \mathfrak{x} \to \mathfrak{y}.\tag{9.7}
$$

The equation (9.6) shows that between Łukasiewicz implication ant the operation ⊕ there is a similar connection as in S-implications which are defined by means of disjunction. But ⊕ is not disjunction in Łukasiewicz logic. In many cases ⊕ is interpreted as disjunction, but defined on the unit interval it gives different values as Łukasiewicz disjunction operation max. However, it is possible to define the operations max and min by means of the operations of MV-algebra, but then the result usually is a logic with additional operations having no reasonable interpretations (*cf.* for example, the logic Fuzzy*<sup>L</sup>* in Bergmann's book (Bergmann, 2008). Some comments on Fuzzy*<sup>L</sup>* is given in (Mattila, 2010)).

Wajsberg created an algebra, called by *Wajsberg algebra* (*W-algebra*, for short) which is know to serve an algebraic approach to Łukasiewicz infinite-valued logic. The following lemma (*cf.* Cignoli et al. (Cignoli et al., 2000), p. 83) gives the connection between Wajsberg algebras and MV-algebras.

**Lemma 9.1.** *Let A be an MV-algebra, and put x* → *y* def <sup>=</sup> <sup>¬</sup>*<sup>x</sup>* <sup>⊕</sup> *y and* <sup>1</sup> def = ¬0*. Then* �*A*, →, 1� *is a W-algebra.*

The binary operation of W-algebra is implication operation. In this algebra a unary operation is ¬ because it is needed to create the unit element 1. The zero element 0 belongs to this algebra because it implies the unit element by means of negation. Hence, the algebra is in a suitable form according to Łukasiewicz logic. Now, we have counterparts of the primitive connectives of Łukasiewicz logic as the operations of W-algebra. The other connectives can be created in the similar way in W-algebra as Łukasiewicz has introduced them.

One consequence from this consideration is that in MV-algebras the operations max and min can be created by the operations of W-algebra, i.e., by the primitive connectives of Łukasiewicz logic by means of the evaluation rule of (9.6).

### **10. Conclusion**

The main problem we considered here is to find connections between standard fuzzy sets and Łukasiewicz logic Łℵ<sup>1</sup> and to find a suitable algebra for it, especially, because the primitive connectives are negation and implication. In De Morgan algebras the counterparts for the logical connectives disjunction, conjunction, and negation appear as the algebraic operations. It cannot immediately be seen how Łukasiewicz implication, that belongs to the primitive connectives, are derived from the disjunction (max) and negation (¬*x* = 1 − *x*, *x* ∈ **I**). We have done it here using a special De Morgan algebra, namely, Zadeh algebra. Hence, the connection between standard fuzzy sets and Łukasiewicz logic Łℵ<sup>1</sup> becomes clear. The key result, where Łukasiewicz implication is derived algebraically from disjunction and negation, is given in Proposition 8.3.

Kleene's logic is considered because of its close connection to standard fuzzy sets, already motivated by Zadeh. The sections 2, 3, 4, 6, and 8 gives the method we have used for creating our results.

**5** 

*Mexico* 

**Parametric Type-2 Fuzzy Logic Systems** 

The use of Fuzzy Logic Systems (FLS) for control applications has increased since they became popular from 80's. After Mendel in 90's showed how uncertainty can be computed in order to achieve more robust systems, Type-2 Fuzzy Logic Systems (T2FLS) are in the

At same time, Batyrshin et al demonstrated that parametric conjunctions can be useful for tuning a FLS in order to achieve better performance beyond the set parameter tuning. In signal processing and system identification, this fact let the designer to add freedom degrees

This chapter presents the parametric T2FLS and shows that this new FLS is a very useful option for sharper approximations in control. In order to verify the advantages of the parametric T2FLS, it is used the Ball and Plate System as a testbench. This study case helps us to understand how a parametric conjunction affects the controller behavior in measures like response time or overshoot. Also, this application let us observe how the controller

A Parametric Type-2 Fuzzy Logic Systems (PT2FLS) is a general FLS which can be fully adjusted through a single or multiple parameters in order to achieve a benefit in its general performance. It means that a PT2FLS has several options to adjust set parameters (i.e. membership function parameters), rule parameters and output set parameters. Fig. 1 shows

In this figure the Defuzzification stage comprises the Output Processing Block and the Defuzzifier as Mendel stated in (Karnik, Mendel et al. 1999). For Interval Type-2 Fuzzy Logic Systems (IT2FLS) this block represents only the centroid calculation for example considering the WM Algorithm (Wu and Mendel 2002). As it can be seen, a dashed arrow

A general Fuzzy System is a function where all input variables are mapped to the output variables according to the knowledge base defined by rules. Rule Set represents the

crosses every stage; this means that every stage is tunable for optimization purposes.

focus of researchers and recently they became a new research topic.

the structure of a PT2FLS which it is almost equal to a general T2FLS.

**1. Introduction** 

to adjust a general FLS.

works in noise presence.

**2. Parametric T2FLS** 

configuration of the T2FLS.

Arturo Tellez, Heron Molina, Luis Villa,

Elsa Rubio and Ildar Batyrshin

*IPN, CIC, Mexico City,* 

The section 9 tells very briefly how the others consider this topic. That way is different and alternative to ours. MV-algebra is quite general, and many algebras, like Boolean algebras and also De Morgan algebras belong to its scope. The reader may become familiar with this topic, for example, by reading the Bergmann's, Cignoli's et. al, and Hájek's books mentioned in References. A lot of other material is available, too.

Our alternative way we have considered the topic here, is not totally new, because these things are considered in (Mattila, 2004), (Mattila, 2005), and (Mattila, 2010), but our key result, Proposition 8.3 is. Proposition 8.2 is the core of this result. It makes the connection of max operation and Łukasiewicz inmplication clear by means of Zadeh algebra. Using the expressions (8.4) for the operations max and min is usually not used in general, but it makes the consideration easy. As we can see from the used references, De Morgan algebras have already been well known relatively long time having, in the long run, different alternative names, like "quasi Boolean algebras", and "soft algebras".

H. Rasiowa has considered *implicative algebras* and *implication algebras* in her book ((Rasiowa, 1974)). Hence, the future research policy may be based on these algebras. Also, the connections between implicative/implication algebras and De Morgan algebras or MV-algebras restricted to many-valued or modal logics are included to the future research.

### **11. References**


## **Parametric Type-2 Fuzzy Logic Systems**

Arturo Tellez, Heron Molina, Luis Villa, Elsa Rubio and Ildar Batyrshin *IPN, CIC, Mexico City, Mexico* 

### **1. Introduction**

22 Will-be-set-by-IN-TECH

96 Fuzzy Logic – Algorithms, Techniques and Implementations

The section 9 tells very briefly how the others consider this topic. That way is different and alternative to ours. MV-algebra is quite general, and many algebras, like Boolean algebras and also De Morgan algebras belong to its scope. The reader may become familiar with this topic, for example, by reading the Bergmann's, Cignoli's et. al, and Hájek's books mentioned

Our alternative way we have considered the topic here, is not totally new, because these things are considered in (Mattila, 2004), (Mattila, 2005), and (Mattila, 2010), but our key result, Proposition 8.3 is. Proposition 8.2 is the core of this result. It makes the connection of max operation and Łukasiewicz inmplication clear by means of Zadeh algebra. Using the expressions (8.4) for the operations max and min is usually not used in general, but it makes the consideration easy. As we can see from the used references, De Morgan algebras have already been well known relatively long time having, in the long run, different alternative

H. Rasiowa has considered *implicative algebras* and *implication algebras* in her book ((Rasiowa, 1974)). Hence, the future research policy may be based on these algebras. Also, the connections between implicative/implication algebras and De Morgan algebras or MV-algebras restricted to many-valued or modal logics are included to the future research.

Bergmann, M. (2008). *An Introduction to Many-Valued and Fuzzy Logic*, Cambridge University Press, New York, Melbourne, Madrid,Cape Town, Singapore, São Paulo, Delhi. Cignoli, L. & D'Ottaviano, M. & Mundici, D. (2000). *Algebraic Foundations of Many-valued Reasoning*, Kluwer Academic Publishers, Dordrecht, Boston, London. Hájek, P. (1998). *Metamathematics of Fuzzy Logic*, Kluwer Academic Publishers, Dordrecht,

Lowen, R. (1996). *Fuzzy Set Theory. Basic Concepts, Techniques and Bibliography*, Kluwer

Mattila, J. K. (2004). Zadeh algebras as a syntactical approach to fuzzy sets, *Current Issues in Data and Knowledge Engineering*, Springer-Verlag, Warszawa, Poland, pp. 343–349. Mattila, J. K. (year 2005). On łukasiewicz modifier logic, *Journal of Advanced Computational*

Mattila, J. K. (2009). Many-valuation, modality, and fuzziness, *in* Seising R. (ed.), *Views on*

Mattila, J. K. (2010). On Łukasiewicz' infinite-Valued logic and Fuzzy*L*, *in*, *KES 2010*

Negoit ˘a, C. V. & Ralescu, D. A., (1975). *Applications of Fuzzy Sets to Systems Analysis*,

Rescher, N. (1969). *Many-Valued Logic*, McGraw-Hill, New York, St. Louis, San Francisco,

*Fuzzy Sets and Systems from Different Perspectives. Philosophy and Logic, Criticisms and*

*Knowledge-Based Intelligent Information and Engineering Systems*, Part IV, LNAI 6279,

Academic Publishers, Dordrecht, Boston, London.

Springer-Verlag, Berlin, Heidelberg, pp. 108–115.

London, Sydney, Toronto, Mexiko, Panama. Zadeh, L. A. (1965). Fuzzy Sets, *Information and Controll* 8.

*Intelligence and Intelligent Informatics* Vol. 9(No. 5): 506–510.

*Applications*, Springer-Verlag, Berlin, Heidelberg, pp. 271–300.

Rasiowa, H. (1974). *An Algebraic Approach to non-classical Logics*, North-Holland.

in References. A lot of other material is available, too.

names, like "quasi Boolean algebras", and "soft algebras".

**11. References**

Boston, London.

URL: *www.fujipress.jp*

Birkhäuser, Basel, Stuttgart.

The use of Fuzzy Logic Systems (FLS) for control applications has increased since they became popular from 80's. After Mendel in 90's showed how uncertainty can be computed in order to achieve more robust systems, Type-2 Fuzzy Logic Systems (T2FLS) are in the focus of researchers and recently they became a new research topic.

At same time, Batyrshin et al demonstrated that parametric conjunctions can be useful for tuning a FLS in order to achieve better performance beyond the set parameter tuning. In signal processing and system identification, this fact let the designer to add freedom degrees to adjust a general FLS.

This chapter presents the parametric T2FLS and shows that this new FLS is a very useful option for sharper approximations in control. In order to verify the advantages of the parametric T2FLS, it is used the Ball and Plate System as a testbench. This study case helps us to understand how a parametric conjunction affects the controller behavior in measures like response time or overshoot. Also, this application let us observe how the controller works in noise presence.

### **2. Parametric T2FLS**

A Parametric Type-2 Fuzzy Logic Systems (PT2FLS) is a general FLS which can be fully adjusted through a single or multiple parameters in order to achieve a benefit in its general performance. It means that a PT2FLS has several options to adjust set parameters (i.e. membership function parameters), rule parameters and output set parameters. Fig. 1 shows the structure of a PT2FLS which it is almost equal to a general T2FLS.

In this figure the Defuzzification stage comprises the Output Processing Block and the Defuzzifier as Mendel stated in (Karnik, Mendel et al. 1999). For Interval Type-2 Fuzzy Logic Systems (IT2FLS) this block represents only the centroid calculation for example considering the WM Algorithm (Wu and Mendel 2002). As it can be seen, a dashed arrow crosses every stage; this means that every stage is tunable for optimization purposes.

A general Fuzzy System is a function where all input variables are mapped to the output variables according to the knowledge base defined by rules. Rule Set represents the configuration of the T2FLS.

$$\begin{aligned} F^l &= \left[ \overline{F}^l; \underline{F}^l \right] = \left[ \overline{\mu}\_{\tilde{A}\_l^l} (\boldsymbol{\chi}\_l); \,\underline{\mu}\_{\tilde{A}\_l^l} (\boldsymbol{\chi}\_l) \right] \\\\ \overline{F}^l &= \overline{\mu}\_{\tilde{A}\_1^l} (\boldsymbol{\chi}\_1) \wedge \overline{\mu}\_{\tilde{A}\_2^l} (\boldsymbol{\chi}\_2) \wedge \dots \wedge \overline{\mu}\_{\tilde{A}\_m^l} (\boldsymbol{\chi}\_m) \\\\ \underline{F}^l &= \underline{\mu}\_{\tilde{A}\_1^l} (\boldsymbol{\chi}\_1) \wedge \underline{\mu}\_{\tilde{A}\_1^l} (\boldsymbol{\chi}\_2) \wedge \dots \wedge \underline{\mu}\_{\tilde{A}\_m^l} (\boldsymbol{\chi}\_m) \end{aligned}$$

$$\overline{F}^l = \mathbf{T}\left(\overline{\mu}\_{\vec{A}\_1^l}(\mathbf{x}\_1), \overline{\mu}\_{\vec{A}\_2^l}(\mathbf{x}\_2), \dots, \overline{\mu}\_{\vec{A}\_m^l}(\mathbf{x}\_m); \mathbb{p}\_r\right) = \mathbf{T}\left(\overline{\mu}\_{\vec{A}\_l^l}(\mathbf{x}\_l); \mathbb{p}\_r^l\right) \tag{1}$$

$$\underline{F}^l = \mathbf{T}\left(\underline{\mu}\_{\tilde{A}\_1^l}(\mathbf{x}\_1), \underline{\mu}\_{\tilde{A}\_2^l}(\mathbf{x}\_2), \dots, \underline{\mu}\_{\tilde{A}\_m^l}(\mathbf{x}\_m); \mathbf{p}\_r\right) = \mathbf{T}\left(\underline{\mu}\_{\tilde{A}\_l^l}(\mathbf{x}\_l); \mathbf{p}\_r^l\right) \tag{2}$$

$$\overline{\mu}\_{\mathcal{B}\_1^{\ell}}(\mathbf{y}) = \sqcup\_{\mathbf{x} \in \mathcal{X}} \left( \mathbf{T} \left( \overline{\mu}\_{\mathcal{A}\_l^{\ell}}(\mathbf{x}\_l); \mathbb{p}\_r \right) \right) \tag{3}$$

$$\underline{\mu}\_{\mathcal{B}\_1^{\ell}}(\mathcal{y}) = \sqcup\_{\mathbf{x} \in \mathcal{X}} \left( \mathbf{T} \left( \underline{\mu}\_{\mathcal{A}\_l^{\ell}}(\mathbf{x}\_l); \mathbb{p}\_r \right) \right) \tag{4}$$


$$T(a,b) = T\_{ll}(a,b) \text{ if } (a,b) \in D\_{lj}; t, j \in G \tag{5}$$

$$T\_{l,l} = T\_{l,l} \tag{6}$$

$$T(a,b,p) = \begin{cases} T\_{11}(a,b), & (a \le p) \land (b \le p) \\ T\_{21}(a,b), & (a > p) \land (b \le p) \\ T\_{12}(a,b), & (a \le p) \land (b > p) \\ T\_{22}(a,b), & (a > p) \land (b > p) \end{cases} \tag{7}$$

$$T(a,b,p) = \begin{cases} T\_{11}(a,b), & (a \le p) \land (b \le p) \\ T\_{21}(a,b), & (a > p) \land (a \le 1-p) \land (b \le p) \\ T\_{12}(a,b), & (a > 1-p) \land (b \le p) \\ T\_{12}(a,b), & (a \le p) \land (b > p) \land (b \le 1-p) \\ T\_{22}(a,b), & (a > p) \land (a \le 1-p) \land (b > p) \land (b \le 1-p) \\ T\_{32}(a,b), & (a > 1-p) \land (b > p) \land (b \le 1-p) \\ T\_{13}(a,b), & (a \le p) \land (b > p) \land (b \le 1-p) \\ T\_{23}(a,b), & (a > p) \land (a \le 1-p) \land (b > p) \\ T\_{33}(a,b), & (a > p) \land (b > p) \end{cases} \tag{8}$$
 
$$H\_{p-2} \xrightarrow{p} \begin{bmatrix} 1 & p & t-p \\ D\_{12} & D\_{22} & D\_{32} \\ D\_{13} & D\_{23} & D\_{33} \\ & H\_{12} & D\_{12} & D\_{23} \\ H\_{12} & H\_{21} & D\_{22} & D\_{33} \end{bmatrix} t-p$$
 
$$H\_{p-1} \xrightarrow[h \to H\_{p-1}]{H\_{p-1} \xrightarrow[h \to H\_{p-1}]{H\_{p-1} \xrightarrow[h \to H\_{p-1}]}} 1 \quad H\_{p-2} \xrightarrow[h \to H\_{p-2}]{H\_{p-1} \xrightarrow[h \to H\_{p-1}]} 1$$

Parametric Type-2 Fuzzy Logic Systems 105

ball, over the axis X. If it is assumed that the velocity and the acceleration of the ball are constant at shorter values of �, it is possible to estimate the next position of ball with

NA NM Z PM PA

NA NM Z PM PA PA NM NM NM PA PA PA Z NA NA Z PA PA PM NA NA NA PM PM PA NA NA NM Z PM

��(���) and finally the desired position �� and the estimated error �(���).

**Error**

Rubio-Espino et al. 2010) used for the PT2FLC purposes.

Fig. 7. IT2FLS Simulator for B&P System

negative" (NA). Rule set is described in Table 1.

corresponds to every axis.

**Tilt Change** 

Table 1. Optimal Rule Set of B&P System with T1FLC described in (Moreno-Armendariz,

It is proved that B&P System is a decoupled system over its two axes (Moreno-Armendariz, Rubio-Espino et al. 2010). So, (10) are similar for the axis Y. Fig 5 shows that B&P system block has two inputs and two outputs for our control purposes; so, every in-out pair

T1FLC proposed by (Moreno-Armendariz, Rubio-Espino et al. 2010) has two inputs and one output. Every variable has 5 FS (Fig. 6) associated to linguistic variables "high positive" (PA), "medium positive" (PM), "zero or null" (Z), "medium negative" (NM), and "high

The T1FLC controls the tilt of plate using the information that the FPGA takes form the camera and calculates the current position using (10) under perfect environment conditions. But what happens when some external forces (e.g. weather) complicate the system stability? Some equivalent phenomena may be introduced to the plate. For example, the illuminating

in axis Y. Servomotors perform the adequate tilt over both axes. Every tilt value is calculated by its corresponding PT2FLC using the error position and the position change. Position change is the differential of the feedback of the plant, i.e. the current position.

It is noteworthy that PT2FLC hardware has not been implemented and tested for this application. Only simulations are performed in order to show all advantages of the use of PT2FLC in control applications. Mechanical model proposed in (Moreno-Armendariz, Rubio-Espino et al. 2010) has the characteristic of designing and testing new and improved controllers, which it is a suitable future work, because of the flexibility of FPGA.

$$A = \begin{bmatrix} 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & -9.81 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\ -6.1313 \times 10^4 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & -9.81 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 & -6.1313 \times 10^4 & 0 & 0 & 0 \end{bmatrix}$$

$$B = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 5.6618 \times 10^4 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 5.6618 \times 10^4 \end{bmatrix} \tag{9}$$

$$C = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 \end{bmatrix}$$

$$D = \begin{bmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & 0 \end{bmatrix}$$

$$x' = Ax + Bu$$

$$y = Cx$$

The characteristics of this B&P System (9) is a linearized state-space model, the same as described in (Moreno-Armendariz, Rubio-Espino et al. 2010). With (10), it can be calculated the current velocity, acceleration and position in axis x.

$$\begin{aligned} v(k) &= \frac{\{\mathbf{x}(k) - \mathbf{x}(k-1)\}}{T} \\ a(k) &= v(k) - v(k-1) \\ \mathbf{x}\_e(k+1) &= \mathbf{x}(k) + \frac{v(k)}{T} + \frac{a(k)T^2}{2} \\ e(k+1) &= \mathbf{x}\_d - \mathbf{x}\_e(k+1) \end{aligned} \tag{10}$$

The vision system described in (Moreno-Armendariz, Rubio-Espino et al. 2010) uses a sampling time � (50ms), which captures and processes a single image in that period. After the vision system process the image, FPGA calculates the current position of the ball in axis X, �(�), where � is the current sample. Once it is known the position, it is possible to find the current velocity component �(�) and the current acceleration component �(�) of the 104 Fuzzy Logic – Algorithms, Techniques and Implementations

in axis Y. Servomotors perform the adequate tilt over both axes. Every tilt value is calculated by its corresponding PT2FLC using the error position and the position change. Position

It is noteworthy that PT2FLC hardware has not been implemented and tested for this application. Only simulations are performed in order to show all advantages of the use of PT2FLC in control applications. Mechanical model proposed in (Moreno-Armendariz, Rubio-Espino et al. 2010) has the characteristic of designing and testing new and improved

change is the differential of the feedback of the plant, i.e. the current position.

controllers, which it is a suitable future work, because of the flexibility of FPGA.

� � � � � � �

the current velocity, acceleration and position in axis x.

� 0 100 0 000 0 0 −9.81 0 0 0 0 0 0 001 0 000 −6.1313 × 10� 000 0 000 0 000 0 100 0 0 0 0 0 0 −9.81 0 0 000 0 001 0 0 0 0 −6.1313 × 10� 000�

> � 0 0 0 0 0 0 5.6818 × 10� 0 0 0 0 0 0 0 0 5.6818 × 10��

�=�<sup>10000000</sup> 00001000�

> � = [0] �� = �� + �� � = ��

�(�) <sup>=</sup> ��(�) − �(�−1)� �

�(�+1) = �� − ��(�+1)

��(�+1) = �(�) +

The characteristics of this B&P System (9) is a linearized state-space model, the same as described in (Moreno-Armendariz, Rubio-Espino et al. 2010). With (10), it can be calculated

> �(�) � <sup>+</sup>

The vision system described in (Moreno-Armendariz, Rubio-Espino et al. 2010) uses a sampling time � (50ms), which captures and processes a single image in that period. After the vision system process the image, FPGA calculates the current position of the ball in axis X, �(�), where � is the current sample. Once it is known the position, it is possible to find the current velocity component �(�) and the current acceleration component �(�) of the

� � � � � � �

� = (9)

(10) �(�) = �(�) − �(�−1)

�(�)�� 2

� � � � � � �

� =

� � � � � � � ball, over the axis X. If it is assumed that the velocity and the acceleration of the ball are constant at shorter values of �, it is possible to estimate the next position of ball with ��(���) and finally the desired position �� and the estimated error �(���).


Table 1. Optimal Rule Set of B&P System with T1FLC described in (Moreno-Armendariz, Rubio-Espino et al. 2010) used for the PT2FLC purposes.

Fig. 7. IT2FLS Simulator for B&P System

It is proved that B&P System is a decoupled system over its two axes (Moreno-Armendariz, Rubio-Espino et al. 2010). So, (10) are similar for the axis Y. Fig 5 shows that B&P system block has two inputs and two outputs for our control purposes; so, every in-out pair corresponds to every axis.

T1FLC proposed by (Moreno-Armendariz, Rubio-Espino et al. 2010) has two inputs and one output. Every variable has 5 FS (Fig. 6) associated to linguistic variables "high positive" (PA), "medium positive" (PM), "zero or null" (Z), "medium negative" (NM), and "high negative" (NA). Rule set is described in Table 1.

The T1FLC controls the tilt of plate using the information that the FPGA takes form the camera and calculates the current position using (10) under perfect environment conditions. But what happens when some external forces (e.g. weather) complicate the system stability? Some equivalent phenomena may be introduced to the plate. For example, the illuminating

Parametric Type-2 Fuzzy Logic Systems 107

Fig. 8. Second approximation of PT2FLC modifying the FOU of sets

optimal set distribution.

Fig. 9. Parametric Fuzzy Conjunction using (p) −Monotone Sum with parameter p = 0.25

In first experiments, (Fig. 6) it was re-adjusted the FOU of every set, leaving the set distribution intact, so it was found that only for a very thin FOU in every input set it is gotten a good convergence without overshoot and other phenomena. But, what is the sense of having a very short FOU like T1FS if they will not capture the associated uncertainties of the system? So, there should be a way of tuning the PT2FLC without changing this initial

variation due to light incidence over the plate, an unbalanced motor tied to the plate, a low quality image sensor or some interference noise added to the processed image, may be introduced as external disturbances.


Table 2. Phenomena associated with the FOU of every set in system

In initial experiments, noise-free optimization is performed and similar results are achieved in order to compare it with T1FLC. For noise tests it is only considered an unbalanced motor tied to the plate that makes it tremble while a sine trajectory is performed, analyzing a single axis. This experiment helps us to verify the noise-proof ability of the T2FLC.

### **4. Experimental results**

FS distribution, i.e., FS shape parameters may arise several characteristic phenomena that expert must take into account when designing applied-to-control fuzzy systems, so-called Fuzzy Logic Controller (FLC). As described in (Moreno-Armendariz, Rubio-Espino et al. 2010), authors found an optimal FS distribution where FLC shows a great performance in 3.8 seconds. However, when it is used this same configuration some phenomena arises when it is introduced T2FS.

Starting from the initial optimal set distribution and without considering any possible noise influence, it was tested several configurations modifying every set FOU, starting from a T1FS (without FOU) and increasing it as much as possible; or starting from a very wide FOU and collapsing it until it becomes a T1FS. Some phenomena are related to them as described in Table 2, but in general, when it is introduced a T2FS a certain level of overshoot is found, no matter which variable was modified; so, if every variable has a T2FS, then the expert has to deal with the influence of nonlinear aggregation of overshoot, steady-state error or offset (SSE) and ripple, when tuning a PT2FLC, which might be a complicated task.

For every experiment it was used an implemented simulator for IT2FLS. With some instructions it can be constructed any parametric IT2FLS and expert may choose from set shape, several parametric conjunctions and defuzzification options (Fig. 7).

106 Fuzzy Logic – Algorithms, Techniques and Implementations

variation due to light incidence over the plate, an unbalanced motor tied to the plate, a low quality image sensor or some interference noise added to the processed image, may be

which set FOUs are increasing from zero. Yes No No

In initial experiments, noise-free optimization is performed and similar results are achieved in order to compare it with T1FLC. For noise tests it is only considered an unbalanced motor tied to the plate that makes it tremble while a sine trajectory is performed, analyzing a single

FS distribution, i.e., FS shape parameters may arise several characteristic phenomena that expert must take into account when designing applied-to-control fuzzy systems, so-called Fuzzy Logic Controller (FLC). As described in (Moreno-Armendariz, Rubio-Espino et al. 2010), authors found an optimal FS distribution where FLC shows a great performance in 3.8 seconds. However, when it is used this same configuration some phenomena arises when it

Starting from the initial optimal set distribution and without considering any possible noise influence, it was tested several configurations modifying every set FOU, starting from a T1FS (without FOU) and increasing it as much as possible; or starting from a very wide FOU and collapsing it until it becomes a T1FS. Some phenomena are related to them as described in Table 2, but in general, when it is introduced a T2FS a certain level of overshoot is found, no matter which variable was modified; so, if every variable has a T2FS, then the expert has to deal with the influence of nonlinear aggregation of overshoot, steady-state error or offset (SSE) and ripple, when tuning a PT2FLC, which might be a

For every experiment it was used an implemented simulator for IT2FLS. With some instructions it can be constructed any parametric IT2FLS and expert may choose from set

shape, several parametric conjunctions and defuzzification options (Fig. 7).

**Experiment Overshoot SSE Ripple** 

No No Yes

Yes No No

No Yes No

introduced as external disturbances.

are wide as much as it can be possible.

are wide as much as it can be possible.

**4. Experimental results** 

is introduced T2FS.

complicated task.

When all sets in every variable are T1, except the variable

When all sets in input variable error are T2 and their FOU are decreasing until they become T1. All other variable sets

When all sets in input variable change are T2 and their FOU are decreasing until they become T1. All other variable sets are wide as much as it can be possible.

When all sets in output variable tilt are T2 and their FOU are decreasing until they become T1. All other variable sets

Table 2. Phenomena associated with the FOU of every set in system

axis. This experiment helps us to verify the noise-proof ability of the T2FLC.

Fig. 8. Second approximation of PT2FLC modifying the FOU of sets

Fig. 9. Parametric Fuzzy Conjunction using (p) −Monotone Sum with parameter p = 0.25

In first experiments, (Fig. 6) it was re-adjusted the FOU of every set, leaving the set distribution intact, so it was found that only for a very thin FOU in every input set it is gotten a good convergence without overshoot and other phenomena. But, what is the sense of having a very short FOU like T1FS if they will not capture the associated uncertainties of the system? So, there should be a way of tuning the PT2FLC without changing this initial optimal set distribution.

Parametric Type-2 Fuzzy Logic Systems 109

**shoot SSE Ripple** 

No No No

No No Yes

Yes No No

No Yes No

No No Yes

**Parameter Description Over** 

non-parametric conjunctions.

parameter value of rule 18.

7b, 19c These rules increase or decrease the offset of

8b This rule help to stretch the ripple slightly but

Fig. 10. Rule 18 parameter distribution for 43 experiments

conjunctions also.

overshoot.

These rules have no influence with the final response, so their parameter values might be any. These rules may be quantified using

These rules have a very slight influence with the final response. Some of them reduce the ripple, but they are negligible. These rules may be quantified using non-parametric

These rules have a very positive influence with the final response, especially the

the final response, but could add some

also might be useful to reduce small ripple.

Table 3. Phenomena associated with the rule operator of every implication in inference

**Rule** 

1, 2, 5, 6, 10, 16, 17, 20, 21, 22, 25

3, 4, 9, 11, 12,

13b, 14c, 18a, 23b, 24b

15

In second experiments, it was moved the FOU of every set in every variable and found a very close approximation of time response as described in Fig. 8. This configuration has wider FOU in every input and output variable as much as necessary (with uniform spread) for supporting variations in error until 0.0075 radians, in change until 0.01 radians per second and in tilt until 0.004 radians, all around the mean of every point of its corresponding set and variable.

As it can be seen, every set exhibits a wider FOU and its time response has increased over 5 seconds. Also, some overshoot and ripple are present, but reference is reached, so SSE is eliminated. This is the first best approximation using the same optimal distribution of sets, although it does not mean that there could not be any other set distribution for this application.

As it is sated in (Batyrshin, Rudas et al. 2009), a parametric operator may help to tune a T1FLC through the inference step, so every rule of the knowledge base related with the implication of the premises might be a parametric conjunction. In third experiments, it is used commutative (�) −monotone sum of conjunctions (11), where it is assigned to every section the following conjunctions: ��� = �� is the drastic intersection, ��� = ��� = �� is the product and ��� = �� is the minimum, using (7) as follows:

$$T(\mathbf{x}, \mathbf{y}, p) = \begin{cases} T\_d(\mathbf{x}, \mathbf{y}), \ (\mathbf{x} \le p) \land (\mathbf{y} \le p) \\ T\_p(\mathbf{x}, \mathbf{y}), \ [(\mathbf{x} > p) \land (\mathbf{y} \le p)] \lor [(\mathbf{x} \le p) \land (\mathbf{y} > p)] \\ T\_m(\mathbf{x}, \mathbf{y}), \ (\mathbf{x} > p) \land (\mathbf{y} > p) \end{cases} \tag{11}$$

In (10), it is possible to assure that when parameter �=0 then the conjunction in (11) will have a minimum t-norm behavior, but when parameter �=1, it will be a drastic product tnorm behavior as it can be seen in Fig. 9. If � has any other value between the interval (0,1), then it will have a drastic, product o minimum t-norm behavior depending on the membership values of operands. Resulting behavior of this monotone sum might help to diminish the fuzzy implication between two membership degrees of premises and therefore to reduce the resulting overshoot of system and then reach the reference faster. Now another task is to choose the values of every parameter of conjunctions.

Moreover the optimal FS distribution, it is used the same rule set of (Moreno-Armendariz, Rubio-Espino et al. 2010) as shown in Table 1 in order to show that any T1FLC can be extended to a PT2FLC. So, � = 25 rules define the T1FLC configuration, it means that there are 25 parametric conjunctions and therefore 25 parameters. When searching for an optimal value of every �, it is recommended to use an optimization algorithm in order to obtain optimal values and the resulting waste of time when calculating them manually.

According to (11), the initial values of ��, make the conjunctions to behave like min, i.e.

$$\mathbb{p}\_r = \{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0\}$$

It is proposed some values when optimization was performed with heuristics to get optimal rule parameters, i.e.

$$\mathfrak{p}\_r = \{0, 0.25, 0.25, 0, 0, 0, 0.2, 0, 0, 1, 0, 1, 0, 0, 0.2, 0, 0, 0, 0, 0.25, 0.25, 0\} \tag{12}$$

108 Fuzzy Logic – Algorithms, Techniques and Implementations

In second experiments, it was moved the FOU of every set in every variable and found a very close approximation of time response as described in Fig. 8. This configuration has wider FOU in every input and output variable as much as necessary (with uniform spread) for supporting variations in error until 0.0075 radians, in change until 0.01 radians per second and in tilt until 0.004 radians, all around the mean of every point of its

As it can be seen, every set exhibits a wider FOU and its time response has increased over 5 seconds. Also, some overshoot and ripple are present, but reference is reached, so SSE is eliminated. This is the first best approximation using the same optimal distribution of sets, although it does not mean that there could not be any other set distribution for this

As it is sated in (Batyrshin, Rudas et al. 2009), a parametric operator may help to tune a T1FLC through the inference step, so every rule of the knowledge base related with the implication of the premises might be a parametric conjunction. In third experiments, it is used commutative (�) −monotone sum of conjunctions (11), where it is assigned to every section the following conjunctions: ��� = �� is the drastic intersection, ��� = ��� = �� is the

��(�, �), [(���) ∧ (���)] ∨ [(���) ∧ (���)]

(11)

��(�, �), (���) ∧ (���)

��(�, �), (���) ∧ (���)

In (10), it is possible to assure that when parameter �=0 then the conjunction in (11) will have a minimum t-norm behavior, but when parameter �=1, it will be a drastic product tnorm behavior as it can be seen in Fig. 9. If � has any other value between the interval (0,1), then it will have a drastic, product o minimum t-norm behavior depending on the membership values of operands. Resulting behavior of this monotone sum might help to diminish the fuzzy implication between two membership degrees of premises and therefore to reduce the resulting overshoot of system and then reach the reference faster. Now another

Moreover the optimal FS distribution, it is used the same rule set of (Moreno-Armendariz, Rubio-Espino et al. 2010) as shown in Table 1 in order to show that any T1FLC can be extended to a PT2FLC. So, � = 25 rules define the T1FLC configuration, it means that there are 25 parametric conjunctions and therefore 25 parameters. When searching for an optimal value of every �, it is recommended to use an optimization algorithm in order to obtain

optimal values and the resulting waste of time when calculating them manually.

According to (11), the initial values of ��, make the conjunctions to behave like min, i.e.

�� = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] It is proposed some values when optimization was performed with heuristics to get optimal

�� = [0,0.25,0.25,0,0,0,0,0.2,0,0,1,0,1,0,1,0,0,0.2,0,0,0,0,0.25,0.25,0] (12)

product and ��� = �� is the minimum, using (7) as follows:

task is to choose the values of every parameter of conjunctions.

�(�, �, �) = �

corresponding set and variable.

application.

rule parameters, i.e.


Table 3. Phenomena associated with the rule operator of every implication in inference

Fig. 10. Rule 18 parameter distribution for 43 experiments

Parametric Type-2 Fuzzy Logic Systems 111

Suppose a PT2FLC where it is only modified the parameter value of rule 18 and a set of parameters that can be spread randomly around the mean of its value ��(18) = 0.7. For this experiment, it was performed 43 iterations in order to show how the variation of ��(18)

Table 4 shows some results about the transient when trying to reach a tilt = 0.125 rads. Other phenomena can be analyzed for all 43 iterations. Also, in Fig.12 it can be seen that overshoot is attenuated drastically when ��(18) → 1, if it is only modified this rule. Time response (rise time, peak time and settling time) is also compromised due to parametric conjunctions. It can be seen also that drastic attenuation of overshoot occurs for ��(18) ≲ 0.7. Greater values do not affect it meaningfully. As it can be seen in (12), rule parameter proposed as the optimal for rule 18 is near to 1, which might be different with other configurations. This is because of the influence of the rest of rule parameters. However, this optimal configuration does not compromise the response time but it does eliminate the

affects the overshoot attenuation and also other phenomena (Fig. 10-11).

Fig. 13. Final approximation of T2FLC modifying rule parameters

Fig. 14. Comparison of response between T1FLC and parametric T2FLC when reference is a

overshoot completely.

noisy sine signal

Fig. 11. Transient response for several values of parameter of rule 18


Table 4. Transient characteristics for parameter variation of rule 18

Fig. 12. Histograms for transient measures (overshoot, rise time, peak time and settling time) for rule parameter 18

Also, it was found that every rule parameter has a full, medium or null influence with final response. Table 3 shows the analysis made with every implication. For example, with rule 18 it can be diminished the overshoot when PT2FLC is just trying to control the system to reach a specific tilt of plate.

110 Fuzzy Logic – Algorithms, Techniques and Implementations

**Transient Values Min Max** � �� Overshoot (rads) 0.0038 0.0529 0.0086 6.5267e-05 Delay Time (s) 0.95 0.95 0.95 0 Rise Time (s) 1.05 1.2 1.1465 8.5992e-04 Peak Time (s) 2.05 3.95 3.2162 0.1249 Settling Time (s) 2.35 3.95 3.2465 0.0774

Fig. 12. Histograms for transient measures (overshoot, rise time, peak time and settling time)

Also, it was found that every rule parameter has a full, medium or null influence with final response. Table 3 shows the analysis made with every implication. For example, with rule 18 it can be diminished the overshoot when PT2FLC is just trying to control the system to

Fig. 11. Transient response for several values of parameter of rule 18

Table 4. Transient characteristics for parameter variation of rule 18

for rule parameter 18

reach a specific tilt of plate.

Suppose a PT2FLC where it is only modified the parameter value of rule 18 and a set of parameters that can be spread randomly around the mean of its value ��(18) = 0.7. For this experiment, it was performed 43 iterations in order to show how the variation of ��(18) affects the overshoot attenuation and also other phenomena (Fig. 10-11).

Table 4 shows some results about the transient when trying to reach a tilt = 0.125 rads. Other phenomena can be analyzed for all 43 iterations. Also, in Fig.12 it can be seen that overshoot is attenuated drastically when ��(18) → 1, if it is only modified this rule. Time response (rise time, peak time and settling time) is also compromised due to parametric conjunctions. It can be seen also that drastic attenuation of overshoot occurs for ��(18) ≲ 0.7. Greater values do not affect it meaningfully. As it can be seen in (12), rule parameter proposed as the optimal for rule 18 is near to 1, which might be different with other configurations. This is because of the influence of the rest of rule parameters. However, this optimal configuration does not compromise the response time but it does eliminate the overshoot completely.

Fig. 13. Final approximation of T2FLC modifying rule parameters

Fig. 14. Comparison of response between T1FLC and parametric T2FLC when reference is a noisy sine signal

Parametric Type-2 Fuzzy Logic Systems 113

It is introduced a PT2FLC suitable for control system implementation using a new set of

Some phenomena are present when trying to tune a fuzzy system. Original B&P T1FLC was tuned to obtain the best results as in (Moreno-Armendariz, Rubio-Espino et al. 2010). When it was implemented a B&P PT2FLC with same set distribution in input and output with same rule set, as its counterpart, it was found that some phenomenon appears again. Final system response is related with all their variables, like set distribution, FOU width or conjunction parameters and they all have an implicit phenomenon which might be controlled, depending on the characteristics of the plant and the proposed rule set for a

A parametric conjunction to perform the implication can be applied to any fuzzy system, no matter if it is type1 or type 2. The usage of parametric conjunctions in inference help to weight the influence of premises and therefore it can be forced to obtain a certain crisp value desired. Finally it was obtained an optimal result when trying to control the B&P system,

When the PT2FLC is subjected to external perturbations, i.e. an extra level of uncertainty is aggregated to the system; the PT2FLC exhibits a better response over its T1 counterpart. Therefore, uncertain variations in inputs of a general FLC require sets with an appropriated

Therefore, the usage of PT2FLS for control purposes gives additional options for improving control precision and the usage of Monotone Sum of Conjunctions gives an opportunity to

Future research needs to examine the use of other parametric classes of conjunctions using simple functions. Moreover, this work can be extended using optimization techniques for calculating both better rule parameter selection and other parameters like set distribution and rule set. A hardware implementation is convenient in order to validate its behavior in

This work was supported by the Instituto de Ciencia y Tecnologia del Distrito Federal (ICyTDF) under project number PICCT08-22. We also thank the support of the Secretaria de Investigacion y Posgrado of Instituto Politecnico Nacional (SIP-IPN) under project number SIP-20113813 and project number SIP-20113709, COFFA-IPN and PIFI-IPN. Any opinions, findings, conclusions or recommendations expressed in this publication are those of the

Batyrshin, I. Z. and O. Kaynak (1999). "Parametric Classes of Generalized Conjunction and

Disjunction Operations for Fuzzy Modeling." IEEE Transactions on Fuzzy Systems

authors and do not necessarily reflect the views of the sponsoring agency.

parametric conjunction called (p) −monotone sum of conjunctions.

reaching the reference without overshoot, SSE nor ripple in 2.65 seconds.

**6. Conclusion** 

particular solution.

real time applications.

**8. References** 

**7**(5): 586-596.

**7. Acknowledgements** 

FOU that can capture and support them.

implement PT2FLC in hardware for real time applications.

Once it has been chosen the right parameter values of every rule it is possible to see that the influence of premises over a consequent may be regulated using a parametric conjunction. Then, overshoot and ripple have been completely removed and time response has been improved also as it can be seen in Fig. 13.

Finally Fig. 14 depicts this response of T1FLC and PT2FLC using the optimal set and rule parameters when reference cannot be determined in presence of noise. In this last experiment, signal to follow is a noisy sine signal with noise frequency equal to 500 Hz (applied to a single axis of plate). PT2FLC follows this shape very similar to T1FLC. It can be seen that PT2FLC filters all drastic changes of this noisy signal unlike T1FLC.

### **5. Discussion**

Some of encountered problems and solutions are listed below.

### **5.1 Overshoot**

The best results were obtained when it was reduced the FOU of every set, but reducing their FOU to zero converts the T2FLC into a T1FLC, so, this system could not deal with the uncertainties that could exist in feedback of control system (e.g. noise in sensor or noise due to illumination of room). The use of parametric conjunction operators instead the common tnorm operators, e.g. min, is the best solution to reduce the reminding overshoot after considering to modify the FOU of the sets. Due to overshoot is present when the ball is nearby the reference, inertia pulls the ball over the reference and no suitable control action could be applied. In order to smooth this action it is possible to decrease its effect diminishing the influence of premises using a parametric conjunction. A suitable value of parameter � of certain rule let drop that excessive control action, and therefore decrease the overshoot. Parameters of rules 8 and 18 have the major influence on overshoot.

### **5.2 Steady-State Error**

There is not a precise solution to decrease the SSE. But expert can play with FOU widths of variables. For example, reducing the SSE having a big FOU in sets of variable error and decreasing all FOUs of variable change is a good option to reduce all SSE. Also it is possible to reduce it modifying the centroids of output variable tilt. Unfortunately those actions could generate additional nonlinearities so an expert must evaluate this situation.

### **5.3 Ripple**

Ripple can be controlled considering the FOU width of the variable error. Having a big FOU in sets of variable change can help to reduce the ripple.

### **5.4 Response time**

A simpler approximation is possible considering the values of parameters of rules 8 and 18. If �� = 1 then all reminding ripple is cleared and if ��� = 1 then almost all overshoot is eliminated, but time response is increased. Hence, if the expert has not any timing constraints then the usage of those rule parameters might help to reduce the undesired phenomenon considering this compromise.

### **6. Conclusion**

112 Fuzzy Logic – Algorithms, Techniques and Implementations

Once it has been chosen the right parameter values of every rule it is possible to see that the influence of premises over a consequent may be regulated using a parametric conjunction. Then, overshoot and ripple have been completely removed and time response has been

Finally Fig. 14 depicts this response of T1FLC and PT2FLC using the optimal set and rule parameters when reference cannot be determined in presence of noise. In this last experiment, signal to follow is a noisy sine signal with noise frequency equal to 500 Hz (applied to a single axis of plate). PT2FLC follows this shape very similar to T1FLC. It can be

The best results were obtained when it was reduced the FOU of every set, but reducing their FOU to zero converts the T2FLC into a T1FLC, so, this system could not deal with the uncertainties that could exist in feedback of control system (e.g. noise in sensor or noise due to illumination of room). The use of parametric conjunction operators instead the common tnorm operators, e.g. min, is the best solution to reduce the reminding overshoot after considering to modify the FOU of the sets. Due to overshoot is present when the ball is nearby the reference, inertia pulls the ball over the reference and no suitable control action could be applied. In order to smooth this action it is possible to decrease its effect diminishing the influence of premises using a parametric conjunction. A suitable value of parameter � of certain rule let drop that excessive control action, and therefore decrease the

There is not a precise solution to decrease the SSE. But expert can play with FOU widths of variables. For example, reducing the SSE having a big FOU in sets of variable error and decreasing all FOUs of variable change is a good option to reduce all SSE. Also it is possible to reduce it modifying the centroids of output variable tilt. Unfortunately those actions

Ripple can be controlled considering the FOU width of the variable error. Having a big FOU

A simpler approximation is possible considering the values of parameters of rules 8 and 18. If �� = 1 then all reminding ripple is cleared and if ��� = 1 then almost all overshoot is eliminated, but time response is increased. Hence, if the expert has not any timing constraints then the usage of those rule parameters might help to reduce the undesired

seen that PT2FLC filters all drastic changes of this noisy signal unlike T1FLC.

overshoot. Parameters of rules 8 and 18 have the major influence on overshoot.

could generate additional nonlinearities so an expert must evaluate this situation.

in sets of variable change can help to reduce the ripple.

phenomenon considering this compromise.

Some of encountered problems and solutions are listed below.

improved also as it can be seen in Fig. 13.

**5. Discussion** 

**5.1 Overshoot** 

**5.2 Steady-State Error** 

**5.3 Ripple** 

**5.4 Response time** 

It is introduced a PT2FLC suitable for control system implementation using a new set of parametric conjunction called (p) −monotone sum of conjunctions.

Some phenomena are present when trying to tune a fuzzy system. Original B&P T1FLC was tuned to obtain the best results as in (Moreno-Armendariz, Rubio-Espino et al. 2010). When it was implemented a B&P PT2FLC with same set distribution in input and output with same rule set, as its counterpart, it was found that some phenomenon appears again. Final system response is related with all their variables, like set distribution, FOU width or conjunction parameters and they all have an implicit phenomenon which might be controlled, depending on the characteristics of the plant and the proposed rule set for a particular solution.

A parametric conjunction to perform the implication can be applied to any fuzzy system, no matter if it is type1 or type 2. The usage of parametric conjunctions in inference help to weight the influence of premises and therefore it can be forced to obtain a certain crisp value desired. Finally it was obtained an optimal result when trying to control the B&P system, reaching the reference without overshoot, SSE nor ripple in 2.65 seconds.

When the PT2FLC is subjected to external perturbations, i.e. an extra level of uncertainty is aggregated to the system; the PT2FLC exhibits a better response over its T1 counterpart. Therefore, uncertain variations in inputs of a general FLC require sets with an appropriated FOU that can capture and support them.

Therefore, the usage of PT2FLS for control purposes gives additional options for improving control precision and the usage of Monotone Sum of Conjunctions gives an opportunity to implement PT2FLC in hardware for real time applications.

Future research needs to examine the use of other parametric classes of conjunctions using simple functions. Moreover, this work can be extended using optimization techniques for calculating both better rule parameter selection and other parameters like set distribution and rule set. A hardware implementation is convenient in order to validate its behavior in real time applications.

### **7. Acknowledgements**

This work was supported by the Instituto de Ciencia y Tecnologia del Distrito Federal (ICyTDF) under project number PICCT08-22. We also thank the support of the Secretaria de Investigacion y Posgrado of Instituto Politecnico Nacional (SIP-IPN) under project number SIP-20113813 and project number SIP-20113709, COFFA-IPN and PIFI-IPN. Any opinions, findings, conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the sponsoring agency.

### **8. References**

Batyrshin, I. Z. and O. Kaynak (1999). "Parametric Classes of Generalized Conjunction and Disjunction Operations for Fuzzy Modeling." IEEE Transactions on Fuzzy Systems **7**(5): 586-596.

**6** 

**Application of Adaptive Neuro Fuzzy** 

Each construction project has unique features that differentiate it from even resembling projects. Construction techniques, design, contract types, liabilities, weather, soil conditions, politic-economic environment and many other aspects may be different for every new commitment. Uncertainty is a reality of construction business. Leung et al. (2007) developed a model to deal with uncertain demand by considering a multi-site production planning problem. The inventory control problem and quantify the value of advanced demand information were examined (Ozer and Wei, 2004). Mula et al. (2010) proposed mathematical programming models to address supply chain production and transport planning problems. A model for making multi-criteria decision was developed for both the manufacturers and the distributors Dong et al. (2005). A stochastic planning model was constructed for a twoechelon supply chain of a petroleum company Al-Othman et al. (2008). Weng and McClurg (2003) and Ray et al. (2005) focused supply uncertainty along with demand uncertainty in supply chains. Bollapragada et al. (2004) examined uncertain lead time for random demand

A number of methods were developed by researches to solve problems associated with uncertainties, including scenario programming (Wullink et al., 2004; Chang et al., 2007), stochastic programming (Popescu, 2007; Santoso et al., 2005), fuzzy approach (Petrovic et al., 1999; Schultmann et al., 2006; Liang, 2008), and computer simulation and intelligent algorithms (Kalyanmoy, 2001; Coello, 2005). However, each method is suitable for particular situations. The decision makers have to select the appropriate method for solving a problem. For a uncertain construction project, the fuzzy atmosphere has been represented with the terms 'uncertainty' or 'risk' by construction managers and researchers, and they tried to control this systematically through risk management and analysis methods since the early 1990s (Edwards L., 2004). Some researchers like Flanagan et al. Flanagan R, Norman G. (1993) and Pilcher R. (1985) put differentiation between these two terms. They have mentioned that uncertainty represents the situations in which there is no historical data; and risk, in contrast, can be used for situations where success or failure is determined in probabilistic quantities by benefiting from the previous data available. Since such a

**1. Introduction** 

and supply capacity in assembly Systems.

**Inference System in Supply Chain** 

*Prince of Songkla Universit, Kohong Hatyai, Songkhla,* 

**Management Evaluation** 

Thoedtida Thipparat

*Thailand* 

*Faculty of Management Sciences,* 


## **Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation**

Thoedtida Thipparat *Faculty of Management Sciences, Prince of Songkla Universit, Kohong Hatyai, Songkhla, Thailand* 

### **1. Introduction**

114 Fuzzy Logic – Algorithms, Techniques and Implementations

Batyrshin, I. Z., I. J. Rudas, et al. (2009). "On Generation of Digital Fuzzy Parametric

Karnik, N. N., J. M. Mendel, et al. (1999). "Type-2 Fuzzy Logic Systems." IEEE Transactions

Moreno-Armendariz, M. A., E. Rubio-Espino, et al. (2010). Design and Implementation of a

Rudas, I. J., I. Z. Batyrshin, et al. (2009). Digital Fuzzy Parametric Conjunctions for

Wu, H. and J. M. Mendel (2002). "Uncertainty Bounds and their Use in the Design of Interval Type-2 Fuzzy Logic Systems." IEEE Transactions on Fuzzy Systems **10**(5): 622-639.

Conference on Reconfigurable Computing and FPGAs, Cancun, Mexico. Prometeo Cortes, Ildar Z. Batyrshin, et al. (2010). FPGA Implementation of (p)-Monotone

Sum of Basic t-norms. International Conference on Fuzzy Systems.

Computational Cybernetics, Budapest, Hungary.

Visual Fuzzy Control in FPGA for the Ball and Plate System. IEEE International

Hardware Implementation of Fuzzy Systems. IEEE International Conference on

Conjunctions." Studies in Computational Intelligence **243**: 79-89. E. P. Klement, R. Mesiar, et al. (2000). Triangular norms. Dordrecht, Kluwer.

on Fuzzy Systems **7**(6): 643-658

Each construction project has unique features that differentiate it from even resembling projects. Construction techniques, design, contract types, liabilities, weather, soil conditions, politic-economic environment and many other aspects may be different for every new commitment. Uncertainty is a reality of construction business. Leung et al. (2007) developed a model to deal with uncertain demand by considering a multi-site production planning problem. The inventory control problem and quantify the value of advanced demand information were examined (Ozer and Wei, 2004). Mula et al. (2010) proposed mathematical programming models to address supply chain production and transport planning problems. A model for making multi-criteria decision was developed for both the manufacturers and the distributors Dong et al. (2005). A stochastic planning model was constructed for a twoechelon supply chain of a petroleum company Al-Othman et al. (2008). Weng and McClurg (2003) and Ray et al. (2005) focused supply uncertainty along with demand uncertainty in supply chains. Bollapragada et al. (2004) examined uncertain lead time for random demand and supply capacity in assembly Systems.

A number of methods were developed by researches to solve problems associated with uncertainties, including scenario programming (Wullink et al., 2004; Chang et al., 2007), stochastic programming (Popescu, 2007; Santoso et al., 2005), fuzzy approach (Petrovic et al., 1999; Schultmann et al., 2006; Liang, 2008), and computer simulation and intelligent algorithms (Kalyanmoy, 2001; Coello, 2005). However, each method is suitable for particular situations. The decision makers have to select the appropriate method for solving a problem.

For a uncertain construction project, the fuzzy atmosphere has been represented with the terms 'uncertainty' or 'risk' by construction managers and researchers, and they tried to control this systematically through risk management and analysis methods since the early 1990s (Edwards L., 2004). Some researchers like Flanagan et al. Flanagan R, Norman G. (1993) and Pilcher R. (1985) put differentiation between these two terms. They have mentioned that uncertainty represents the situations in which there is no historical data; and risk, in contrast, can be used for situations where success or failure is determined in probabilistic quantities by benefiting from the previous data available. Since such a

Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation 117

Section 2 reviews the literature on construction supply chain, supply chain performance evaluation and Agile Supply Chain (ASC); Section 3 represents the conceptual model using the capabilities of construction supply chain such as reliability, flexibility, responsiveness, cost, and asset, Section four contains an adaptive neuro fuzzy inference system (ANFIS) model which is proposed to evaluate flexibility in construction supply chains and the applicability of the proposed model has been tested by using construction companies in

Considering the construction industry, the client represents a unique customer with unique requirements. Stakeholders in the supply chain will provide these requirements. They must have the required primary competencies to make possible the fulfilment of these

In reality, organisations within a supply network delivering an office development will differ from those required to deliver a residential project. It may be useful to consider the chain as a network of organisations or a network organisations operating within the same market or industry to satisfy a variety of clients. Stakeholders involved in the construction supply-chain were classified into five categories related to the construction stages (H. Ismail & Sharif., 2005). The contract is the predominant approach for managing the relationship between organisations that operate in a construction project to deliver the client's required project. Although contracts are a sufficient basis for the delivery of a completed project, they are not sufficient to deliver a construction efficiently, at minimum

The definition of flexibility is still fuzzy, mainly because it largely deals with things already being addressed by industry and which are covered by existing research projects and programs. Many researchers provide conceptual over views, different reference and mature models of flexibility. For instance, Siemieniuch and Sinclair (2000) presented that to become a truly agile supply chain key enablers are classified into four categories: Collaborative relationship as the supply chain strategy, Process integration as the foundation of supply chain, Information integration as the infrastructure of supply chain and Customer /marketing sensitivity as the mechanism of supply chain. The aggregation of current approaches can be criticized as they haven't considered the impact of enablers in assessing supply chain flexibility and also the scale used to aggregate the flexibility capabilities has

Several papers present application of theories of measurement systems for managing performance of supply chain. However, there is no measurement system for managing performance of the entire supply chain. The adoption of metrics that cross of the borders of organization considering dimensions of performance related to inter and intra organization

Thailand. Finally, in section 5 the main conclusion of this study is discussed.

**2. Construction supply chain** 

**2.1 Construction supply chain** 

cost, and right first time'.

the limitations.

**2.2 Flexibility supply chain** 

requirements.

separation is regarded as meaningless in the construction literature, risk turns out to be the most consistent term to be used for construction projects because some probability values can be attached intuitively and judgmentally to even the most uncertain events (Flanagan R, Norman G., 1993). The uncertainty represented quantitatively at some level is not the uncertainty any more; rather it is the risk henceforth and needs to be managed.

Construction companies are trying to make their supply chain more effective, and more efficient. Supply chain management has the potential to make construction projects less fragmented, improve project quality, reduce project duration, and hence reduce total project cost, while creating more satisfied customers. Construction companies need to respond of uncertain environment by using the concept of flexibility. Construction companies have recognized that flexibility is crucial for their survival and competitiveness. Several definitions of flexibility have been proposed since the construct is still in its initial stage of application to organizational phenomenon. Flexibility is defined as "the agility of a supply chain to respond to market changes in demand in order to gain or maintain its competitive advantage" (Bolstorff, P., Rosenbaum, R., 2007). The combination of Supply Chain Management (SCM) and flexibility is a significant source of competitiveness which has come to be named Agile Supply Chain (ASC). This paper argues that it is important to establish the flexibility of the construction supply chain. After embracing ASC an important question must be asked: How construction companies can evaluate flexibility in supply chains? This evaluation is essential for construction managers as it assists in achieving flexibility effectively by performing gap analysis between existent flexibility level and the desired one and also provides more informative and reliable information for decision making. Therefore, this study attempts to answer this question with a particular focus on measuring flexibility.

An approach based on Adaptive Neuro Fuzzy Inference System (ANFIS) for measurement of agility in Supply Chain was developed (Seyedhoseini, S.M., et al., 2010). The researchers used ANFIS to deal with complexity and vagueness of agility in global markets. ANFIS was applied order to inject different and complicated agility capabilities (that is, flexibility, competency, cost, responsiveness and quickness) to the model in an ambiguous environment. In addition, this study developed different potential attributes of ANFIS. Membership functions for each agility capabilities were constructed. The collected data was trained by using the functions through an adaptive procedure, using fuzzy concepts in order to model objective attributes. The proposed approach was useful for surveying real life problems. The proposed procedure had efficiently been applied to a large scale automobile manufacturing company in Iran. Statistical analysis illustrated that there were no meaningful difference between experts' opinion and our proposed procedure for supply chain agility measurement.

A procedure with aforementioned functionality must be develop to cope with uncertain environment of construction projects and lack of efficient measuring tool for flexibility of supply chain system. This study is to apply fuzzy concepts and aggregate this powerful tool with Artificial Neural Network concepts in favor of gaining ANFIS to handle the imprecise nature of attributes for associated concepts of flexibility. ANFIS is considered as an efficient tool for development and surveying of the novel procedure. Due to our best knowledge this combination has never been reported in literature before. This paper is organized as follows. Section 2 reviews the literature on construction supply chain, supply chain performance evaluation and Agile Supply Chain (ASC); Section 3 represents the conceptual model using the capabilities of construction supply chain such as reliability, flexibility, responsiveness, cost, and asset, Section four contains an adaptive neuro fuzzy inference system (ANFIS) model which is proposed to evaluate flexibility in construction supply chains and the applicability of the proposed model has been tested by using construction companies in Thailand. Finally, in section 5 the main conclusion of this study is discussed.

### **2. Construction supply chain**

116 Fuzzy Logic – Algorithms, Techniques and Implementations

separation is regarded as meaningless in the construction literature, risk turns out to be the most consistent term to be used for construction projects because some probability values can be attached intuitively and judgmentally to even the most uncertain events (Flanagan R, Norman G., 1993). The uncertainty represented quantitatively at some level is not the

Construction companies are trying to make their supply chain more effective, and more efficient. Supply chain management has the potential to make construction projects less fragmented, improve project quality, reduce project duration, and hence reduce total project cost, while creating more satisfied customers. Construction companies need to respond of uncertain environment by using the concept of flexibility. Construction companies have recognized that flexibility is crucial for their survival and competitiveness. Several definitions of flexibility have been proposed since the construct is still in its initial stage of application to organizational phenomenon. Flexibility is defined as "the agility of a supply chain to respond to market changes in demand in order to gain or maintain its competitive advantage" (Bolstorff, P., Rosenbaum, R., 2007). The combination of Supply Chain Management (SCM) and flexibility is a significant source of competitiveness which has come to be named Agile Supply Chain (ASC). This paper argues that it is important to establish the flexibility of the construction supply chain. After embracing ASC an important question must be asked: How construction companies can evaluate flexibility in supply chains? This evaluation is essential for construction managers as it assists in achieving flexibility effectively by performing gap analysis between existent flexibility level and the desired one and also provides more informative and reliable information for decision making. Therefore, this study attempts to answer this question with a particular focus on measuring

An approach based on Adaptive Neuro Fuzzy Inference System (ANFIS) for measurement of agility in Supply Chain was developed (Seyedhoseini, S.M., et al., 2010). The researchers used ANFIS to deal with complexity and vagueness of agility in global markets. ANFIS was applied order to inject different and complicated agility capabilities (that is, flexibility, competency, cost, responsiveness and quickness) to the model in an ambiguous environment. In addition, this study developed different potential attributes of ANFIS. Membership functions for each agility capabilities were constructed. The collected data was trained by using the functions through an adaptive procedure, using fuzzy concepts in order to model objective attributes. The proposed approach was useful for surveying real life problems. The proposed procedure had efficiently been applied to a large scale automobile manufacturing company in Iran. Statistical analysis illustrated that there were no meaningful difference between experts' opinion and our proposed procedure for supply

A procedure with aforementioned functionality must be develop to cope with uncertain environment of construction projects and lack of efficient measuring tool for flexibility of supply chain system. This study is to apply fuzzy concepts and aggregate this powerful tool with Artificial Neural Network concepts in favor of gaining ANFIS to handle the imprecise nature of attributes for associated concepts of flexibility. ANFIS is considered as an efficient tool for development and surveying of the novel procedure. Due to our best knowledge this combination has never been reported in literature before. This paper is organized as follows.

uncertainty any more; rather it is the risk henceforth and needs to be managed.

flexibility.

chain agility measurement.

Considering the construction industry, the client represents a unique customer with unique requirements. Stakeholders in the supply chain will provide these requirements. They must have the required primary competencies to make possible the fulfilment of these requirements.

### **2.1 Construction supply chain**

In reality, organisations within a supply network delivering an office development will differ from those required to deliver a residential project. It may be useful to consider the chain as a network of organisations or a network organisations operating within the same market or industry to satisfy a variety of clients. Stakeholders involved in the construction supply-chain were classified into five categories related to the construction stages (H. Ismail & Sharif., 2005). The contract is the predominant approach for managing the relationship between organisations that operate in a construction project to deliver the client's required project. Although contracts are a sufficient basis for the delivery of a completed project, they are not sufficient to deliver a construction efficiently, at minimum cost, and right first time'.

### **2.2 Flexibility supply chain**

The definition of flexibility is still fuzzy, mainly because it largely deals with things already being addressed by industry and which are covered by existing research projects and programs. Many researchers provide conceptual over views, different reference and mature models of flexibility. For instance, Siemieniuch and Sinclair (2000) presented that to become a truly agile supply chain key enablers are classified into four categories: Collaborative relationship as the supply chain strategy, Process integration as the foundation of supply chain, Information integration as the infrastructure of supply chain and Customer /marketing sensitivity as the mechanism of supply chain. The aggregation of current approaches can be criticized as they haven't considered the impact of enablers in assessing supply chain flexibility and also the scale used to aggregate the flexibility capabilities has the limitations.

Several papers present application of theories of measurement systems for managing performance of supply chain. However, there is no measurement system for managing performance of the entire supply chain. The adoption of metrics that cross of the borders of organization considering dimensions of performance related to inter and intra organization

Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation 119

Delivery cycle

The neuro-fuzzy system attempts to model the uncertainty in the factor assessments, accounting for their qualitative nature. A combination of classic stochastic simulations and fuzzy logic operations on the ANN inputs as a supplement to artificial neural network is employed. Artificial Neural Networks (ANN) has the capability of self-learning, while fuzzy logic inference system (FLIS) is capable of dealing with fuzzy language information and simulating judgment and decision making of the human brain. It is currently the research focus to combine ANN with FLIS to produce fuzzy network system. ANFIS is an example of such a readily available system, which uses ANN to accomplish fuzzification, fuzzy inference and defuzzification of a fuzzy system. ANFIS utilizes ANN's learning mechanisms to draw rules from input and output data pairs. The system possesses not only the function of adaptive learning but also the function of fuzzy information describing and processing, and judgment and decision making. ANFIS is different from ANN in that ANN uses the connection weights to describe a system while ANFIS uses fuzzy language rules from fuzzy

The ANFIS approach adopts Gaussian functions (or other membership functions) for fuzzy sets, linear functions for the rule outputs, and Sugeno's inference mechanism (R.E. Spekman, J.W. Kamau! Jr., N. Myhr., 1998). The parameters of the network are the mean and standard deviation of the membership functions (antecedent parameters) and the coefficients of the output linear functions as well (consequent parameters). The ANFIS learning algorithm is then used to obtain these parameters. This learning algorithm is a hybrid algorithm consisting of the gradient descent and the least-squares estimate. Using this hybrid algorithm, the rule parameters are recursively updated until an acceptable level of error is reached. Each iteration includes two passes, forward and backward. In the forward pass, the antecedent parameters are fixed and the consequent parameters are obtained using the linear least-squares estimation. In the backward pass, the consequent parameters are fixed and the error signals propagate backward as well as the antecedent

time

Order fulfillment Total cost supply chain

Source cycle time Finance and

Make cycle time Inventory

management

planning cost

carrying cost

chain

Order

IT cost for supply

acquisition cost

management cost

Cash to cash

Days sales outstanding

Days payable outstanding

Inventory days of supply

Return of asset

Asset turns Net profit

Reliability Flexibility Responsiveness Cost Asset

Upside flexibility supply chain

flexibility

Upside make flexibility

Upside delivery flexibility

Perfect condition Material

Perfect order fulfillment

day

day

Accurate documentation

Delivery to commit

Delivery to commit

Orders in full Upside source

Table 1. Input/Output indicators

inference to describe a system.

**4. Neurofuzzy model** 

processes (Lapide, L., 2000). The metrics developed by the SCOR model (Supply-Chain Council (SCC), 2011) were proposed to analyze a supply chain form three perspectives: process, metrics and best practice. The connections between the inter-organizational processes in each company in a supply chain are created based on the SCOR framework. The common and standardized language among the company within a supply chain is developed in order to compare supply chain performance as a whole.

There are five performance attributes in top level SCOR metric, namely reliability, responsiveness, flexibility, cost and asset management efficiency (Bolstorff, P., Rosenbaum, R., 2007). Reliability is defined as the performance related to the delivery, i.e., whether the correct product (according to specifications) is delivered to the correct place, it the correct quantity, at the correct time, with the correct documentation and the right customer. The definition of responsiveness is the speed at which a supply chain provides the products to customers. Flexibility is the agility of a supply chain to respond to market changes in demand in order to gain or maintain its competitive advantage. All the costs related to the operation of supply chain are included in the cost attribute. The asset management efficiency is the efficiency of an organization in managing its resources to meet demand. The management of all the resources (i.e., fixed and working capital) is considered.

The first limitation of supply chain flexibility evaluation is that the techniques do not consider the ambiguity and multi possibility associated with mapping of individual judgment to a number. The second limitation is the subjective judgment, selection and preference of evaluators having a significant influence on these methods. Because of the fact that the qualitative and ambiguous attributes are linked to flexibility assessment, most measures are described subjectively using linguistic terms, and cannot be handled effectively using conventional assessment approaches. The fuzzy logic provides an effective means of handling problems involving imprecise and vague phenomena. Fuzzy concepts enable assessors to use linguistic terms to assess indicators in natural language expressions, and each linguistic term can be associated with a membership function. In addition, fuzzy logic has generally found significant applications in management decisions. This study applies a fuzzy inference system for mapping input space (tangible and intangible) to output space in order to assist construction companies in better achieving an flexibility supply chain. The proposed Fuzzy Inference System (FIS) has been based on the experiences of experts to evaluate flexibility of construction supply chains.

### **3. Methodology**

To evaluate flexibility of the construction supply chain two main steps are performed. At the first step, measurement criteria are identified. A conceptual model is developed based on literature review. Capabilities of supply chain are employed to define supply chain performance in three basic segments: sourcing, construction and delivery. In this study the conceptual model involves four attributes: reliability, flexibility, responsiveness, cost, and asset. Twenty seven sub-attributes are the basis of the conceptual model as shown in Table 1. At the Second step, the design of an ANFIS architecture is performed by constructing an input-output mapping based on both human knowledge in the form of fuzzy if-then rules with appropriate membership functions and stipulated input-output data based- for deriving performance in supply chains.


Table 1. Input/Output indicators

### **4. Neurofuzzy model**

118 Fuzzy Logic – Algorithms, Techniques and Implementations

processes (Lapide, L., 2000). The metrics developed by the SCOR model (Supply-Chain Council (SCC), 2011) were proposed to analyze a supply chain form three perspectives: process, metrics and best practice. The connections between the inter-organizational processes in each company in a supply chain are created based on the SCOR framework. The common and standardized language among the company within a supply chain is

There are five performance attributes in top level SCOR metric, namely reliability, responsiveness, flexibility, cost and asset management efficiency (Bolstorff, P., Rosenbaum, R., 2007). Reliability is defined as the performance related to the delivery, i.e., whether the correct product (according to specifications) is delivered to the correct place, it the correct quantity, at the correct time, with the correct documentation and the right customer. The definition of responsiveness is the speed at which a supply chain provides the products to customers. Flexibility is the agility of a supply chain to respond to market changes in demand in order to gain or maintain its competitive advantage. All the costs related to the operation of supply chain are included in the cost attribute. The asset management efficiency is the efficiency of an organization in managing its resources to meet demand. The

The first limitation of supply chain flexibility evaluation is that the techniques do not consider the ambiguity and multi possibility associated with mapping of individual judgment to a number. The second limitation is the subjective judgment, selection and preference of evaluators having a significant influence on these methods. Because of the fact that the qualitative and ambiguous attributes are linked to flexibility assessment, most measures are described subjectively using linguistic terms, and cannot be handled effectively using conventional assessment approaches. The fuzzy logic provides an effective means of handling problems involving imprecise and vague phenomena. Fuzzy concepts enable assessors to use linguistic terms to assess indicators in natural language expressions, and each linguistic term can be associated with a membership function. In addition, fuzzy logic has generally found significant applications in management decisions. This study applies a fuzzy inference system for mapping input space (tangible and intangible) to output space in order to assist construction companies in better achieving an flexibility supply chain. The proposed Fuzzy Inference System (FIS) has been based on the experiences

To evaluate flexibility of the construction supply chain two main steps are performed. At the first step, measurement criteria are identified. A conceptual model is developed based on literature review. Capabilities of supply chain are employed to define supply chain performance in three basic segments: sourcing, construction and delivery. In this study the conceptual model involves four attributes: reliability, flexibility, responsiveness, cost, and asset. Twenty seven sub-attributes are the basis of the conceptual model as shown in Table 1. At the Second step, the design of an ANFIS architecture is performed by constructing an input-output mapping based on both human knowledge in the form of fuzzy if-then rules with appropriate membership functions and stipulated input-output data based- for

developed in order to compare supply chain performance as a whole.

management of all the resources (i.e., fixed and working capital) is considered.

of experts to evaluate flexibility of construction supply chains.

**3. Methodology** 

deriving performance in supply chains.

The neuro-fuzzy system attempts to model the uncertainty in the factor assessments, accounting for their qualitative nature. A combination of classic stochastic simulations and fuzzy logic operations on the ANN inputs as a supplement to artificial neural network is employed. Artificial Neural Networks (ANN) has the capability of self-learning, while fuzzy logic inference system (FLIS) is capable of dealing with fuzzy language information and simulating judgment and decision making of the human brain. It is currently the research focus to combine ANN with FLIS to produce fuzzy network system. ANFIS is an example of such a readily available system, which uses ANN to accomplish fuzzification, fuzzy inference and defuzzification of a fuzzy system. ANFIS utilizes ANN's learning mechanisms to draw rules from input and output data pairs. The system possesses not only the function of adaptive learning but also the function of fuzzy information describing and processing, and judgment and decision making. ANFIS is different from ANN in that ANN uses the connection weights to describe a system while ANFIS uses fuzzy language rules from fuzzy inference to describe a system.

The ANFIS approach adopts Gaussian functions (or other membership functions) for fuzzy sets, linear functions for the rule outputs, and Sugeno's inference mechanism (R.E. Spekman, J.W. Kamau! Jr., N. Myhr., 1998). The parameters of the network are the mean and standard deviation of the membership functions (antecedent parameters) and the coefficients of the output linear functions as well (consequent parameters). The ANFIS learning algorithm is then used to obtain these parameters. This learning algorithm is a hybrid algorithm consisting of the gradient descent and the least-squares estimate. Using this hybrid algorithm, the rule parameters are recursively updated until an acceptable level of error is reached. Each iteration includes two passes, forward and backward. In the forward pass, the antecedent parameters are fixed and the consequent parameters are obtained using the linear least-squares estimation. In the backward pass, the consequent parameters are fixed and the error signals propagate backward as well as the antecedent

Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation 121

For proving the applicability of the model and illustration, the proposed model was applied in twenty-five of the construction companies in Thailand. The first step to apply the model was to construct the decision team. The stakeholders involved in the construction stage became the decision team including main contractor, domestic subcontractors, nominated subcontractors, project manager, material suppliers, plant/equipment suppliers, designers, financial institution, insurance agency, and regulatory bodies. For training the ANFIS, a questionnaire was designed including the identified criteria. The decision team was asked to give a score to them, based on their knowledge associated with the construction stage. A Matlab programme was generated and compiled. The pre-processed input/output matrix which contained all the necessary representative features, was used to train the fuzzy inference system. Fig 2 shows the structure of the ANFIS; a Sugeno fuzzy inference system was used in this investigation. Based on the collected data, 150 data sets were used to train the ANFIS and the rest (50) for checking and validation of the model. For rule generation, the subtractive clustering was employ where the range of influence, squash factor, acceptance ratio, and rejection ratio were set at 0.5, 1.25, 0.5 and 0.15, respectively during the process of subtractive clustering. The trained fuzzy inference system includes 20 rules (clusters) as present in Fig 3. Because by using subtractive clustering, input space was categorized into 20 clusters. Each input has 20 Gaussian curve built-in membership functions. During training in ANFIS, sets of processed data were used to conduct 260 cycles

By inserting ANFIS output to the system the flexibility level of the supply chain management can be derived. In addition, the trend of training error and checking error has been shown in Fig 4. The researcher continued the training process to 500 epochs because the trend of checking error started to increase afterward and over fitting occurred. The value of checking error by 500 epochs was 1.45 which is acceptable. Then the value of supply chain flexibility is derived by a trained ANFIS. The ANFIS output in Thai construction

of learning.

companies is calculated.

Fig. 2. Network of innovation performance by the ANFIS

parameters are updated by the gradient descent method. An ANFIS architecture is equivalent to a two-input first-order Sugeno fuzzy model with nine rules, where each input is assumed to have three associated membership functions (MFs) (Z.Zhang., D.Ding., L.Rao., and Z.Bi., 2006). Sub-attributes associated with reliability, flexibility, responsiveness, cost, and asset are used as input variables; simultaneously, construction supply chain performance is considered as output variables. These input variables were used in the measurement of the supply chain performance by (G.M.D. Ganga, L.C.R. Carpinetti., 2011). Fig 1 is an ANFIS architecture that is equivalent to a two-input first-order Sugeno fuzzy model with nine rules, where each input is assumed to have three associated membership functions (MFs) (J. Jassbi, S.M. Seyedhosseini, and N. Pilevari., 2010).

Fig. 1. The ANFIS architecture for two input variables

120 Fuzzy Logic – Algorithms, Techniques and Implementations

parameters are updated by the gradient descent method. An ANFIS architecture is equivalent to a two-input first-order Sugeno fuzzy model with nine rules, where each input is assumed to have three associated membership functions (MFs) (Z.Zhang., D.Ding., L.Rao., and Z.Bi., 2006). Sub-attributes associated with reliability, flexibility, responsiveness, cost, and asset are used as input variables; simultaneously, construction supply chain performance is considered as output variables. These input variables were used in the measurement of the supply chain performance by (G.M.D. Ganga, L.C.R. Carpinetti., 2011). Fig 1 is an ANFIS architecture that is equivalent to a two-input first-order Sugeno fuzzy model with nine rules, where each input is assumed to have three associated membership

functions (MFs) (J. Jassbi, S.M. Seyedhosseini, and N. Pilevari., 2010).

Fig. 1. The ANFIS architecture for two input variables

For proving the applicability of the model and illustration, the proposed model was applied in twenty-five of the construction companies in Thailand. The first step to apply the model was to construct the decision team. The stakeholders involved in the construction stage became the decision team including main contractor, domestic subcontractors, nominated subcontractors, project manager, material suppliers, plant/equipment suppliers, designers, financial institution, insurance agency, and regulatory bodies. For training the ANFIS, a questionnaire was designed including the identified criteria. The decision team was asked to give a score to them, based on their knowledge associated with the construction stage. A Matlab programme was generated and compiled. The pre-processed input/output matrix which contained all the necessary representative features, was used to train the fuzzy inference system. Fig 2 shows the structure of the ANFIS; a Sugeno fuzzy inference system was used in this investigation. Based on the collected data, 150 data sets were used to train the ANFIS and the rest (50) for checking and validation of the model. For rule generation, the subtractive clustering was employ where the range of influence, squash factor, acceptance ratio, and rejection ratio were set at 0.5, 1.25, 0.5 and 0.15, respectively during the process of subtractive clustering. The trained fuzzy inference system includes 20 rules (clusters) as present in Fig 3. Because by using subtractive clustering, input space was categorized into 20 clusters. Each input has 20 Gaussian curve built-in membership functions. During training in ANFIS, sets of processed data were used to conduct 260 cycles of learning.

By inserting ANFIS output to the system the flexibility level of the supply chain management can be derived. In addition, the trend of training error and checking error has been shown in Fig 4. The researcher continued the training process to 500 epochs because the trend of checking error started to increase afterward and over fitting occurred. The value of checking error by 500 epochs was 1.45 which is acceptable. Then the value of supply chain flexibility is derived by a trained ANFIS. The ANFIS output in Thai construction companies is calculated.

Fig. 2. Network of innovation performance by the ANFIS

Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation 123

Fig. 5. Network of construction supply chain performance by the ANFIS

No. Criteria

Table 2. Input values for the trained ANFIS

The rate of sub-attributes associated with flexibility, responsiveness & quickness, competency and cost and the output of ANFIS have been shown in Table 2 and 3, respectively. The twenty-five scenarios were used to test the performance the proposed method. The results indicate that the output values obtained from ANFIS are closer to the values given by experts in most scenarios being tested. The average and standard deviation of the differences between the estimated and the output values obtained from expert produced by ANFIS are calculated to be 12.6% and 8.75% respectively. As far as ANFIS is concerned, its biggest advantage is that there is no need to know the concrete functional relationship between outputs and inputs. Any relationship, linear or nonlinear, can be learned and approximated by an ANFIS such as a five-layer with sufficient large number of neurons in the hidden layer. In the case that the functional relationship between outputs and inputs is not known or cannot be determined, ANFIS definitely outperforms regression, which requires the relationship between output and inputs be known or specified. Another remarkable advantage of ANFIS is its capability of modelling the data of multiple inputs and multiple outputs. ANFIS has no restriction on the number of output. The relationships can be learned simultaneously by an ANFIS with multiple inputs and multiple outputs.

Reliability Flexibility Responsiveness Cost Asset

1 23 54 31 31 20 2 20 65 65 28 26 3 19 75 75 30 23 : : : : : : 98 20 65 65 28 31 99 26 70 70 32 65 100 23 70 25 31 75


Fig. 3. Trained main ANFIS surface of supply chain performance

Fig. 4. Trend of errors of trained fuzzy system

Fig 5 depicts a three dimensional plot that represents the mapping from reliability (in1) and flexibility (in2) to supply chain performance (out1). As the reliability and flexibility increases, the predicted supply chain performance increases in a non-linear piecewise manner, this being largely due to non-linearity of the characteristic of the input vector matrix derived from the collected data. This assumes that the collected data are fully representative of the features of the data that the trained FIS is intended to model. However the data are inherently insufficient and training data cannot cover all the features of the data that should be presented to the trained model. The accuracy of the model, therefore, is affected under such circumstances.

122 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig 5 depicts a three dimensional plot that represents the mapping from reliability (in1) and flexibility (in2) to supply chain performance (out1). As the reliability and flexibility increases, the predicted supply chain performance increases in a non-linear piecewise manner, this being largely due to non-linearity of the characteristic of the input vector matrix derived from the collected data. This assumes that the collected data are fully representative of the features of the data that the trained FIS is intended to model. However the data are inherently insufficient and training data cannot cover all the features of the data that should be presented to the trained model. The accuracy of the model, therefore, is

Fig. 3. Trained main ANFIS surface of supply chain performance

Fig. 4. Trend of errors of trained fuzzy system

affected under such circumstances.

Fig. 5. Network of construction supply chain performance by the ANFIS

The rate of sub-attributes associated with flexibility, responsiveness & quickness, competency and cost and the output of ANFIS have been shown in Table 2 and 3, respectively. The twenty-five scenarios were used to test the performance the proposed method. The results indicate that the output values obtained from ANFIS are closer to the values given by experts in most scenarios being tested. The average and standard deviation of the differences between the estimated and the output values obtained from expert produced by ANFIS are calculated to be 12.6% and 8.75% respectively. As far as ANFIS is concerned, its biggest advantage is that there is no need to know the concrete functional relationship between outputs and inputs. Any relationship, linear or nonlinear, can be learned and approximated by an ANFIS such as a five-layer with sufficient large number of neurons in the hidden layer. In the case that the functional relationship between outputs and inputs is not known or cannot be determined, ANFIS definitely outperforms regression, which requires the relationship between output and inputs be known or specified. Another remarkable advantage of ANFIS is its capability of modelling the data of multiple inputs and multiple outputs. ANFIS has no restriction on the number of output. The relationships can be learned simultaneously by an ANFIS with multiple inputs and multiple outputs.


Table 2. Input values for the trained ANFIS

Application of Adaptive Neuro Fuzzy Inference System in Supply Chain Management Evaluation 125

Bollapragada, R., Rao, U.S., Zhang, J. (2004) Managing inventory and supply performance in

Bolstorff, P., Rosenbaum, R. (2007) *Supply chain excelence*. A handbook for Dramatic

Coello, C.A.C. (2005) An introduction to evolutionary algorithms and their applications.

Dong, J., Zhang, D., Yan, H., Nagurney, A., (2005) Multitiered supply chain networks:

Edwards L. (2004). *Practical risk management in the construction industry,* London: Thomas

Flanagan R, Norman G.(1993) *Risk management and construction*, Cambridge: Backwell

Ganga, G.M.D., Carpinetti, L.C.R. (2011) A fuzzy logic approach to supply chain

Ismail, H. and Sharif. (2005) Supply chain design for supply chain: A balanced approach to building agile supply chain, *Proceeding of the International conference on Flexibility*. Jassbi, J., Seyedhosseini, S.M. and Pilevari, N. (2010) An adaptive neuro fuzzy inference

Kalyanmoy, D. (2001) Multi-objective Optimization Using Evolutionary Algorithms. John

Lapide, L. (2000) What about measuring supply chain performance? *Achieving Supply Chain* 

Leung, S.C.H., Tsang, S.O.S., Ng, W.L., Wu, Y., (2007) A robust optimization model for

Liang, T.F. (2008) Integrating production-transportation planning decision with fuzzy

Mula, J., Peidro, D., D ด az-Madro๑ero, M., Vicens, E., (2010) Mathematical programming

O zer, O., Wei, W., (2004) Inventory Control with Limited Capacity and Advance Demand

Petrovic, D., Roy, R., Petrovic, R. (1999) Supply chain modeling using fuzzy sets. International *Journal of Production Economics*, Vol. 59, No.1–3, pp.443–453. Popescu, I. (2007) Robust Mean-Covariance Solutions for Stochastic Optimization. *Operations* 

Ray, S., Li, S., Song, Y. ( 2005) Tailored supply chain decision making under price sensitive

stochastic demand and delivery uncertainty. *Management Science*, Vol. 51, No.12,

system for supply chain flexibility evaluation, *International Journal of Industrial* 

multi-site production planning problem in an uncertain environment, *European* 

multiple goals in supply chains. *International Journal of Production Research*, Vol. 46,

models for supply chain production and transport planning, *European Journal of* 

Improvement Using the SCOR Model. AMACOM, NY.

performance management, *Int. J. Production Economics*.

*Engineering & Production Research*, Vol. 20, pp.187-196.

*Journal of Operational Research,* Vol.181, No.1, pp. 224–238.

Information, *Operations Research,* Vol. 52, No.6, pp.988–1000.

*Excellence through Technology*, Vol. 2, pp. 287–297. Pilcher R. (1985) *Project cost control in construction*, London: Collins.

*Operational Research,* Vol.204, No.3, pp. 377–390.

*Research*, Vol. 55, No.1, pp.98–112.

Advanced Distributed Systems. Springer, Berlin, Heidelberg.

Vol.50, No.12, pp.1729–1743.

Vol.135, No.1, pp. 155–178.

Wiley and Sons, New York.

No.6, pp.1477–1494.

pp.1873–1891.

Telford.

Scientific.

assembly systems with random supply capacity and demand. *Management Science*

Multicriteria mecision-making under uncertainty. *Annals of Operations Research*,


Table 3. Possible value obtained form ANFIS method

### **5. Conclusion**

This paper has discussed the need for flexibility assessment of the construction supply chain. The particular features of construction supply chains highlighted. The need for and potential benefits of, construction supply chain flexibility assessment were then examined and the conceptual model of a flexibility assessment model for the supply chain presented. Case studies of the use of the model in assessing the construction organizations were also presented. The following conclusions can be drawn from the work presented in this paper: The way to improve the construction supply chain delivers projects is necessary to achieve client satisfaction, efficiency, effectiveness and profitability. It is important to perform the flexibility assessment of the construction supply chain in order to ensure that maximum benefit can be obtained. Since agile supply chain is considered as a dominant competitive advantage in recent years, evaluating supply chain flexibility can be useful and applicable for managers to make more informative and reliable decisions in anticipated changes of construction markets. The development of an appropriate flexibility assessment tool or model for the construction supply chain is necessary, as existing models are not appropriate in their present form. The results reveal that the ANFIS model improves flexibility assessment by using fuzzy rules to generate the adaptive neuro-fuzzy network, as well as a rotation method of training and testing data selection which is designed to enhance the reliability of the sampling process before constructing the training and testing model. The ANFIS model can explain the training procedure of outcome and how to simulate the rules for prediction. It can provide more accuracy on prediction.

Further research is necessary to compare efficiency of different models for measuring flexibility in supply chain. Although this study has been performed in the construction companies, the proposed methodology is applicable to other companies, e.g. consulting companies. Enablers in flexibility evaluation should be determined and the impact of them on capabilities must be studied in further researches. In addition, the relations between enablers should be considered in order to design a dynamic system for the supply chain management evaluation.

### **6. References**

Al-Othman, W.B.E., Lababidi, H.M.S., Alatiqi, I.M., Al-Shayji, K., (2008) Supply chain optimization of petroleum organization under uncertainty in market demands and prices*. European Journal of Operational Research*, Vol. 189, No.3, pp. 822–840.

124 Fuzzy Logic – Algorithms, Techniques and Implementations

1 11 72 15.4 2 20 75 15.08 3 19 75 19.67

98 18 66 10.17 99 26 30 28.46 100 28 55 8.77

This paper has discussed the need for flexibility assessment of the construction supply chain. The particular features of construction supply chains highlighted. The need for and potential benefits of, construction supply chain flexibility assessment were then examined and the conceptual model of a flexibility assessment model for the supply chain presented. Case studies of the use of the model in assessing the construction organizations were also presented. The following conclusions can be drawn from the work presented in this paper: The way to improve the construction supply chain delivers projects is necessary to achieve client satisfaction, efficiency, effectiveness and profitability. It is important to perform the flexibility assessment of the construction supply chain in order to ensure that maximum benefit can be obtained. Since agile supply chain is considered as a dominant competitive advantage in recent years, evaluating supply chain flexibility can be useful and applicable for managers to make more informative and reliable decisions in anticipated changes of construction markets. The development of an appropriate flexibility assessment tool or model for the construction supply chain is necessary, as existing models are not appropriate in their present form. The results reveal that the ANFIS model improves flexibility assessment by using fuzzy rules to generate the adaptive neuro-fuzzy network, as well as a rotation method of training and testing data selection which is designed to enhance the reliability of the sampling process before constructing the training and testing model. The ANFIS model can explain the training procedure of outcome and how to simulate the rules

Further research is necessary to compare efficiency of different models for measuring flexibility in supply chain. Although this study has been performed in the construction companies, the proposed methodology is applicable to other companies, e.g. consulting companies. Enablers in flexibility evaluation should be determined and the impact of them on capabilities must be studied in further researches. In addition, the relations between enablers should be considered in order to design a dynamic system for the supply chain

Al-Othman, W.B.E., Lababidi, H.M.S., Alatiqi, I.M., Al-Shayji, K., (2008) Supply chain

prices*. European Journal of Operational Research*, Vol. 189, No.3, pp. 822–840.

optimization of petroleum organization under uncertainty in market demands and

Expert's ANFIS Percent Difference

No Value of output

Table 3. Possible value obtained form ANFIS method

for prediction. It can provide more accuracy on prediction.

:

**5. Conclusion** 

management evaluation.

**6. References** 


**0**

**7**

*Mexico*

**in Wavelet Domain**

*National Polytechnic Institute of Mexico*

**Fuzzy Image Segmentation Algorithms**

The images are considered one of the most important means of information transmission; therefore the image processing has become an important tool in a variety of fields such as video coding, computer vision and medical imaging. Within the image processing, there is the segmentation process that involves partitioning an image into a set of homogeneous and meaningful regions, such that the pixels in each partitioned region possess an identical set of properties or attributes (Gonzalez & Woods, 1992). The sets of properties of the image may include gray levels, contrast, spectral values, or texture properties, etc. The result of segmentation is a number of homogeneous regions, each having a unique label. Image segmentation is often considered to be the most important task in computer vision. However, the segmentation in images is a challenging task due to several reasons: irregular and dispersive lesion borders, low contrast, artifacts in the image and variety of colors within the interest region. Therefore, numerous methods have been developed for image segmentation within applications in the computer vision. Image segmentation can be classified into three categories: A) *Supervised*.- These methods require the interactivity in which the pixels belonging to the same intensity range pointed out manually and segmented. B) *Automatic*.- This is also known as unsupervised methods, where the algorithms need some priori information, so these methods are more complex, and C) *Semi-automatic*.- That is the combination of manual and automatic segmentation. Some of practical applications of image segmentation are: the medical imaging tasks that consist of location of tumors and other pathologies, recognition of the objects in images of remote sensing obtained via satellite or aerial platforms, automated-recognition systems to inspect the electronic assemblies, biometrics, automatic traffic controlling systems, machine vision, separating and tracking the regions appearing in consequent frames of an sequence, and finally, the real time mobile robot

A lot of methods have been developed in the image segmentation. Let present brief

**1. Introduction**

applications employing vision systems. 1.

description of the several promising frameworks.

**2. Related work**

<sup>1</sup> (Gonzalez & Woods, 1992)

Heydy Castillejos and Volodymyr Ponomaryov


### **Fuzzy Image Segmentation Algorithms in Wavelet Domain**

Heydy Castillejos and Volodymyr Ponomaryov *National Polytechnic Institute of Mexico Mexico*

### **1. Introduction**

126 Fuzzy Logic – Algorithms, Techniques and Implementations

Santoso, T., Ahmed, S., Goetschalckx, M., Shapiro, A. (2005) A stochastic programming

Schultmann, F., Frohling, M., Rentz, O. (2006) Fuzzy approach for production planning and

Seyedhosseini, S.M., Jassbi, J., and Pilevari, N.(2010) Application of adaptive neuro fuzzy

Siemieniuch, C.E., and Sinclair, M.A. (2000) Implications of the supply chain for role

Spekman, R.E., Kamau Jr. J.W., Myhr, N. (1998) An empirical investigation into supply chain

Wullink, G., Gademann, A.J.R.M., Hans, E.W., Van Harten, A. (2004) Scenario-based

Zhang, Z., Ding, D., Rao, L., and Bi, Z. (2006) An ANFIS based approach for predicting the

*Distribution and Logistics Management*, Vol. 28, No. 8, pp. 630-650. Supply-Chain Council (SCC). (3 June 2011). Availableat: /http://www.supply-chain.org Weng, Z.K., McClurg, T., (2003) Coordinated ordering decisions for short life cycle products

*Operational Research*. Vol.167 , No.1, pp. 96–115.

*Ergonomics in Manufacturing*, Vol. 10, No.3, pp. 251-272.

*Production Research*, Vol. 42, No.24, pp.5079–5098.

*Research*, Vol. 44, No.8, pp. 1589–1612.

*Research*, No. 151, No.1, pp.12–24.

*Methods (GSP 153) ASCE*.

083-096.

approach for supply chain network design under uncertainty. *European Journal of* 

detailed scheduling in paints manufacturing. *International Journal of Production* 

inference system in measurement of supply chain agility: Real case study of a manufacturing company, *African Journal of Business Management*, Vol.4, No. 1, pp.

definitions in concurrent engineering, *International Journal of Human Factors and* 

management\*A perspective on partnerships, *International Journal of Physical* 

with uncertainty in delivery time and demand. *European Journal of Operational* 

approach for flexible resource loading under uncertainty*. International Journal of* 

ultimate bearing capacity of single piles, *Foundation Analysis and Design : Innovative* 

The images are considered one of the most important means of information transmission; therefore the image processing has become an important tool in a variety of fields such as video coding, computer vision and medical imaging. Within the image processing, there is the segmentation process that involves partitioning an image into a set of homogeneous and meaningful regions, such that the pixels in each partitioned region possess an identical set of properties or attributes (Gonzalez & Woods, 1992). The sets of properties of the image may include gray levels, contrast, spectral values, or texture properties, etc. The result of segmentation is a number of homogeneous regions, each having a unique label. Image segmentation is often considered to be the most important task in computer vision. However, the segmentation in images is a challenging task due to several reasons: irregular and dispersive lesion borders, low contrast, artifacts in the image and variety of colors within the interest region. Therefore, numerous methods have been developed for image segmentation within applications in the computer vision. Image segmentation can be classified into three categories: A) *Supervised*.- These methods require the interactivity in which the pixels belonging to the same intensity range pointed out manually and segmented. B) *Automatic*.- This is also known as unsupervised methods, where the algorithms need some priori information, so these methods are more complex, and C) *Semi-automatic*.- That is the combination of manual and automatic segmentation. Some of practical applications of image segmentation are: the medical imaging tasks that consist of location of tumors and other pathologies, recognition of the objects in images of remote sensing obtained via satellite or aerial platforms, automated-recognition systems to inspect the electronic assemblies, biometrics, automatic traffic controlling systems, machine vision, separating and tracking the regions appearing in consequent frames of an sequence, and finally, the real time mobile robot applications employing vision systems. 1.

### **2. Related work**

A lot of methods have been developed in the image segmentation. Let present brief description of the several promising frameworks.

<sup>1</sup> (Gonzalez & Woods, 1992)

Ideally, the order in testing the region merging is when any test between two true regions occurs, which means that all tests inside each of the two true regions have previously occurred.

Fuzzy Image Segmentation Algorithms in Wavelet Domain 129

The most promising in segmentation of the images in general is the approach based on clustering. Cluster oriented-segmentation uses the multidimensional data to partition of the image pixels into clusters. Such kind of technique may be more appropriate than histogram-oriented ones in segmenting images, where each pixel has several attributes and is represented by a vector. Cluster analysis has attracted much attention since the 1960's and has been applied in many fields such as OCR (*Optical Character Recognition*) system. Below, let present three most successful frameworks based on this technique that we apply

K-Means algorithm is an unsupervised clustering algorithm that classifies the input data point into multiple classes based on their inherent distance from each other. The algorithm assumes that the data features from a vector space and tries to find natural clustering in them (Hartigan

2. Calculate new cluster membership. A feature vector *xj* is assigned to the cluster *Ci* if and

*mi* <sup>=</sup> <sup>1</sup>

In Fig.3, the segmentation process using the K-Means algorithm is exposed:

<sup>|</sup>*Ci*<sup>|</sup> <sup>∑</sup> *xj*∈*Ci*

4. If none of the cluster centroids has been changed, finish the algorithm. Otherwise, go to

*<sup>i</sup>* <sup>=</sup> *argmink*<sup>=</sup>1,,*K*�*xj* <sup>−</sup> *mk*�2. (2)

*xj*, (3)

& Wong, 1979). It works an iterative manner according to the following steps:

1. Choose initial centroids *m*1,..., *mk* of the clusters *C*1,..., *Ck*.

3. Recalculate the centroids for the clusters according to

where *xj* belong to data set *X* = *x*1, , *xi*, *xN*.

Fig. 3. Block diagram for K-Means algorithm.

**Clustering based segmentation**

in segmentation applications.

only if:

step 2.

**2.3 K-Means clustering algorithm**

### **2.1** *Adaptive thresholding (AT)*

In (Argenziano & Soyer, 1996), the automatic adaptive thresholding (AT) performs the image segmentation comparing the color of each a pixel with a threshold. The pixel is classified as a lesion if it is darker than the threshold, finally, presenting the output as a binary image. Morphological post-processing is then applied to fill the holes and to select the largest connected component in the binary image. For color images, an automatic selection of the color component based on the entropy of the color component *i* is used:

$$S(i) = -\sum\_{k=0}^{L-1} h\_i(k) \log[h\_i(k)],\tag{1}$$

where *hi*(*k*) is the histogram of the color component *i*. It is assumed that the image *Ii*(*x*, *y*) varies in the range 0, . . . , 255 and the histogram is computed using bins of length *L* = 25. The block diagram in Fig.1 explains in detail the operation for AT method.

Fig. 1. Block diagram of Adaptive thresholding.

### **2.2** *Statistical region merging*

In (M. Celebi, 2008), the authors use a variant of region growing and merging technique, called as statistical region merging (SRM). The authors propose the following strategy:


The SRM framework uses the image generation homogeneity property and performs as follows in Fig. 2:

Fig. 2. Block diagram of Statistical region merging.

Ideally, the order in testing the region merging is when any test between two true regions occurs, which means that all tests inside each of the two true regions have previously occurred.

### **Clustering based segmentation**

2 Will-be-set-by-IN-TECH

In (Argenziano & Soyer, 1996), the automatic adaptive thresholding (AT) performs the image segmentation comparing the color of each a pixel with a threshold. The pixel is classified as a lesion if it is darker than the threshold, finally, presenting the output as a binary image. Morphological post-processing is then applied to fill the holes and to select the largest connected component in the binary image. For color images, an automatic selection of the

> *L*−1 ∑ *k*=0

where *hi*(*k*) is the histogram of the color component *i*. It is assumed that the image *Ii*(*x*, *y*) varies in the range 0, . . . , 255 and the histogram is computed using bins of length *L* = 25. The

In (M. Celebi, 2008), the authors use a variant of region growing and merging technique, called

• Regions are defied as the sets of pixels with homogeneous properties that then are

• Region growing/merging techniques is used employing a statistical test to form the

The SRM framework uses the image generation homogeneity property and performs as

 

 --



 -

as statistical region merging (SRM). The authors propose the following strategy:

*hi*(*k*)*log*[*hi*(*k*)], (1)

color component based on the entropy of the color component *i* is used:

*S*(*i*) = −

block diagram in Fig.1 explains in detail the operation for AT method.

Fig. 1. Block diagram of Adaptive thresholding.

iteratively growing by combining smaller regions.

 ! "

Fig. 2. Block diagram of Statistical region merging.

&#\$% &

**2.2** *Statistical region merging*

merging of regions.

follows in Fig. 2:

 - 

**2.1** *Adaptive thresholding (AT)*

The most promising in segmentation of the images in general is the approach based on clustering. Cluster oriented-segmentation uses the multidimensional data to partition of the image pixels into clusters. Such kind of technique may be more appropriate than histogram-oriented ones in segmenting images, where each pixel has several attributes and is represented by a vector. Cluster analysis has attracted much attention since the 1960's and has been applied in many fields such as OCR (*Optical Character Recognition*) system. Below, let present three most successful frameworks based on this technique that we apply in segmentation applications.

### **2.3 K-Means clustering algorithm**

K-Means algorithm is an unsupervised clustering algorithm that classifies the input data point into multiple classes based on their inherent distance from each other. The algorithm assumes that the data features from a vector space and tries to find natural clustering in them (Hartigan & Wong, 1979). It works an iterative manner according to the following steps:


$$\dot{\mathbf{x}} = \arg\min\_{k=1,\dots,K} ||\mathbf{x}\_j - m\_k||^2. \tag{2}$$

3. Recalculate the centroids for the clusters according to

$$m\_i = \frac{1}{|\mathbb{C}\_i|} \sum\_{\mathbf{x}\_j \in \mathbb{C}\_i} \mathbf{x}\_{j\prime} \tag{3}$$

where *xj* belong to data set *X* = *x*1, , *xi*, *xN*.

4. If none of the cluster centroids has been changed, finish the algorithm. Otherwise, go to step 2.

In Fig.3, the segmentation process using the K-Means algorithm is exposed:

Fig. 3. Block diagram for K-Means algorithm.

weight factor. However, in many applications *k* = 2 is a common choice. In case of crisp clustering, k may be chosen as 1. The membership value is proportional to the probability that a pixel belongs to some specific cluster where the probability is only dependent on the distance between the pixel and each independent cluster center. So, the criterion E has minimal value when for the pixels that are nearby the corresponding cluster center, higher membership values are assigned, while lower membership values are assigned to the pixels that are far from a center. This algorithm runs with the clusters' number and initial center positions that should be done at beginning, and then, the algorithm determines how many pixels belong to

Fuzzy Image Segmentation Algorithms in Wavelet Domain 131

�*xi* − *cj*�

 <sup>2</sup> (*k*−1)  , (6)

. (7)

�*xi* − *cm*�

*<sup>j</sup>*=<sup>1</sup> *<sup>u</sup><sup>k</sup> ijxj*

∑*<sup>N</sup> <sup>j</sup>*=<sup>1</sup> *<sup>u</sup><sup>k</sup> ij*

1. The center is initialized with the first value *'t'* of the data to be equal to zero, and this value

*t* = *t* + 1� is changed and novel centers are computed using (7).

Criterion E approaches to minimum value when its variations are decreased according to the restriction that a user should decide. The algorithm also can be interrupted if a user

The FCM algorithm, which is one of the most commonly used procedures, has the following drawback: the number of clusters should be pre-determined by a user before it starts to work. Therefore, sometimes the correct number of clusters in the concrete application may not be the same that the number being chosen by a user. Therefore, a method that should add a process based on fuzzy logic to find the number of clusters to be used. To realize this, we take into consideration the difference between the max (*Vmax*) and the min (*Vmin*) values of intensity in an image *D* = *Vmax* − *Vmin*, these proportions determine the number of clusters. Following, obtained data are applied in the determination of the centers, reducing the operational time of the FCM algorithm. This value is the first data of our fuzzy system called 'Distance', that has six fuzzy sets, 'minimum', 'shorter', 'short', 'regular', 'large' and 'maximum' (see Tab. 1). For value of data of our fuzzy system called 'Size', that has five fuzzy sets, 'Min', 'Small', 'Medium', 'Big' and 'Max' (see Tab. 2). Finally, For value of data of our fuzzy system called 'Cluster', that has five fuzzy sets, 'Very few', 'Few', 'Some', 'Many' and 'Too Many' (see Tab.

each cluster. The membership function and centers are determined as follows:

*<sup>μ</sup>ij* <sup>=</sup> <sup>1</sup> *C* ∑ *m*=1

*ci* <sup>=</sup> <sup>∑</sup>*<sup>N</sup>*

2. The fuzzy partition membership functions *μij* are initialized according to (6).

determines that only a certain number of iterations to be done Bezdek (1981).

The FCM algorithm runs four simple steps:

**2.5 Cluster pre-selection fuzzy C-Means**

3. The value �

3)

is used as a counter for number of iterations.

4. The steps 2 and 3 run until criterion E convergence.

### **Image segmentation using fuzzy methods**

### **Preliminaries and background**

The conventional set theory is based on a binary valued membership, which implies that a particular element either belongs to a particular set or it does not belong to it. A crisp set is defined as one whose elements fully belong to the set and they possess well-defined common attributes, which can be measured quantitatively. In a crisp set the common attributes are equally shared by all the elements of the set. On the other hand, in fuzzy sets, the degree of membership of an element to the set is indicated by a membership value, which signifies the extent to which the element belongs to the set. The membership value lies between 0 and 1, with membership "0" indicating no membership and "1" indicating full membership of the element to the set. In a crisp set, the membership values of its elements are either 0 or 1. The membership of an element *z* in a fuzzy set is obtained using a membership function *μ*(*x*) that maps every element belonging to the fuzzy set *XF* to the interval [0, 1]. Formally, this mapping can be written as:

$$
\mu(\mathfrak{x}): \mathbf{X}\_{\mathcal{F}} \to [0, 1] \tag{4}
$$

The membership assignment is primarily subjective in the sense that the users specify the membership values.

*Selection of the Membership Function* The assignment of the membership function may be performed by several ways.


### **2.4 Fuzzy C-Means algorithm**

Details of fuzzy approach to supervised pattern classification and clustering may be found in (Bezdek, 1981) In fuzzy clustering, a pattern is assigned with a degree of belongings to each cluster in a partition. Here, let present the most popular and efficient fuzzy clustering algorithm: *Fuzzy C-Means Algorithm*. The algorithm should find the center of '*n*' number of clusters iteratively adjusting their position via evaluation of an objective function. Additionally, it permits more flexibility by introducing the partial membership to the other clusters. The classical variant of this algorithm uses the following objective function:

$$E = \sum\_{j=1}^{C} \sum\_{i=1}^{N} \mu\_{ij}^{k} ||\mathbf{x}\_{i} - \mathbf{c}\_{j}||^{2} \tag{5}$$

where *μ<sup>k</sup> ij* is the fuzzy membership of the pixel *xi*; here, the cluster is identified by its center *cj*, and *k* ∈ [1, ∞] is an exponent weight factor. There is no fixed rule for choosing the exponent 4 Will-be-set-by-IN-TECH

The conventional set theory is based on a binary valued membership, which implies that a particular element either belongs to a particular set or it does not belong to it. A crisp set is defined as one whose elements fully belong to the set and they possess well-defined common attributes, which can be measured quantitatively. In a crisp set the common attributes are equally shared by all the elements of the set. On the other hand, in fuzzy sets, the degree of membership of an element to the set is indicated by a membership value, which signifies the extent to which the element belongs to the set. The membership value lies between 0 and 1, with membership "0" indicating no membership and "1" indicating full membership of the element to the set. In a crisp set, the membership values of its elements are either 0 or 1. The membership of an element *z* in a fuzzy set is obtained using a membership function *μ*(*x*) that maps every element belonging to the fuzzy set *XF* to the interval [0, 1]. Formally, this mapping

The membership assignment is primarily subjective in the sense that the users specify the

*Selection of the Membership Function* The assignment of the membership function may be

• *Membership based on visual model*: The membership function may be assigned in accordance with the human visual perceptual model. We may model the variation of the membership values of the pixels in a linear fashion as the pixel gray value changes from 0 to L - 1 (for

• Statistical Distribution: The membership values of the pixels may be assigned on the basis of image statistics as a whole or on the basis of local information at a pixel calculated from the surrounding pixels. The probability density function of the Gaussian or gamma distribution may be used for assignment of membership values (Chaira & Ray, 2003).

Details of fuzzy approach to supervised pattern classification and clustering may be found in (Bezdek, 1981) In fuzzy clustering, a pattern is assigned with a degree of belongings to each cluster in a partition. Here, let present the most popular and efficient fuzzy clustering algorithm: *Fuzzy C-Means Algorithm*. The algorithm should find the center of '*n*' number of clusters iteratively adjusting their position via evaluation of an objective function. Additionally, it permits more flexibility by introducing the partial membership to the other

clusters. The classical variant of this algorithm uses the following objective function:

*N* ∑ *i*=1 *μk*

*ij* is the fuzzy membership of the pixel *xi*; here, the cluster is identified by its center *cj*,

and *k* ∈ [1, ∞] is an exponent weight factor. There is no fixed rule for choosing the exponent

*E* = *C* ∑ *j*=1

*μ*(*x*) : *XF* → [0, 1] (4)

*ij*�*xi* <sup>−</sup> *cj*�2, (5)

**Image segmentation using fuzzy methods**

**Preliminaries and background**

can be written as:

membership values.

performed by several ways.

an L level image).

**2.4 Fuzzy C-Means algorithm**

where *μ<sup>k</sup>*

weight factor. However, in many applications *k* = 2 is a common choice. In case of crisp clustering, k may be chosen as 1. The membership value is proportional to the probability that a pixel belongs to some specific cluster where the probability is only dependent on the distance between the pixel and each independent cluster center. So, the criterion E has minimal value when for the pixels that are nearby the corresponding cluster center, higher membership values are assigned, while lower membership values are assigned to the pixels that are far from a center. This algorithm runs with the clusters' number and initial center positions that should be done at beginning, and then, the algorithm determines how many pixels belong to each cluster. The membership function and centers are determined as follows:

$$\mu\_{ij} = \frac{1}{\sum\_{m=1}^{C} \left(\frac{||\mathbf{x}\_i - \mathbf{c}\_j||}{||\mathbf{x}\_i - \mathbf{c}\_m|| \left(\frac{2}{(k-1)}\right)}\right)},\tag{6}$$

$$c\_i = \frac{\sum\_{j=1}^{N} \boldsymbol{u}\_{ij}^k \boldsymbol{x}\_j}{\sum\_{j=1}^{N} \boldsymbol{u}\_{ij}^k}.\tag{7}$$

The FCM algorithm runs four simple steps:


Criterion E approaches to minimum value when its variations are decreased according to the restriction that a user should decide. The algorithm also can be interrupted if a user determines that only a certain number of iterations to be done Bezdek (1981).

### **2.5 Cluster pre-selection fuzzy C-Means**

The FCM algorithm, which is one of the most commonly used procedures, has the following drawback: the number of clusters should be pre-determined by a user before it starts to work. Therefore, sometimes the correct number of clusters in the concrete application may not be the same that the number being chosen by a user. Therefore, a method that should add a process based on fuzzy logic to find the number of clusters to be used. To realize this, we take into consideration the difference between the max (*Vmax*) and the min (*Vmin*) values of intensity in an image *D* = *Vmax* − *Vmin*, these proportions determine the number of clusters. Following, obtained data are applied in the determination of the centers, reducing the operational time of the FCM algorithm. This value is the first data of our fuzzy system called 'Distance', that has six fuzzy sets, 'minimum', 'shorter', 'short', 'regular', 'large' and 'maximum' (see Tab. 1). For value of data of our fuzzy system called 'Size', that has five fuzzy sets, 'Min', 'Small', 'Medium', 'Big' and 'Max' (see Tab. 2). Finally, For value of data of our fuzzy system called 'Cluster', that has five fuzzy sets, 'Very few', 'Few', 'Some', 'Many' and 'Too Many' (see Tab. 3)

where '*N*' represents the number of clusters to be created and '*j*' is a counter to define all the centers. This looks like a hard type of algorithm, but the centers are still a bit far from the final ones, therefore, there are still a certain number of iterations that should be applied to find them, but the number of iterations is a lot less than for original system, permitting to reduce the computation time. RGB image is discomposed into its three-color channels, and the Euclidean distance is employed (L.A. & Zadeh, 1965) to determine, which one is the difference

Fuzzy Image Segmentation Algorithms in Wavelet Domain 133

 *P* ∑ *k*=1 (*x<sup>k</sup> red* <sup>−</sup> *<sup>x</sup><sup>k</sup>*

 *P* ∑ *k*=1 (*x<sup>k</sup> red* <sup>−</sup> *<sup>x</sup><sup>k</sup>*

 *P* ∑ *k*=1 (*x<sup>k</sup>*

Two distances that are more alike should be combined into one gray scale image, and it is processed as a correct image, then the method proposed is used to determine the number of

1. Divide RGB image into three different images, use (9) to find two images that are more

2. Calculate the distance between intensity levels in the image *D*, and obtain the size of an

3. Feed with these data the fuzzy pre selective system and obtain the number of centers to be

4. Use (8) to obtain the approximate centers. The initial value 't' is equal to zero and it is used

The proposed frameworks in segmentation are based on wavelet analysis, so let present some brief introduction in this part. The continuous wavelet transform (CWT) (Grossman & Morlet,

> *<sup>x</sup>*(*t*) <sup>1</sup> |*a*<sup>|</sup> *ψ*∗ *t* − *b a*

where *b* acts to translate the function across *x*(*t*), and the variable *a* acts to vary the time scale of the probing function, *ψ*. If value *a* is greater than one, the wavelet function, *ψ* is stretched

*blue*)2, (9)

*dt*, (10)

*green*)2,

*blue*)2.

*green* <sup>−</sup> *<sup>x</sup><sup>k</sup>*

*d*1(*xred*, *xblue*) =

*d*2(*xred*, *xgreen*) =

*d*3(*xgreen*, *xblue*) =

similar each to other and use them to create a new gray scale image.

5. The fuzzy partition membership functions *μ<sup>i</sup> j* are initialized according to (6).

6. Let the value be 't=t+1' and compute the new centers using (7). 7. The steps 5 and 6 should be done until criterion E converges.

*<sup>W</sup>*(*a*, *<sup>b</sup>*) = <sup>+</sup><sup>∞</sup>

−∞

clusters to be created. The CPSFCM consists of the next steps:

as a counter for the number of the iterations.

between three distances.

image.

created.

**3. Wavelet texture analysis**

**3.1 Continuous wavelet transform**

1985) can be written as follows:


Table 1. Member functions of "Distance"


Table 2. Member functions of "Size"


Table 3. Member functions of "Clusters"

Fig. 4. Pre-selection of the Number of Clusters.

In the second phase, the number of clusters and its centers are already known, simply dividing the difference *D* into the 'N' clusters and determining its center.

$$c\_{j} = j \frac{D}{N'} \tag{8}$$

$$j = 1,2,3,N\_{\prime} \tag{8}$$

6 Will-be-set-by-IN-TECH

**Fuzzy set Function Center Variance** Minimum Gauss 15 16 Shorter Gauss 53 24 Short Gauss 105 30 Regular Gauss 150 30 Large Gauss 222 45 Maximum Gauss 255 15

**Fuzzy set Function Center Variance** Min Gauss 9000 1.789e+005 Small Gauss 3.015e+005 1.626e+005 Medium Gauss 6.53e+005 1.968e+005 Big Gauss 9.728e+005 2.236e+005 Max Gauss 1.44e+006 2.862e+005

**Fuzzy set Function Center Variance** Very few Gauss 2 3 Few Gauss 7 3 Some Gauss 16 5 Many Gauss 23 5 Too many Gauss 33 7

In the second phase, the number of clusters and its centers are already known, simply dividing

*<sup>N</sup>* , *<sup>j</sup>* <sup>=</sup> 1, 2, 3, , *<sup>N</sup>*, (8)

Table 1. Member functions of "Distance"

Table 2. Member functions of "Size"

Table 3. Member functions of "Clusters"

Fig. 4. Pre-selection of the Number of Clusters.

the difference *D* into the 'N' clusters and determining its center.

*cj* = *j D* where '*N*' represents the number of clusters to be created and '*j*' is a counter to define all the centers. This looks like a hard type of algorithm, but the centers are still a bit far from the final ones, therefore, there are still a certain number of iterations that should be applied to find them, but the number of iterations is a lot less than for original system, permitting to reduce the computation time. RGB image is discomposed into its three-color channels, and the Euclidean distance is employed (L.A. & Zadeh, 1965) to determine, which one is the difference between three distances.

$$d\_1(\mathbf{x}\_{red}, \mathbf{x}\_{blue}) = \sqrt{\sum\_{k=1}^{P} (\mathbf{x}\_{red}^k - \mathbf{x}\_{blue}^k)^2},\tag{9}$$

$$d\_2(\mathbf{x}\_{red}, \mathbf{x}\_{green}) = \sqrt{\sum\_{k=1}^{P} (\mathbf{x}\_{red}^k - \mathbf{x}\_{green}^k)^2},$$

$$d\_3(\mathbf{x}\_{green}, \mathbf{x}\_{blue}) = \sqrt{\sum\_{k=1}^{P} (\mathbf{x}\_{green}^k - \mathbf{x}\_{blue}^k)^2}.$$

Two distances that are more alike should be combined into one gray scale image, and it is processed as a correct image, then the method proposed is used to determine the number of clusters to be created. The CPSFCM consists of the next steps:


### **3. Wavelet texture analysis**

#### **3.1 Continuous wavelet transform**

The proposed frameworks in segmentation are based on wavelet analysis, so let present some brief introduction in this part. The continuous wavelet transform (CWT) (Grossman & Morlet, 1985) can be written as follows:

$$\mathcal{W}(a,b) = \int\_{-\infty}^{+\infty} \mathbf{x}(t) \frac{1}{\sqrt{|a|}} \psi^\* \left(\frac{t-b}{a}\right) dt. \tag{10}$$

where *b* acts to translate the function across *x*(*t*), and the variable *a* acts to vary the time scale of the probing function, *ψ*. If value *a* is greater than one, the wavelet function, *ψ* is stretched

itself can be defined from the scaling function (Rao & Bopardikar, 1998):

**L**

*x*

**H**

Fig. 5. Structure of the analysis filter bank for 2-D image.

**4. Wavelet based texture analysis**

*x*

∞ ∑ *n*=−∞

√

Fuzzy Image Segmentation Algorithms in Wavelet Domain 135

where *d*(*n*) are the series of scalars that are related to the waveform *x*(*t*) and that define the discrete wavelet in terms of the scaling function. While the DWT can be implemented using the above equations, it is usually implemented using filter bank techniques. The use of a group of filters to divide up a signal into various spectral components is termed sub-band coding. The most used implementation of the DWT for 2-D signal applies only two filters for

**2**

**N**

**L**

**H**

**M/2**

**M/2**

**2**

A recent overview of methods applied to segmentation of skin lesions in dermoscopic images (M. Celebi & Stoecker, 2009) results that clustering is the most popular segmentation technique, probably due to their robustness. In the image analysis, texture is an important characteristic, including natural scenes and medical images. It has been noticed that the wavelet transform (WT)provides an ideal representation for texture analysis presenting spatial-frequency properties via a pyramid of tree structures, which is similar to sub-band decomposition. The hierarchical decomposition allows analyzing the high frequencies in the image, which features are importantin the segmentation task. Several works beneficially use the image features within a WT domain during the segmentation process.In paper (Bello, 1994), the image data firstly are decomposed into channels for a selected set of resolution levels using wavelet packets transform, then the Markov random field (MRF) segmentation is applied to the sub-bands coefficients for each scale, starting with the coarsest level, and propagating the segmentation process from current level to segmentation at the next level. Strickland et al. (Strickland & Hahn, 2009) apply the image features extracted in the WT

**N**

2*d*(*n*)*φ*(2*t* − *n*), (15)

**L***<sup>y</sup>*

**2**

**LL**<sup>1</sup>

**LH**<sup>1</sup>

**HL**<sup>1</sup>

**HH**<sup>1</sup>

**2**

**2**

**2**

**H***<sup>y</sup>*

**L***<sup>y</sup>*

**H***<sup>y</sup>*

*ψ*(*t*) =

rows and columns, as in the filter bank, which is shown in 5.

**M**

**N**

along the time axis, and if it is less than one (but still positive) it contacts the function. Wavelets are functions generated from one single function (basis function) called the prototype or mother wavelet by dilations (scalings) and translations (shifts) in time (frequency) domain. If the mother wavelet is denoted by *ψ*(*t*) , the other wavelets *ψa*,*b*(*t*) can be represented as:

$$
\psi\_{a,b}(t) = \frac{1}{\sqrt{|a|}} \psi^\* \left( \frac{t-b}{a} \right). \tag{11}
$$

The variables *a* and *b* represent the parameters for *dilations* and *translations*, respectively in the time axis. If the wavelet function *ψ*(*t*) is appropriately chosen, then it is possible to reconstruct the original waveform from the wavelet coefficients just as in the Fourier transform. Since the CWT decomposes the waveform into coefficients of two variables, a and b, a double summation en discrete case (or integration in continuous case) is required to recover the original signal from the coefficients (Meyers, 1993):

$$\mathbf{x}(t) = \frac{1}{\mathbb{C}} \int\_{a-\infty}^{+\infty} \int\_{b=-\infty}^{+\infty} \mathcal{W}(a,b)\boldsymbol{\upvarphi}\_{a,b}(t) da db,\tag{12}$$

where*C* = <sup>+</sup><sup>∞</sup> −∞ |Ψ(*ω*)| 2 <sup>|</sup>*ω*<sup>|</sup> *<sup>d</sup><sup>ω</sup>* and 0 <sup>&</sup>lt; *<sup>C</sup>* <sup>&</sup>lt; <sup>−</sup><sup>∞</sup> (so called a*admissibility* condition). In fact, reconstruction of the original waveform is rarely performed using the CWT coefficients because of its redundancy.

### **3.2 Discrete wavelet transforms**

The CWT has one serious problem: it is highly redundant. The CWT provides an oversampling of the original waveform: many more coefficients are generated than are actually needed to uniquely specify the signal. The discrete wavelet transform (DWT) achieves this parsimony by restricting the variation in translation and scale, usually to powers of two that is the case of the dyadic wavelet transform. The basic analytical expressions for the DWT is usually implemented using filter banks (Mallat, 1989):

$$\mathbf{x}(t) = \sum\_{k=-\infty}^{\infty} \sum\_{l=-\infty}^{\infty} d(k, l) 2^{-k/2} \psi(2^{-k}t - l). \tag{13}$$

Here, *k* is related to *a* as: *a* = 2*k* ; *b* is related to *λ* as *b* = 2*k* ; and *d*(*k*, *λ*) is a sampling of *W*(*a*, *b*) at discrete points k and *λ*. In the DWT, it is introduced the scaling function, a function that facilitates computation of the DWT. To implement the DWT efficiently, the finest resolution is computed first. The computation then proceeds to coarser resolutions, but rather than start over on the original waveform, the computation uses a smoothed version of the fine resolution waveform. This smoothed version is obtained with the help of the scaling function. The definition of the scaling function uses a dilation or a two-scale difference equation:

$$\phi(t) = \sum\_{n=-\infty}^{\infty} \sqrt{2}c(n)\phi(2t - n). \tag{14}$$

Here, *c*(*n*) are the series of scalars that define the specific scaling function. This equation involves two time scales (*t* and 2*t*) and can be quite difficult to solve. In the DWT, the wavelet 8 Will-be-set-by-IN-TECH

along the time axis, and if it is less than one (but still positive) it contacts the function. Wavelets are functions generated from one single function (basis function) called the prototype or mother wavelet by dilations (scalings) and translations (shifts) in time (frequency) domain. If the mother wavelet is denoted by *ψ*(*t*) , the other wavelets *ψa*,*b*(*t*) can be represented as:

> <sup>|</sup>*a*<sup>|</sup> *ψ*∗ *t* − *b a*

The variables *a* and *b* represent the parameters for *dilations* and *translations*, respectively in the time axis. If the wavelet function *ψ*(*t*) is appropriately chosen, then it is possible to reconstruct the original waveform from the wavelet coefficients just as in the Fourier transform. Since the CWT decomposes the waveform into coefficients of two variables, a and b, a double summation en discrete case (or integration in continuous case) is required to recover the

. (11)

*W*(*a*, *b*)*ψa*,*b*(*t*)*dadb*, (12)

*<sup>d</sup>*(*k*, <sup>l</sup>)2−*k*/2*ψ*(2<sup>−</sup>*kt* <sup>−</sup> <sup>l</sup>). (13)

2*c*(*n*)*φ*(2*t* − *n*). (14)

*<sup>ψ</sup>a*,*b*(*t*) = <sup>1</sup>

 +∞ *a*−∞  +∞ *b*=−∞

reconstruction of the original waveform is rarely performed using the CWT coefficients

The CWT has one serious problem: it is highly redundant. The CWT provides an oversampling of the original waveform: many more coefficients are generated than are actually needed to uniquely specify the signal. The discrete wavelet transform (DWT) achieves this parsimony by restricting the variation in translation and scale, usually to powers of two that is the case of the dyadic wavelet transform. The basic analytical expressions for

Here, *k* is related to *a* as: *a* = 2*k* ; *b* is related to *λ* as *b* = 2*k* ; and *d*(*k*, *λ*) is a sampling of *W*(*a*, *b*) at discrete points k and *λ*. In the DWT, it is introduced the scaling function, a function that facilitates computation of the DWT. To implement the DWT efficiently, the finest resolution is computed first. The computation then proceeds to coarser resolutions, but rather than start over on the original waveform, the computation uses a smoothed version of the fine resolution waveform. This smoothed version is obtained with the help of the scaling function. The definition of the scaling function uses a dilation or a two-scale difference equation:

<sup>|</sup>*ω*<sup>|</sup> *<sup>d</sup><sup>ω</sup>* and 0 <sup>&</sup>lt; *<sup>C</sup>* <sup>&</sup>lt; <sup>−</sup><sup>∞</sup> (so called a*admissibility* condition). In fact,

original signal from the coefficients (Meyers, 1993):

where*C* = <sup>+</sup><sup>∞</sup>

−∞

**3.2 Discrete wavelet transforms**

because of its redundancy.


the DWT is usually implemented using filter banks (Mallat, 1989):

∞ ∑ *k*=−∞

*φ*(*t*) =

∞ ∑ *l*=−∞

∞ ∑ *n*=−∞

√

Here, *c*(*n*) are the series of scalars that define the specific scaling function. This equation involves two time scales (*t* and 2*t*) and can be quite difficult to solve. In the DWT, the wavelet

*x*(*t*) =

itself can be defined from the scaling function (Rao & Bopardikar, 1998):

$$\psi(t) = \sum\_{n = -\infty}^{\infty} \sqrt{2}d(n)\phi(2t - n),\tag{15}$$

where *d*(*n*) are the series of scalars that are related to the waveform *x*(*t*) and that define the discrete wavelet in terms of the scaling function. While the DWT can be implemented using the above equations, it is usually implemented using filter bank techniques. The use of a group of filters to divide up a signal into various spectral components is termed sub-band coding. The most used implementation of the DWT for 2-D signal applies only two filters for rows and columns, as in the filter bank, which is shown in 5.

Fig. 5. Structure of the analysis filter bank for 2-D image.

### **4. Wavelet based texture analysis**

A recent overview of methods applied to segmentation of skin lesions in dermoscopic images (M. Celebi & Stoecker, 2009) results that clustering is the most popular segmentation technique, probably due to their robustness. In the image analysis, texture is an important characteristic, including natural scenes and medical images. It has been noticed that the wavelet transform (WT)provides an ideal representation for texture analysis presenting spatial-frequency properties via a pyramid of tree structures, which is similar to sub-band decomposition. The hierarchical decomposition allows analyzing the high frequencies in the image, which features are importantin the segmentation task. Several works beneficially use the image features within a WT domain during the segmentation process.In paper (Bello, 1994), the image data firstly are decomposed into channels for a selected set of resolution levels using wavelet packets transform, then the Markov random field (MRF) segmentation is applied to the sub-bands coefficients for each scale, starting with the coarsest level, and propagating the segmentation process from current level to segmentation at the next level. Strickland et al. (Strickland & Hahn, 2009) apply the image features extracted in the WT domain for detection of microcalcifications in mammograms using a matching process and some a priori knowledge on the target objects.

Zhang et al. (Zhang & Desai, 2001) employ a Bayes classifier on wavelet coefficients to determine an appropriate scale and threshold that can separate segmentation targets from other features.

### **5. Proposed framework**

The idea of our approach is consisted in employing the feature extraction in WT space before the segmentation process where the main difference with other algorithms presented in literature is in usage the information from three color channels in WT space gathering the color channels via a nearest neighbour interpolation (NNI). Developed approach uses the procedure that consists of the following stages: a digital color image I[n,m] is separated in R, G and Bchannels in the color space, where each a color channel is decomposed calculating their wavelets coefficients using Mallat's pyramid algorithm (Mallat, 1989). For chosen wavelet family is being used, the original image is decomposed into four sub-bands (Fig.5). These sub-bands labeled as LH, HL and HH represent the finest scale wavelet coefficient (detail images), while the sub-band LL corresponds to coarse level coefficients (approximation image), noted below as *<sup>D</sup>*(2*<sup>i</sup>* ) *<sup>h</sup>* ,*D*(2*<sup>i</sup>* ) *<sup>v</sup>* ,*D*(2*<sup>i</sup>* ) *<sup>d</sup>* and *<sup>A</sup>*(2*<sup>i</sup>* ), respectively at given scale 2*<sup>j</sup>* , for *j* = 1, 2, . . . , *J*, where J is the numbers of scales used in the DWT (Kravchenko, 2009). Finally, the DWT can be represented as follows:

$$\mathcal{W}\_{\mathbf{i}} = |\mathcal{W}\_{\mathbf{i}}| \exp(j\Theta\_{\mathbf{i}}) \,\prime \,\tag{16}$$

$$|\mathcal{W}\_{i}| = \left(\sqrt{|D\_{h,i}|^2 + |D\_{v,i}|^2 + |D\_{d,i}|^2}\right)^2,\tag{17}$$

$$\Theta\_{\dot{l}} = \begin{cases} \mathfrak{a}\_{\dot{l}} & \text{if } D\_{\mathfrak{h},\dot{l}} > 0 \\ \pi - \Theta\_{\dot{l}} \text{ if } D\_{\mathfrak{h},\dot{l}} < 0 \text{ /} \end{cases} \tag{18}$$

Fig. 6. Block diagram of proposed framework.

In this section, let present the evaluation criteria focusing them in segmentation process in dermoscopic image. The same measures can be used for segmentation in other applications. Different objective measures are used in literature for the purpose of evaluation of the segmentation performance in dermoscopic images. For objective measures, there is needed

Fuzzy Image Segmentation Algorithms in Wavelet Domain 137

**6. Evaluation criteria**

$$
\Theta\_i = \tan^{-1} \left( \frac{D\_{v,i}}{D\_{h,i}} \right).
$$

Therefore, *Wi* is considered as a new image for each color channel. The following process employed in the wavelet transform space is consisted of the stages: the classic segmentation method is applied to images; the image segmented corresponding to the red channel is interpolated with the image segmented corresponding to the green channel, the found image after applying *NNI* process is interpolated with the image segmented corresponding to the blue channel using *NNI* again, finally, this image is considerers the output of the segmentation procedure, Fig. 6 shows the block diagram of the above. The importance of considering the information of the three-color channels is an advantage in the segmentation process as it is judged to clusters formed in each of them.

The block diagram in Fig. 7 explains the operations for: a) image segmentation if K-Means algorithm is used where the WT is applied, named as WK-Means; b) image segmentation if FCM algorithm is used where the WT is applied, named as W-FCM; finally c) image segmentation if CPSFCM algorithm is used where the WT is applied, named as W-CPSFCM.

10 Will-be-set-by-IN-TECH

domain for detection of microcalcifications in mammograms using a matching process and

Zhang et al. (Zhang & Desai, 2001) employ a Bayes classifier on wavelet coefficients to determine an appropriate scale and threshold that can separate segmentation targets from

The idea of our approach is consisted in employing the feature extraction in WT space before the segmentation process where the main difference with other algorithms presented in literature is in usage the information from three color channels in WT space gathering the color channels via a nearest neighbour interpolation (NNI). Developed approach uses the procedure that consists of the following stages: a digital color image I[n,m] is separated in R, G and Bchannels in the color space, where each a color channel is decomposed calculating their wavelets coefficients using Mallat's pyramid algorithm (Mallat, 1989). For chosen wavelet family is being used, the original image is decomposed into four sub-bands (Fig.5). These sub-bands labeled as LH, HL and HH represent the finest scale wavelet coefficient (detail images), while the sub-band LL corresponds to coarse level coefficients (approximation

1, 2, . . . , *J*, where J is the numbers of scales used in the DWT (Kravchenko, 2009). Finally,


 *Dv*,*<sup>i</sup> Dh*,*<sup>i</sup>* .

*α<sup>i</sup>* if *Dh*,*<sup>i</sup>* > 0

Therefore, *Wi* is considered as a new image for each color channel. The following process employed in the wavelet transform space is consisted of the stages: the classic segmentation method is applied to images; the image segmented corresponding to the red channel is interpolated with the image segmented corresponding to the green channel, the found image after applying *NNI* process is interpolated with the image segmented corresponding to the blue channel using *NNI* again, finally, this image is considerers the output of the segmentation procedure, Fig. 6 shows the block diagram of the above. The importance of considering the information of the three-color channels is an advantage in the segmentation process as it is

The block diagram in Fig. 7 explains the operations for: a) image segmentation if K-Means algorithm is used where the WT is applied, named as WK-Means; b) image segmentation if FCM algorithm is used where the WT is applied, named as W-FCM; finally c) image segmentation if CPSFCM algorithm is used where the WT is applied, named as W-CPSFCM.

Θ*<sup>i</sup>* = tan−<sup>1</sup>

), respectively at given scale 2*<sup>j</sup>*

*Wi* = |*Wi*|*exp*(*j*Θ*i*), (16)

*<sup>π</sup>* <sup>−</sup> <sup>Θ</sup>*<sup>i</sup>* if *Dh*,*<sup>i</sup>* <sup>&</sup>lt; 0, , (18)

2

, for *j* =

, (17)

some a priori knowledge on the target objects.

other features.

**5. Proposed framework**

image), noted below as *<sup>D</sup>*(2*<sup>i</sup>*

the DWT can be represented as follows:

judged to clusters formed in each of them.

) *<sup>h</sup>* ,*D*(2*<sup>i</sup>*


) *<sup>v</sup>* ,*D*(2*<sup>i</sup>* ) *<sup>d</sup>* and *<sup>A</sup>*(2*<sup>i</sup>*

Θ*<sup>i</sup>* =

Fig. 6. Block diagram of proposed framework.

### **6. Evaluation criteria**

In this section, let present the evaluation criteria focusing them in segmentation process in dermoscopic image. The same measures can be used for segmentation in other applications. Different objective measures are used in literature for the purpose of evaluation of the segmentation performance in dermoscopic images. For objective measures, there is needed

the ground truth (GT) image, which is determined by dermatologist manually drawing the

Fuzzy Image Segmentation Algorithms in Wavelet Domain 139

Employing GT image Hance *et al*. (Hance, 1996) calculated the operation exclusive disjunction (XOR) measure, other metrics used in segmentation performance are presented in(Garnavi, 2011): the *sensitivity* and *specificity, precision and recall, true positive rate, false positive rate, pixel misclassification probability*, the *weighted performance index*, among others. Below, let consider the *sensitivity* and *specificity* measure. Sensitivity and specificity are statistical measures of the performance of a binary classification test, commonly used in medical studies. In the context of segmentation of skin lesions, sensitivity measures the proportion of actual lesion pixels that are correctly identified as such. Specificity measures the proportion of background skin pixels

that are correctly identified. Given the following definitions:

**TP** true positive, object pixels that are correctly classified as interest object.

**FP** false positive, background pixels that are incorrectly identified as interest object. **TN** true negative, background pixels that are correctly identified as background. **FN** false negative, object pixels that are incorrectly identified as background. In each of the above categories, the sensitivity and specificity are given by:

*sensitivity* <sup>=</sup> *TP*

*speci ficity* <sup>=</sup> *TN*

We also apply the *Receiver Operating Characteristic* (ROC) analysis (Fig. 8) that permits to evaluate the image segmentation quality in terms of the ability of human observer or a computer algorithm using image data to classify patients as "*positive*" or "*negative*" with respect to any particular disease. This characteristic represents the second level of diagnostic efficacy in the hierarchical model described by Fryback and Thornbury (Fryback DG, 1991). Fig. 8 presents the points of the ROC curve that are obtained by sweeping the classification threshold from the most positive classification value to the most negative. These points are desirable to produce quantitative summary measure using the ROC curve, called as an area

In the processing area of biomedical image processing, we applied the developed and existed segmentation techniques to dermoscopic images. Let present some definitions of commonly used terms in this application area. The term "skin cancer" refers to three different conditions

The two most common forms of skin cancer are basal cell carcinoma and squamous cell carcinoma. Together, these two are also referred to as nonmelanoma skin cancer.

that are from the least to the most dangerous can be presented as follows:

• Squamous cell carcinoma (the first stage of which is called *actinic keratosis*)

• Basal cell carcinoma (or basal cell *carcinomaepithelioma*)

*TP* <sup>+</sup> *FN* (19)

*FP* <sup>+</sup> *TN* (20)

border around the lesion.

under the ROC curve (AUC).

**7. Dermoscopic images**

• *Melanoma*

Fig. 7. Block diagram of the proposed algorithms: a) segmentation with WK-MEANS; b) segmentation with W-FCM; c) segmentation with W-CPSFCM.

12 Will-be-set-by-IN-TECH

Fig. 7. Block diagram of the proposed algorithms: a) segmentation with WK-MEANS; b)

segmentation with W-FCM; c) segmentation with W-CPSFCM.

the ground truth (GT) image, which is determined by dermatologist manually drawing the border around the lesion.

Employing GT image Hance *et al*. (Hance, 1996) calculated the operation exclusive disjunction (XOR) measure, other metrics used in segmentation performance are presented in(Garnavi, 2011): the *sensitivity* and *specificity, precision and recall, true positive rate, false positive rate, pixel misclassification probability*, the *weighted performance index*, among others. Below, let consider the *sensitivity* and *specificity* measure. Sensitivity and specificity are statistical measures of the performance of a binary classification test, commonly used in medical studies. In the context of segmentation of skin lesions, sensitivity measures the proportion of actual lesion pixels that are correctly identified as such. Specificity measures the proportion of background skin pixels that are correctly identified. Given the following definitions:

**TP** true positive, object pixels that are correctly classified as interest object.

**FP** false positive, background pixels that are incorrectly identified as interest object.

**TN** true negative, background pixels that are correctly identified as background.

**FN** false negative, object pixels that are incorrectly identified as background.

In each of the above categories, the sensitivity and specificity are given by:

$$sensitivity = \frac{TP}{TP + FN} \tag{19}$$

$$Specificity = \frac{TN}{FP + TN} \tag{20}$$

We also apply the *Receiver Operating Characteristic* (ROC) analysis (Fig. 8) that permits to evaluate the image segmentation quality in terms of the ability of human observer or a computer algorithm using image data to classify patients as "*positive*" or "*negative*" with respect to any particular disease. This characteristic represents the second level of diagnostic efficacy in the hierarchical model described by Fryback and Thornbury (Fryback DG, 1991). Fig. 8 presents the points of the ROC curve that are obtained by sweeping the classification threshold from the most positive classification value to the most negative. These points are desirable to produce quantitative summary measure using the ROC curve, called as an area under the ROC curve (AUC).

### **7. Dermoscopic images**

In the processing area of biomedical image processing, we applied the developed and existed segmentation techniques to dermoscopic images. Let present some definitions of commonly used terms in this application area. The term "skin cancer" refers to three different conditions that are from the least to the most dangerous can be presented as follows:


The two most common forms of skin cancer are basal cell carcinoma and squamous cell carcinoma. Together, these two are also referred to as nonmelanoma skin cancer.

Fig. 9. Block diagram of CAD system.

different nature used in this study.

Melanoma (lesion2)

segmentation. The dataset presents 24-bits color images in JPEG format with 600 x 600 pixel size. Below, we only expose five different images with different texture characteristics where the sensitivity and specificity are used as the evaluation criteria for segmentation accuracy. We also plotted the ROC curves to examine the classifier performance. Additionally, the diagnostic performance was quantified by AUC measure. Fig. 10 shows the images of

Fuzzy Image Segmentation Algorithms in Wavelet Domain 141

(a) (b) (c)

(d) (e)

The simulation results in Table present the values of AUC for the proposed framework based on different wavelet families confirming their better performance in comparison with classical techniques. The maximum value of AUC is obtained when WF Daubechies 4 is used, followed by the WAF *π*6. According to (Fryback DG, 1991) AUC measure should have values greater than 0.8 to consider a good test, but our study is focused in the best approximation of

Based on the objective quantity metrics and subjective visual results presented in Fig.4, one can see that the W-FCM presents borders that characterize the lesion (green color), in Fig.11

Fig. 10. Images used in this study:a) *Flower* b) *sea shell* c) *Tree* d)*Melanoma* (lesion1) e))

segmented image to GT, this means to get the value of AUC approximated to one.

Fig. 8. ROC curve.

Melanoma is generally the most serious form of skin cancer because it tends to spread (metastasize) throughout the body quickly. For a diagnosis, doctors usually remove all or a part of the growth by performing a biopsy but is considered an invasive technique. Alternative, dermatoscopy reduces the need for a biopsy applying a dermatoscope device, which magnifies the sub surface structures with the use of oil and illumination, also called epiluminescence. Dermatoscopy is a particularly helpful standard method of diagnosing the malignancy of skin lesions (Argenziano & Soyer, 2001). A mayor advantage is the accuracy of dermatoscopy is increased to 20% in the case of sensitivity and up to 10% in the case of specificity, compared with naked-eye examination, permitting to reduce the frequency of unnecessary surgical excisions of benign lesions (Vestergaard, 2001). Several instruments designed for a computer aided diagnosis (CAD) (Fig. 9 of skin lesions have been proposed, which usually work in four steps: data acquisition of skin (dermoscopic images), segmentation, feature extraction and classification. The most relevant step is segmentation process because it provides fundamental information to the next stages. Image segmentation is the process of adequately grouping pixels into a few regions, which pixels share some similar characteristics. Automated analysis of edges, colors, and shape of the lesion relies upon an accurate segmentation and is an important first step in any CAD system but irregular shape, nonuniform color, and ambiguous structures make the problem difficult.

### **8. Simulation results**

This section presents numerous experimental results in segmentation obtained by developed and existed techniques. The segmentation algorithms were evaluated on a set of 50 images of dermoscopic images obtained from http://www.dermoscopyatlas.com and http://www.wisdom.weizmann.ac.il. The GT images were found via human based

Fig. 9. Block diagram of CAD system.

14 Will-be-set-by-IN-TECH

Melanoma is generally the most serious form of skin cancer because it tends to spread (metastasize) throughout the body quickly. For a diagnosis, doctors usually remove all or a part of the growth by performing a biopsy but is considered an invasive technique. Alternative, dermatoscopy reduces the need for a biopsy applying a dermatoscope device, which magnifies the sub surface structures with the use of oil and illumination, also called epiluminescence. Dermatoscopy is a particularly helpful standard method of diagnosing the malignancy of skin lesions (Argenziano & Soyer, 2001). A mayor advantage is the accuracy of dermatoscopy is increased to 20% in the case of sensitivity and up to 10% in the case of specificity, compared with naked-eye examination, permitting to reduce the frequency of unnecessary surgical excisions of benign lesions (Vestergaard, 2001). Several instruments designed for a computer aided diagnosis (CAD) (Fig. 9 of skin lesions have been proposed, which usually work in four steps: data acquisition of skin (dermoscopic images), segmentation, feature extraction and classification. The most relevant step is segmentation process because it provides fundamental information to the next stages. Image segmentation is the process of adequately grouping pixels into a few regions, which pixels share some similar characteristics. Automated analysis of edges, colors, and shape of the lesion relies upon an accurate segmentation and is an important first step in any CAD system but irregular

shape, nonuniform color, and ambiguous structures make the problem difficult.

dermoscopic images obtained from http://www.dermoscopyatlas.com and

This section presents numerous experimental results in segmentation obtained by developed and existed techniques. The segmentation algorithms were evaluated on a set of 50 images of

http://www.wisdom.weizmann.ac.il. The GT images were found via human based

Fig. 8. ROC curve.

**8. Simulation results**

segmentation. The dataset presents 24-bits color images in JPEG format with 600 x 600 pixel size. Below, we only expose five different images with different texture characteristics where the sensitivity and specificity are used as the evaluation criteria for segmentation accuracy. We also plotted the ROC curves to examine the classifier performance. Additionally, the diagnostic performance was quantified by AUC measure. Fig. 10 shows the images of different nature used in this study.

Fig. 10. Images used in this study:a) *Flower* b) *sea shell* c) *Tree* d)*Melanoma* (lesion1) e)) Melanoma (lesion2)

The simulation results in Table present the values of AUC for the proposed framework based on different wavelet families confirming their better performance in comparison with classical techniques. The maximum value of AUC is obtained when WF Daubechies 4 is used, followed by the WAF *π*6. According to (Fryback DG, 1991) AUC measure should have values greater than 0.8 to consider a good test, but our study is focused in the best approximation of segmented image to GT, this means to get the value of AUC approximated to one.

Based on the objective quantity metrics and subjective visual results presented in Fig.4, one can see that the W-FCM presents borders that characterize the lesion (green color), in Fig.11

(a) (b)

Fuzzy Image Segmentation Algorithms in Wavelet Domain 143

(c) (d)

(e) (f)

(g) (h)

(i) (j)

Fig. 11. Image segmentation results under different algorithms using: a) Melanoma, b) Ground Truth, c) FCM, d) W-FCM with WF Coiflets 3, e) W-FCM with Daubechies 4, f) W-FCM with WF biorthogonal 6.8, g) W-FCM with WAF *up*2, h) W-FCM with WAF *π*6" i)

W-FCM with WAF *fup*2, j) W-FCM with WAF *e*2.


Table 4. AUC simulation results using different segmentation algorithms

c-f, it is easy to note that the segmentation procedure has performed only around the lesion. On other hand, in Fig. 11 g-j, where WAF results are presented, one can see that together with segmentation of lesion boarder there are some areas into the lesion segmented.

Figure 12 presents ROC curves for lesion 1 comparing the classic and proposed algorithms. In particular, Fig.11c) exposes the ROC curves for WK-means and K-Means algorithms where one can see superiority of proposed WK-Means algorithm that uses WAFp6 (see ROC curve in light green color), Fig.12 d) presents ROC curves for W-FCM and FCM algorithms where it is easy to observe the better performance of WK-Means that employs the WF biorthogonal 6.8 (seeROC curve in red color), and finally, in Fig. 12 e), theROC curves for W-CPSFCM and CPSFCM algorithms have confirmed the better performance of the first one for WF biorthogonal 6.8usage (see ROC curve in red color).

16 Will-be-set-by-IN-TECH

Table 4. AUC simulation results using different segmentation algorithms

biorthogonal 6.8usage (see ROC curve in red color).

segmentation of lesion boarder there are some areas into the lesion segmented.

c-f, it is easy to note that the segmentation procedure has performed only around the lesion. On other hand, in Fig. 11 g-j, where WAF results are presented, one can see that together with

Figure 12 presents ROC curves for lesion 1 comparing the classic and proposed algorithms. In particular, Fig.11c) exposes the ROC curves for WK-means and K-Means algorithms where one can see superiority of proposed WK-Means algorithm that uses WAFp6 (see ROC curve in light green color), Fig.12 d) presents ROC curves for W-FCM and FCM algorithms where it is easy to observe the better performance of WK-Means that employs the WF biorthogonal 6.8 (seeROC curve in red color), and finally, in Fig. 12 e), theROC curves for W-CPSFCM and CPSFCM algorithms have confirmed the better performance of the first one for WF

Fig. 11. Image segmentation results under different algorithms using: a) Melanoma, b) Ground Truth, c) FCM, d) W-FCM with WF Coiflets 3, e) W-FCM with Daubechies 4, f) W-FCM with WF biorthogonal 6.8, g) W-FCM with WAF *up*2, h) W-FCM with WAF *π*6" i) W-FCM with WAF *fup*2, j) W-FCM with WAF *e*2.

**9. Conclusion**

in comparison with traditional existed techniques.

**10. Acknowledgement**

**11. References**

for their support to realize this work.

Technology 17(1): 33–44.

URL: *www.intechweb.org*

*Letters* 12(24): 1943–1950.

New York.

publication.

Medicine and Biology Magazine 15(1): 104–111.

11(1): 88–94.

The segmentation process involves the partition of an image into a set of homogeneous and meaningful regions allowing the detection of an object of interest in a specific task, and is an important stage in the different problems such as computer vision, remote sensing, medical images, etc. In this chapter, we present a review of existed promising methods of image segmentation; some of them are popular because they are used in various applications. Novel approach in segmentation exposed here has generated several frameworks that use traditional and fuzzy logic techniques (WK-Means, W-FCM, W-CPSFCM), all of them involve the wavelet transform space and approximation procedures for inter color channels processing, permitting better extraction of the image features. Numerous simulation results summarize the performance of all investigated algorithms for segmentation in images of different nature exposing quality in form of ROC curves (sensitivity-specificity parameters) and AUC values. It has been justified sufficiently better performance of the developed frameworks (WK-Means, W-FCM, and W-CPSFCM) that apply different classic wavelets families and WAF

Fuzzy Image Segmentation Algorithms in Wavelet Domain 145

The authors thank the National Polytechnic Institute of Mexico and CONACYT (grant 81599)

Garnavi,R. (Aldeen, M. (Celebi, M. E. (2011). Weighted performance index for objective

Fryback, D. G. (Thornbury, J.R. (1991). The efficacy of diagnostic imaging, Med Decis Making

Hance, G. A (Umbaugh, S.E. ((Moss, R. H. (Stoecker, W. V. 1991). Unsupercised Color

Argenziano, G. & Soyer, H. (1996). Adaptive thresholding of wavelet coefficients, Computational Statistics and amp. Data Analysis Vol.22(No.4): 351 – 361. Argenziano, G. & Soyer, H. (2001). Dermoscopy of pigmented skin lesions-a valuable tool for

Bello, M. (1994). A combined markov random field and wave-packet transform-based approach for image segmentation, IEEE Trans. Image Processing 3(6): 834–846.

Bezdek, J. (1981). *Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum press*,

Chaira & Ray, A. K. (2003). Fuzzy approach to color region extraction, *Pattern Recognition*

Fryback DG, T. J. (1991). The efficacy of diagnostic imaging, Med Decis Making 11(1): 88U94. ˝ Gonzalez, R. C. & Woods, R. E. (1992). *Digital Image Processing*, Addison Wesley, Place of

early diagnosis of melanoma, The Lancet Oncology 2(7): 443U449. ˝

evaluation of border detection methods in dermoscopy images, Skin Research and

Image Segmentation with Application to Skin Tumor Borders, IEEE Engineering in

Fig. 12. a) Lesion 1 Melanoma b) Ground Truth image; ROC curves for c) WK-Means algorithm d) FCM algorithm e) W-CPSFCM: for WF Daubechies 4(dark blue), for WF biorthogonal 6.8 (red), for WF Coiflets 3 (purple), for WAF *up*<sup>2</sup> (dark green), for WAF fup2 (aqua), for WAF *π*<sup>6</sup> (light green); FCM (black).

### **9. Conclusion**

18 Will-be-set-by-IN-TECH

(a) (b)

(c) (d)

(e)

Fig. 12. a) Lesion 1 Melanoma b) Ground Truth image; ROC curves for c) WK-Means algorithm d) FCM algorithm e) W-CPSFCM: for WF Daubechies 4(dark blue), for WF biorthogonal 6.8 (red), for WF Coiflets 3 (purple), for WAF *up*<sup>2</sup> (dark green), for WAF fup2

(aqua), for WAF *π*<sup>6</sup> (light green); FCM (black).

The segmentation process involves the partition of an image into a set of homogeneous and meaningful regions allowing the detection of an object of interest in a specific task, and is an important stage in the different problems such as computer vision, remote sensing, medical images, etc. In this chapter, we present a review of existed promising methods of image segmentation; some of them are popular because they are used in various applications. Novel approach in segmentation exposed here has generated several frameworks that use traditional and fuzzy logic techniques (WK-Means, W-FCM, W-CPSFCM), all of them involve the wavelet transform space and approximation procedures for inter color channels processing, permitting better extraction of the image features. Numerous simulation results summarize the performance of all investigated algorithms for segmentation in images of different nature exposing quality in form of ROC curves (sensitivity-specificity parameters) and AUC values. It has been justified sufficiently better performance of the developed frameworks (WK-Means, W-FCM, and W-CPSFCM) that apply different classic wavelets families and WAF in comparison with traditional existed techniques.

### **10. Acknowledgement**

The authors thank the National Polytechnic Institute of Mexico and CONACYT (grant 81599) for their support to realize this work.

### **11. References**


**Part 2** 

**Techniques and Implementation** 


## **Part 2**

**Techniques and Implementation** 

20 Will-be-set-by-IN-TECH

146 Fuzzy Logic – Algorithms, Techniques and Implementations

Grossman, A. & Morlet, J. . (1985). *Mathematics and Physics: Lectures on Recent Results*, L. Streit,

Hartigan, A. & Wong, M. A. (1979). A k-means clustering algorithm, Applied Statistics

Kravchenko, V., M. H. P. V. . (2009). *Adaptive digital processing of multidimensional signals with*

L.A. & Zadeh (1965). Fuzzy approach to color region extraction, *Information and Control*

M. Celebi, H. Iyatomi, G. S. & Stoecker, W. (2009). Lesion border detection in dermoscopy images, Computerized Medical Imaging and Graphics 33(2): 148U153.

M. Celebi, H. Kingravi, H. I. e. a. (2008). Border detection in dermoscopy images using statistical region merging, Skin Research and Technology Vol. 3(No. 14): 347–353. Mallat, S. (1989). A theory for multiresolution signal decomposition: The wavelet

Rao, R. M. & Bopardikar, A. S. (1998). *Wavelet Transforms: Introduction to Theory and*

Strickland, R. N. & Hahn, H. I. (2009). Wavelet transform matched filters for the

Zhang, X. & Desai, M. (2001). Segmentation of bright targets using wavelets and adaptive

Meyers, Y. (1993). *Wavelet: Algorithms and Applications*, SIAM, Place of publication.

thresholding, IEEE Trans. Image Processing 10(7): 1020–1030.

*Applications.*, Addison-Wesley, Place of publication.

representation, IEEE Trans. on Pattern Analysis and Machine Intelligence 11(7): 338

detection and classification of microcalcifications in mammography, Proceedings of the International Conference on Image Processing, Washington 1(2): 422–425. Vestergaard, ME; Macaskill, P. H. P. M. (2001). Dermoscopy compared with naked eye

examination for the diagnosis of primary melanoma: a metaanalysis of studies performed in a clinical setting, British Journal of Dermatology 159(3): 669U76.

˝

˝

Place of publication.

*applications.*, FizMatLit, Place of publication.

28(1): 100–108.

8(3): 338 – 353.

– 353.

**8** 

*Albania* 

**Fuzzy Logic Approach for** 

One of the main challenges nowadays for managing IP networks is guaranteeing quality of service. One of the proposed solutions is traffic management with MPLS protocol. However, requirement characterization and the network state are very difficult tasks, taking into account that requirements for different services are random, where as a result the network condition varies dynamically and randomly. This is reason why researches have used fuzzy logic for solving a lot of problems that can occur in very dynamic networks. In this chapter we will analyze MPLS network routing metrics using fuzzy logic. We will pay attention the most appropriate defuzzification methods for finding the path that fulfills the QoS

One of the key issues in providing end-to-end quality of service (QoS) guarantees in today's networks is how to determine a feasible route that satisfies a set of constraints. In general, finding a path subject to multiple constraints is an NP-complete problem that cannot be exactly solved in polynomial time. Accordingly, several heuristics and approximation algorithms have been proposed for this problem. Many of these algorithms suffer from

Selecting feasible paths that satisfy various QoS requirements of applications in a network is known as QoS routing [1]. In general, two issues are related to QoS routing: state distribution and routing strategy. State distribution addresses the issue of exchanging the state information throughout the network. Routing strategy is used to find a feasible path that meets the QoS requirements. In this chapter we will present the fuzzy logic approach for QoS routing analysis in a network which is able to offer multimedia services, such is MPLS network [2] [3] [4] [5] [6]. Fuzzy sets offer powerful mathematical structure that has to do with non-preciosity and uncertainty of real word. Linguistic variables allow representation of numerical values with fuzzy sets. Knowing that networks nowadays are very dynamic, which means that networks have parameters that are affected from unexpected overloads, failures and other concerns, fuzzy logic offers promising approach for addressing different network problems [7] [8]. Applying of fuzzy logic in telecommunication networks is done lately and is proved to be a very economic and efficient method compared with other methods used in automatic control. Recent researches on application of fuzzy logic in telecommunication networks have to do with: packet queuing, buffer management, call acceptance, QoS routing, channel capacity

**1. Introduction** 

requirements for multimedia services.

sharing, traffic management etc.

either excessive computational cost or low performance.

 **QoS Routing Analysis** 

Adrian Shehu and Arianit Maraj *Polytechnic University of Tirana,* 

### **Fuzzy Logic Approach for QoS Routing Analysis**

Adrian Shehu and Arianit Maraj *Polytechnic University of Tirana, Albania* 

### **1. Introduction**

One of the main challenges nowadays for managing IP networks is guaranteeing quality of service. One of the proposed solutions is traffic management with MPLS protocol. However, requirement characterization and the network state are very difficult tasks, taking into account that requirements for different services are random, where as a result the network condition varies dynamically and randomly. This is reason why researches have used fuzzy logic for solving a lot of problems that can occur in very dynamic networks. In this chapter we will analyze MPLS network routing metrics using fuzzy logic. We will pay attention the most appropriate defuzzification methods for finding the path that fulfills the QoS requirements for multimedia services.

One of the key issues in providing end-to-end quality of service (QoS) guarantees in today's networks is how to determine a feasible route that satisfies a set of constraints. In general, finding a path subject to multiple constraints is an NP-complete problem that cannot be exactly solved in polynomial time. Accordingly, several heuristics and approximation algorithms have been proposed for this problem. Many of these algorithms suffer from either excessive computational cost or low performance.

Selecting feasible paths that satisfy various QoS requirements of applications in a network is known as QoS routing [1]. In general, two issues are related to QoS routing: state distribution and routing strategy. State distribution addresses the issue of exchanging the state information throughout the network. Routing strategy is used to find a feasible path that meets the QoS requirements. In this chapter we will present the fuzzy logic approach for QoS routing analysis in a network which is able to offer multimedia services, such is MPLS network [2] [3] [4] [5] [6]. Fuzzy sets offer powerful mathematical structure that has to do with non-preciosity and uncertainty of real word. Linguistic variables allow representation of numerical values with fuzzy sets. Knowing that networks nowadays are very dynamic, which means that networks have parameters that are affected from unexpected overloads, failures and other concerns, fuzzy logic offers promising approach for addressing different network problems [7] [8]. Applying of fuzzy logic in telecommunication networks is done lately and is proved to be a very economic and efficient method compared with other methods used in automatic control. Recent researches on application of fuzzy logic in telecommunication networks have to do with: packet queuing, buffer management, call acceptance, QoS routing, channel capacity sharing, traffic management etc.

Fuzzy Logic Approach for QoS Routing Analysis 151

MPLS supports traffic engineering for QoS provision and traffic prioritization, for example: provision of wider bandwidth and lower delays for gold customers who are able to pay more for better quality services. Another example a lot of paths can be defined through edge points by ensuring lower levels of interferences and backup services in case of any network failure. This is like using routing metrics in IP network to enforce traffic flowing in one or another direction, but in this case MPLS is much more powerful. An important aspect in MPLS is the priority concept of LSP [13]. LSPs can be configured with higher or lower priority. LSPs with higher priority have advantages in finding new paths compared with

Figure 1 shows MPLS network and its corresponding elements. The core part represents the MPLS network. MPLS combines the advantages of packet forwarding which is based on layer 2 and routing properties of the 3'd layer. MPLS also offers traffic engineering (TE). TE is process of selecting suitable routes for data transmission on the network, that has to do with efficient use of network resources and improving network performance, thus increasing network revenue and QoS. One of the main goals of TE is efficient and reliable functionality of the network. Also, TE calculates the route from the source to the destination based on different metrics such as channel capacity (bandwidth), delays and other

Routing metrics have a significant role, not just in complexity of route calculation but also in QoS. The use of multiple metrics is able to model the network in a more precise way, but the problem for finding appropriate path can become very complex [9] [10]. In general, there are

Fig. 1. MPLS network

those of lower priority.

administrative requirements.

3 types of metrics:

Multiplicative and

Additive,

Concave.

**4. Routing metrics in MPLS network** 

Some problems can occur during multimedia service transmission, therefore it is a good idea to design some control mechanisms for solving such problems. As a result of the complex nature of control mechanisms, more and more is being done in designing intelligent controlled techniques. One of the intelligent controlled techniques that will be part of this chapter is Fuzzy Logic Controller, a technique that is based on fuzzy logic. In this chapter we will use main metrics of MPLS network as input parameters of FLC, and we will try to choose the most appropriate defuzzification method for finding better crisp values in aspect of link utilization, in the output of Fuzzy Logic controller.

In this chapter we will firs explain shortly QoS routing principle, MPLS technology in aspect of QoS routing metrics. Also, here we will give the main attention to the fuzzy logic approach, especially FLC used for QoS routing analysis in MPLS network. In this aspect we will try to find the best defuzzification method for gaining better crisp values for link optimization in MPLS network.

### **2. QoS routing**

The main goal of QoS based routing is to select the most suitable path according to traffic requirements for multimedia applications. Selection of suitable transmission paths is done through routing mechanisms based on existing network resources and QoS requirements. Multimedia applications might suffer degradation in quality in traditional networks such as Internet [9]. This problem can be solved in networks that contain dynamic path creation features with bandwidth-guaranteed and constrained delays [10]. Real–time applications impose strict QoS requirements. These application requirements are expressed by parameters such as acceptable and end-to-end delays, necessary bandwidth and acceptable losses. For example, audio and video transmissions have strict requirements for delay and losses. Wide bandwidth must be guaranteed for high capacity transmission. Real time traffic, video in particular, quite often utilizes most important quantities of network resources**.** Efficient management of network resources will reduce network service cost and will allow more applications to be transmitted simultaneously. The task of finding suitable paths through networks is treated by routing protocols. Since common routing protocols are reaching their acceptable complexity limits, it is important that complexity proposed by QoS based routing [11] should not damage scalability of routing protocols. MPLS is a multiple solution for a lot of current problems faced by Internet [12]. By a wide support for QoS and traffic engineering, MPLS is establishing itself as a standard of the next generation's network.

### **3. MPLS network**

MPLS is a data transmission technology which includes some features of circuit switched networks through packet switched network. MPLS actually works at both Layer 2 and Layer 3 in OSI model and it is often referred to as a Layer 2.5 technology. It is designed to provide transport possibilities of data for all users. MPLS techniques can be used as a more efficient tool for traffic engineering than standard routing in IP networks. Also, MPLS can be used for path control of traffic flow, in order to utilize network resources in an optimal way. Network paths can be defined for sensitive traffic, high security traffic etc, guaranteeing different CoS (Class of Service) and QoS. Main MPLS feature is virtual circuit configuration through IP network. These virtual circuits are called LSP.

Fig. 1. MPLS network

150 Fuzzy Logic – Algorithms, Techniques and Implementations

Some problems can occur during multimedia service transmission, therefore it is a good idea to design some control mechanisms for solving such problems. As a result of the complex nature of control mechanisms, more and more is being done in designing intelligent controlled techniques. One of the intelligent controlled techniques that will be part of this chapter is Fuzzy Logic Controller, a technique that is based on fuzzy logic. In this chapter we will use main metrics of MPLS network as input parameters of FLC, and we will try to choose the most appropriate defuzzification method for finding better crisp

In this chapter we will firs explain shortly QoS routing principle, MPLS technology in aspect of QoS routing metrics. Also, here we will give the main attention to the fuzzy logic approach, especially FLC used for QoS routing analysis in MPLS network. In this aspect we will try to find the best defuzzification method for gaining better crisp values for link

The main goal of QoS based routing is to select the most suitable path according to traffic requirements for multimedia applications. Selection of suitable transmission paths is done through routing mechanisms based on existing network resources and QoS requirements. Multimedia applications might suffer degradation in quality in traditional networks such as Internet [9]. This problem can be solved in networks that contain dynamic path creation features with bandwidth-guaranteed and constrained delays [10]. Real–time applications impose strict QoS requirements. These application requirements are expressed by parameters such as acceptable and end-to-end delays, necessary bandwidth and acceptable losses. For example, audio and video transmissions have strict requirements for delay and losses. Wide bandwidth must be guaranteed for high capacity transmission. Real time traffic, video in particular, quite often utilizes most important quantities of network resources**.** Efficient management of network resources will reduce network service cost and will allow more applications to be transmitted simultaneously. The task of finding suitable paths through networks is treated by routing protocols. Since common routing protocols are reaching their acceptable complexity limits, it is important that complexity proposed by QoS based routing [11] should not damage scalability of routing protocols. MPLS is a multiple solution for a lot of current problems faced by Internet [12]. By a wide support for QoS and traffic engineering,

MPLS is a data transmission technology which includes some features of circuit switched networks through packet switched network. MPLS actually works at both Layer 2 and Layer 3 in OSI model and it is often referred to as a Layer 2.5 technology. It is designed to provide transport possibilities of data for all users. MPLS techniques can be used as a more efficient tool for traffic engineering than standard routing in IP networks. Also, MPLS can be used for path control of traffic flow, in order to utilize network resources in an optimal way. Network paths can be defined for sensitive traffic, high security traffic etc, guaranteeing different CoS (Class of Service) and QoS. Main MPLS feature is virtual circuit configuration

values in aspect of link utilization, in the output of Fuzzy Logic controller.

MPLS is establishing itself as a standard of the next generation's network.

through IP network. These virtual circuits are called LSP.

optimization in MPLS network.

**2. QoS routing** 

**3. MPLS network** 

MPLS supports traffic engineering for QoS provision and traffic prioritization, for example: provision of wider bandwidth and lower delays for gold customers who are able to pay more for better quality services. Another example a lot of paths can be defined through edge points by ensuring lower levels of interferences and backup services in case of any network failure. This is like using routing metrics in IP network to enforce traffic flowing in one or another direction, but in this case MPLS is much more powerful. An important aspect in MPLS is the priority concept of LSP [13]. LSPs can be configured with higher or lower priority. LSPs with higher priority have advantages in finding new paths compared with those of lower priority.

Figure 1 shows MPLS network and its corresponding elements. The core part represents the MPLS network. MPLS combines the advantages of packet forwarding which is based on layer 2 and routing properties of the 3'd layer. MPLS also offers traffic engineering (TE). TE is process of selecting suitable routes for data transmission on the network, that has to do with efficient use of network resources and improving network performance, thus increasing network revenue and QoS. One of the main goals of TE is efficient and reliable functionality of the network. Also, TE calculates the route from the source to the destination based on different metrics such as channel capacity (bandwidth), delays and other administrative requirements.

### **4. Routing metrics in MPLS network**

Routing metrics have a significant role, not just in complexity of route calculation but also in QoS. The use of multiple metrics is able to model the network in a more precise way, but the problem for finding appropriate path can become very complex [9] [10]. In general, there are 3 types of metrics:


Fuzzy Logic Approach for QoS Routing Analysis 153

When the control processes are too complex to analyze by conventional quantitative

 When the available sources of information are interpreted qualitatively or uncertainly. Fuzzy logic controller consists of: fuzzifier, rule base, fuzzy inference and defuzzifier (see

**Fuzzifier**: A fuzzifier operator has the effect of transforming crisp value to fuzzy sets. Fuzzifier is presented with *x*=*fuzzifier*(*x*0), where *x*0 is input crisp value; *x* is a fuzzy set and

**Rule-Base (Linguistic Rules): C**ontains IF-THEN rules that are determined through fuzzy

*Example*: if *x* is *Ai* and *Y* is *Bi* the *Z* is *Ci,* Where *x* and *y* are inputs and *z* is controlled output;

**Fuzzy Inference:** Is a process of converting input values into output values using fuzzy logic. Converting is essential for decision making. Fuzzy Inference process includes:

**Defuzzifier:** can be expressed by: *y*ou=*defuzzifier*(*y*), where *y* identifies fuzzy controller action, *y*ou identifies crisp value of control action and defuzzifier presents defuzzifier operator. Converting process of fuzzy terms in crisp values is called defuzzification. There are some defuzzification methods: COG (Centre of Gravity), COGS (Centre of Gravity for Singletons), COA (Centre of Area), LM (Left Most Maximum) and RM (Right Most

For solving QoS routing problem, we will use fuzzy logic approach. Fuzzy logic is proved to be very effective in a lot of applications, such as intelligent control, decision making process etc. Fuzzy logic is based in a set of metrics which can be or not connected with each other. Calculating the best route cannot be done using complex mathematical solutions, but is

techniques

Fig. 2. Fuzzy Logic Controller

fuzzifier represents a fuzzification operator.

membership functions and logic operations

Ai, Bi and Ci are linguistic terms, like: low, medium, high etc.

**7. MPLS network metrics and membership functions** 

Figure 2).

logic.

Maximum).

based in intuitive rules.

They are defined as below:

If *m* (*n*1, *n*2) are metrics for link (n1, n2). For one path *P* = (*n*1, *n*2, …, *n*i, *n*j), metric *m* is (*n*1, *n*2, …, *n*i, *n*j):


For any path *p* = (*i, j, k, …, l, m*), we say metric d is additive if:

$$d\mathbf{d}\text{ (}p) = d(\mathbf{i}, \mathbf{j}) + d\text{ (}\mathbf{j}, \mathbf{k}) + \dots + d(\mathbf{l}, \mathbf{m})\tag{1}$$


$$d \text{ (\$p\$)} \equiv d(\text{l, j}) \ge d \text{ (\$j, k)} \ge \dots \ge d(\text{l, m}) \tag{2}$$


$$d(p) = \min\{d(\mathbf{i}, \mathbf{j}), \ d(\mathbf{j}, \mathbf{k}), \ \dots \ d(\mathbf{l}, \mathbf{m})\}\tag{3}$$

In MPLS network there are a lot of metrics that we can take into consideration, but in this chapter, for sake of simplicity, we will consider three main metrics: delay, losses and bandwidth. Those metrics play a direct role in quality of service in MPLS network. In order to consider multiple metrics simultaneously, we will use fuzzy logic controller. FLC is intelligent technique that can manipulate with two or more input parameters simultaneously without any problem.

### **5. Soft computing**

Soft Computing is more tolerable in uncertainty and partial truth than Hard Computing. The model in which soft computing is based in human mind. The main components of soft computing are: Fuzzy Logic, Neural Networks, Probabilistic reasoning and Genetic algorithms. The most important component of soft computing is Fuzzy logic, which will be part of this chapter. Fuzzy logic will be used for a lot of applications. Applications of fuzzy logic in telecommunications networks are recent. Fuzzy Logic is organized into three main efforts: modeling and control, management and forecasting, and performance estimation.

### **5.1 Fuzzy logic**

Idea for fuzzy logic has born in 1965. Lotfi Zadeh has published one seminar for fuzzy which was the beginning for fuzzy logic [14]. Fuzzy logic is tolerant in imprecise data, nonlinear functions and can be mixed with other techniques for different problems solving. The main principle of fuzzy logic is using fuzzy groups which are without crisp boundaries.

### **6. QoS routing analysis using FLC – Fuzzy Logic Controller**

As we have mentioned above, for QoS routing analysis we will use FLC as intelligent controlling technique. A Fuzzy Logic Controller [15] is a rule based system in which fuzzy rule represents a control mechanism. In this case, a fuzzy controller uses fuzzy logic to simulate human thinking. In particular the FLC is useful in two special cases [15]:

152 Fuzzy Logic – Algorithms, Techniques and Implementations

If *m* (*n*1, *n*2) are metrics for link (n1, n2). For one path *P* = (*n*1, *n*2, …, *n*i, *n*j), metric *m* is (*n*1, *n*2,

In MPLS network there are a lot of metrics that we can take into consideration, but in this chapter, for sake of simplicity, we will consider three main metrics: delay, losses and bandwidth. Those metrics play a direct role in quality of service in MPLS network. In order to consider multiple metrics simultaneously, we will use fuzzy logic controller. FLC is intelligent technique that can manipulate with two or more input parameters

Soft Computing is more tolerable in uncertainty and partial truth than Hard Computing. The model in which soft computing is based in human mind. The main components of soft computing are: Fuzzy Logic, Neural Networks, Probabilistic reasoning and Genetic algorithms. The most important component of soft computing is Fuzzy logic, which will be part of this chapter. Fuzzy logic will be used for a lot of applications. Applications of fuzzy logic in telecommunications networks are recent. Fuzzy Logic is organized into three main efforts: modeling and control, management and forecasting, and performance

Idea for fuzzy logic has born in 1965. Lotfi Zadeh has published one seminar for fuzzy which was the beginning for fuzzy logic [14]. Fuzzy logic is tolerant in imprecise data, nonlinear functions and can be mixed with other techniques for different problems solving. The main principle of fuzzy logic is using fuzzy groups which are without crisp boundaries.

As we have mentioned above, for QoS routing analysis we will use FLC as intelligent controlling technique. A Fuzzy Logic Controller [15] is a rule based system in which fuzzy rule represents a control mechanism. In this case, a fuzzy controller uses fuzzy logic to

simulate human thinking. In particular the FLC is useful in two special cases [15]:

**6. QoS routing analysis using FLC – Fuzzy Logic Controller** 

*d (p) = d(i, j) + d (j,k) + … + d(l, m)* (1)

*d (p) = d(i, j) x d (j,k) x … x d(l, m)* (2)

*d(p) = min[d(i, j), d (j,k), … d(l, m)]* (3)

They are defined as below:



simultaneously without any problem.

**5. Soft computing** 

estimation.

**5.1 Fuzzy logic** 


For any path *p* = (*i, j, k, …, l, m*), we say metric d is additive if:

…, *n*i, *n*j):


Fuzzy logic controller consists of: fuzzifier, rule base, fuzzy inference and defuzzifier (see Figure 2).

Fig. 2. Fuzzy Logic Controller

**Fuzzifier**: A fuzzifier operator has the effect of transforming crisp value to fuzzy sets. Fuzzifier is presented with *x*=*fuzzifier*(*x*0), where *x*0 is input crisp value; *x* is a fuzzy set and fuzzifier represents a fuzzification operator.

**Rule-Base (Linguistic Rules): C**ontains IF-THEN rules that are determined through fuzzy logic.

*Example*: if *x* is *Ai* and *Y* is *Bi* the *Z* is *Ci,* Where *x* and *y* are inputs and *z* is controlled output; Ai, Bi and Ci are linguistic terms, like: low, medium, high etc.

**Fuzzy Inference:** Is a process of converting input values into output values using fuzzy logic. Converting is essential for decision making. Fuzzy Inference process includes: membership functions and logic operations

**Defuzzifier:** can be expressed by: *y*ou=*defuzzifier*(*y*), where *y* identifies fuzzy controller action, *y*ou identifies crisp value of control action and defuzzifier presents defuzzifier operator. Converting process of fuzzy terms in crisp values is called defuzzification. There are some defuzzification methods: COG (Centre of Gravity), COGS (Centre of Gravity for Singletons), COA (Centre of Area), LM (Left Most Maximum) and RM (Right Most Maximum).

### **7. MPLS network metrics and membership functions**

For solving QoS routing problem, we will use fuzzy logic approach. Fuzzy logic is proved to be very effective in a lot of applications, such as intelligent control, decision making process etc. Fuzzy logic is based in a set of metrics which can be or not connected with each other. Calculating the best route cannot be done using complex mathematical solutions, but is based in intuitive rules.

Fuzzy Logic Approach for QoS Routing Analysis 155

For losses we will use 3 membership functions. For two membership functions we will use triangular form (ACCEPTABLE and TOLERABLE), while for one of membership functions

In table x are given details about metrics in the input of the fuzzy system and fuzzy sets

**Fuzzy sets** 

Delays {ACCEPTABLE, TOLERABLE, INTOLERABLE}- *ms* Losses { ACCEPTABLE, TOLERABLE, INTOLERABLE }-%

Mathematical relations of 3 input parameters can be given by the below expression:

Where *p* is the calculated path, *B(p)* is channel capacity, *D(p)* is the delay among

Packet switching networks commonly are used for transmission of multimedia services. This trend continued in MPLS network also. Real time traffic is sensitive to delays, such as: voice, video etc, constitute an important part of real time traffic. Such traffic has more strict requirements for quality of service (QoS), especially in the aspect of delays between two end points and packet losses. The table below shows standard requirements for QoS for

**Maximum rate Average rate Probability of packet loss** 

*B p f p <sup>D</sup> <sup>p</sup> xL <sup>p</sup>* (4)

Channel capacity (bandwidth) {LOW, MEDIUM, HIGH}-*Mbps*

**c. Losses (L)** 

we will use rectangular form (INTOLERABLE).

Fig. 4. Fuzzy triangular number for delay-D

Table 1. Input parameters and fuzzy sets

transmission, and *L(p)* is the probability of packet loss.

**8. Limits of fuzzy sets for MPLS network parameters** 

Voice 32 KBits/sec 11.2 KBits/sec 0.05 Voice 11.6 MBits/sec 3.85 MBits/sec 10\*\*(-5)

corresponding such inputs.

**Input parameters for MPLS** 

multimedia services.

Table 2. Bit rate for voice transmission

**network** 

Fuzzy logic applies to all those routes that are candidates for being chosen whereas the chosen path in this way is the path that has better quality. In this chapter we will use the Fuzzy logic controller for solving QoS routing problems and the routing algorithm refers to the fuzzy logic (fuzzy routing algorithm). This algorithm is able to choose the path with better transmission parameters. For solving such a problem using fuzzy logic, first we have to take into consideration some input parameters, acting at the entrance of FLC, which in our case must be the MPLS network metrics. These input variables can be crisp or fuzzy values. Whereas the main disadvantageous of MPLS network consist in: losses, delay and bandwidth, then these three metrics will be taken as network parameters. These metrics match with the main factors which affect for choosing the best route for transmission of multimedia services. Each network metric has the value from 0 to 1. These metrics are:

### **a. Channel capacity (B)**

Channel capacity is one of the main MPLS network parameters. In this chapter, channel capacity is combined with linguistic data for selecting the optimal route from the source to destination. In this particular case we took three membership functions that indicate the potential scale of the channel capacity: LOW, MEDIUM and HIGH. Channel capacity is presented by triangular membership functions. Figure x shows the membership function for the channel capacity. Triangular number *B* = (*b*1, *b*2, *b*3) in limited in his left hand with value b1 and in his right hand with the value *b*3. In this way, decision taker can calculate that channel capacity in a certain link cannot be smaller than *b*1 or greater than *b*2. Figure 3 shows the fuzzy set of linguistic data for channel capacity.

Fig. 3. Fuzzy Triangular number for channel capacity - B

It can be seen from the figure x that each value has the upper and lower limit.

### **b. Delays (D)**

For most applications, especially real time applications, the delay for transmission of information between two points is one of the most important parameters for meeting QoS requirements. For delay we set 3 membership functions in triangular form to show the potential scale of the delays: ACCEPTABLE, TOLERABLE and INTOLERABLE. Figure 4 represents the membership function *s s D* for the delay. From this function it is indicated that the greatest value of membership is (=1) for *D d s s* .

### **c. Losses (L)**

154 Fuzzy Logic – Algorithms, Techniques and Implementations

Fuzzy logic applies to all those routes that are candidates for being chosen whereas the chosen path in this way is the path that has better quality. In this chapter we will use the Fuzzy logic controller for solving QoS routing problems and the routing algorithm refers to the fuzzy logic (fuzzy routing algorithm). This algorithm is able to choose the path with better transmission parameters. For solving such a problem using fuzzy logic, first we have to take into consideration some input parameters, acting at the entrance of FLC, which in our case must be the MPLS network metrics. These input variables can be crisp or fuzzy values. Whereas the main disadvantageous of MPLS network consist in: losses, delay and bandwidth, then these three metrics will be taken as network parameters. These metrics match with the main factors which affect for choosing the best route for transmission of multimedia services. Each network metric has the value from 0 to 1. These metrics are:

Channel capacity is one of the main MPLS network parameters. In this chapter, channel capacity is combined with linguistic data for selecting the optimal route from the source to destination. In this particular case we took three membership functions that indicate the potential scale of the channel capacity: LOW, MEDIUM and HIGH. Channel capacity is presented by triangular membership functions. Figure x shows the membership function for the channel capacity. Triangular number *B* = (*b*1, *b*2, *b*3) in limited in his left hand with value b1 and in his right hand with the value *b*3. In this way, decision taker can calculate that channel capacity in a certain link cannot be smaller than *b*1 or greater than *b*2. Figure 3 shows

**a. Channel capacity (B)** 

**b. Delays (D)** 

represents the membership function

the fuzzy set of linguistic data for channel capacity.

Fig. 3. Fuzzy Triangular number for channel capacity - B

that the greatest value of membership is (=1) for *D d s s* .

It can be seen from the figure x that each value has the upper and lower limit.

For most applications, especially real time applications, the delay for transmission of information between two points is one of the most important parameters for meeting QoS requirements. For delay we set 3 membership functions in triangular form to show the potential scale of the delays: ACCEPTABLE, TOLERABLE and INTOLERABLE. Figure 4

*s s D* for the delay. From this function it is indicated

For losses we will use 3 membership functions. For two membership functions we will use triangular form (ACCEPTABLE and TOLERABLE), while for one of membership functions we will use rectangular form (INTOLERABLE).

Fig. 4. Fuzzy triangular number for delay-D

In table x are given details about metrics in the input of the fuzzy system and fuzzy sets corresponding such inputs.


Table 1. Input parameters and fuzzy sets

Mathematical relations of 3 input parameters can be given by the below expression:

$$f\left(p\right) = \frac{B\left(p\right)}{D\left(p\right)\ge L\left(p\right)}\tag{4}$$

Where *p* is the calculated path, *B(p)* is channel capacity, *D(p)* is the delay among transmission, and *L(p)* is the probability of packet loss.

### **8. Limits of fuzzy sets for MPLS network parameters**

Packet switching networks commonly are used for transmission of multimedia services. This trend continued in MPLS network also. Real time traffic is sensitive to delays, such as: voice, video etc, constitute an important part of real time traffic. Such traffic has more strict requirements for quality of service (QoS), especially in the aspect of delays between two end points and packet losses. The table below shows standard requirements for QoS for multimedia services.


Table 2. Bit rate for voice transmission

Fuzzy Logic Approach for QoS Routing Analysis 157

Membership function for channel capacity, delays, losses and the output of the fuzzy system will be seen in the following figure (where the values are taken for real tie applications). Based on the above limits for fuzzy sets, using Matlab software we can create the

membership functions main parameters of the MPLS network.

Fig. 5. Fuzzy system and its integral components in MATLAB software

[System]

[Input1]

Name='bandwidth' Range=[0 1000] NumMFs=3

Name='fuzzy\_link' Type='mamdani' Version=2.0 NumInputs=3 NumOutputs=1 NumRules=4 AndMethod='min' OrMethod='max' ImpMethod='min' AggMethod='max'

DefuzzMethod='Centre of Gravity'

According to ITU recommendations for delay, packet loss and channel capacity, we have defined the boundaries of fuzzy sets.


Table 3. ITU recommendation for delays

### **Delays**

Delays up to 150 ms are acceptable Delays between 150 and 400 ms are tolerable for special applications Delays higher than 400 ms are intolerable

### **Packet loss percentage**

Lower than 2% - acceptable From 2 – 6% - tolerable Higher than 6 % - intolerable

### **Channel capacity:**

Low: from 0 *Mbps* to 200 *Mbps.* Medium: from 180 *Mbps* to 500 *Mbps.* High: from 470 *Mbps* to 1000 *Mbps.*

### **9. Fuzzy logic toolbox in Matlab software**

Fuzzy logic tool in Matlab is used for solving different problems dealing with Fuzzy Logic. Fuzzy logic is a very valuable tool for planning because it makes a very good for problems that have high importance and require high precision – something that human beings have done long time ago. Fuzzy logic tool allows users to do important jobs, but the most important thing is to allow users to create fuzzy conclusions (fuzzy inference). It is also possible to use fuzzy logic tool through command line, but in general it is easier to build a system through the GUI. There are five primary GUI tools for building, editing and reviewing systems in fuzzy logic toolbox:


The interactions of these tools can be seen in the figure below (Figure 5).

Rule viewer and surface viewer are used for survey, compared with FIS editor, which is used for editing. So, they are read-only tools. These GUI dynamically are connected with each other and changes in FIS can be seen in other open GUIs.

156 Fuzzy Logic – Algorithms, Techniques and Implementations

According to ITU recommendations for delay, packet loss and channel capacity, we have

Fuzzy logic tool in Matlab is used for solving different problems dealing with Fuzzy Logic. Fuzzy logic is a very valuable tool for planning because it makes a very good for problems that have high importance and require high precision – something that human beings have done long time ago. Fuzzy logic tool allows users to do important jobs, but the most important thing is to allow users to create fuzzy conclusions (fuzzy inference). It is also possible to use fuzzy logic tool through command line, but in general it is easier to build a system through the GUI. There are five primary GUI tools for building, editing and

Rule viewer and surface viewer are used for survey, compared with FIS editor, which is used for editing. So, they are read-only tools. These GUI dynamically are connected with

The interactions of these tools can be seen in the figure below (Figure 5).

each other and changes in FIS can be seen in other open GUIs.

**Delays in one direction Characterization of quality** 

Delays between 150 and 400 ms are tolerable for special applications

0 to 150 *ms* Acceptable for most of applications 150 to 400 *ms* May impact in some applications Above 400 *ms* Unacceptable for most of applications

defined the boundaries of fuzzy sets.

Table 3. ITU recommendation for delays

Delays higher than 400 ms are intolerable

**9. Fuzzy logic toolbox in Matlab software** 

reviewing systems in fuzzy logic toolbox: 1. FIS (Fuzzy Inference System) editor 2. Membership function editor

Delays up to 150 ms are acceptable

**Packet loss percentage** 

**Channel capacity:** 

3. Rules editor 4. Rule Viewer 5. Surface viewer

Lower than 2% - acceptable From 2 – 6% - tolerable Higher than 6 % - intolerable

Low: from 0 *Mbps* to 200 *Mbps.* Medium: from 180 *Mbps* to 500 *Mbps.* High: from 470 *Mbps* to 1000 *Mbps.*

**Delays** 

Membership function for channel capacity, delays, losses and the output of the fuzzy system will be seen in the following figure (where the values are taken for real tie applications). Based on the above limits for fuzzy sets, using Matlab software we can create the membership functions main parameters of the MPLS network.

Fig. 5. Fuzzy system and its integral components in MATLAB software

[System] Name='fuzzy\_link' Type='mamdani' Version=2.0 NumInputs=3 NumOutputs=1 NumRules=4 AndMethod='min' OrMethod='max' ImpMethod='min' AggMethod='max' DefuzzMethod='Centre of Gravity' 

[Input1] Name='bandwidth' Range=[0 1000] NumMFs=3

Fuzzy Logic Approach for QoS Routing Analysis 159

b) Delays in **ms** 

c) Packet loss percentage

d) Output of fuzzy system

Fig. 6**.** a) Membership function of channel capacity, b) delays, c) packet loss percentage and

d) output of fuzzy system

MF1='low':'trimf',[-400 0 400] MF2='medium':'trimf',[100 500 900] MF3='high':'trimf',[600 1000 1400] [Input2] Name='delays' Range=[0 600] NumMFs=3 MF1='acceptable':'trimf',[-240 0 240] MF2='tolerable':'trimf',[60 300 540] MF3='highintolerable':'trimf',[360 600 840] [Input3] Name='losses'

Range=[0 10] NumMFs=3 MF1='acceptable':'trimf',[-4 0 4] MF2='tolerable':'trimf',[1 5 9] MF3='intolerable':'trimf',[6 10 14] [Output1] Name='link optimization' Range=[0 100] NumMFs=3 MF1='low':'trimf',[-40 0 40] MF2='medium':'trimf',[10 50 90] MF3='high':'trimf',[60 100 140]

### [Rules]

1 1 1, 1 (1) : 1 2 1 1, 2 (1) : 1 3 1 1, 3 (1) : 1 3 2 3, 1 (1) : 1

158 Fuzzy Logic – Algorithms, Techniques and Implementations

a) Channel capacity (bandwidth) in **Mbps** 

MF1='low':'trimf',[-400 0 400] MF2='medium':'trimf',[100 500 900] MF3='high':'trimf',[600 1000 1400]

MF1='acceptable':'trimf',[-240 0 240] MF2='tolerable':'trimf',[60 300 540]

MF1='acceptable':'trimf',[-4 0 4] MF2='tolerable':'trimf',[1 5 9] MF3='intolerable':'trimf',[6 10 14]

Name='link optimization'

MF1='low':'trimf',[-40 0 40] MF2='medium':'trimf',[10 50 90] MF3='high':'trimf',[60 100 140]

MF3='highintolerable':'trimf',[360 600 840]

 [Rules] 1 1 1, 1 (1) : 1 2 1 1, 2 (1) : 1 3 1 1, 3 (1) : 1 3 2 3, 1 (1) : 1

[Input2] Name='delays' Range=[0 600] NumMFs=3

[Input3] Name='losses' Range=[0 10] NumMFs=3

[Output1]

Range=[0 100] NumMFs=3

Fig. 6**.** a) Membership function of channel capacity, b) delays, c) packet loss percentage and d) output of fuzzy system

Fuzzy Logic Approach for QoS Routing Analysis 161

*C L* (7)

*C L* are membership functions for channel capacity, delays

Fig. 7. The structure of the fuzzy controller system for MPLS network analysis

*<sup>O</sup> L* - Membership function for output of the fuzzy system.

operator, which graphically will look like in figure below.

gaining more accurate values in the output.

The role of Linguistic rules is to connect these input parameters with output of fuzzy system, which in our case is link optimization. The output comprises from three membership functions: LOW, MEDIUM and HIGH. Each rule determines one fuzzy relation. In our case, each rule represents the relation between 3 input parameters and

Fuzzy controller considered here is Mamdani type and consists of: Fuzzifier, fuzzy

It is well known that in some cases the output of fuzzy process needs to be a single scalar value. Defuzzification is the process of converting the fuzzy quantity to a precise value. The output of a fuzzy process can be the union of two or more fuzzy membership functions. To see this better, we will take into consideration one example. Let suppose a fuzzy output comprises from two parts: 1) Triangular membership shape and 2) Triangular membership shape. The union of these two membership functions means that we will use the max

There are a lot of methods that have been proposed recently as defuzzification methods. We will explain shortly each of these methods and we will analyze which one is the best for

inference, linguistic rules and defuzzifier. This fuzzy controller is shown in figure x.

Where *C Acce ptable Tolerable Intolerable* , ,

*B D* and

required output that is written as:

Where *O Low Medium Hi* , , *gh*

**11. Defuzzification process** 

While,

and losses.

*A B* ,

With this simple program are created the membership functions for abovementioned parameters. These membership functions are seen in the figure 6 (a, b, c and d)

Once the variable and membership functions are assigned, fuzzy rules can be written for corresponding variables.

### **Some of the fuzzy rules derived from Rule editor (Matlab) are listed as below:**

*Rule 1: If (bandwidth is low) and (delay is acceptable) and (loss is acceptable) then (link optimization is Low)* 

*Rule 2: If (bandwidth is Medium) and (delay is acceptable) and (loss is acceptable) then (link optimization is Medium)* 

*Rule 3: If (bandwidth is High) and (delay is Acceptable) and (loss is Acceptable) then (link optimization is High)* 

*Rule 4: If (bandwidth is Low) and (delay is Tolerable) and (loss is Acceptable) then (link optimization is Low)* 

*Rule 5: If (bandwidth is Medium) and (delay is Tolerable) and (loss is Tolerable) then (link optimization is Medium)* 

*Rule 6: If (bandwidth is High) and (delay is Tolerable) and (loss is Intolerable) then (link optimization is Medium)* 

*Rule 7: If (bandwidth is Low) and (delay is Intolerable) and (loss is Acceptable) then (link optimization is Low)* 

*Rule 8: If (bandwidth is Medium) and (delay is Intolerable) and (loss is Tolerable) then (link optimization is Low)* 

*Rule 9: If (bandwidth is High) and (delay is Intolerable) and (loss is Intolerable) then (link optimization is Low)* 

### **10. Fuzzy relations for QoS routing analysis**

Here we will illustrate the relation between input fuzzy value and required output. The figure below shows the structure of the proposed solution. So, in the figure is shown fuzzy controller comprising of: inputs (which react in the input), fuzzy rules and outputs. Parameters acting on the input of this controller are: channel capacity (bandwidth), delays and losses.

Three input parameters are noted as:

$$A\begin{pmatrix}B\\ \end{pmatrix} \tag{5}$$

Where *A Low Medium Hi* , , *gh*

$$
\mu \text{B} (D) \tag{6}
$$

Where *B Acce ptable Tolerable Intolerable* , ,

160 Fuzzy Logic – Algorithms, Techniques and Implementations

With this simple program are created the membership functions for abovementioned

Once the variable and membership functions are assigned, fuzzy rules can be written for

*Rule 1: If (bandwidth is low) and (delay is acceptable) and (loss is acceptable) then (link optimization* 

*Rule 2: If (bandwidth is Medium) and (delay is acceptable) and (loss is acceptable) then (link* 

*Rule 3: If (bandwidth is High) and (delay is Acceptable) and (loss is Acceptable) then (link* 

*Rule 4: If (bandwidth is Low) and (delay is Tolerable) and (loss is Acceptable) then (link optimization* 

*Rule 5: If (bandwidth is Medium) and (delay is Tolerable) and (loss is Tolerable) then (link* 

*Rule 6: If (bandwidth is High) and (delay is Tolerable) and (loss is Intolerable) then (link* 

*Rule 7: If (bandwidth is Low) and (delay is Intolerable) and (loss is Acceptable) then (link* 

*Rule 8: If (bandwidth is Medium) and (delay is Intolerable) and (loss is Tolerable) then (link* 

*Rule 9: If (bandwidth is High) and (delay is Intolerable) and (loss is Intolerable) then (link* 

Here we will illustrate the relation between input fuzzy value and required output. The figure below shows the structure of the proposed solution. So, in the figure is shown fuzzy controller comprising of: inputs (which react in the input), fuzzy rules and outputs. Parameters acting on the input of this controller are: channel capacity (bandwidth), delays

*A B* (5)

*B D* (6)

parameters. These membership functions are seen in the figure 6 (a, b, c and d)

**Some of the fuzzy rules derived from Rule editor (Matlab) are listed as below:** 

corresponding variables.

*optimization is Medium)* 

*optimization is Medium)* 

*optimization is Medium)* 

*optimization is Low)* 

*optimization is Low)* 

*optimization is Low)* 

and losses.

**10. Fuzzy relations for QoS routing analysis** 

Three input parameters are noted as:

Where *B Acce ptable Tolerable Intolerable* , ,

Where *A Low Medium Hi* , , *gh*

*optimization is High)* 

*is Low)* 

*is Low)* 

$$
\mu \mathbb{C} \begin{pmatrix} \mathsf{L} \end{pmatrix} \tag{7}
$$

Where *C Acce ptable Tolerable Intolerable* , ,

While, *A B* , *B D* and *C L* are membership functions for channel capacity, delays and losses.

Fig. 7. The structure of the fuzzy controller system for MPLS network analysis

The role of Linguistic rules is to connect these input parameters with output of fuzzy system, which in our case is link optimization. The output comprises from three membership functions: LOW, MEDIUM and HIGH. Each rule determines one fuzzy relation. In our case, each rule represents the relation between 3 input parameters and required output that is written as:

*<sup>O</sup> L* - Membership function for output of the fuzzy system.

Where *O Low Medium Hi* , , *gh*

Fuzzy controller considered here is Mamdani type and consists of: Fuzzifier, fuzzy inference, linguistic rules and defuzzifier. This fuzzy controller is shown in figure x.

### **11. Defuzzification process**

It is well known that in some cases the output of fuzzy process needs to be a single scalar value. Defuzzification is the process of converting the fuzzy quantity to a precise value. The output of a fuzzy process can be the union of two or more fuzzy membership functions. To see this better, we will take into consideration one example. Let suppose a fuzzy output comprises from two parts: 1) Triangular membership shape and 2) Triangular membership shape. The union of these two membership functions means that we will use the max operator, which graphically will look like in figure below.

There are a lot of methods that have been proposed recently as defuzzification methods. We will explain shortly each of these methods and we will analyze which one is the best for gaining more accurate values in the output.

Fuzzy Logic Approach for QoS Routing Analysis 163

*Max*

*Min Max*

*U*

Where *U* is defuzzification result, *u* = *output variable*,

determines the crisp value after defuzzification.

Fig. 9. COG method

**12.2 Bisectorial method** 

in figure below (Figure 10).

**12.3 Middle, smallest and largest of maximum methods** 

determines Middle of maximum value in that zone.

*limit for defuzzification*, *Max*=*maximum limit for defuzzification* 

 

*Min*

*u u du*

With formula (1) we can calculate the surface of zone that is shown in figure below and also we can find one central point in this zone. Projecting this point in the abscissa axis

This method divides a certain zone into two equal regions by a vertical line. This can be seen

In some cases MOM and LOM methods are better than COG method, but in general, for the most of cases, no matter what zone we will have, COG method shows better results. In this chapter we will analyze which method is better with MATLAB software in 3 D. LOM method determines the largest of maximum value in the zone that is obtained from membership functions with AND and OR logic operators whereas MOM method

*u du*

(8)

*membership function* , *Min*=*minimum* 

Fig. 8. a) Triangular membership shape, b) Triangular membership shape c) The union of these two membership (a and b)

### **12. Selection of defuzzification method for finding crisp value for link optimization**

For finding the appropriate path for transmission of multimedia services one important role plays selection of defuzzification method. Using fuzzy logic technique, the "not accurate" data are presented by linguistic values which depend on user preferences.

There are 5 defuzzification methods: Centre of Gravity (COG), bisectorial, LOM (largest of maximum), MOM (middle of maximum) and SOM (smallest of maximum). Three most important methods are: COG, MOM and LOM. It is important to find which method gives better results in aspect of link optimization in MPLS network. To see which method is most suitable for defuzzification, first we will explain shortly each abovementioned method .

### **12.1 Centre of gravity**

This method determines the centre of zone that is gained from membership functions with AND and OR logic operators. Formula with which we can calculate the defuzzified crisp output *U* is given:

162 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 8. a) Triangular membership shape, b) Triangular membership shape c) The union of

For finding the appropriate path for transmission of multimedia services one important role plays selection of defuzzification method. Using fuzzy logic technique, the "not accurate"

There are 5 defuzzification methods: Centre of Gravity (COG), bisectorial, LOM (largest of maximum), MOM (middle of maximum) and SOM (smallest of maximum). Three most important methods are: COG, MOM and LOM. It is important to find which method gives better results in aspect of link optimization in MPLS network. To see which method is most suitable for defuzzification, first we will explain shortly each abovementioned method .

This method determines the centre of zone that is gained from membership functions with AND and OR logic operators. Formula with which we can calculate the defuzzified crisp

**12. Selection of defuzzification method for finding crisp value for link** 

data are presented by linguistic values which depend on user preferences.

a) b)

c)

these two membership (a and b)

**optimization** 

**12.1 Centre of gravity** 

output *U* is given:

$$\begin{aligned} \int\_{M} u \, \mu\left(u\right) du\\ \mathcal{U} = \frac{\text{Min}}{\text{Max}}\\ \int\_{\text{Min}} \mu(u) du \end{aligned} \tag{8}$$

Where *U* is defuzzification result, *u* = *output variable*, *membership function* , *Min*=*minimum limit for defuzzification*, *Max*=*maximum limit for defuzzification* 

With formula (1) we can calculate the surface of zone that is shown in figure below and also we can find one central point in this zone. Projecting this point in the abscissa axis determines the crisp value after defuzzification.

Fig. 9. COG method

### **12.2 Bisectorial method**

This method divides a certain zone into two equal regions by a vertical line. This can be seen in figure below (Figure 10).

### **12.3 Middle, smallest and largest of maximum methods**

In some cases MOM and LOM methods are better than COG method, but in general, for the most of cases, no matter what zone we will have, COG method shows better results. In this chapter we will analyze which method is better with MATLAB software in 3 D. LOM method determines the largest of maximum value in the zone that is obtained from membership functions with AND and OR logic operators whereas MOM method determines Middle of maximum value in that zone.

Fuzzy Logic Approach for QoS Routing Analysis 165

Fig. 11. LOM, MOM and SOM defuzzification methods

rules gives us better result in aspect of link utilization. If we use rule number 3 (derived from rule editor):

Fig. 12. Rule viewer when it is used COG method

*optimization is High)* 

**13. Analysis and examples for different defuzzification methods** 

Here we will use some examples taking different rules derived from rule editor of Matlba's toolbox. Also we will analyze which of the defuzzification method is better and which of the

*If (bandwidth is High) AND (delay is Acceptable) AND (loss is Acceptable) THEN (link* 

Whereas, if we take the values for channel capacity (bandwidth) as "high" (946 Mbps), Delays 25.3 ms and losses 0.663 %, then link optimization for MPLS network will be 86.4 %. This can be shown graphically using rule viewer. So, from this figure, it is clearly seen that link optimization is high (according to determination of link optimization using

Fig. 10. Bisectorial Method

These three methods have to do with the maximum value of the sum of membership functions. In this example, since in this graph (figure x) is a flat curve at the maximum point, then these three methods have different values from each other. In case when we have a single maximum point, then three methods have the same value.

Finding defuzzification values using the above-mentioned methods can e done like below:

```
x3 = defuzz(x,mf1,'mom') 
x4 = defuzz(x,mf1,'som') 
x5 = defuzz(x,mf1,'lom') 
set([h2 t2],'Color',gray) 
h3 = line([x3 x3],[-0.7 1.2],'Color','k'); 
t3 = text(x3,-0.7,' MOM','FontWeight','bold'); 
h4 = line([x4 x4],[-0.8 1.2],'Color','k'); 
t4 = text(x4,-0.8,' SOM','FontWeight','bold'); 
h5 = line([x5 x5],[-0.6 1.2],'Color','k'); 
t5 = text(x5,-0.6,' LOM','FontWeight','bold'); 
x3 = -5 
x4 = -2 
x5 = -8
```
These values are represented graphically like in figure 11.

Methods that are used mostly for defuzzification are: COGM MOM and LOM. Although, in some cases MOM and LOM methods give very favorable values, COG method gives better results whatever the case that we are analyzing. The performance comparison offered by three methods can be better seen through examples by surface viewer in 3D.

164 Fuzzy Logic – Algorithms, Techniques and Implementations

These three methods have to do with the maximum value of the sum of membership functions. In this example, since in this graph (figure x) is a flat curve at the maximum point, then these three methods have different values from each other. In case when we have a

Finding defuzzification values using the above-mentioned methods can e done like below:

Methods that are used mostly for defuzzification are: COGM MOM and LOM. Although, in some cases MOM and LOM methods give very favorable values, COG method gives better results whatever the case that we are analyzing. The performance comparison offered by

three methods can be better seen through examples by surface viewer in 3D.

single maximum point, then three methods have the same value.

Fig. 10. Bisectorial Method

x3 = defuzz(x,mf1,'mom') x4 = defuzz(x,mf1,'som') x5 = defuzz(x,mf1,'lom') set([h2 t2],'Color',gray)

x3 = -5 x4 = -2 x5 = -8

h3 = line([x3 x3],[-0.7 1.2],'Color','k');

h4 = line([x4 x4],[-0.8 1.2],'Color','k'); t4 = text(x4,-0.8,' SOM','FontWeight','bold'); h5 = line([x5 x5],[-0.6 1.2],'Color','k'); t5 = text(x5,-0.6,' LOM','FontWeight','bold');

t3 = text(x3,-0.7,' MOM','FontWeight','bold');

These values are represented graphically like in figure 11.

Fig. 11. LOM, MOM and SOM defuzzification methods

### **13. Analysis and examples for different defuzzification methods**

Here we will use some examples taking different rules derived from rule editor of Matlba's toolbox. Also we will analyze which of the defuzzification method is better and which of the rules gives us better result in aspect of link utilization.

If we use rule number 3 (derived from rule editor):

*If (bandwidth is High) AND (delay is Acceptable) AND (loss is Acceptable) THEN (link optimization is High)* 

Fig. 12. Rule viewer when it is used COG method

Whereas, if we take the values for channel capacity (bandwidth) as "high" (946 Mbps), Delays 25.3 ms and losses 0.663 %, then link optimization for MPLS network will be 86.4 %. This can be shown graphically using rule viewer. So, from this figure, it is clearly seen that link optimization is high (according to determination of link optimization using

Fuzzy Logic Approach for QoS Routing Analysis 167

*c) LOM Method* 

While, 3 D surface viewer using three main defuzzification methods is the same and is depicted in the figure below. From this graph, it is clearly seen that we have high bandwidth, but delays and percentage of packet losses are also high, resulting thus in low

In this case we have a very low link usage, and this is the reason why surface viewer is

*If (bandwidth is Medium) and (delay is Tolerable) and (loss is Tolerable) then (link optimization is* 

approximately same for three defuzzification methods (low link optimization).

Fig. 13. Surface viewer for 3 defuzzification methods: a) COG, b) MOM and c) LOM

Fig. 14. Rule viewer when it is used COG method for rule 9

If we use rule number 5 (derived from rule editor):

link optimization.

*Medium)*

membership functions). This value is obtained using COG method. Based on the results obtained here, we will conclude that this method is very effective.

The 86.4 % value represents the value after defuzzification. While, the surface viewer in 3D for three main defuzzification methods will look like in figure below.

If we use rule number 9 (derived from rule editor):

*If (bandwidth is High) AND (delay is Intolerable) AND (loss is Intolerable) THEN (link optimization is Low)* 

Then, link optimization will be 20 % (see Figure 14). As we can see, in this case, link optimization is very low. So we can freely conclude that in a case when we have intolerable delays and intolerable losses, we will not have high QoS. The reason is because multimedia applications are very sensitive in delays and losses.

*a) COG Method* 

*b) MOM Method* 

166 Fuzzy Logic – Algorithms, Techniques and Implementations

membership functions). This value is obtained using COG method. Based on the results

The 86.4 % value represents the value after defuzzification. While, the surface viewer in 3D

*If (bandwidth is High) AND (delay is Intolerable) AND (loss is Intolerable) THEN (link* 

Then, link optimization will be 20 % (see Figure 14). As we can see, in this case, link optimization is very low. So we can freely conclude that in a case when we have intolerable delays and intolerable losses, we will not have high QoS. The reason is because multimedia

*a) COG Method* 

*b) MOM Method* 

obtained here, we will conclude that this method is very effective.

If we use rule number 9 (derived from rule editor):

applications are very sensitive in delays and losses.

*optimization is Low)* 

for three main defuzzification methods will look like in figure below.

*c) LOM Method* 

Fig. 13. Surface viewer for 3 defuzzification methods: a) COG, b) MOM and c) LOM

Fig. 14. Rule viewer when it is used COG method for rule 9

While, 3 D surface viewer using three main defuzzification methods is the same and is depicted in the figure below. From this graph, it is clearly seen that we have high bandwidth, but delays and percentage of packet losses are also high, resulting thus in low link optimization.

In this case we have a very low link usage, and this is the reason why surface viewer is approximately same for three defuzzification methods (low link optimization).

If we use rule number 5 (derived from rule editor):

*If (bandwidth is Medium) and (delay is Tolerable) and (loss is Tolerable) then (link optimization is Medium)*

Fuzzy Logic Approach for QoS Routing Analysis 169

a) COG method

b) MOM method

While, 3 D surface viewer for 3 main defuzzification methods will look like below:

Fig. 15. Surface viewer for rule 9, for defuzzification methods: COG, MOM and LOM

Fig. 16. Rule viewer when we use COG method for rule 5

In this case, the value for bandwidth is 548 Mbps, delays = 437 ms (tolerable) and losses are tolerable (2.83%). From the analysis we have shown that the link optimization is medium. This means that using the parameters above, we can transmit almost every multimedia service.

168 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 15. Surface viewer for rule 9, for defuzzification methods: COG, MOM and LOM

In this case, the value for bandwidth is 548 Mbps, delays = 437 ms (tolerable) and losses are tolerable (2.83%). From the analysis we have shown that the link optimization is medium. This means that using the parameters above, we can transmit almost every multimedia

Fig. 16. Rule viewer when we use COG method for rule 5

service.

While, 3 D surface viewer for 3 main defuzzification methods will look like below:

a) COG method

b) MOM method

Fuzzy Logic Approach for QoS Routing Analysis 171

methods that are used mostly for defuzzification, that are: COGM MOM and LOM. The performance comparison offered by three methods we have described through examples by surface viewer in 3D. Through these analyses using Matlab's toolbox we have shown that

[1] Mario Marchese, "*QoS over heterogeneous networks*", Copyright © 2007 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, England [2] Pragyansmita Paul and S V Raghavan–"Survey of QoS Routing"- Proceedings of the 15th international conference on Computer communication, p.50-75, August 12-14 [3] Santiago Alvarez, "*QoS for IP/MPLS Networks*", Cisco Press, Pub Date: June 02, 2006,

[5] Monique Morrow, Azhar Sayeed, "*MPLS and Next-Generation Networks: Foundations for* 

[6] Arianit Maraj, B. Shatri, I. Limani, A. Abdullahu, S. Rugova *"*Analysis of QoS Routing in

[7] Runtong Zhang and Jian Ma - Fuzzy QoS Management in Diff-Serv Networks- Systems,

[8] A .Vasilakos, C .Ricudis, K. Anagnostakis, W .Pedrycz, A. Pitsillides-"Evolutionary-

[9] A .Vasilakos, C .Ricudis, K. Anagnostakis, W .Pedrycz, A. Pitsillides-"Evolutionary-

[10] Balandin, S. Heiner, A.P, SPF protocol and statistical tools for network simulations in

[11] By Eric Osborne, Ajay Simha, "*Traffic Engineering with MPLS*", Cisco Press, Pub Date:

[12] Baolin Sun, Layuan Li, Chao Gui - Fuzzy QoS Controllers Based Priority Scheduler for

[13] B. Shatri, A.Abdullahu, S. Rugova, Arianit Maraj, "VPN Creation in IP/MPLS Network

[14] K. H. Lee, "Firs Course on fuzzy theory and applications" [book], pages: 253-279, ISBN

*NGN and Enterprise Virtualization*", Cisco Press, Pub Date: November 06, 2006,

MPLS Network in Kosova using Fuzzy Logic"-Proceedings of the 7th WSEAS International Conference on Signal Processing, Robotics and Automation (ISPRA '08)

Man, and Cybernetics, 2000 IEEE International Conference on Volume 5, Issue ,

Fuzzy Prediction for Strategic QoS Routing in Broadband Networks", 0-7803-4863-

Fuzzy Prediction for Strategic QoS Routing in Broadband Networks", 0-7803-4863-

NS-2"- Information Technology Interfaces, 2002. ITI 2002. Proceedings of the 24th

Mobile Ad Hoc Networks- This paper appears in: Mobile Technology, Applications and Systems, 2005 2nd International Conference on Publication Date: 15-17 Nov.

in Kosova" icn, pp. 318-323, Seventh International Conference on Networking (icn

3-540-22988-4 Springer Berlin Heidelberg NewYork, Springer-Verlag Berlin

COG method gives better results in all analyzed cases.

[4] Shigang Chen and Klara Nahrstedt -"Distributed QoS routing"

International Conference on Publication Date: 2002

ISBN: 1-58705-233-4

ISBN: 1-58720-120-8

2000 Page(s): 3752 - 757 vol.5

W98 @10.0001998 IEEE

W98 @10.0001998 IEEE

2005

2008), 2008

Heidelberg 2005

July 17, 2002, ISBN: 1-58705-031-5

**15. References** 

c) LOM method

Fig. 17. Surface viewer for rule 5 when it is used: a) COG, b) MOM and c) LOM

From the above examples can be clearly seen that COG defuzzification method gives better result in aspect of link utilization. This can be seen with surface viewer and rule viewer as presented in the above examples using different rules.

### **14. Conclusion**

In the past, routing problem in communication networks was relatively simple. The applications have used a modest percentage of bandwidth and no one of those applications had QoS requirements. However, the existing routing protocols should be improved or replaced with algorithms that meet different QoS requirements. Thus, it is necessary to present an architecture that supports new services in Internet and guarantees QoS for multimedia applications.

In this chapter we introduced the MPLS technology as a multiple solution for a lot of current problems faced by Internet today. By a wide support for QoS and traffic engineering capability, MPLS is establishing itself as a standard of the next generation's network. In MPLS network, some problems can occur during multimedia service transmission, therefore it is a good idea to design some control mechanisms for solving such problems. As a result of the complex nature of control mechanisms, in this chapter we used intelligent controlled techniques. One of the intelligent controlled techniques that we analyzed here is Fuzzy Logic Controller, a technique that is based on fuzzy logic. We have shown that fuzzy logic approach is suitable for QoS routing analysis in MPLS network. In this chapter we used main metrics of MPLS network acting in the input of FLC and we found the most appropriate defuzzification method for finding better crisp values in the aspect of link utilization in MPLS network. Also, we have shown that the most important part of FLC is defuzzifier, which converts fuzzy values into crisp values. In this chapter we explained the methods that are used mostly for defuzzification, that are: COGM MOM and LOM. The performance comparison offered by three methods we have described through examples by surface viewer in 3D. Through these analyses using Matlab's toolbox we have shown that COG method gives better results in all analyzed cases.

### **15. References**

170 Fuzzy Logic – Algorithms, Techniques and Implementations

c) LOM method

From the above examples can be clearly seen that COG defuzzification method gives better result in aspect of link utilization. This can be seen with surface viewer and rule viewer as

In the past, routing problem in communication networks was relatively simple. The applications have used a modest percentage of bandwidth and no one of those applications had QoS requirements. However, the existing routing protocols should be improved or replaced with algorithms that meet different QoS requirements. Thus, it is necessary to present an architecture that supports new services in Internet and guarantees QoS for

In this chapter we introduced the MPLS technology as a multiple solution for a lot of current problems faced by Internet today. By a wide support for QoS and traffic engineering capability, MPLS is establishing itself as a standard of the next generation's network. In MPLS network, some problems can occur during multimedia service transmission, therefore it is a good idea to design some control mechanisms for solving such problems. As a result of the complex nature of control mechanisms, in this chapter we used intelligent controlled techniques. One of the intelligent controlled techniques that we analyzed here is Fuzzy Logic Controller, a technique that is based on fuzzy logic. We have shown that fuzzy logic approach is suitable for QoS routing analysis in MPLS network. In this chapter we used main metrics of MPLS network acting in the input of FLC and we found the most appropriate defuzzification method for finding better crisp values in the aspect of link utilization in MPLS network. Also, we have shown that the most important part of FLC is defuzzifier, which converts fuzzy values into crisp values. In this chapter we explained the

Fig. 17. Surface viewer for rule 5 when it is used: a) COG, b) MOM and c) LOM

presented in the above examples using different rules.

**14. Conclusion** 

multimedia applications.


**Term Weighting for Information** 

Jorge Ropero, Ariel Gómez, Alejandro Carrasco,

*Department of Electronic Technology, University of Seville,* 

The rising quantity of available information has constituted an enormous advance in our daily life. However, at the same time, some problems emerge as a result from the existing difficulty to distinguish the necessary information among the high quantity of unnecessary data. Information Retrieval has become a capital task for retrieving the useful information. Firstly, it was mainly used for document retrieval, but lately, its use has been generalized for the retrieval of any kind of information, such as the information contained in a database, a web page, or any set of accumulated knowledge. In particular, the so-called Vector Space Model is widely used. Vector Space Model is based on the use of index terms, which represent some pieces of knowledge or Objects. Index terms have associated weights, which

It is important that the assignment of weights to every index term - called Term Weighting - is automatic. The so-called TF-IDF method is mainly used for determining the weight of a term (Lee et al., 1997). Term Frequency (TF) is the frequency of occurrence of a term in a document; and Inverse Document Frequency (IDF) varies inversely with the number of documents to which the term is assigned (Salton, 1988). Although TF-IDF method for Term Weighting has worked reasonably well for Information Retrieval and has been a starting point for more recent algorithms, it was never taken into account that some other aspects of index terms may be important for determining term weights apart from TF and IDF: first of all, we should consider the degree of identification of an object if only the considered index term is used. This parameter has a strong influence on the final value of a term weight if the degree of identification is high. The more an index term identifies an object, the higher value for the corresponding term weight; secondly, we should also consider the existance of join terms.

These aspects are especially important when the information is abundant, imprecise, vague and heterogeneous. In this chapter, we define a new Term Weighting model based on Fuzzy Logic. This model tries to replace the traditional Term Weighting method, called TF-IDF. In order to show the efficiency of the new method, the Fuzzy Logic-based method has been tested on the website of the University of Seville. Web pages are usually a perfect example of heterogeneous and disordered information. We demonstrate the improvement introduced by the new method extracting the required information. Besides, it is also

possible to extract related information, which may be of interest to the users.

represent the importance of them in the considered set of knowledge.

**1. Introduction** 

 **Retrieval Using Fuzzy Logic** 

Carlos León and Joaquín Luque

*Spain* 

[15] D. Driankov, H. Hellenndoorn and M. Reinfrank "An Introduction to fuzzy Control", Springer – Verlang, Berlin, New York, 1993 **9** 

## **Term Weighting for Information Retrieval Using Fuzzy Logic**

Jorge Ropero, Ariel Gómez, Alejandro Carrasco, Carlos León and Joaquín Luque *Department of Electronic Technology, University of Seville, Spain* 

### **1. Introduction**

172 Fuzzy Logic – Algorithms, Techniques and Implementations

[15] D. Driankov, H. Hellenndoorn and M. Reinfrank "An Introduction to fuzzy Control",

The rising quantity of available information has constituted an enormous advance in our daily life. However, at the same time, some problems emerge as a result from the existing difficulty to distinguish the necessary information among the high quantity of unnecessary data. Information Retrieval has become a capital task for retrieving the useful information. Firstly, it was mainly used for document retrieval, but lately, its use has been generalized for the retrieval of any kind of information, such as the information contained in a database, a web page, or any set of accumulated knowledge. In particular, the so-called Vector Space Model is widely used. Vector Space Model is based on the use of index terms, which represent some pieces of knowledge or Objects. Index terms have associated weights, which represent the importance of them in the considered set of knowledge.

It is important that the assignment of weights to every index term - called Term Weighting - is automatic. The so-called TF-IDF method is mainly used for determining the weight of a term (Lee et al., 1997). Term Frequency (TF) is the frequency of occurrence of a term in a document; and Inverse Document Frequency (IDF) varies inversely with the number of documents to which the term is assigned (Salton, 1988). Although TF-IDF method for Term Weighting has worked reasonably well for Information Retrieval and has been a starting point for more recent algorithms, it was never taken into account that some other aspects of index terms may be important for determining term weights apart from TF and IDF: first of all, we should consider the degree of identification of an object if only the considered index term is used. This parameter has a strong influence on the final value of a term weight if the degree of identification is high. The more an index term identifies an object, the higher value for the corresponding term weight; secondly, we should also consider the existance of join terms.

These aspects are especially important when the information is abundant, imprecise, vague and heterogeneous. In this chapter, we define a new Term Weighting model based on Fuzzy Logic. This model tries to replace the traditional Term Weighting method, called TF-IDF. In order to show the efficiency of the new method, the Fuzzy Logic-based method has been tested on the website of the University of Seville. Web pages are usually a perfect example of heterogeneous and disordered information. We demonstrate the improvement introduced by the new method extracting the required information. Besides, it is also possible to extract related information, which may be of interest to the users.

Term Weighting for Information Retrieval Using Fuzzy Logic 175

So firstly, terms that are mentioned frequently in individual documents or extracts from a document, appear to be useful for improving recall. This suggests the use of a factor known as Term Frequency (TF) as part of a Term Weighting system, measuring the frequency of occurance of a term in a document. The TF factor has been used for Term Weighting for years in automatic indexing environments. Secondly, the TF factor solely does not ensure an acceptable retrieval. In particular, when the high frequency terms are not concentrated in specific documents, but instead are frequent in the entire set, all documents tend to be recovered, and this affects the precision factor. Thus, there is the need to introduce a new factor that favours the terms that are concentrated in only a few documents in the collection. The Inverse Document Frequency (IDF) is the factor that considers this aspect. The IDF factor is inversely proportional to the number of documents (n) to which a term is assigned in a set of documents N. A typical IDF factor is log (N / n) (Salton & Buckley, 1996). So the best index terms to identify the contents of a document are those able to distinguish certain individual documents from the rest of the set. This implies that the best terms should have high term frequencies, but low overall collection frequencies. A reasonable measure of the importance of a term can be obtained, therefore, by the product of term frequency and inverse document frequency (TF x IDF). It is usual to describe the weight of a term *i* in a

wij = tfij x idfj

This formula was originally designed for the retrieval and extraction of documents. Eventually, it has also been used for the retrieval of any object in any set of accumulated knowledge, and has been revised and improved by other authors in order to obtain better results in Information Retrieval (Lee et al., 1997), (Zhao & Karypis, 2002), (Lertnattee &

In short, term weights must be related somehow to the importance of an index term in the




corresponding set of knowledge. There are two options for defining these weights:

To calculate the weight of a term, the TF-IDF approach considers two factors:

*retrieved relevant objects* Precision

Equation 2. Definition of precision

document *j* as shown in Equation 3.

Theeramunkong, 2003), (Liu & Ke, 2007).

Information Retrieval.

Equation 3. Obtention of term weights; general formula

subjective and is not able of being automated.

occurrence of the term Tk in document i.

*total number of retrieved objects*

### **2. Vector Space Model and Term Weighting**

In the Vector Space Model, the contents of a document are represented by a multidimensional space vector. Later, the proper classes of the given vector are determined by comparing the distances between vectors. The procedure of the Vector Space Model can be divided into three stages, as seen in Figure 1 (Raghavan & Wong, 1986):


Fig. 1. Vector Space Model procedure

In this chapter, we are focusing in the second stage. It was in the late 50's when the idea of text retrieval came up - a concept that was later extended to general information retrieval -. Text retrieval was founded on an automatic search based on textual content through a series of identifiers. It was Gerard Salton who laid the foundations for linking these identifiers and the texts that they represent during the 70's and the 80's. Salton suggested that every document could be represented by a term vector in the way D = (ti, tj,…, tp), where every tk identifies a term assigned to a document D. A formal representation of the vector D leads us not to consider only the terms in the vector, but to add a set of weights representing the term weight, it is to say, its importance in the document.

A Term Weighting system should improve efficiency in two main factors, recall and precision. Recall takes into account the fact that the objects relevant to the user should be retrieved. Precision considers the fact that the objects that are not wanted by the user should be rejected. In principle, it is desirable to build a system that rewards both high recall, - retrieving all that is relevant - and high precision - discarding all unwanted objects (Ruiz & Srinisavan, 1998). Recall improves using high-frequency index terms, i.e. terms which occur in many documents of the collection. This way, it is expected to retrieve many documents including such terms, and thus, many of the relevant documents. The precision factor, however, improves when using more specific index terms that are capable of isolating the few relevant articles of the mass of irrelevant. In practice, compromises are utilized; using frequent enough terms to achieve a reasonable level of recall without causing a too low value of precision. The exact definitions of recall and precision are shown in Equations 1 and 2.

> *total number of relevant objects retrieved relevant objects* Recall

Equation 1. Definition of recall

Equation 2. Definition of precision

174 Fuzzy Logic – Algorithms, Techniques and Implementations

In the Vector Space Model, the contents of a document are represented by a multidimensional space vector. Later, the proper classes of the given vector are determined by comparing the distances between vectors. The procedure of the Vector Space Model can


In this chapter, we are focusing in the second stage. It was in the late 50's when the idea of text retrieval came up - a concept that was later extended to general information retrieval -. Text retrieval was founded on an automatic search based on textual content through a series of identifiers. It was Gerard Salton who laid the foundations for linking these identifiers and the texts that they represent during the 70's and the 80's. Salton suggested that every document could be represented by a term vector in the way D = (ti, tj,…, tp), where every tk identifies a term assigned to a document D. A formal representation of the vector D leads us not to consider only the terms in the vector, but to add a set of weights representing the term

A Term Weighting system should improve efficiency in two main factors, recall and precision. Recall takes into account the fact that the objects relevant to the user should be retrieved. Precision considers the fact that the objects that are not wanted by the user should be rejected. In principle, it is desirable to build a system that rewards both high recall, - retrieving all that is relevant - and high precision - discarding all unwanted objects (Ruiz & Srinisavan, 1998). Recall improves using high-frequency index terms, i.e. terms which occur in many documents of the collection. This way, it is expected to retrieve many documents including such terms, and thus, many of the relevant documents. The precision factor, however, improves when using more specific index terms that are capable of isolating the few relevant articles of the mass of irrelevant. In practice, compromises are utilized; using frequent enough terms to achieve a reasonable level of recall without causing a too low value of precision. The exact

*total number of relevant objects*

*retrieved relevant objects* Recall

be divided into three stages, as seen in Figure 1 (Raghavan & Wong, 1986):



**2. Vector Space Model and Term Weighting** 

order to improve the retrieval relevant to the user.

Fig. 1. Vector Space Model procedure

weight, it is to say, its importance in the document.

definitions of recall and precision are shown in Equations 1 and 2.

Equation 1. Definition of recall

So firstly, terms that are mentioned frequently in individual documents or extracts from a document, appear to be useful for improving recall. This suggests the use of a factor known as Term Frequency (TF) as part of a Term Weighting system, measuring the frequency of occurance of a term in a document. The TF factor has been used for Term Weighting for years in automatic indexing environments. Secondly, the TF factor solely does not ensure an acceptable retrieval. In particular, when the high frequency terms are not concentrated in specific documents, but instead are frequent in the entire set, all documents tend to be recovered, and this affects the precision factor. Thus, there is the need to introduce a new factor that favours the terms that are concentrated in only a few documents in the collection. The Inverse Document Frequency (IDF) is the factor that considers this aspect. The IDF factor is inversely proportional to the number of documents (n) to which a term is assigned in a set of documents N. A typical IDF factor is log (N / n) (Salton & Buckley, 1996). So the best index terms to identify the contents of a document are those able to distinguish certain individual documents from the rest of the set. This implies that the best terms should have high term frequencies, but low overall collection frequencies. A reasonable measure of the importance of a term can be obtained, therefore, by the product of term frequency and inverse document frequency (TF x IDF). It is usual to describe the weight of a term *i* in a document *j* as shown in Equation 3.

$$\mathbf{w}\_{\overline{\imath}\overline{\jmath}} = \mathbf{tf}\_{\overline{\imath}\overline{\jmath}} \ge \mathbf{id} \mathbf{f}\_{\overline{\jmath}}$$

Equation 3. Obtention of term weights; general formula

This formula was originally designed for the retrieval and extraction of documents. Eventually, it has also been used for the retrieval of any object in any set of accumulated knowledge, and has been revised and improved by other authors in order to obtain better results in Information Retrieval (Lee et al., 1997), (Zhao & Karypis, 2002), (Lertnattee & Theeramunkong, 2003), (Liu & Ke, 2007).

In short, term weights must be related somehow to the importance of an index term in the corresponding set of knowledge. There are two options for defining these weights:


To calculate the weight of a term, the TF-IDF approach considers two factors:


Term Weighting for Information Retrieval Using Fuzzy Logic 177

The TF-IDF method works reasonably well, but has the disadvantage of not considering two



This chapter describes, firstly, the operation of TF-IDF method. Then, the new Term Weighting Fuzzy Logic-based method is introduced. Finally, both methods are implemented for the particular case of Information Retrieval for the University of Seville web portal, obtaining specific results of the operation of both of them. A web portal is a typical example of a disordered, vague and heterogenous set of knowledge. With this aim, an intelligent agent was designed to allow an efficient retrieval of the relevant information. This system should be valid for any set of knowledge. The system was designed to enable users to find possible answers to their queries in a set of knowledge of a great size. The whole set of knowledge was classified into different objects. These objects represent the possible answers to user queries and were organized into hierarchical groups (called Topic, Section and Object). One or more standard questions are assigned to every object and some

The last step is Term Weigthing; the assigned weight depends on the importance of an index term for the identification of the object. The way in which these weights are assigned is the

As an example of the classical TF-IDF Term Weighting method functioning, we are using


**3. Term Weighting method comparison** 

**3.1 Term Weighting methods** 

aspects that we believe key:

with different objects.

index terms are extracted from them.

other tfik and nk for the Topic-.

At Topic hierarchic level:


main issue of this chapter. All the process is shown in Figure 3.

the term 'library', used in the example shown in Table 1.


another object.


Introducing standardization to simplify the calculations, the formula finally obtained for the calculation of the weights is defined in Equation 4 (Liu et al., 2001)

$$W\_{ik} = \frac{y\_{ik}^r \times \log(N/n\_k + 0.01)}{\sqrt{\sum\_{k=1}^m y\_{ik}^r \times \log(N/n\_k + 0.01)}^{2^r}}$$

Equation 4. Obtention of term weights. Used formula.

A third factor that is commonly used is the document length normalization factor. Long documents usually have a much larger set of extracted terms than short documents. This fact makes it more likely that long documents are retrieved (Van Rijsbergen, 1979), (Salton & Buckley, 1996). The term weight obtained using a length normalization factor is given by Equation 5.

$$W\_{ik} = \frac{w\_{ik}}{\sqrt{\sum\_{i=1}^{m} (w\_i)^2}}$$

Equation 5. Obtention of term weights using a length normalization factor

In Equation 5, wi correspond to the weights of the other components of the vector.

All Term Weighting tasks are shown in Figure 2.

Fig. 2. Term Weighting tasks

### **3. Term Weighting method comparison**

### **3.1 Term Weighting methods**

176 Fuzzy Logic – Algorithms, Techniques and Implementations


Introducing standardization to simplify the calculations, the formula finally obtained for the

*ik k*

*nNtf*

A third factor that is commonly used is the document length normalization factor. Long documents usually have a much larger set of extracted terms than short documents. This fact makes it more likely that long documents are retrieved (Van Rijsbergen, 1979), (Salton & Buckley, 1996). The term weight obtained using a length normalization factor is given by

1

*i*

2 )(

*w*

*ik*

*w*

*i*

*m*

In Equation 5, wi correspond to the weights of the other components of the vector.

*ik*

*W*

Equation 5. Obtention of term weights using a length normalization factor

<sup>2</sup> ))01.0/log(

)01.0/log(

*ik k*

calculation of the weights is defined in Equation 4 (Liu et al., 2001)

*ik*

Equation 4. Obtention of term weights. Used formula.

All Term Weighting tasks are shown in Figure 2.

Fig. 2. Term Weighting tasks

1

*k*

*m*

*nNtf <sup>W</sup>*

expression log (N / nk + 0.01).

Equation 5.

The TF-IDF method works reasonably well, but has the disadvantage of not considering two aspects that we believe key:


This chapter describes, firstly, the operation of TF-IDF method. Then, the new Term Weighting Fuzzy Logic-based method is introduced. Finally, both methods are implemented for the particular case of Information Retrieval for the University of Seville web portal, obtaining specific results of the operation of both of them. A web portal is a typical example of a disordered, vague and heterogenous set of knowledge. With this aim, an intelligent agent was designed to allow an efficient retrieval of the relevant information. This system should be valid for any set of knowledge. The system was designed to enable users to find possible answers to their queries in a set of knowledge of a great size. The whole set of knowledge was classified into different objects. These objects represent the possible answers to user queries and were organized into hierarchical groups (called Topic, Section and Object). One or more standard questions are assigned to every object and some index terms are extracted from them.

The last step is Term Weigthing; the assigned weight depends on the importance of an index term for the identification of the object. The way in which these weights are assigned is the main issue of this chapter. All the process is shown in Figure 3.

As an example of the classical TF-IDF Term Weighting method functioning, we are using the term 'library', used in the example shown in Table 1.

At Topic hierarchic level:


Term Weighting for Information Retrieval Using Fuzzy Logic 179



Consequently, 'Library' is relevant to find out that the Object is in Topic 6, but not very relevant to find out the definite Object, which should be found according to other terms in a

As said above, TF-IDF has the disadvantage of not considering the degree of identification of the object if only the considered index term is used and the existance of join terms. The FLbased method provides a solution for these problems: the solution is to create a table of all the index terms and their corresponding weights for each object. This table will be created in the process of extracting the index words from the standard questions. Imprecision practically does not affect the method due to the fact that Term Weighting is based on fuzzy logic. This


For example, in the case of a website, the own web page developer may define standard questions. These questions are associated with the object - the web page -. He also should define the index for each object and answer the two questions proposed above. This greatly

Fuzzy Logic based Term Weighting method is defined below. Four questions must be



fact minimizes the effect of possible variations of the assigned weights.

How does an index term define an object by itself?

answered to determine the weight of an Index Term:


Are there any join terms tied to the considered index term?

Furthermore, the Fuzzy Logic-based method provides two important advantages:

simplifies the process and leaves the possibility of using collaborative intelligence.


is not very relevant to distinguish the desired Section inside the Topic.

At Section hierarchic level:

At Object hierarchic level:

once in an Object -.



IDF factor -.

TF factor -.

user consultation.





Fig. 3. Information Retrieval process.



Table 1. Example of the followed methodology.

At Section hierarchic level:

178 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 3. Information Retrieval process.

Step 1: Web page identified by

standard/s question/s

Step 2: Locate standard/s question/s in the hierarchical

structure.

As well, an example of the followed methodology is shown in Table 1.



idweb.html

Seville?

Topic 6: Library

Object 1.

Step 3: Extract index terms Index terms: 'Library', 'services', 'online'

Section 3: Online services

**STEP EXAMPLE** 

Step 4: Term weighting Explained below

Table 1. Example of the followed methodology.


At Object hierarchic level:


Consequently, 'Library' is relevant to find out that the Object is in Topic 6, but not very relevant to find out the definite Object, which should be found according to other terms in a user consultation.

As said above, TF-IDF has the disadvantage of not considering the degree of identification of the object if only the considered index term is used and the existance of join terms. The FLbased method provides a solution for these problems: the solution is to create a table of all the index terms and their corresponding weights for each object. This table will be created in the process of extracting the index words from the standard questions. Imprecision practically does not affect the method due to the fact that Term Weighting is based on fuzzy logic. This fact minimizes the effect of possible variations of the assigned weights.

Furthermore, the Fuzzy Logic-based method provides two important advantages:


For example, in the case of a website, the own web page developer may define standard questions. These questions are associated with the object - the web page -. He also should define the index for each object and answer the two questions proposed above. This greatly simplifies the process and leaves the possibility of using collaborative intelligence.

Fuzzy Logic based Term Weighting method is defined below. Four questions must be answered to determine the weight of an Index Term:


Term Weighting for Information Retrieval Using Fuzzy Logic 181

Provided that there are 1114 index terms defined in our case, we think that 1 % of these words must mark the border for the value 0 (11 words). Therefore, whenever an index term appears more than 12 times in other subsets, we will give it the value of 0. Associated values

Associated value 1 0.9 0.8 0.7 0.64 0.59 0.53

Associated value 0.47 0.41 0.36 0.3 0.2 0.1 0

Between 0 and 3 times appearing - approximately a third of the possible values - , we consider that an index term belongs to the so called HIGH set. Therefore, it is defined in its correspondant fuzzy set with uniformly distributed values between 0.7 and 1, as may be seen in Figure 5. Analogously, we distribute all values uniformly according to different fuzzy sets. Fuzzy sets are defined by linguistic variables LOW, MEDIUM and HIGH. Fuzzy sets are triangular, on one hand for simplicity and on the other hand because we tested other more complex types of sets (Gauss, Pi type, etc), but the results did not improve at all.

On the other hand, given that at each hierarchical level, a different term weight is defined, it is necessary to consider other scales to calculate the fuzzy system input values for the other hierarchical levels. As for the level of topic was considered the top level - the whole set of knowledge - , for the level of Section we consider the number occurrences of an index term on a given topic. Keeping in mind that all topics are considered, we take as reference the

Table 3. Input values associated to Q1 for topic hierarchic level.

0 1 2 3 4 5 6

7 8 9 10 11 12 ≥13

for every Topic are defined in Table 3.

Number of appearances

Number of appearances

Fig. 5. Input fuzzy sets.

With the answers to these questions, a set of values is obtained. These values are the inputs to a fuzzy logic system, a Term Weight Generator. The Fuzzy Logic system output sets the weight of an index term for each hierarchical level (Figure 4).

Fig. 4. Term Weighting using Fuzzy Logic.

Next it is described how to define the system input values associated with each of the four questions (Qi). Qi are the inputs to the Fuzzy Logic system

### *Question 1*

Term weight is partly associated to the question 'How often does an index term appear in other subsets?'. It is given by a value between 0 – if it appears many times – and 1 - if it does not appear in any other subset -. To define weights, we are considering the times that the most used terms in the whole set of knowledge appear. The list of the most used index terms is shown in Table 2.


Table 2. List of the most used words.

180 Fuzzy Logic – Algorithms, Techniques and Implementations

With the answers to these questions, a set of values is obtained. These values are the inputs to a fuzzy logic system, a Term Weight Generator. The Fuzzy Logic system output sets the

Next it is described how to define the system input values associated with each of the four

Term weight is partly associated to the question 'How often does an index term appear in other subsets?'. It is given by a value between 0 – if it appears many times – and 1 - if it does not appear in any other subset -. To define weights, we are considering the times that the most used terms in the whole set of knowledge appear. The list of the most used index

**Number of order Index term Number of appearances in the** 

Resources 12

1 Service 31 2 Services 18 3 Library 16 4 Research 15 5 Address 14 Student 14 7 Mail 13 Access 13 9 Electronic 12 Computer 12

12 Center 10 Education 10 Registration 10 Program 10

**accumulated set of knowledge** 

weight of an index term for each hierarchical level (Figure 4).

Fig. 4. Term Weighting using Fuzzy Logic.

*Question 1* 

terms is shown in Table 2.

Table 2. List of the most used words.

questions (Qi). Qi are the inputs to the Fuzzy Logic system

Provided that there are 1114 index terms defined in our case, we think that 1 % of these words must mark the border for the value 0 (11 words). Therefore, whenever an index term appears more than 12 times in other subsets, we will give it the value of 0. Associated values for every Topic are defined in Table 3.


Table 3. Input values associated to Q1 for topic hierarchic level.

Between 0 and 3 times appearing - approximately a third of the possible values - , we consider that an index term belongs to the so called HIGH set. Therefore, it is defined in its correspondant fuzzy set with uniformly distributed values between 0.7 and 1, as may be seen in Figure 5. Analogously, we distribute all values uniformly according to different fuzzy sets. Fuzzy sets are defined by linguistic variables LOW, MEDIUM and HIGH. Fuzzy sets are triangular, on one hand for simplicity and on the other hand because we tested other more complex types of sets (Gauss, Pi type, etc), but the results did not improve at all.

Fig. 5. Input fuzzy sets.

On the other hand, given that at each hierarchical level, a different term weight is defined, it is necessary to consider other scales to calculate the fuzzy system input values for the other hierarchical levels. As for the level of topic was considered the top level - the whole set of knowledge - , for the level of Section we consider the number occurrences of an index term on a given topic. Keeping in mind that all topics are considered, we take as reference the

Term Weighting for Information Retrieval Using Fuzzy Logic 183

For example, the developer of a web page would only have to answer "Yes", "Rather" or

Finally, question 4 deals with the number of index terms joined to another one. If an index term is joined to another one, its weight is lower. This is due to the fact that the term must be a join term to refer to the object in question. We propose term weight values for this question in Table 9. Again, the values 0.7 and 0.3 are a consequence of considering the

After considering all these factors, fuzzy rules must be defined. In the case of Topic and Section hierarchical levels, we must consider the four input values that are associated with questions Q1, Q2, Q3 and Q4. Four output fuzzy sets have been also defined: HIGH, MEDIUM-HIGH, MEDIUM-LOW AND LOW. For the definition of the fuzzy rules for the





The combination of the four inputs and the three input fuzzy sets provides 81 possible

In the object level (the last hierarchic level), Question 2 is discarded. Therefore, there is a change in the rules, although the criteria for the definition of fuzzy rules are similar to the

An example of the followed methodology is shown below. A comparision with the classical TF-IDF is done, starting from the definition of an object in the database of the Web portal of

Associated value 1 0.5 0

"No" to Question 3, without complicated mathematical formulas to describe it.

**Joined terms to an index term** 0 1 2 ≥ 3 Associated value 1 0.7 0.3 0

Term Weighting system, we have used basically the following criteria:

little importance (low Q3) or it is joined to many terms (low Q4).

previous case. An input less reduces the number of rules to twenty seven.

particular object, it is much easier to find the object.

terms. This fact causes a lower output value.

combinations, which are summarized in Table 10.

**3.2 Example of the followed methodology** 

Yes Rather No

**Answer (Does the term itself define the** 

Table 8. Input values associated to Q3.

Table 9. Input values associated to Q3.

for extracting information.

border between fuzzy sets.

**Object?)** 

*Question 4* 

value of the topic in which the index term appears more often. The process is analogous to the above described, obtaining the values shown in Table 4.


Table 4. Input values associated to Q1 for section hierarchic level.

To find the term weight associated with the object level, the method is slightly different. It is also based on the definition of fuzzy sets, but we do not take into account the maximum number of words per secion, but the value associated to Q1 directly passes the border between fuzzy sets when the number of objects in which it appears increases in one unit, as seen in Table 5.


Table 5. Input values associated to Q1 for object hierarchic level.

### *Question 2*

To find the imput value to the FL system of FL with question 2, the reasoning is analogous to the one for Q1, Though, we only have to consider the frequency of occurance of an index term within a single subset of knowledge, and not the frequency of occurrence in other subsets. Logically, the more times a term appears in a subset, the greater the probability that the query is related to it. Question Q2 corresponds to the TF factor.

Looking again at the list of index terms used in a topic, we obtain the values shown in Tables 6 and 7. It has been taken into account that the more times an index term appears in a topic or section, the greater should be the input value. These tables correspond to the values for the hierarchical levels of Topic and Section, respectively.


Table 6. Input values associated to Q2 for topic hierarchic level.


Table 7. Input values associated to Q2 for section hierarchic level.

Q2 is meaningless to determine the input value for the last hierarchical level. At this level, an index term appears only once on every object.

### *Question 3*

For Question 3, the answer is completely subjective. In this chapter, we propose the values "Yes", "Rather" and "No". Table 8, shows the input values associated with Q3. This value is independent of hierarchical level.


Table 8. Input values associated to Q3.

For example, the developer of a web page would only have to answer "Yes", "Rather" or "No" to Question 3, without complicated mathematical formulas to describe it.

### *Question 4*

182 Fuzzy Logic – Algorithms, Techniques and Implementations

value of the topic in which the index term appears more often. The process is analogous to

To find the term weight associated with the object level, the method is slightly different. It is also based on the definition of fuzzy sets, but we do not take into account the maximum number of words per secion, but the value associated to Q1 directly passes the border between fuzzy sets when the number of objects in which it appears increases in one unit, as

To find the imput value to the FL system of FL with question 2, the reasoning is analogous to the one for Q1, Though, we only have to consider the frequency of occurance of an index term within a single subset of knowledge, and not the frequency of occurrence in other subsets. Logically, the more times a term appears in a subset, the greater the probability that

Looking again at the list of index terms used in a topic, we obtain the values shown in Tables 6 and 7. It has been taken into account that the more times an index term appears in a topic or section, the greater should be the input value. These tables correspond to the values

Number of appearances 1 2 3 4 5 ≥ 6 Associated value 0 0.3 0.45 0.6 0.7 1

Number of appearances 1 2 3 4 5 ≥ 6 Associated value 0 0.3 0.45 0.6 0.7 1

Q2 is meaningless to determine the input value for the last hierarchical level. At this level,

For Question 3, the answer is completely subjective. In this chapter, we propose the values "Yes", "Rather" and "No". Table 8, shows the input values associated with Q3. This value is

Associated value 1 0.7 0.6 0.5 0.4 0.3 0

Number of appearances 0 1 2 ≥ 3 Associated value 1 0.7 0.3 0

0 1 2 3 4 5 ≥ 6

the above described, obtaining the values shown in Table 4.

Table 4. Input values associated to Q1 for section hierarchic level.

Table 5. Input values associated to Q1 for object hierarchic level.

the query is related to it. Question Q2 corresponds to the TF factor.

for the hierarchical levels of Topic and Section, respectively.

Table 6. Input values associated to Q2 for topic hierarchic level.

Table 7. Input values associated to Q2 for section hierarchic level.

an index term appears only once on every object.

independent of hierarchical level.

Number of appearances

seen in Table 5.

*Question 2* 

*Question 3* 

Finally, question 4 deals with the number of index terms joined to another one. If an index term is joined to another one, its weight is lower. This is due to the fact that the term must be a join term to refer to the object in question. We propose term weight values for this question in Table 9. Again, the values 0.7 and 0.3 are a consequence of considering the border between fuzzy sets.


Table 9. Input values associated to Q3.

After considering all these factors, fuzzy rules must be defined. In the case of Topic and Section hierarchical levels, we must consider the four input values that are associated with questions Q1, Q2, Q3 and Q4. Four output fuzzy sets have been also defined: HIGH, MEDIUM-HIGH, MEDIUM-LOW AND LOW. For the definition of the fuzzy rules for the Term Weighting system, we have used basically the following criteria:


The combination of the four inputs and the three input fuzzy sets provides 81 possible combinations, which are summarized in Table 10.

In the object level (the last hierarchic level), Question 2 is discarded. Therefore, there is a change in the rules, although the criteria for the definition of fuzzy rules are similar to the previous case. An input less reduces the number of rules to twenty seven.

### **3.2 Example of the followed methodology**

An example of the followed methodology is shown below. A comparision with the classical TF-IDF is done, starting from the definition of an object in the database of the Web portal of

Term Weighting for Information Retrieval Using Fuzzy Logic 185





**Method** - - - - 1

**Method** - - - - 0.01

**Method** - - - - 0.01

We may see the difference with the corresponding weight for the TF-IDF method - a value Wik = 0.01 had been obtained), but this is just what we were looking for: not only the desired object is found, but also the ones that are more closely related to it. The word 'library' has a small weight for the TF-IDF method because it can not distinguish between the objects of Section 6.3. However, in this case all the objects will be retrieved, as they are interrelated. The weights of other terms determine the object which has a higher level of certainty.

Tests were held on the website of the University of Seville. 253 objects were defined, and grouped in a hierarchical structure, with 12 topics. Every topic has a variable number of sections and objects. From these 253 objects, 2107 standard questions were extracted. More

0.53 1 0.35 0.66 0.56

0 0.6 0.375 0.63 0.13

0 - 0.5 0.57 0.33

**value** 

**Term Weight** 


A summary of the values for the index term 'library' is shown in Table 11.

**Hierarchic levels Q1 value Q2 value Q3 value Q4** 

At Object hierarchich level:

associated to Q1 is 0.

associated to Q4 is 0.57.

**TF-IDF** 

**Fuzzy Logicbased method** 

**TF-IDF** 

**Fuzzy Logicbased method** 

**TF-IDF** 

**Fuzzy Logicbased method** 

Table 11. Comparison of Term Weight values.

**Topic level (Topic 6)** 

**Section level (Section 3)** 

**Object level (Object 1)** 

**4. Tests and results** 

**4.1 General tests** 


the University of Seville. The following example shows the difference between applying the TF-IDF method and applying the Fuzzy Logic-based one.

Table 10. Fuzzy rules.

In the Web portal database, Object 6.3.1 (http://bib.us.es/index-ides-idweb.html) is defined by the following standard question:

*What online services are offered by the Library of the University of Seville?* 

If we consider the term 'library':

At Topic hierarchic level:


At Section hierarchic level:



At Object hierarchich level:

184 Fuzzy Logic – Algorithms, Techniques and Implementations

the University of Seville. The following example shows the difference between applying the

R1 IF Q1 = HIGH and Q2 ≠ LOW At least MEDIUM-

R2 IF Q1 = MEDIUM and Q2 = HIGH At least MEDIUM-

R3 IF Q1 = HIGH and Q2 = LOW Depends on other

R4 IF Q1 = HIGH and Q2 = LOW Depends on other

R5 IF Q3 = HIGH At least MEDIUM-

R6 IF Q4 = LOW Descends a level

R8 IF (R1 and R2) or (R1 and R5) or (R2 and R5) HIGH R9 In any other case MEDIUM-LOW

In the Web portal database, Object 6.3.1 (http://bib.us.es/index-ides-idweb.html) is defined









value associated to Q3 is a weighted average: (7\*0.5 + 3\*0)/10 = 0.35.

HIGH

HIGH

Questions

Questions

HIGH

If the Output is MEDIUM-LOW, it descends to LOW

**Rule number Rule definition Output** 

TF-IDF method and applying the Fuzzy Logic-based one.

R7 IF Q4 = MEDIUM

*What online services are offered by the Library of the University of Seville?* 

Table 10. Fuzzy rules.

by the following standard question:

If we consider the term 'library':

value associated to Q1 is 0.53.

At Topic hierarchic level:

At Section hierarchic level:

associated to Q1 is 0.

Q3 is (3\*0.5 + 1\*0)/4 = 0.375.

value associated to Q4 is 0.63.


A summary of the values for the index term 'library' is shown in Table 11.


Table 11. Comparison of Term Weight values.

We may see the difference with the corresponding weight for the TF-IDF method - a value Wik = 0.01 had been obtained), but this is just what we were looking for: not only the desired object is found, but also the ones that are more closely related to it. The word 'library' has a small weight for the TF-IDF method because it can not distinguish between the objects of Section 6.3. However, in this case all the objects will be retrieved, as they are interrelated. The weights of other terms determine the object which has a higher level of certainty.

### **4. Tests and results**

### **4.1 General tests**

Tests were held on the website of the University of Seville. 253 objects were defined, and grouped in a hierarchical structure, with 12 topics. Every topic has a variable number of sections and objects. From these 253 objects, 2107 standard questions were extracted. More

Term Weighting for Information Retrieval Using Fuzzy Logic 187




**Method Cat1 Cat2 Cat3 Cat4 Cat5 Total** 

The results obtained with the TF-IDF method are quite reasonable. 81.18% of the objects are retrieved among the top 5 choices and more than half of the objects are retrieved in the first place, Fuzzy Logic-based method is clearly better. 92.45% of the objects are retrieved and

In order to refine the conclusions about both Term Weighting methods, it is important to make a more thorough analysis of the results. We submitted to both Term Weighting methods to a comprehensive analysis according to the type of standard question. Results are

According to the results, the TF-IDF method works relatively well considering the number of objects retrieved. Though, the Fuzzy Logic-based method is more precise, retrieving 91.67% of the objects in the first place. On the other hand, good results for this type of questions are logical, since questions correspond to supposedly well-made user queries.

For synonymous standard questions, the conclusions are similar: the results obtained using the Fuzzy Logic-based method are better than those achieved with TF-IDF method, especially in regard to precision. Though, the TF-IDF method also ensures good results. However, queries are not precise, so the performance is worse for the TF-IDF method than it is for the Fuzzy Logic-based method. This fact gives an idea of fuzzy logic as an ideal tool for adding more flexibility to the system. Anyway, the results are quite similar to those obtained for the main standard questions. They are only slightly worse, since synonim

The difference is even more noticeable in regard to imprecise standard questions and specific standard questions. Imprecise standard questions are detected nearly as well as the main standard questions in the case of Fuzzy Logic-based method. This is another reason to confirm the appropriateness of using Fuzzy Logic. As for the specific standard questions, we

Table 13. Information Retrieval results of using both Term Weighting methods.

466 (50.98%) 223 (24.40%) 53 (5.80%) 79 (8.64%) 93 (10.18%) 914

710 (77.68%) 108 (11.82%) 27 (2.95%) 28 (3.06%) 41 (4.49%) 914

degree of certainty -excluding the previous case -.

degree of certainty - excluding the previous cases -.

more than three-quarters are retrieved in the first place.

**4.2 Tests according to the type of standard questions** 

standard questions are similar to the main standard questions.


higher degree of certainty.

Results are shown in Table 13

**TF-IDF method** 

**FL-based method** 

shown in the Table 14.

than half of them were not used for these tests, as they were similar to others and did not contribute much to the results. Finally, the number of standard questions used for the tests was 914. Also, several types of standard questions were defined.

Depending on the nature of the considered object, we defined different types of standard questions, such as:



For our tests, we considered the types of standard questions shown in Table 12.

Table 12. Types of standard questions.

The standard questions were used as inputs in a Fuzzy Logic-based system. The outputs of the system are the objects with a degree a certainty greater than a certain threshold. To compare results, we considered the position in which the correct answer appears among the total number of answers identified as probable.

First of all, we shall define the thresholds to overcome in the Fuzzy Logic system. Thus, topics and sections that are not related to the object to be identified are removed. This is one of the advantages of using a hierarchical structure. Processing time is better as many subsets of knowledge are discarded. Anyway, it is desirable not to discard too many objects, in order to also obtain the related ones. The ideal is to retrieve between one and five answers for the user. The results of the consultation were sorted in 5 categories:


Results are shown in Table 13

186 Fuzzy Logic – Algorithms, Techniques and Implementations

than half of them were not used for these tests, as they were similar to others and did not contribute much to the results. Finally, the number of standard questions used for the tests

Depending on the nature of the considered object, we defined different types of standard





The standard questions were used as inputs in a Fuzzy Logic-based system. The outputs of the system are the objects with a degree a certainty greater than a certain threshold. To compare results, we considered the position in which the correct answer appears among the

First of all, we shall define the thresholds to overcome in the Fuzzy Logic system. Thus, topics and sections that are not related to the object to be identified are removed. This is one of the advantages of using a hierarchical structure. Processing time is better as many subsets of knowledge are discarded. Anyway, it is desirable not to discard too many objects, in order to also obtain the related ones. The ideal is to retrieve between one and five answers

for the user. The results of the consultation were sorted in 5 categories:

For our tests, we considered the types of standard questions shown in Table 12.

**Type of standard question Number of questions** 

*Main standard questions* 252 *Synonim standard questions* 308 *Imprecise standard questions* 125 *Specific standard questions* 229 *Feedback standard questions* 0 **Total standard questions 914** 

Table 12. Types of standard questions.

total number of answers identified as probable.

was 914. Also, several types of standard questions were defined.

questions, such as:

questions.

used.

questions are optional.

them synonim standard questions.


Table 13. Information Retrieval results of using both Term Weighting methods.

The results obtained with the TF-IDF method are quite reasonable. 81.18% of the objects are retrieved among the top 5 choices and more than half of the objects are retrieved in the first place, Fuzzy Logic-based method is clearly better. 92.45% of the objects are retrieved and more than three-quarters are retrieved in the first place.

### **4.2 Tests according to the type of standard questions**

In order to refine the conclusions about both Term Weighting methods, it is important to make a more thorough analysis of the results. We submitted to both Term Weighting methods to a comprehensive analysis according to the type of standard question. Results are shown in the Table 14.

According to the results, the TF-IDF method works relatively well considering the number of objects retrieved. Though, the Fuzzy Logic-based method is more precise, retrieving 91.67% of the objects in the first place. On the other hand, good results for this type of questions are logical, since questions correspond to supposedly well-made user queries.

For synonymous standard questions, the conclusions are similar: the results obtained using the Fuzzy Logic-based method are better than those achieved with TF-IDF method, especially in regard to precision. Though, the TF-IDF method also ensures good results. However, queries are not precise, so the performance is worse for the TF-IDF method than it is for the Fuzzy Logic-based method. This fact gives an idea of fuzzy logic as an ideal tool for adding more flexibility to the system. Anyway, the results are quite similar to those obtained for the main standard questions. They are only slightly worse, since synonim standard questions are similar to the main standard questions.

The difference is even more noticeable in regard to imprecise standard questions and specific standard questions. Imprecise standard questions are detected nearly as well as the main standard questions in the case of Fuzzy Logic-based method. This is another reason to confirm the appropriateness of using Fuzzy Logic. As for the specific standard questions, we

Term Weighting for Information Retrieval Using Fuzzy Logic 189

Obviously, groups 1 and 2 are more numerous, since it is less common that many questions have the same response. However, the objects from the groups 3 and 4 correspond to a wide range of standard questions, so they are equally important. In Table 15 the number of

Group 1 1 95 Group 2 2-5 108 Group 3 6-10 22 Group 4 > 10 28

To analyze the results, the position in which the required object is retrieved must be considered. We consider the retrieval of most of the standard questions that define that object. For example, if an object is defined by 15 standard questions and, for 10 of them, the object is retrieved in second place, it is considered that the object has actually been retrieved

In short, this study does not focus on the answers to standard questions, but on the correctly retrieved objects. This provides a new element for the system analysis. Results are shown in

For group 1, the results are almost perfect for the Fuzzy Logic-based method, as nearly all the objects are retrieved in the first place (about 94%). However, the TF-IDF method, though not as accurate, resists the comparison. This behaviour is repeated in group 2. The objects are often retrieved by both methods among the top three items. Though, the Fuzzy Logicbased method is better for its accuracy, retrieving over 92% of the objects in the first place. In view of the tests, we conclude that the results are very good for both methods when up to five standard questions are defined. Although the results are better for the novel Fuzzy Logic-based Term Weighting method, they are also quite reasonable for the classical TF-IDF

However, the largest advantage of using Fuzzy Logic for Term Weighting occurs when many standard questions per object are defined, i.e. when the information is confusing, disordered or imprecise. For the case of group 3, where objects are defined by among six and ten standard questions per object type, we observe that there is a significant difference between the TF-IDF classical method and the proposed Fuzzy Logic-based method. Although both methods retrieve all the objects, there is a big difference in the way they are retrieved, especially on the accuracy of the information extraction. 86% of the objects are retrieved in first place using the Fuzzy Logic-based method, while only 45% using the TF-

**object Number of objects** 


**Group number Number of standard questions per** 

Table 15. Groups according to the number of standard questions per object.

objects for each of these groups is defined.

in second place.

Term Weighting method.

IDF classical method.

Table 16.


Table 14. Information Retrieval results of using both Term Weighting methods, according to the type of standard question.

get the worst result by far among all classes of standard questions. This is a logical fact, considering that these questions are associated with the main standard question, but it is more concrete. In fact, it is usual for such specific questions to belong to a list within a whole. This way, there may be objects that are more related to the query than the requiered object itself. This is hardly a drawback, since both objects are retrieved to the user - the more specific one and the more general one -. The own user must choose which one is the most accurate. This case shows more clearly that the fact of using Fuzzy Logic allows the user to extract a larger number of objects.

### **4.3 Tests according to the number of standard questions**

Another aspect to consider in the analysis of the results is the number of standard questions assigned to every object. Obviously, an object that is well defined by a single standard question is very specific. Thus, it is easy to extract the object from the complete set of knowledge. However, there are objects that contain very vague or imprecise information, making it necessary to define several standard questions for every object. For this study, the objects are grouped into the following:

188 Fuzzy Logic – Algorithms, Techniques and Implementations

**question Cat1 Cat2 Cat3 Cat4 Cat5 Total** 

(23.02%) 6 (2.38%) 6 (2.38%) 11 (4.37%) 252

(27.92%) 13 (4.22%) 15 (4.87%) 17 (5.52%) 308

(13.31%) 3 (0.97%) 5 (1.62%) 47 (2.27%) 308

(25.60%) 6 (4.80%) 1 (0.80%) 12 (9.60%) 125

(23.14%) 24 (10.48%) 23(10.04%) 22 (9.61%) 229

(22.71%) <sup>229</sup>

(91.67%) 13 (5.16%) 2 (0.79%) 0 (0.00 %) 6 (2.38%) 252

(88.80%) 5 (4.00%) 0 (0.00 %) 0 (0.00 %) 9 (7.20%) 125

(21.40%) 26(11.35%) 55(24.01%) <sup>52</sup>

58

86

41

32

49

53

Table 14. Information Retrieval results of using both Term Weighting methods, according to

get the worst result by far among all classes of standard questions. This is a logical fact, considering that these questions are associated with the main standard question, but it is more concrete. In fact, it is usual for such specific questions to belong to a list within a whole. This way, there may be objects that are more related to the query than the requiered object itself. This is hardly a drawback, since both objects are retrieved to the user - the more specific one and the more general one -. The own user must choose which one is the most accurate. This case shows more clearly that the fact of using Fuzzy Logic allows the user to

Another aspect to consider in the analysis of the results is the number of standard questions assigned to every object. Obviously, an object that is well defined by a single standard question is very specific. Thus, it is easy to extract the object from the complete set of knowledge. However, there are objects that contain very vague or imprecise information, making it necessary to define several standard questions for every object. For this study, the

**Type of standard** 

TF-IDF Method

Fuzzy Logicbased method

TF-IDF Method

Fuzzy Logicbased method

TF-IDF Method

Fuzzy Logicbased method

TF-IDF Method

Fuzzy Logicbased method

the type of standard question.

extract a larger number of objects.

objects are grouped into the following:

171 (67.86%)

231

177 (57.46%)

252 (81.82%)

74 (59.20%)

111

46 (20.08%)

107 (46.72%)

**4.3 Tests according to the number of standard questions** 

Main standard questions

Synonim standard questions

Imprecise standard questions

Specific standard questions


Obviously, groups 1 and 2 are more numerous, since it is less common that many questions have the same response. However, the objects from the groups 3 and 4 correspond to a wide range of standard questions, so they are equally important. In Table 15 the number of objects for each of these groups is defined.


Table 15. Groups according to the number of standard questions per object.

To analyze the results, the position in which the required object is retrieved must be considered. We consider the retrieval of most of the standard questions that define that object. For example, if an object is defined by 15 standard questions and, for 10 of them, the object is retrieved in second place, it is considered that the object has actually been retrieved in second place.

In short, this study does not focus on the answers to standard questions, but on the correctly retrieved objects. This provides a new element for the system analysis. Results are shown in Table 16.

For group 1, the results are almost perfect for the Fuzzy Logic-based method, as nearly all the objects are retrieved in the first place (about 94%). However, the TF-IDF method, though not as accurate, resists the comparison. This behaviour is repeated in group 2. The objects are often retrieved by both methods among the top three items. Though, the Fuzzy Logicbased method is better for its accuracy, retrieving over 92% of the objects in the first place. In view of the tests, we conclude that the results are very good for both methods when up to five standard questions are defined. Although the results are better for the novel Fuzzy Logic-based Term Weighting method, they are also quite reasonable for the classical TF-IDF Term Weighting method.

However, the largest advantage of using Fuzzy Logic for Term Weighting occurs when many standard questions per object are defined, i.e. when the information is confusing, disordered or imprecise. For the case of group 3, where objects are defined by among six and ten standard questions per object type, we observe that there is a significant difference between the TF-IDF classical method and the proposed Fuzzy Logic-based method. Although both methods retrieve all the objects, there is a big difference in the way they are retrieved, especially on the accuracy of the information extraction. 86% of the objects are retrieved in first place using the Fuzzy Logic-based method, while only 45% using the TF-IDF classical method.

Term Weighting for Information Retrieval Using Fuzzy Logic 191

neuro-fuzzy techniques represent a very interesting field, as they combine human reasoning provided by Fuzzy Logic and the connection-based structure of Artificial Neural Networks, taking advantage of both techniques. One possible application is the creation of fuzzy rules

Another possible future direction is to check the validity of this method in other

The difficulty to distinguish the necessary information from the huge quantity of unnecessary data has enhanced the use of Information Retrieval recently. Especially, the socalled Vector Space Model is much extended. Vector Space Model is based on the use of index terms. These index terms are associated with certain weights, which represent the importance of these terms in the considered set of knowledge. In this chapter, we propose the development of a novel automatic Fuzzy Logic-based Term Weighting method for Vector Space Model. This method improves the TF-IDF Term Weighting classic method for its flexibility. The use of Fuzzy Logic is very appropiate in heterogeneous, vague, imprecise,

Fuzzy Logic-based method is similar to TF-IDF, but also considers two aspects that the TF-IDF does not: the degree of identification of the object if a determined index term is solely used in a query; and the existance of join index terms. Term Weighting is automatic. The level of expertise required is low, so there is no need for an operator of any kind of knowledge about Fuzzy Logic. Therefore, an operator only has to know how many times an

Although the results obtained with the TF-IDF method are quite reasonable, Fuzzy Logicbased method is clearly superior. Especially when user queries are not equal to the standard query or they are imprecise, we observe that the performance declines more for the TF-IDF method than for the Fuzzy Logic-based method. This fact gives us an idea of how suitable is

Lertnattee, V. & Theeramunkong, T. (2003). Combining homogenous classifiers for centroid-

Lee, D.L., Chuang, H., Seamons, K., 1997. *Document ranking and the vector-space model*. IEEE

Liu, S., Dong, M., Zhang, H., Li, R. & Shi, Z. (2001). An approach of multi-hierarchy text

Raghavan, V.V. & Wong, S.K. (1986). A critical analysis of vector space model for

Ruiz, M. & Srinivasan, P. (1998). Automatic Text Categorization Using Neural Networks*.*

based text classification. *Proceedings of the 7th International Symposium on Computers* 

classification. *Proceedings of the International Conferences on Info-tech and Info-net*.

information retrieval. *Journal of the American Society for Information Science*, Vol.37

*Advances in Classification Research vol. 8: Proceedings of the 8th ASIS SIG/CR* 

index term appears in a certain subset and the answer to two simple questions.

the use of Fuzzy Logic to add more flexibility to an Information Retrieval system.

by means of an Artificial Neural Network system.

or not in order information environments.

*and Communications*, pp. 1034-1039.

Beijing. Vol 3, pp. 95 – 100.

(5), p. 279-87.

Software, Vol. 14, Issue 2, pp. 67 – 75.

**6. Conclusion** 

**7. References** 

environments containing inaccurate, vague and heterogeneous data.


Table 16. Information Retrieval results of using both Term Weighting methods, according to the number of standard questions per object.

The difference is even more marked when more than ten standard questions per object are defined. In this case, it is obvious that none of the questions clearly define the object, so that information is clearly vague. While using the Fuzzy Logic-based method, more than 96% of the objects are retrieved - with 75% of them in the first place -, with the TF-IDF method correctly, only 82% of the objects are retrieved. Furthermore, only 35.7% of these objects are extracted in the first place.

In view of the table, we observe that the more standard questions per object, the better the results of the Fuzzy Logic-based method, compared with those obtained with the classical TF-IDF method. Therefore, the obvious conclusion is that the more convoluted, messy and confusing is the information, the better the Fuzzy Logic-based Term Weighting method is compared to the classical one. This makes Fuzzy Logic-based Term Weighting an ideal tool for the case of information extraction in a web portal.

### **5. Future research directions**

We suggest the application of other Computational Intelligence techniques apart from Fuzzy Logic for Term Weighting. Among these techniques, we believe that the so-called neuro-fuzzy techniques represent a very interesting field, as they combine human reasoning provided by Fuzzy Logic and the connection-based structure of Artificial Neural Networks, taking advantage of both techniques. One possible application is the creation of fuzzy rules by means of an Artificial Neural Network system.

Another possible future direction is to check the validity of this method in other environments containing inaccurate, vague and heterogeneous data.

### **6. Conclusion**

190 Fuzzy Logic – Algorithms, Techniques and Implementations

**Cat1 Cat2 Cat3 Cat4 Cat5 Total** 

74 (77.89%) 16 (16.84%) 1 (1.05%) 1 (1.05%) 3 (3.16%) 95

89 (93.68%) 3 (3.16%) 2 (2.10%) 0 (0.00 %) 1 (1.05%) 95

86 (79.63%) 21 (19.44%) 1 (0.93%) 0 (0.00 %) 0 (0.00 %) 108

100 (92.59%) 7 (6.48%) 0 (0.00 %) 0 (0.00 %) 1 (0.93%) 108

10 (45.45%) 9 (40.91%) 3 (13.63%) 0 (0.00 %) 0 (0.00 %) 22

19 (86.36%) 3 (13.63%) 0 (0.00 %) 0 (0.00 %) 0 (0.00 %) 22

10 (35.71%) 10 (35.71%) 3 (10.71%) 2 (7.14%) 3 (10.71%) 28

21 (75.00%) 4 (14.29%) 1 (3.57%) 1 (3.57%) 1 (3.57%) 28

Table 16. Information Retrieval results of using both Term Weighting methods, according to

The difference is even more marked when more than ten standard questions per object are defined. In this case, it is obvious that none of the questions clearly define the object, so that information is clearly vague. While using the Fuzzy Logic-based method, more than 96% of the objects are retrieved - with 75% of them in the first place -, with the TF-IDF method correctly, only 82% of the objects are retrieved. Furthermore, only 35.7% of these objects are

In view of the table, we observe that the more standard questions per object, the better the results of the Fuzzy Logic-based method, compared with those obtained with the classical TF-IDF method. Therefore, the obvious conclusion is that the more convoluted, messy and confusing is the information, the better the Fuzzy Logic-based Term Weighting method is compared to the classical one. This makes Fuzzy Logic-based Term Weighting an ideal tool

We suggest the application of other Computational Intelligence techniques apart from Fuzzy Logic for Term Weighting. Among these techniques, we believe that the so-called

**Type of standard** 

TF-IDF Method

Fuzzy Logicbased method

TF-IDF Method

Fuzzy Logicbased method

TF-IDF Method

Fuzzy Logicbased method

TF-IDF Method

Fuzzy Logicbased method

extracted in the first place.

**5. Future research directions** 

the number of standard questions per object.

for the case of information extraction in a web portal.

**question** 

**Group 1** 

**Group 2** 

**Group 3** 

**Group 4** 

The difficulty to distinguish the necessary information from the huge quantity of unnecessary data has enhanced the use of Information Retrieval recently. Especially, the socalled Vector Space Model is much extended. Vector Space Model is based on the use of index terms. These index terms are associated with certain weights, which represent the importance of these terms in the considered set of knowledge. In this chapter, we propose the development of a novel automatic Fuzzy Logic-based Term Weighting method for Vector Space Model. This method improves the TF-IDF Term Weighting classic method for its flexibility. The use of Fuzzy Logic is very appropiate in heterogeneous, vague, imprecise, or not in order information environments.

Fuzzy Logic-based method is similar to TF-IDF, but also considers two aspects that the TF-IDF does not: the degree of identification of the object if a determined index term is solely used in a query; and the existance of join index terms. Term Weighting is automatic. The level of expertise required is low, so there is no need for an operator of any kind of knowledge about Fuzzy Logic. Therefore, an operator only has to know how many times an index term appears in a certain subset and the answer to two simple questions.

Although the results obtained with the TF-IDF method are quite reasonable, Fuzzy Logicbased method is clearly superior. Especially when user queries are not equal to the standard query or they are imprecise, we observe that the performance declines more for the TF-IDF method than for the Fuzzy Logic-based method. This fact gives us an idea of how suitable is the use of Fuzzy Logic to add more flexibility to an Information Retrieval system.

### **7. References**


**10** 

Amin Parvizi *University of Malaya,* 

*Malaysia* 

**Artificial Intelligence Techniques of Estimating** 

Switched reluctance motor (SRM) is one of the best candidates for industrial and household applications. Owing to its superior abilities such as high torque to inertia ratio, easy cooling, high speed capability and ease of repair, SRM has been taken into consideration by researchers. one of the major difficulties is the nonlinear relation between current, rotor position and flux linkage. Due to the mentioned nonlinearity, it is essential to have an accurate model to deal with nonlinear characteristics of SRM. The essence of this research work is to develop the SRM model based on artificial techniques (AI) such as fuzzy logic, adaptive neuro-fuzzy. In the papers (Chancharoensook& Rahman,2002;Geldhof&Van den Bossche& Vyncke&Melkebeek,2008; Mirzaeian-Dehkordi& Moallem, 2006; Gobbi, Ramar;2008; Rajapakse& Gole& Muthumuni& Wilson& Perregaux,2004; Wai-Chuen Gan& Cheung& Li Qiu,2008) SRM models

In this short communication, rule based system are considered in order to find a model to deal with nonlinear characteristics of SRM. We call it rule based due to have fixed data point. Fuzzy logic and adaptive neuro-fuzzy are employed to develop a comprehensive model for nonlinear characteristics of 8:6 SRM. Torque profile is simulated based on fuzzy logic, adaptive neuro-fuzzy techniques via MATLAB software. In the line above, error analysis is conducted for those models. Data is tabled and compared with the published data. The result of error analysis reflects the precision of the method and the capability of

Switched reluctance motor (SRM) is a type of synchronous machine. Figure 1 shows the classification of the SRM. This initial classification is made by considering the method of

Stator and rotor are two basic parts of SRM. One of the most important features of the SRM comes back to its simple structure. This type of electrical machine has no winding or magnet in rotor part. Both of stator and rotor have salient poles. Thus, it is named double salient

**1. Introduction** 

presented based on the look-up tables.

the approach for the further simulation.

machine. Figure 2 shows the typical structure of SRM.

**2. Background theory** 

movement.

**of Torque for 8:6 Switched Reluctance Motor** 

*Classification Research Workshop*. Ed. Efthimis Efthimiadis. Information Today, Medford:New Jersey, pp 59-72.

Salton, G. (1988). *Automatic Text Processing*. Addison-Wesley Publishing Company.

Salton, G. & Buckley, C. (1996). Term Weighting Approaches in Automatic Text Retrieval*. Technical Report TR87-881, Department of Computer Science, Cornell University, 1987. Information Processing and Management* Vol.32 (4), pp. 431-443.

Van Rijsbergen, C.J. (1979). *Information retrieval.* Butterworths.

Zhao, Y. & Karypis, G. (2002). Improving precategorized collection retrieval by using supervised term weighting schemes. *Proceedings of the International Conference on Information Technology: Coding and Computing*, pp 16 – 21.

## **Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor**

Amin Parvizi *University of Malaya, Malaysia* 

### **1. Introduction**

192 Fuzzy Logic – Algorithms, Techniques and Implementations

Salton, G. & Buckley, C. (1996). Term Weighting Approaches in Automatic Text Retrieval*.*

Zhao, Y. & Karypis, G. (2002). Improving precategorized collection retrieval by using

Salton, G. (1988). *Automatic Text Processing*. Addison-Wesley Publishing Company.

*Information Processing and Management* Vol.32 (4), pp. 431-443.

*Information Technology: Coding and Computing*, pp 16 – 21.

Medford:New Jersey, pp 59-72.

Van Rijsbergen, C.J. (1979). *Information retrieval.* Butterworths.

*Classification Research Workshop*. Ed. Efthimis Efthimiadis. Information Today,

*Technical Report TR87-881, Department of Computer Science, Cornell University, 1987.* 

supervised term weighting schemes. *Proceedings of the International Conference on* 

Switched reluctance motor (SRM) is one of the best candidates for industrial and household applications. Owing to its superior abilities such as high torque to inertia ratio, easy cooling, high speed capability and ease of repair, SRM has been taken into consideration by researchers. one of the major difficulties is the nonlinear relation between current, rotor position and flux linkage. Due to the mentioned nonlinearity, it is essential to have an accurate model to deal with nonlinear characteristics of SRM. The essence of this research work is to develop the SRM model based on artificial techniques (AI) such as fuzzy logic, adaptive neuro-fuzzy. In the papers (Chancharoensook& Rahman,2002;Geldhof&Van den Bossche& Vyncke&Melkebeek,2008; Mirzaeian-Dehkordi& Moallem, 2006; Gobbi, Ramar;2008; Rajapakse& Gole& Muthumuni& Wilson& Perregaux,2004; Wai-Chuen Gan& Cheung& Li Qiu,2008) SRM models presented based on the look-up tables.

In this short communication, rule based system are considered in order to find a model to deal with nonlinear characteristics of SRM. We call it rule based due to have fixed data point. Fuzzy logic and adaptive neuro-fuzzy are employed to develop a comprehensive model for nonlinear characteristics of 8:6 SRM. Torque profile is simulated based on fuzzy logic, adaptive neuro-fuzzy techniques via MATLAB software. In the line above, error analysis is conducted for those models. Data is tabled and compared with the published data. The result of error analysis reflects the precision of the method and the capability of the approach for the further simulation.

### **2. Background theory**

Switched reluctance motor (SRM) is a type of synchronous machine. Figure 1 shows the classification of the SRM. This initial classification is made by considering the method of movement.

Stator and rotor are two basic parts of SRM. One of the most important features of the SRM comes back to its simple structure. This type of electrical machine has no winding or magnet in rotor part. Both of stator and rotor have salient poles. Thus, it is named double salient machine. Figure 2 shows the typical structure of SRM.

Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 195

Fig. 3. Operation of switched reluctance motor a)Phase c aligned b) Phase a aligned

potential contour.

**4. Single – Phase SRMs** 

in unaligned position.

toward itself. This process subsequently will be continued.

Following figure shows the lamination profile of 8:6 SRM in align position with magnetic

Fig. 4. Lamination profile of 8:6 SRM (Parvizi, Hassani, Mehbodnia, Makhilef, & Tamjis, 2009)

During the past years, single-phase SRMs have attracted much attention due to resemblance to universal and single phase induction machines and also, single-phase SRMs are low-cost manufacture as well as induction and universal machines. Specific applications of singlephase SRMs come up in where high-speed motors are needed. When the stator and rotor poles are in front of each other which means the align position, the current that relevant to stator phase is turned off and the rotor keeps moving toward the adjacent stator pole due to kinetic energy which is stored. Adjacent stator phase is energized to attract the rotor pole

The major problem of single-phase SRMs operation come up when the rotor and stator are in align position at the instant of starting or the rotor at a position where the load torque at the starting is greater than the produced load. Permanent magnet has been used as a solution. It pulls the rotor away from the stator or at the right position in which motor can produce a torque greater than the load torque. As the figure 5 shows the rotor and stator are

Fig. 1. Classification of the SRM

The number under the configuration (6/4 or 8/6) means SRM with 6 or 8 poles on stator and 4 or 6 poles on rotor.

### **3. Operation of the SRMs**

The key of understanding rotor movement is rising from the tendency of rotor to place in minimum reluctance position at the instance of excitation. While two rotor poles are in front of two stator poles, called align position. In align position; another set of rotor pole is out of alignment position there for another set of stator pole will be excited to move the rotor poles until the time to reach minimum reluctance. Figure 3 shows a 6:4 SRM. In the figure, at the first situation, suppose that �� and �� are two poles of rotor and in align position with *́ c* and ć which are the stator poles. When *a* is excited in the direction that is shown, stator poles tends to pull the rotor poles toward itself. Therefore, ��and ��́ are in front of the a and á, respectively. After they are aligned, the stator current is turned off and the corresponding situation is shown in Figure 3(b). Now, b is excited and pulls the ��and �� toward b ́ ́ and b, respectively. Hence, the rotor is rotating in a clockwise direction.

Fig. 3. Operation of switched reluctance motor a)Phase c aligned b) Phase a aligned

Following figure shows the lamination profile of 8:6 SRM in align position with magnetic potential contour.

Fig. 4. Lamination profile of 8:6 SRM (Parvizi, Hassani, Mehbodnia, Makhilef, & Tamjis, 2009)

### **4. Single – Phase SRMs**

194 Fuzzy Logic – Algorithms, Techniques and Implementations

The number under the configuration (6/4 or 8/6) means SRM with 6 or 8 poles on stator and

The key of understanding rotor movement is rising from the tendency of rotor to place in minimum reluctance position at the instance of excitation. While two rotor poles are in front of two stator poles, called align position. In align position; another set of rotor pole is out of alignment position there for another set of stator pole will be excited to move the rotor poles until the time to reach minimum reluctance. Figure 3 shows a 6:4 SRM. In the figure, at the first situation, suppose that �� and �� are two poles of rotor and in align position with *́ c* and ć which are the stator poles. When *a* is excited in the direction that is shown, stator poles tends to pull the rotor poles toward itself. Therefore, ��and ��́ are in front of the a and á, respectively. After they are aligned, the stator current is turned off and the corresponding situation is shown in Figure 3(b). Now, b is excited and pulls the ��and �� toward b ́ ́ and b,

respectively. Hence, the rotor is rotating in a clockwise direction.

Fig. 1. Classification of the SRM

4 or 6 poles on rotor.

Fig. 2. SRM configuration

**3. Operation of the SRMs** 

During the past years, single-phase SRMs have attracted much attention due to resemblance to universal and single phase induction machines and also, single-phase SRMs are low-cost manufacture as well as induction and universal machines. Specific applications of singlephase SRMs come up in where high-speed motors are needed. When the stator and rotor poles are in front of each other which means the align position, the current that relevant to stator phase is turned off and the rotor keeps moving toward the adjacent stator pole due to kinetic energy which is stored. Adjacent stator phase is energized to attract the rotor pole toward itself. This process subsequently will be continued.

The major problem of single-phase SRMs operation come up when the rotor and stator are in align position at the instant of starting or the rotor at a position where the load torque at the starting is greater than the produced load. Permanent magnet has been used as a solution. It pulls the rotor away from the stator or at the right position in which motor can produce a torque greater than the load torque. As the figure 5 shows the rotor and stator are in unaligned position.

Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 197

Figure 7 shows a magnetization curve under specific condition. Rotor angle is locked in somewhere between aligned and unaligned position. Energy and co-energy are defined in

�� = � ���(�� �) ��

 �� <sup>=</sup> � �(�� �)�� ��

 �(�� �) represents the flux linkage as a nonlinear function of rotor position and current. As figure 7 shows, the area behind the magnetization curve until ��called stored field energy (��) that this energy is stored in the iron core (rotor and stator) and in the air gap. The area

In the next step, suppose that rotor is released. In this situation, rotor moves toward the adjacent stator pole until place in align position. For an infinitesimal movementΔ� , suppose that �� is considered constant, thus the flux linkage changes from point A to point B as shown in figure 8. By considering the conservation of energy, the mechanical work Δ�� which has been done by rotor during the Δ� movement is equal to the change in the stored

The area Δ�� equals to the change in co-energy because of the Δ� movement. Thus, the

under the magnetization curve until �� called co-energy (�� ) .

Fig. 7. Concept of stored field energy and co-energy

field energy (Δ��).

Fig. 8. Mechanical work area

mechanical energy can be stated as following:

� (1)

� (2)

any point of that respectively by:

Fig. 5. Single-phase SRM with permanent magnet

Maximum duty cycle of single-phase SRM is 0.5, thus, noise and high ripple torque are deduced from a torque discontinuity which arises from duty cycle. Applications, in which torque ripple and noise are not important, are good for this machine such as home appliances and hand tool.

### **5. Magnetization characteristic of SRM**

Due to saturation and varying reluctance with rotor position, there is no simple analytical solution to express the field which is produced by phase winding. Energy conversion approach that is presented in is used to analyze energy conversion.

Figure 6 shows a typical magnetization curve. Flux linkage is a function of both rotor position and excitation current and also, it is nonlinear function. One of the most important parameter which affects on flux linkage is air gap. As it can be seen clearly, in unaligned position, flux linkage is a linear function due to big air gap. In other words, the gap between stator pole and rotor pole is big. In contrast, in aligned position, due to small air gap, the magnetization curve is heavily saturated.

Fig. 6. Magnetization curve for SRM

Figure 7 shows a magnetization curve under specific condition. Rotor angle is locked in somewhere between aligned and unaligned position. Energy and co-energy are defined in any point of that respectively by:

$$\mathcal{W}\_f = \int\_0^{\lambda\_a} i d\lambda (i, \theta) \tag{1}$$

$$
\dot{W} = \int\_0^{l\_a} \lambda(\theta, l) dl \tag{2}
$$

 �(�� �) represents the flux linkage as a nonlinear function of rotor position and current. As figure 7 shows, the area behind the magnetization curve until ��called stored field energy (��) that this energy is stored in the iron core (rotor and stator) and in the air gap. The area under the magnetization curve until �� called co-energy (�� ) .

Fig. 7. Concept of stored field energy and co-energy

In the next step, suppose that rotor is released. In this situation, rotor moves toward the adjacent stator pole until place in align position. For an infinitesimal movementΔ� , suppose that �� is considered constant, thus the flux linkage changes from point A to point B as shown in figure 8. By considering the conservation of energy, the mechanical work Δ�� which has been done by rotor during the Δ� movement is equal to the change in the stored field energy (Δ��).

Fig. 8. Mechanical work area

196 Fuzzy Logic – Algorithms, Techniques and Implementations

Maximum duty cycle of single-phase SRM is 0.5, thus, noise and high ripple torque are deduced from a torque discontinuity which arises from duty cycle. Applications, in which torque ripple and noise are not important, are good for this machine such as home

Due to saturation and varying reluctance with rotor position, there is no simple analytical solution to express the field which is produced by phase winding. Energy conversion

Figure 6 shows a typical magnetization curve. Flux linkage is a function of both rotor position and excitation current and also, it is nonlinear function. One of the most important parameter which affects on flux linkage is air gap. As it can be seen clearly, in unaligned position, flux linkage is a linear function due to big air gap. In other words, the gap between stator pole and rotor pole is big. In contrast, in aligned position, due to small air gap, the

Fig. 5. Single-phase SRM with permanent magnet

**5. Magnetization characteristic of SRM** 

magnetization curve is heavily saturated.

Fig. 6. Magnetization curve for SRM

approach that is presented in is used to analyze energy conversion.

appliances and hand tool.

The area Δ�� equals to the change in co-energy because of the Δ� movement. Thus, the mechanical energy can be stated as following:

$$
\Delta W\_m = \Delta \dot{W} = \int\_0^{l\_a} \lambda(\theta\_B, l) di - \int\_0^{l\_a} \lambda(\theta\_A, l) di \tag{3}
$$

$$
\Delta W\_m = T \Delta \theta \tag{4}
$$

$$T = \frac{\Delta W\_{\rm m}}{\Delta \theta} = \frac{\int\_0^{l\_0} \lambda(\theta\_B, l) dl - \int\_0^{l\_0} \lambda(\theta\_A, l) dl}{\Delta \theta} \tag{5}$$

$$T = \frac{\partial}{\partial \theta} \int\_0^l \lambda(\theta, \mathbf{f}) d\mathbf{f} \tag{6}$$

$$\lambda(\theta, t) = L(\theta)t \tag{7}$$

$$T = \int\_0^l \frac{\partial \lambda(\theta, \mathbf{i})}{\partial \theta} d\mathbf{i} = \int\_0^l \frac{dL}{d\theta} \mathbf{i} d\mathbf{i} = \frac{dL}{d\theta} \int\_0^l \mathbf{i} \, d\mathbf{i} = \frac{1}{2} l^2 \frac{dL}{d\theta} \tag{8}$$


Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 201

The most important part of the modeling is the constructing FIS rules because of the outcome of this part will define output fuzzy set. In other words, torque as the output is a fuzzy set that basically are formed by the results of the constructing FIS rules. Table 2 is used to constructing FIS rules. In Mamdani's type a set of if-then called rules. Thus, the

Degree of membership function is a value between 0 and 1 which is the output of the membership function. Now, degree of a rule can be defined the multiple of the degree of the

FIS editor is used to produce the rules which are shown in figure 15. Also, figure 15 shows a

inputs and output. For example, degree of the mentioned rule can be as following:

conditional statements are formulated by if-then form. For instance,

Rule 1: If current is s6 and rotor angle is s4 then torque is s10

part of rules for 8:6 SRM and the number of total rules are 80.

Fig. 12. Membership functions for current

Fig. 13. Membership functions for rotor angle

Fig. 14. Membership functions for torque

Degree (Rule 1) = µ (s6).µ (s4).µ (s10)

**6.4 Constructing FIS rule** 


Table 2. Fuzzy rule base table for 8:6 SRM

It should be noted that wrong interpretation of each of rules will influence on overall output and as a result, wrong model will come up. Therefore, this part of work should be done carefully and without any wrong rule.

### **6.2 Formation of Fuzzy Inference System (FIS)**

For the formation of the fuzzy inference system, fuzzy logic toolbox of MATLAB is used. Figure 11 shows the FIS structure that current and rotor angle are the inputs and torque is the output.

Fig. 11. Fuzzy SRM FIS structure

### **6.3 Assigning the FIS membership functions**

Once the FIS structure is completed, membership functions for each of the inputs and the output will be formed. Toolbox of MATLAB has 11 built-in membership functions (MFs) which some of those are trimf, gbellmf, gaussmf, gauss2mf, sigmf, psigmf. One of these MFs that are formed by straight lines is called triangular MFs. These MFs are used here in account of simple structure and well suited for the modeling.

Current and rotor angle as the inputs have 8 MFs and 13 MFs respectively for itself and torque as the output has 21 MFs for 8:6 SRM. Figure 12 , figure 13 and figure 14 show the MF for current, rotor angle and torque, respectively.

Fig. 12. Membership functions for current

200 Fuzzy Logic – Algorithms, Techniques and Implementations

 s4 s3 s2 s1 M b1 b2 b3 s6 s10 s10 s10 s10 s10 s10 s10 s10 s5 s10 s10 s10 s9 s9 s8 s8 s8 s4 s10 s9 s9 s8 s8 s7 s7 s6 s3 s9 s8 s8 s7 s6 s5 s3 s2 s2 s8 s7 s5 s2 b1 b3 b6 b9 s1 s8 s7 s4 s1 b2 b5 b7 b10 M s8 s7 s4 s1 b2 b5 b7 b10 b1 s8 s7 s4 s1 b1 b4 b7 b10 b2 s8 s7 s5 s2 b1 b4 b6 b8 b3 s8 s7 s5 s2 M b2 b4 b6 b4 s8 s7 s5 s4 s2 M b1 b2 b5 s9 s8 s6 s5 s5 s4 s3 s3 b6 s10 s10 s10 s10 s10 s10 s10 s10

It should be noted that wrong interpretation of each of rules will influence on overall output and as a result, wrong model will come up. Therefore, this part of work should be done

For the formation of the fuzzy inference system, fuzzy logic toolbox of MATLAB is used. Figure 11 shows the FIS structure that current and rotor angle are the inputs and torque is

Once the FIS structure is completed, membership functions for each of the inputs and the output will be formed. Toolbox of MATLAB has 11 built-in membership functions (MFs) which some of those are trimf, gbellmf, gaussmf, gauss2mf, sigmf, psigmf. One of these MFs that are formed by straight lines is called triangular MFs. These MFs are used here in

Current and rotor angle as the inputs have 8 MFs and 13 MFs respectively for itself and torque as the output has 21 MFs for 8:6 SRM. Figure 12 , figure 13 and figure 14 show the

Table 2. Fuzzy rule base table for 8:6 SRM

carefully and without any wrong rule.

Fig. 11. Fuzzy SRM FIS structure

**6.3 Assigning the FIS membership functions** 

account of simple structure and well suited for the modeling.

MF for current, rotor angle and torque, respectively.

the output.

**6.2 Formation of Fuzzy Inference System (FIS)** 

Fig. 13. Membership functions for rotor angle

Fig. 14. Membership functions for torque

### **6.4 Constructing FIS rule**

The most important part of the modeling is the constructing FIS rules because of the outcome of this part will define output fuzzy set. In other words, torque as the output is a fuzzy set that basically are formed by the results of the constructing FIS rules. Table 2 is used to constructing FIS rules. In Mamdani's type a set of if-then called rules. Thus, the conditional statements are formulated by if-then form. For instance,

Rule 1: If current is s6 and rotor angle is s4 then torque is s10

Degree of membership function is a value between 0 and 1 which is the output of the membership function. Now, degree of a rule can be defined the multiple of the degree of the inputs and output. For example, degree of the mentioned rule can be as following:

Degree (Rule 1) = µ (s6).µ (s4).µ (s10)

FIS editor is used to produce the rules which are shown in figure 15. Also, figure 15 shows a part of rules for 8:6 SRM and the number of total rules are 80.

Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 203

By considering the graphical representative of torque (figure 9), rotor angle and current are defined as inputs and torque as output. Figure 17 shows the FIS editor for ANFIS that two

Once the data set are obtained from the torque curve, loading data starts. The loaded data set should be in three columns matrix format. First and second belong to the inputs and the

The training data appears in the plot in center of the ANFIS editor as a set of circle as shown

Once data set is loaded, next step is initializing the MFs. There are two partitioning method

**7.1 Forming ANFIS** 

inputs and one output has been shown clearly.

Fig. 17. Neuro-Fuzzy SRM FIS structure

**7.2 Training scheme of FIS** 

in figure 18 for 8:6 SRM.

third one present the torque data.

Fig. 18. ANFIS editor with training data loaded for 8:6 SRM

**7.3 Initializing and generating FIS** 

for initializing the MFs:


Fig. 15. Constructing rules using rule editor

The surface viewer is used to show the dependency of the output to both of inputs. There for, it generates torque surface map. Figure 16 shows the torque for 8:6.

Fig. 16. Surface viewer of the FIS for 8:6 SRM

### **7. Torque estimation model using Adaptive Neuro-Fuzzy Inference System (ANFIS)**

ANFIS is an acronym of Adaptive Neuro-Fuzzy Inference System. Adaptive Neuro-Fuzzy is a technique which provides a learning method from the desired input and output to adjust the MFs parameters. During this process, back propagation and hybrid are two algorithms that are used so that the best parameters for the MFs will be achieved.

### **7.1 Forming ANFIS**

202 Fuzzy Logic – Algorithms, Techniques and Implementations

The surface viewer is used to show the dependency of the output to both of inputs. There

**7. Torque estimation model using Adaptive Neuro-Fuzzy Inference System** 

that are used so that the best parameters for the MFs will be achieved.

ANFIS is an acronym of Adaptive Neuro-Fuzzy Inference System. Adaptive Neuro-Fuzzy is a technique which provides a learning method from the desired input and output to adjust the MFs parameters. During this process, back propagation and hybrid are two algorithms

for, it generates torque surface map. Figure 16 shows the torque for 8:6.

Fig. 15. Constructing rules using rule editor

Fig. 16. Surface viewer of the FIS for 8:6 SRM

**(ANFIS)** 

By considering the graphical representative of torque (figure 9), rotor angle and current are defined as inputs and torque as output. Figure 17 shows the FIS editor for ANFIS that two inputs and one output has been shown clearly.

Fig. 17. Neuro-Fuzzy SRM FIS structure

### **7.2 Training scheme of FIS**

Once the data set are obtained from the torque curve, loading data starts. The loaded data set should be in three columns matrix format. First and second belong to the inputs and the third one present the torque data.

The training data appears in the plot in center of the ANFIS editor as a set of circle as shown in figure 18 for 8:6 SRM.

Fig. 18. ANFIS editor with training data loaded for 8:6 SRM

### **7.3 Initializing and generating FIS**

Once data set is loaded, next step is initializing the MFs. There are two partitioning method for initializing the MFs:

Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 205

Figure 21 shows the ANFIS model structure for 8:6 SRM. There are two inputs (rotor angle

and current) and one output (torque). There are total 104 MFs for each of inputs.

The summarized modeling description is shown in table 3 for 8:6 SRM.

Modeling Description Setting Number of Inputs 2 Number of Output 1

Number of MFs 104 Optimized Method Hybrid Epochs 150

The mapping surface of 8:6 SRM using neuro-fuzzy technique is shown in figure 19.

Method Subtractive Clustering

Fig. 20. ANFIS training with hybrid method

Fig. 21. ANFIS model structure for 8:6 SRM

Table 3. Fuzzy rule base table for 8:6 SRM

**7.5 Viewing ANFIS structure** 


Second method is employed in account of having one-pass algorithm which estimates the number of clusters.

Figure 19 shows the cluster parameters which are:


For varying both of inputs "range of influence" is set to 0.15. Other parameters remain in their previous value because those values are acceptable for training scheme. Once the parameters are set, the outcome FIS generates 104 numbers of MFs for both of the inputs, and output.


Fig. 19. Parameters set for subtractive clustering

### **7.4 ANFIS training**

In order to optimize the obtained parameters, two methods are available:


The first method is considered for data training. Error tolerance is established to create halt criterion. The error training will stop after certain epoch which is set. The number of epochs for both of 8:6 is 150.

The final error training is 3.014e -7 which is shown in figure 18 after 150 epochs.

Fig. 20. ANFIS training with hybrid method

### **7.5 Viewing ANFIS structure**

204 Fuzzy Logic – Algorithms, Techniques and Implementations

Second method is employed in account of having one-pass algorithm which estimates the

For varying both of inputs "range of influence" is set to 0.15. Other parameters remain in their previous value because those values are acceptable for training scheme. Once the parameters are set, the outcome FIS generates 104 numbers of MFs for both of the inputs,

1. Grid partition

number of clusters.

and output.

1. Range of influence 2. Squash factor 3. Accept ratio 4. Reject ratio

2. Subtractive clustering

Figure 19 shows the cluster parameters which are:

Fig. 19. Parameters set for subtractive clustering

In order to optimize the obtained parameters, two methods are available:

2. Back propagation: this method consists of steepest descend method for MFs.

The final error training is 3.014e -7 which is shown in figure 18 after 150 epochs.

1. Hybrid method: this method is a combination of least squares and back propagation

The first method is considered for data training. Error tolerance is established to create halt criterion. The error training will stop after certain epoch which is set. The number of epochs

**7.4 ANFIS training** 

method.

for both of 8:6 is 150.

Figure 21 shows the ANFIS model structure for 8:6 SRM. There are two inputs (rotor angle and current) and one output (torque). There are total 104 MFs for each of inputs.

Fig. 21. ANFIS model structure for 8:6 SRM

The summarized modeling description is shown in table 3 for 8:6 SRM.


Table 3. Fuzzy rule base table for 8:6 SRM

The mapping surface of 8:6 SRM using neuro-fuzzy technique is shown in figure 19.

Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 207

ANFIS is one the best approaches due to the capability of learning without dependency to human knowledge. In other worlds, in fuzzy logic approach, membership functions have been formed by the human knowledge but ANFIS because of having training algorithm and independency to human knowledge is more capable to produce accurate data. In this

section, error analysis shows the preciseness of the mode:

 ∑� ��) = Sum of the computed torque =2.71E+02 ∑� ��) = Sum of the measured torque =270.95 ∑� |�|) = Calculated total absolute error= 4.61 e-011

N = Number of data points= 104

 � = Rotor angle in mechanical degree �� = Measured torque in Newton-meter �� = Computed torque in Newton-meter

���� ��∗�� ∗ 100%

<sup>=</sup> 4.61� � 011

FIS ANFIS

2.6058 ∗ 104 ∗ 100 = 1.7011� � 011

Error analysis is conducted for the two approaches. Table 4 reflects the average percentage

As it can been seen clearly, table above shows the ANFIS model is the best among those. ANFIS technique is used in order to develop predictive model for obtaining precision outcome. This approach can be used for any nonlinear function with arbitrary accuracy.

Torque profile of switched reluctance motor is a nonlinear function and the inherent nonlinear characteristics lead us toward artificial intelligence approaches. Due to the mentioned nonlinearity a predictive model is needed. ANFIS model owing to its abilities to predict is opted. The reason being is due to the ANFIS modeling approach possessing

5.7431% 1.701� � 011%

**8.2 Error analysis for torque estimation model using adaptive neuro-fuzzy inference** 

**system technique** 

Thus,

Mean ��= �.������

**9. Conclusion** 

error of each models.

Error

Table 4. Error analysis result

Average% error= � ∑|�|

**From the results in appendix B:** 

I = phase current(A)

��� <sup>=</sup>2.6058

Average Percentage

Fig. 22. Surface view of 8:6 SRM

### **8. Result and discussion**

### **8.1 Error analysis for torque estimation model using fuzzy logic technique**

Torque estimation based on fuzzy logic technique has been presented. Thus, 8 and 13 membership functions were formed for the inputs and 21 for the torque as the output for 8:6 RM. Error analysis is conducted to obtain the accuracy of the model. Appendix D shows the computed torque values in term of comparison with the desired measured values.

### **From the results in appendix A:**


### Thus,

Mean  $T\_f = \frac{262.4614}{104} = 2.716$ 

\_Average\% error =  $\left[\frac{\Sigma|\varepsilon|}{Mean \, T\_f \, \ast N}\right] \ast 100\%

$$= \frac{16.2222}{2.716 \ast 104} \ast 100\% = 5.7431$$
$ 

### **8.2 Error analysis for torque estimation model using adaptive neuro-fuzzy inference system technique**

ANFIS is one the best approaches due to the capability of learning without dependency to human knowledge. In other worlds, in fuzzy logic approach, membership functions have been formed by the human knowledge but ANFIS because of having training algorithm and independency to human knowledge is more capable to produce accurate data. In this section, error analysis shows the preciseness of the mode:

### **From the results in appendix B:**


Thus,

206 Fuzzy Logic – Algorithms, Techniques and Implementations

**8.1 Error analysis for torque estimation model using fuzzy logic technique** 

computed torque values in term of comparison with the desired measured values.

Torque estimation based on fuzzy logic technique has been presented. Thus, 8 and 13 membership functions were formed for the inputs and 21 for the torque as the output for 8:6 RM. Error analysis is conducted to obtain the accuracy of the model. Appendix D shows the

Fig. 22. Surface view of 8:6 SRM

**8. Result and discussion** 

**From the results in appendix A:** 

I = phase current(A)

Thus,

Mean �� <sup>=</sup>���.����

���

Average% error = � ∑|�|

N = Number of data points= 104

= 2.716

���� ��∗�� ∗ 100%

<sup>=</sup> 16.2222

2.716 ∗ 104 ∗ 100% = 5.7431

 � = Rotor angle in mechanical degree �� = Measured torque in Newton-meter �� = Computed torque in Newton-meter

 ∑� ��) = Sum of the measured torque ��=270.95 ∑� ��) = Sum of the computed torque ��=282.4614 ∑� |�|) = Calculated total absolute error=16.2222

Mean  $T\_f = \frac{2.71 \text{E} + 02}{104} = 2.6058$ 

Average\% error=
 $\left[\frac{\Sigma \|\epsilon\|}{Mean \ T\_f \ast N}\right] \ast 100\%

$$= \frac{4.61e - 011}{2.6058 \ast 104} \ast 100 = 1.7011e - 011$$
$ 

### **9. Conclusion**

Error analysis is conducted for the two approaches. Table 4 reflects the average percentage error of each models.


Table 4. Error analysis result

As it can been seen clearly, table above shows the ANFIS model is the best among those. ANFIS technique is used in order to develop predictive model for obtaining precision outcome. This approach can be used for any nonlinear function with arbitrary accuracy.

Torque profile of switched reluctance motor is a nonlinear function and the inherent nonlinear characteristics lead us toward artificial intelligence approaches. Due to the mentioned nonlinearity a predictive model is needed. ANFIS model owing to its abilities to predict is opted. The reason being is due to the ANFIS modeling approach possessing

Artificial Intelligence Techniques of Estimating of Torque for 8:6 Switched Reluctance Motor 209

2 0 0 7.99e-014 7.99E-14 4 0 0 6.49e-014 6.49E-14 6 0 0 6.61e-014 6.61E-14 8 0 0 3.72e-14 3.72E-14 10 0 0 2.14e-13 2.14E-13 12 0 0 2.36e-13 2.36E-13 14 0 0 3.11e-13 3.11E-13 16 0 0 3.08e-014 3.08E-14 . . . . . . . . . . 2 30 0 -1.18e-013 1.18E-13 4 30 0 -4.62e-012 4.62E-12 6 30 0 -6.92e-012 6.92E-12 8 30 0 3.25e-012 3.25E-12 10 30 0 1.15e-012 1.15E-12 12 30 0 3.34e-012 3.34E-12 14 30 0 1.36e-012 1.36E-12 16 30 0 2.43e-011 2.43E-11 936 1560 270.95 2.71E+02 4.61E-11

Chancharoensook, P.& Rahman, M.F. , "Dynamic modeling of a four-phase 8/6 switched

Geldhof, K.R. & Van den Bossche, A. & Vyncke, T.J. & Melkebeek, J.A.A. , "Influence of flux

Mirzaeian-Dehkordi, B. & Moallem, P. , "Genetic Algorithm Based Optimal Design of

Gobbi, R. & Ramar, K. , "Practical current control techniques for torque ripple minimization

Rajapakse, A.D. & Gole, A.M.; Muthumuni, D. & Wilson, P.L.; Perregaux, A. , "Simulation of

*Conference of IEEE* , vol., no., pp.1246-1251, 10-13 Nov. 2008

reluctance motor using current and torque look-up tables," *IECON 02 [Industrial Electronics Society, IEEE 2002 28th Annual Conference of the]* , vol.1, no., pp. 491- 496

penetration on inductance and rotor position estimation accuracy of switched reluctance machines," *Industrial Electronics, 2008. IECON 2008. 34th Annual* 

Switching Circuit Parameters for a Switched Reluctance Motor Drive," *Power Electronics, Drives and Energy Systems, 2006. PEDES '06. International Conference on* ,

in SR motors," *Power and Energy Conference, 2008. PECon 2008. IEEE 2nd* 

switched reluctance motors embedded in large networks," *Power System Technology,* 

Torque |ε<sup>|</sup>

Current Rotor Angle Measured Torque Computed

**12. Appendix B** 

**13. References** 

vol.1, 5-8 Nov. 2002

vol., no., pp.1-6, 12-15 Dec. 2006

*International*, vol., no., pp.743-748, 1-3 Dec. 2008

*Error analysis for the ANFIS model of 8:6 SRM* 

learning characteristic capability that allows it to learn from the data values through the training scheme, thus avoids on the dependency of human knowledge with regard to the systems(Parvizi.A&Hassani&Mehbodnia&Makhilef&Tamjis,2009) . Besides, ANFIS method dose not have the complexity of FIS method which makes it much easier to understand and utilize. Average percentage error shows that the outcome is in good agreement with the published data. Torque profile is simulated and results reveals that ANFIS modeling method is a trustable model for further research. In addition, this approch can be used in order to control the turn-off angle of the SRM which leades to a SRM with low torque ripples.

### **10. Acknowledgment**

This work is dedicated to my parents, Mohammad and Fatemeh for their kindness and support.The author like to thank Dr.Aris Ramlan, Mr.Peter Nicoll and Dr.M. Beikzadeh for reviewing and his right-on-target comments.

### **11. Appendix A**

Error analysis for torque Using Fuzzy logic Technique


### **12. Appendix B**

208 Fuzzy Logic – Algorithms, Techniques and Implementations

learning characteristic capability that allows it to learn from the data values through the training scheme, thus avoids on the dependency of human knowledge with regard to the systems(Parvizi.A&Hassani&Mehbodnia&Makhilef&Tamjis,2009) . Besides, ANFIS method dose not have the complexity of FIS method which makes it much easier to understand and utilize. Average percentage error shows that the outcome is in good agreement with the published data. Torque profile is simulated and results reveals that ANFIS modeling method is a trustable model for further research. In addition, this approch can be used in order to control the turn-off angle of the SRM which leades to a SRM with low torque

This work is dedicated to my parents, Mohammad and Fatemeh for their kindness and support.The author like to thank Dr.Aris Ramlan, Mr.Peter Nicoll and Dr.M. Beikzadeh for

Current Rotor Angle Measured torque Computed torque |ε| 2 0 0 0.1723 0.1723 4 0 0 0.1723 0.1723 6 0 0 0.1723 0.1723 8 0 0 0.1723 0.1723 10 0 0 0.1723 0.1723 12 0 0 0.1723 0.1723 14 0 0 0.1723 0.1723 16 0 0 0.1723 0.1723 . . . . . . . . . . 2 30 0 0.1723 0.1723 4 30 0 0.1723 0.1723 6 30 0 0.1723 0.1723 8 30 0 0.1723 0.1723 10 30 0 0.1723 0.1723 12 30 0 0.1723 0.1723 14 30 0 0.1723 0.1723 16 30 0 0.1723 0.1723 936 1560 270.95 282.4614 16.2222

ripples.

**10. Acknowledgment** 

**11. Appendix A** 

reviewing and his right-on-target comments.

Error analysis for torque Using Fuzzy logic Technique


*Error analysis for the ANFIS model of 8:6 SRM* 

### **13. References**


**11** 

*1Romania 2Switzerland* 

**Engine Knock Detection Based on** 

**Computational Intelligence Methods** 

Adriana Florescu1, Claudiu Oros1 and Anamaria Radoi2

*Artificial intelligence* emerged from human thinking that has both logical and intuitive or subjective sides. The logical side has been developed and utilized, resulting advanced von Neumann type computers and expert systems, both constituting the *hard computing* domain. However, it is found that hard computing can't give the solution of very complicated problems by itself. In order to cope with this difficulty, the intuitive and subjective thinking of human mind was explored, resulting the *soft computing* domain (also called *computational intelligence*). It includes *neural networks*, *fuzzy logic* and *probabilistic reasoning*, the last gathering *evolutionary computation* (including *genetic algorithms* with related efforts in *genetic programming* and *classifier systems*, *evolution strategies* and *evolutionary programming*), *immune networks*, *chaos computing* and parts of *learning theory*. In different kind of applications, all pure artificial intelligence methods mentioned above proved to be rather complementary than competitive, so that combined methods appeared in order to gather the advantages and to cope with the disadvantages of each pure method. The scope of this chapter is to study and finaly compare some representative classes of pure and combined computational intelligence methods applied

The internal-combustion engine is one of the most used vehicle power generators in the world today. When looking at the characteristics of a vehicle - and therefore the ones of the engine that drives it - , some of the most important are the emissions, fuel economy and efficiency. All three of these variables are affected by a phenomenon that occurs in the engine called knock**.** *Engine knock* (also known as *knocking*, *self-combustion*, *detonation*, *spark knock* or *pinging*) in spark-ignition internal combustion engines occurs when combustion of the mixture of fuel and air in the cylinder starts off correctly because of the ignition by the spark plug, but one or more pockets of the mixture explode outside the normal combustion front. The importance of knock detection comes from the effects it generates; these can range from increased fuel consumption and pollution, the decrease of engine power and up to partial or complete destruction of the cylinders, pistons, rods, bearings and many other

**1. Introduction** 

in engine knock detection.

damages around the engine bay.

*1University Politehnica of Bucharest, 2Ecole Politechnique Federale de Laussane,* 

*2004. PowerCon 2004. 2004 International Conference on* , vol.1, no., pp. 695- 700 Vol.1, 21-24 Nov. 2004


## **Engine Knock Detection Based on Computational Intelligence Methods**

Adriana Florescu1, Claudiu Oros1 and Anamaria Radoi2

*1University Politehnica of Bucharest, 2Ecole Politechnique Federale de Laussane, 1Romania 2Switzerland* 

### **1. Introduction**

210 Fuzzy Logic – Algorithms, Techniques and Implementations

Wai-Chuen Gan& Cheung, N.C. & Li Qiu , "Short distance position control for linear

Bhiwapurkar, N.; Jain, A.K.; Mohan, N.; , "Study of new stator pole geometry for

Parvizi.A&Hassani.M&Mehbodnia.A&Makhilef.S&Tamjis.M.R *"Adaptive Neuro-Fuzzy* 

*the 2001 IEEE* , vol.4, no., pp.2329-2336 vol.4, 30 Sep-4 Oct. 2001

*International Conference on* , vol., no., pp.516-520, 15-15 May 2005

conference proceeding , page(s): 1-4,2009

21-24 Nov. 2004

*2004. PowerCon 2004. 2004 International Conference on* , vol.1, no., pp. 695- 700 Vol.1,

switched reluctance motors: a plug-in robust compensator approach," *Industry Applications Conference, 2001. Thirty-Sixth IAS Annual Meeting. Conference Record of* 

improvement of SRM torque profile," *Electric Machines and Drives, 2005 IEEE* 

*Approach of Estimating of torque for 8:6 Switched Reluctance Motor"* International Conference for Technical Postgraduates TECHPOS conference, Malaysia. IEEE

> *Artificial intelligence* emerged from human thinking that has both logical and intuitive or subjective sides. The logical side has been developed and utilized, resulting advanced von Neumann type computers and expert systems, both constituting the *hard computing* domain. However, it is found that hard computing can't give the solution of very complicated problems by itself. In order to cope with this difficulty, the intuitive and subjective thinking of human mind was explored, resulting the *soft computing* domain (also called *computational intelligence*). It includes *neural networks*, *fuzzy logic* and *probabilistic reasoning*, the last gathering *evolutionary computation* (including *genetic algorithms* with related efforts in *genetic programming* and *classifier systems*, *evolution strategies* and *evolutionary programming*), *immune networks*, *chaos computing* and parts of *learning theory*. In different kind of applications, all pure artificial intelligence methods mentioned above proved to be rather complementary than competitive, so that combined methods appeared in order to gather the advantages and to cope with the disadvantages of each pure method. The scope of this chapter is to study and finaly compare some representative classes of pure and combined computational intelligence methods applied in engine knock detection.

> The internal-combustion engine is one of the most used vehicle power generators in the world today. When looking at the characteristics of a vehicle - and therefore the ones of the engine that drives it - , some of the most important are the emissions, fuel economy and efficiency. All three of these variables are affected by a phenomenon that occurs in the engine called knock**.** *Engine knock* (also known as *knocking*, *self-combustion*, *detonation*, *spark knock* or *pinging*) in spark-ignition internal combustion engines occurs when combustion of the mixture of fuel and air in the cylinder starts off correctly because of the ignition by the spark plug, but one or more pockets of the mixture explode outside the normal combustion front. The importance of knock detection comes from the effects it generates; these can range from increased fuel consumption and pollution, the decrease of engine power and up to partial or complete destruction of the cylinders, pistons, rods, bearings and many other damages around the engine bay.

Engine Knock Detection Based on Computational Intelligence Methods 213

The first layer represents the input and is built with fuzzy input neurons, each one selecting a characteristic of the original sample vector. In the case of a two dimensional sample containing N1xN2 vector elements we will have a first layer that has N1xN2 neurons. For

[1] [1]

[1] [1]

all the input elements. The notation will be kept for neurons belonging to all the following

The second layer is built of N1xN2 neurons and its purpose is to perform the fuzzification of the input patterns by means of the weight function *wmn* ( ,) - also called the fuzzification

> <sup>222</sup> ( ) ( ,) *m n wmn e*

where parameters m=-(N1-1), …, +(N1-1), n=-(N2-1), …, +(N2-1) and β determines how much of the sample vector each fuzzy neuron sees. Each neuron from the second layer has M outputs, one for each neuron in the third layer. The output for the second layer neuron on

*ij ij ij szx* , (1)

*ij ij <sup>v</sup>*max *y sP* , (2)

*ij z* is it's input value, *ij x* is the value of the element (i, j) in the

*ij y* is its output value and *Pv*max is the maximum value of

*ij s* represents the state of the neuron on the (i, j)

, (3)

Fig. 1. The Fuzzy Kwan-Cai Neural Network structure

the neuron on the (i, j) position the equations are:

for i=1, 2, …, N1; j=1, 2, …, N2, where [1]

position for the first layer, [1]

function - , defined as:

position (p, q) is:

layers.

input sample pattern, ( 0 *ij x* ), [1]

Internal combustion engines present an optimum working cycle that is right on the edge of self-combustion or knock. If engine knock occurs and is detected in a cycle then the ignition timing (spark angle) needs to be modified so that the next cycle does not suffer from the same phenomenon. This is why the detection needs to be done in under a cycle (Bourbai, 2000; Li&Karim, 2004; Hamilton&Cowart, 2008; Erjavec, 2009).

Engine knock can be detected using a series of devices placed in and around the engine bay like: pressure sensors mounted inside each cylinder, devices that measure the ionization current in the spark plug or accelerometers mounted on the engine to measure vibrations etc. The best and most accurate information on knock is given by the *pressure sensors* but the easiest and less expensive way to detect it is by using *vibration sensors* mounted on the engine (Erjavec, 2009; Gupta, 2006; Bosch, 2004; Thomas et al., 1997; Ettefag, 2008, Fleming, 2001). The knock detection methods used so far for extracting information from the engine sensors include *time*, *frequency (spectrum)* or a diversity *time-frequency analysis (Wavelet)* based solutions (Adeli&Karim, 2005; Park&Jang, 2004; Radoi et al., 2009; Midori et al., 1999; Lazarescu et al., 2004; Jonathan et al., 2006). The restriction of average detection rates and the complexity of information needed for the Wavelet analysis support further developments and hybridization with mixed techniques that proved useful in other fields of application than the one explored in this chapter: *wavelet-fuzzy* (Borg et al., 2005), *waveletneural* (Zhang&Benveniste, 1992; Billings&Wei, 2005; Wu&Liu, 2009; Banakar&Azeem, 2008) and *wavelet-neuro-fuzzy* (Ylmaz&Oysal, 2010).

Among the pure computational intelligence methods described in (Wang&Liu, 2006; Prokhorov, 2008; Mitchell, 2010;Wehenkel, 1997), different types of neural network applications have been employed with better detection rates than the previous non-neural methods but no clear comparative analysis results have been presented so far for engine knock detection. The methods taken into account and finally compared in this chapter start with the *Fuzzy Kwan-Cai Neural Network* (Kwan&Cai, 1994) - for the application of which other neuro-fuzzy or fuzzy logic models were studied (Zhang&Liu, 2006; Ibrahim,2004; Liu&Li, 2004; Hui, 2011; Chen, 2005) -, expand to the *Kohonen Self-Organizing Map (SOM)* (Kohonen, 2000, 2002; Hsu, 2006; Lopez-Rubio, 2010) and end with *Bayes Classifier* (Larose, 2006) to which results of this chapter conforming with other work (Auld et al., 2007) published so far have proved needing hybridization.

Work started using two sizes of training and testing sample groups, both belonging to the Bosch Group database in order to see how data size can affect the results. The applications were built to handle both pressure and vibration samples in order to see which of them can supply the most valuable information. In addition, due to the lack of chapters available on this subject, through the analysis of the results, we can get a better impression of the nature of these types of signals, the coherence of samples and evolution of detection rates with every new sample added. Also, to complete the analysis, a comparison of the responses from pressure and vibration families of samples is made for the three methods.

### **2. Mathematical background of used computational intelligence methods**

### **2.1 Fuzzy Kwan-Cai neural network**

The Fuzzy Kwan-Cai neural network shown in Fig.1 has four layers, each of them being a fuzzy block represented by a different type of fuzzy neurons with their own specific purpose and functions (Kwan&Cai, 1994).

212 Fuzzy Logic – Algorithms, Techniques and Implementations

Internal combustion engines present an optimum working cycle that is right on the edge of self-combustion or knock. If engine knock occurs and is detected in a cycle then the ignition timing (spark angle) needs to be modified so that the next cycle does not suffer from the same phenomenon. This is why the detection needs to be done in under a cycle (Bourbai,

Engine knock can be detected using a series of devices placed in and around the engine bay like: pressure sensors mounted inside each cylinder, devices that measure the ionization current in the spark plug or accelerometers mounted on the engine to measure vibrations etc. The best and most accurate information on knock is given by the *pressure sensors* but the easiest and less expensive way to detect it is by using *vibration sensors* mounted on the engine (Erjavec, 2009; Gupta, 2006; Bosch, 2004; Thomas et al., 1997; Ettefag, 2008, Fleming, 2001). The knock detection methods used so far for extracting information from the engine sensors include *time*, *frequency (spectrum)* or a diversity *time-frequency analysis (Wavelet)* based solutions (Adeli&Karim, 2005; Park&Jang, 2004; Radoi et al., 2009; Midori et al., 1999; Lazarescu et al., 2004; Jonathan et al., 2006). The restriction of average detection rates and the complexity of information needed for the Wavelet analysis support further developments and hybridization with mixed techniques that proved useful in other fields of application than the one explored in this chapter: *wavelet-fuzzy* (Borg et al., 2005), *waveletneural* (Zhang&Benveniste, 1992; Billings&Wei, 2005; Wu&Liu, 2009; Banakar&Azeem, 2008)

Among the pure computational intelligence methods described in (Wang&Liu, 2006; Prokhorov, 2008; Mitchell, 2010;Wehenkel, 1997), different types of neural network applications have been employed with better detection rates than the previous non-neural methods but no clear comparative analysis results have been presented so far for engine knock detection. The methods taken into account and finally compared in this chapter start with the *Fuzzy Kwan-Cai Neural Network* (Kwan&Cai, 1994) - for the application of which other neuro-fuzzy or fuzzy logic models were studied (Zhang&Liu, 2006; Ibrahim,2004; Liu&Li, 2004; Hui, 2011; Chen, 2005) -, expand to the *Kohonen Self-Organizing Map (SOM)* (Kohonen, 2000, 2002; Hsu, 2006; Lopez-Rubio, 2010) and end with *Bayes Classifier* (Larose, 2006) to which results of this chapter conforming with other work (Auld et al., 2007)

Work started using two sizes of training and testing sample groups, both belonging to the Bosch Group database in order to see how data size can affect the results. The applications were built to handle both pressure and vibration samples in order to see which of them can supply the most valuable information. In addition, due to the lack of chapters available on this subject, through the analysis of the results, we can get a better impression of the nature of these types of signals, the coherence of samples and evolution of detection rates with every new sample added. Also, to complete the analysis, a comparison of the responses

from pressure and vibration families of samples is made for the three methods.

**2. Mathematical background of used computational intelligence methods** 

The Fuzzy Kwan-Cai neural network shown in Fig.1 has four layers, each of them being a fuzzy block represented by a different type of fuzzy neurons with their own specific

2000; Li&Karim, 2004; Hamilton&Cowart, 2008; Erjavec, 2009).

and *wavelet-neuro-fuzzy* (Ylmaz&Oysal, 2010).

published so far have proved needing hybridization.

**2.1 Fuzzy Kwan-Cai neural network** 

purpose and functions (Kwan&Cai, 1994).

Fig. 1. The Fuzzy Kwan-Cai Neural Network structure

The first layer represents the input and is built with fuzzy input neurons, each one selecting a characteristic of the original sample vector. In the case of a two dimensional sample containing N1xN2 vector elements we will have a first layer that has N1xN2 neurons. For the neuron on the (i, j) position the equations are:

$$\mathbf{x}\_{i\uparrow}^{\{1\}} = \mathbf{z}\_{i\downarrow}^{\{1\}} = \mathbf{x}\_{i\downarrow} \,\, \, \, \, \tag{1}$$

$$y\_{ij}^{[1]} = \mathbf{s}\_{ij}^{[1]} \Big/ P\_{v \cdot \max \prime} \,\tag{2}$$

for i=1, 2, …, N1; j=1, 2, …, N2, where [1] *ij s* represents the state of the neuron on the (i, j) position for the first layer, [1] *ij z* is it's input value, *ij x* is the value of the element (i, j) in the input sample pattern, ( 0 *ij x* ), [1] *ij y* is its output value and *Pv*max is the maximum value of all the input elements. The notation will be kept for neurons belonging to all the following layers.

The second layer is built of N1xN2 neurons and its purpose is to perform the fuzzification of the input patterns by means of the weight function *wmn* ( ,) - also called the fuzzification function - , defined as:

$$w(m,n) = e^{-\beta^2 \left(m^2 + n^2\right)},\tag{3}$$

where parameters m=-(N1-1), …, +(N1-1), n=-(N2-1), …, +(N2-1) and β determines how much of the sample vector each fuzzy neuron sees. Each neuron from the second layer has M outputs, one for each neuron in the third layer. The output for the second layer neuron on position (p, q) is:

Engine Knock Detection Based on Computational Intelligence Methods 215

Fig. 2. Flowchart for implemented Kwan-Cai algorithm

The Kohonen Self-Organizing Map (SOM) with the structure presented in Fig.3 is a neural network characterized by the fact that neighboring neurons (cells) communicate among themselves by mutual-lateral interactions transforming into detectors of specific classes when given input patterns. The learning can be unsupervised or supervised (Kohonen, 2000, 2002; Hsu, 2006; Lopez-Rubio, 2010) In this chapter the supervised learning algorithm was

**2.2 Kohonen Self-Organizing Map (SOM)** 

used.

$$y\_{pqm}^{[2]} = q\_{pqm}^{[2]} \, , \tag{4}$$

for p=1, …, N1; q=1, …, N2; m=1,…,M, where [2] *ypqm* is the *th m* output of the second layer neuron on position (p,q) to the *th m* third level neuron. The output function *qpqm* is determined within the training algorithm. For a more simplified approach, we can choose isosceles triangles with the base α and the height 1, mathematically defined as:

$$\left| y\_{pqm}^{\left[2\right]} = q\_{pqm} \left( s\_{pq}^{\left[2\right]} \right) = \left| 1 - \frac{2\left| s\_{pq}^{\left[2\right]} - \theta\_{pqm} \right| \le \frac{\alpha}{2}}{\alpha}, \text{ for } \left| s\_{pq}^{\left[2\right]} - \theta\_{pqm} \right| \le \frac{\alpha}{2} \right. \tag{5}$$

where 0 , p=1, …, N1; q=1, …, N2; m=1, …, M. Parameter θpqm is the center of the isosceles triangle base. By means of the training algorithm, p, q and m values corresponding to α and θpqm are determined.

The third layer is made-up of M neurons each of them representing a learned pattern and so the value for M can only be determined at the end of the learning process. It can be seen as a fuzzy deduction (inference) layer. The output for the third layer neuron is:

$$y\_m^{[3]} = s\_m^{[3]} = \min\_{p=1\ldots N1} (\min\_{q=1\ldots N2} (y\_{p\eta m}^{[2]}) \, . \tag{6}$$

for m=1,…, M.

The fourth and final layer is the network's output layer and is made up of competitive neurons one for each pattern that is learned; it is the defuzzification layer. If an input pattern is more similar to the mth pattern that was learned, then the output of the mth comparative neuron will be attributed value 1 and the others value 0:

$$\mathbf{y}\_{m}^{\{4\}} = \mathbf{s}\_{m}^{\{4\}} = \mathbf{z}\_{m}^{\{4\}} \,\tag{7}$$

$$y\_{pqm}^{[4]} = q[s\_m^{[4]} - T] = \begin{cases} 0, \text{ if } s\_m^{[4]} < T \\ \mathbf{1}, \text{ if } s\_m^{[4]} = T \end{cases} \tag{8}$$

$$T = \max\_{m=1\ldots M} \text{(max}\_{j=1\ldots N/2} \{ y\_m^{[\} \} )\tag{9}$$

for m=1,…, M, where T is defined as the activation threshold for all the neurons in the forth layer.

The flowchart in Fig.2 summarizes the procedure of adapting and implementing the Fuzzy Kwan-Cai algorithm to the application proposed in the chapter. The differences from the standard theoretical algorithm are that the sample databases are first imported and validated for integrity and then separated into pressure and vibration, respectively training and testing classes. The standard classification steps follow and the algorithm ends with the calculation of the detection rate.

214 Fuzzy Logic – Algorithms, Techniques and Implementations

for p=1, …, N1; q=1, …, N2; m=1,…,M, where [2] *ypqm* is the *th m* output of the second layer neuron on position (p,q) to the *th m* third level neuron. The output function *qpqm* is determined within the training algorithm. For a more simplified approach, we can choose

<sup>2</sup> <sup>1</sup> , 2

[3] [3] [2] *y s m m* min (min ( ) *<sup>p</sup>*1... 1 1... 2 *<sup>N</sup> <sup>q</sup> <sup>N</sup> ypqm* , (6)

[4]

*if T*

[4] m

[3] *T y* max (max ( ) *mM jNm* 1... 1... 2 , (9)

[4] [4] [4] *ymmm s z* , (7)

, (8)

isosceles triangles with the base α and the height 1, mathematically defined as:

*y qs for s*

*other*

0,

fuzzy deduction (inference) layer. The output for the third layer neuron is:

 

[2] [2] [2] [2] 2

*pq pqm pqm pqm pq pq pqm*

*<sup>q</sup> <sup>s</sup>*

The third layer is made-up of M neurons each of them representing a learned pattern and so the value for M can only be determined at the end of the learning process. It can be seen as a

The fourth and final layer is the network's output layer and is made up of competitive neurons one for each pattern that is learned; it is the defuzzification layer. If an input pattern is more similar to the mth pattern that was learned, then the output of the mth comparative

[4] [4] m

0, s [ ] 1, s *pqm m*

*y qs T if <sup>T</sup>* 

for m=1,…, M, where T is defined as the activation threshold for all the neurons in the forth

The flowchart in Fig.2 summarizes the procedure of adapting and implementing the Fuzzy Kwan-Cai algorithm to the application proposed in the chapter. The differences from the standard theoretical algorithm are that the sample databases are first imported and validated for integrity and then separated into pressure and vibration, respectively training and testing classes. The standard classification steps follow and the algorithm ends with the

 , p=1, …, N1; q=1, …, N2; m=1, …, M. Parameter θpqm is the center of the isosceles triangle base. By means of the training algorithm, p, q and m values corresponding

neuron will be attributed value 1 and the others value 0:

where 0 

for m=1,…, M.

layer.

calculation of the detection rate.

to α and θpqm are determined.

[2] [2] *y q pqm pqm* , (4)

, (5)

Fig. 2. Flowchart for implemented Kwan-Cai algorithm

### **2.2 Kohonen Self-Organizing Map (SOM)**

The Kohonen Self-Organizing Map (SOM) with the structure presented in Fig.3 is a neural network characterized by the fact that neighboring neurons (cells) communicate among themselves by mutual-lateral interactions transforming into detectors of specific classes when given input patterns. The learning can be unsupervised or supervised (Kohonen, 2000, 2002; Hsu, 2006; Lopez-Rubio, 2010) In this chapter the supervised learning algorithm was used.

Engine Knock Detection Based on Computational Intelligence Methods 217

After the winner determination process has finished the weights refining one is started and this must not have an effect on all the neurons but only in a certain vicinity \* *Vj* around the winner j\*. Outside this perimeter the influence of this process is considered null. The radius of this vicinity starts out big and keeps on getting smaller and smaller with the refining

The learning rate can have many expressions. In this application, the chosen expression was:

<sup>0</sup> ( ) exp[ / ] *k j*

where rj\* and rk are position vectors in the network representing the characteristic of the neural center of the vicinity and the neuron with the index k for which the refining process is taking place. Function η0=η0(t) decrease in time, representing the value of the learning rate

0() / *<sup>p</sup>*

The parameter σ controls the speed of decreasing the learning rate, depending on the radius

After the refining process for the current input vector is finished the next one is selected and so on until all the input vectors are used and the stop training condition is inspected. A useful stopping condition is the moment when the weights of the network cease being

( 1) ( ) , *wt wt ij ij*

The flowchart in Fig.4 summarizes the procedure of adapting and implementing the Kohonen Self-Organizing Map algorithm to the application proposed in the chapter. The differences from the standard theoretical algorithm are the same as those described for the

For the Bayes Classifier working with Gaussian classes (Larose, 2006) considering first the case of two (R=2) 1-dimensional classes (n=1), the density of probability being of Gaussian

<sup>1</sup> <sup>2</sup> () (| ) ( ) ( ), <sup>2</sup>

 

*r*

*r rr r*

*g x px P e P*

 2 2 ( )

> 

*r r x m*

 (16)

 *t rr* 

\* 2

 

, (13)

, (15)

*t at* , (14)

process.

in the center of the vicinity:

refined (are no longer being modified):

where i=0, 1, …, n-1 and j=0, 1, …, M-1.

Fuzzy Kwan-Cai algorithm in Fig. 2.

**2.3 Bayes classifier** 

nature can be defined:

where parameter *r* 1; 2 .

of the vicinity.

The network transforms similarities among vectors into neural vicinities (the similar input patterns will be found as neighbors).

Fig. 3. The SOM neural network

From a structural point of view, the Kohonen neural network is composed of two layers out of which the first one is an input layer made of transparent neurons with no processing functions. Its purpose is to receive the input pattern and send it to the second layer. This first layer has the same size as the input pattern.

The second layer contains M output neurons, a number equal or higher than the number of classes desired in order to classify the entry patterns. They can be arranged planar, linear, circular, as a torus or sphere, the training and performances being dependent on the network shape. The planar network can also be rectangular or hexagonal depending on the placement of neurons.

An input vector *<sup>n</sup> X R <sup>p</sup>* is applied in parallel to all the neurons of the network, each of them being characterized by a weight vector:

$$\mathcal{W}\_{j} \{ w\_{0j'} w\_{1j'} \dots w\_{n-11j} \}^T \in \mathbb{R}^n \, \, \, \, \tag{10}$$

for j=0, 1, …, M-1.

In order to choose the winning neuron j\* with its associated weight vector Wj\* for an input pattern we must calculate the Gaussian distance dj between that pattern and each of the neuron's weight vectors. The winner will be chosen by the lowest distance \* *<sup>j</sup> d* of all:

$$\|d\_j = \left\| X\_p - \mathcal{W}\_j \right\|\_{\nu} \tag{11}$$

$$d\_j^\* = \min\{d\_j\} \tag{12}$$

for j=0, 1, …, M-1.

216 Fuzzy Logic – Algorithms, Techniques and Implementations

The network transforms similarities among vectors into neural vicinities (the similar input

From a structural point of view, the Kohonen neural network is composed of two layers out of which the first one is an input layer made of transparent neurons with no processing functions. Its purpose is to receive the input pattern and send it to the second layer. This

The second layer contains M output neurons, a number equal or higher than the number of classes desired in order to classify the entry patterns. They can be arranged planar, linear, circular, as a torus or sphere, the training and performances being dependent on the network shape. The planar network can also be rectangular or hexagonal depending on the

An input vector *<sup>n</sup> X R <sup>p</sup>* is applied in parallel to all the neurons of the network, each of

In order to choose the winning neuron j\* with its associated weight vector Wj\* for an input pattern we must calculate the Gaussian distance dj between that pattern and each of the

neuron's weight vectors. The winner will be chosen by the lowest distance \*

0 1 11 ( , ,..., )*T n Www w R j j j nj* , (10)

*j pj d XW* , (11)

\* min{ } *j j d d* , (12)

*<sup>j</sup> d* of all:

patterns will be found as neighbors).

Fig. 3. The SOM neural network

placement of neurons.

for j=0, 1, …, M-1.

for j=0, 1, …, M-1.

first layer has the same size as the input pattern.

them being characterized by a weight vector:

After the winner determination process has finished the weights refining one is started and this must not have an effect on all the neurons but only in a certain vicinity \* *Vj* around the winner j\*. Outside this perimeter the influence of this process is considered null. The radius of this vicinity starts out big and keeps on getting smaller and smaller with the refining process.

The learning rate can have many expressions. In this application, the chosen expression was:

$$\eta(t) = \eta\_0 \exp\left[-\left\|r\_k - r\_j^\*\right\|/\sigma^2\right],\tag{13}$$

where rj\* and rk are position vectors in the network representing the characteristic of the neural center of the vicinity and the neuron with the index k for which the refining process is taking place. Function η0=η0(t) decrease in time, representing the value of the learning rate in the center of the vicinity:

$$
\eta\_0(t) = a \nmid t^p \,, \tag{14}
$$

The parameter σ controls the speed of decreasing the learning rate, depending on the radius of the vicinity.

After the refining process for the current input vector is finished the next one is selected and so on until all the input vectors are used and the stop training condition is inspected. A useful stopping condition is the moment when the weights of the network cease being refined (are no longer being modified):

$$\left\| w\_{ij}(t+1) - w\_{ij}(t) \right\| < \varepsilon\_{\prime \prime} \tag{15}$$

where i=0, 1, …, n-1 and j=0, 1, …, M-1.

The flowchart in Fig.4 summarizes the procedure of adapting and implementing the Kohonen Self-Organizing Map algorithm to the application proposed in the chapter. The differences from the standard theoretical algorithm are the same as those described for the Fuzzy Kwan-Cai algorithm in Fig. 2.

### **2.3 Bayes classifier**

For the Bayes Classifier working with Gaussian classes (Larose, 2006) considering first the case of two (R=2) 1-dimensional classes (n=1), the density of probability being of Gaussian nature can be defined:

$$\log\_r(\mathbf{x}) = p(\mathbf{x} \mid o\_r) \cdot P(o\_r) = \frac{1}{\sqrt{2\pi} \cdot \sigma\_r} \cdot e^{-\frac{\left(\mathbf{x} - \mathbf{w}\_r\right)^2}{2\sigma\_r^2}} \cdot P(o\_r), \tag{16}$$

where parameter *r* 1; 2 .

Engine Knock Detection Based on Computational Intelligence Methods 219

/2 1/2

*C*

*r*

where { } *m Ex r r* represents the means of vectors in class r, {( )( ) } *<sup>T</sup> C Exm xm rr r r* is the matrix of covariance for the vectors in class r and { } *Er* is an operator that determines the mean value and that is used to make estimations concerning mr and Cr based on a finite

1

*r*

*r r r r x C xx m m N* 

Cr being a positive semi-defined symmetrical matrix. The discriminant function based on

1 1 <sup>1</sup> ( ) ln ( ) ln [( ) ( )] 2 2

The flowchart in Fig.5 summarizes the procedure of adapting and implementing the Bayes Classifier algorithm to the application proposed in the chapter. The differences from the standard theoretical algorithm are the same as those described for the Fuzzy Kwan-Cai

The algorithms treated in this chapter were tested on a Bosch Group database using two sizes of vector sample groups: one of 100 vectors and one of 1000, both of them containing pressure and vibration samples. In each case two thirds of the group was used for training

The vectors that make up the database represent samples taken from petrol engines, some corresponding to knock situations and some not. The signals related to these samples were taken from pressure and vibration sensors mounted in and around the engine bay. The change in pressure caused by knock activity is seen as an immediate rise in pressure due to causes outside the normal engine piston cycle. On the other hand, the vibration sensors will detect vibrations – knocking noises – representing abnormal combustion fields being

The applications have to declare knock or no knock for every sample vector received and, after testing the database, reach a verdict on the error of the process or in this case the identification rate. When knock is encountered actions can be taken to return the engine to a

*r r <sup>j</sup> rr r gx P C x m C x m*

*r x m x N* 

*r*

1

algorithm in Fig.2 and the Kohonen Self-Organizing Map in Fig.4.

*r*

*T T*

*T*

, (20)

<sup>1</sup> (| ) (2 )

number Nr of patterns from ωr. Their formulae are:

the Gaussian density of probability will be:

**3. Experimental results for each method** 

and one third for testing.

generated inside the pistons.

non-knock state.

**3.1 Methodology description, results and analysis** 

*r n*

*p x e*

2

<sup>1</sup> <sup>1</sup> ( )( )

, (17)

, (18)

, (19)

*T <sup>r</sup> <sup>j</sup> <sup>r</sup> xm C xm*

Fig. 4. Flowchart for implemented Kohonen Self- Organizing Map algorithm

Making an expansion to the n-dimensional case, the formula (16) for the Gaussian dispersion becomes:

218 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 4. Flowchart for implemented Kohonen Self- Organizing Map algorithm

dispersion becomes:

Making an expansion to the n-dimensional case, the formula (16) for the Gaussian

$$p(\mathbf{x} \mid \boldsymbol{\alpha}\_r) = \frac{1}{(2\pi)^{n/2} \left| \mathbb{C}\_r \right|^{1/2}} e^{-\frac{1}{2} (\mathbf{x} - \boldsymbol{m}\_r)^T \mathbb{C}\_j^{-1} (\mathbf{x} - \boldsymbol{m}\_r)}\tag{17}$$

where { } *m Ex r r* represents the means of vectors in class r, {( )( ) } *<sup>T</sup> C Exm xm rr r r* is the matrix of covariance for the vectors in class r and { } *Er* is an operator that determines the mean value and that is used to make estimations concerning mr and Cr based on a finite number Nr of patterns from ωr. Their formulae are:

$$\text{cm}\_r = \frac{1}{N\_r} \sum\_{\text{x} \neq \text{co}\_r} \text{x} \tag{18}$$

$$\mathbf{C}\_r = \frac{1}{N\_r} \sum\_{\mathbf{x} \in ao\_r} \mathbf{x} \mathbf{x}^T - m\_r m\_r^T \tag{19}$$

Cr being a positive semi-defined symmetrical matrix. The discriminant function based on the Gaussian density of probability will be:

$$\log\_r(\mathbf{x}) = \ln P(o\_r) - \frac{1}{2} \ln \left| \mathbf{C}\_j \right| - \frac{1}{2} [\left(\mathbf{x} - m\_r\right)^T \mathbf{C}\_r^{-1} (\mathbf{x} - m\_r)],\tag{20}$$

The flowchart in Fig.5 summarizes the procedure of adapting and implementing the Bayes Classifier algorithm to the application proposed in the chapter. The differences from the standard theoretical algorithm are the same as those described for the Fuzzy Kwan-Cai algorithm in Fig.2 and the Kohonen Self-Organizing Map in Fig.4.

### **3. Experimental results for each method**

### **3.1 Methodology description, results and analysis**

The algorithms treated in this chapter were tested on a Bosch Group database using two sizes of vector sample groups: one of 100 vectors and one of 1000, both of them containing pressure and vibration samples. In each case two thirds of the group was used for training and one third for testing.

The vectors that make up the database represent samples taken from petrol engines, some corresponding to knock situations and some not. The signals related to these samples were taken from pressure and vibration sensors mounted in and around the engine bay. The change in pressure caused by knock activity is seen as an immediate rise in pressure due to causes outside the normal engine piston cycle. On the other hand, the vibration sensors will detect vibrations – knocking noises – representing abnormal combustion fields being generated inside the pistons.

The applications have to declare knock or no knock for every sample vector received and, after testing the database, reach a verdict on the error of the process or in this case the identification rate. When knock is encountered actions can be taken to return the engine to a non-knock state.

Engine Knock Detection Based on Computational Intelligence Methods 221

The testing method for both algorithms (Fuzzy Kwan-Cai and Kohonen Self-Organizing Map) is the following: one parameter varies between its theoretical limits whereas the others remain constant. It is obvious that the difference between the bigger training group and the

The following tables contain only the significant part of the experimental results in order to

This type of neural network does not need training cycles because it learns as it studies the vectors it receives and builds its own classes in the testing process. In order not to get the wrong idea from the start we have to mention that the high number of classes observed in Table I and Table II for this neural network is due to the second nature of the application which acts like a "focusing lens", examining the internal structure of the two main classes. Therefore it must be stated that the number of classes we are interested in, for this experiment, is two. The significance and proper function limits of this application for parameters given in Table I and Table II are: α which is the base of isosceles triangles (α [1.5; 3.5]), β that determines how much of the sample vector each fuzzy neuron sees (β [0.1; 1.6]) and Tf that

The first vector generates a class of its own, the next ones either are found relatives of one of the vectors that have come before and therefore are put in the same class or start a new class. The maximum detection results in the tables mentioned above are outlined by being bolded.

Tables Ia and Ib present the pressure sample detection rate results for the Fuzzy Kwan-Cai neural network using the small sample database and the large sample database. According to Table Ia, the highest detection rate value of 68% was obtained for combination (3.5; 0.15; 1)

The same method has been used for Table Ib showing the combinations used by changing the parameter Tf while keeping constant the other two. Combinations are from (3.5; 0.35; 1) down to (3.5; 0.15; 1). A maximum correct detection rate of 93.40% was obtained for the (3.5;

The detection rate results in Tables Ia and Ib show that from this point of view the Fuzzy Kwan-Cai neural network is very stabile, small variations of its parameters not affecting the experimental outcome. It is clear from the results presented that an increase in the sample database leads to an increase in the detection rates, the network not being affected by

Tables IIa and IIb contain the vibration sample detection results. Table IIa represents the small sample database and Table IIb the large one. Table IIa uses the same method of parameter variation as Tables Ia and Ib but valid variations are not achieved because for a

The first part of Table IIb contains results obtained by using combinations in the same way as Tables Ia, Ib and IIa, the parameter that varies being Tf whereas the others are kept constant. Used combinations start at (3.5; 0.35; 1) and end at (3.5; 0.15; 1). In this first set the

smaller one should be the higher detection rate.

outline the highest detection rates obtained.

**3.2 Fuzzy Kwan-Cai neural network results** 

represents the neural network's sensitivity to errors (Tf [0.1; 0.35]).

Unsatisfactory results with high detection rates are presented in italic.

sample vectors that are not cohesive in nature with the rest of their class.

result to be considered satisfactory it should at least be higher than 50%.

where parameters Tf and β are kept constant whereas α varies.

0.22; 1) group.

Fig. 5. Flowchart for implemented Bayes Classifier

The testing method for both algorithms (Fuzzy Kwan-Cai and Kohonen Self-Organizing Map) is the following: one parameter varies between its theoretical limits whereas the others remain constant. It is obvious that the difference between the bigger training group and the smaller one should be the higher detection rate.

The following tables contain only the significant part of the experimental results in order to outline the highest detection rates obtained.

### **3.2 Fuzzy Kwan-Cai neural network results**

220 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 5. Flowchart for implemented Bayes Classifier

This type of neural network does not need training cycles because it learns as it studies the vectors it receives and builds its own classes in the testing process. In order not to get the wrong idea from the start we have to mention that the high number of classes observed in Table I and Table II for this neural network is due to the second nature of the application which acts like a "focusing lens", examining the internal structure of the two main classes. Therefore it must be stated that the number of classes we are interested in, for this experiment, is two. The significance and proper function limits of this application for parameters given in Table I and Table II are: α which is the base of isosceles triangles (α [1.5; 3.5]), β that determines how much of the sample vector each fuzzy neuron sees (β [0.1; 1.6]) and Tf that represents the neural network's sensitivity to errors (Tf [0.1; 0.35]).

The first vector generates a class of its own, the next ones either are found relatives of one of the vectors that have come before and therefore are put in the same class or start a new class. The maximum detection results in the tables mentioned above are outlined by being bolded. Unsatisfactory results with high detection rates are presented in italic.

Tables Ia and Ib present the pressure sample detection rate results for the Fuzzy Kwan-Cai neural network using the small sample database and the large sample database. According to Table Ia, the highest detection rate value of 68% was obtained for combination (3.5; 0.15; 1) where parameters Tf and β are kept constant whereas α varies.

The same method has been used for Table Ib showing the combinations used by changing the parameter Tf while keeping constant the other two. Combinations are from (3.5; 0.35; 1) down to (3.5; 0.15; 1). A maximum correct detection rate of 93.40% was obtained for the (3.5; 0.22; 1) group.

The detection rate results in Tables Ia and Ib show that from this point of view the Fuzzy Kwan-Cai neural network is very stabile, small variations of its parameters not affecting the experimental outcome. It is clear from the results presented that an increase in the sample database leads to an increase in the detection rates, the network not being affected by sample vectors that are not cohesive in nature with the rest of their class.

Tables IIa and IIb contain the vibration sample detection results. Table IIa represents the small sample database and Table IIb the large one. Table IIa uses the same method of parameter variation as Tables Ia and Ib but valid variations are not achieved because for a result to be considered satisfactory it should at least be higher than 50%.

The first part of Table IIb contains results obtained by using combinations in the same way as Tables Ia, Ib and IIa, the parameter that varies being Tf whereas the others are kept constant. Used combinations start at (3.5; 0.35; 1) and end at (3.5; 0.15; 1). In this first set the

Engine Knock Detection Based on Computational Intelligence Methods 223

α Tf β Rate [%] No.classes 3.5 0.35÷0.29 1 48% 2 3.5 0.28 1 48% 3 3.5 0.27 1 62% 6 3.5 0.26 1 62% 6 *3.5 0.25 1 68% 9 3.5 0.24 1 68% 9*  3.5 0.23 1 62% 9 3.3 0.3 1 48% 3 3.3 0.3 0.9 48% 3 3.3 0.3 0.8 48% 3 3.3 0.3 0.7÷0.3 48% 2 3.3 0.3 0.2 48% 1

Table IIa. Vibration detection rates- small database

Table IIb. Vibration detection rates- large database

Fig. 7. Plot of Tf (blue),Rate[%] (red) and No.classes (green) (Table IIa)

α Tf β Rate [%] No. classes 3.5 0.35÷0.24 1 93.40% 1 **3.5 0.24 1 93.40% 2**  3.5 0.23 1 93.40% 3 3.5 0.22 1 93.40% 5 3.5 0.21 1 93.40% 9 3.5 0.2 1 82.05% 10 3.5 0.19 1 93.13% 32 3.5 0.18 1 92.34% 57 3.5 0.17 1 90.23% 107 3.5 0.16 1 85.75% 143 3.5 0.15 1 85.75% 200 3.3 0.3 1 85.10% 3 3.3 0.3 0.9 85.10% 3 3.3 0.3 0.8 85.10% 3 **3.3 0.3 0.7÷0.3 85.10% 2**  3.3 0.3 0.2 85.10% 1


Table Ia. Pressure detection rates- small database

Fig. 6. Plot of α (blue) ,Rate[%](green) and No. classes (red) (Table Ia)


Table Ib. Pressure detection rates- large database

222 Fuzzy Logic – Algorithms, Techniques and Implementations

α Tf β Rate [%] No. classes 3.5 0.15 1 68% 2 3.4 0.15 1 64% 4 3.3 0.15 1 64% 4 3.2 0.15 1 64% 4 3.1 0.15 1 64% 4 3 0.15 1 48% 5 2.9 0.15 1 48% 5 2.8 0.15 1 48% 7 2.7 0.15 1 48% 7 2.6 0.15 1 48% 7 2.5 0.15 1 72% 9 2.4 0.15 1 60% 9 2.3 0.15 1 58% 11 2.2 0.15 1 58% 11 2.1 0.15 1 58% 11 2 0.15 1 68% 12

Table Ia. Pressure detection rates- small database

Table Ib. Pressure detection rates- large database

Fig. 6. Plot of α (blue) ,Rate[%](green) and No. classes (red) (Table Ia)

α Tf β Rate [%] No. classes 3.5 0.35÷0.23 1 93.40% 1 3.5 0.22 1 93.40% 2 3.5 0.21 1 93.40% 3 3.5 0.2 1 93.40% 3 3.5 0.19 1 93.40% 4 3.5 0.18 1 93.40% 5 3.5 0.17 1 93.40% 5 3.5 0.16 1 93.40% 8 3.5 0.15 1 93.40% 10 3.5 0.35÷0.23 1 93.40% 1


Table IIa. Vibration detection rates- small database

Fig. 7. Plot of Tf (blue),Rate[%] (red) and No.classes (green) (Table IIa)


Table IIb. Vibration detection rates- large database

Engine Knock Detection Based on Computational Intelligence Methods 225

What can be observed from the start is that the bigger sample group has almost equal detection times in both pressure and vibration cases to the smaller group, a significant increase being shown in the detection rates. The average detection times in Table III show that via optimization the network can be used in real–time knock applications with very

One can observe for the Fuzzy Kwan-Cai algorithm that different combinations of parameters can produce the same detection rates, so that a linear variation in any of the

The Kohonen–Self Organizing Map has a separate learning stage taking place before the detection process begins and being composed of epochs. After the learning stage has ended

For this neural network three sizes of neural maps were used – nine, one-hundred and fourhundred neurons –, as shown in Tables IV, V, VI. They were tested on both pressure and

Table IVa contains only the pressure sample detection rate results for the small vector database using the one hundred–neuron configuration. By keeping the number of epochs constant at 100 and the learning rate at 0.2 and by means of a variation of the neighborhood size from 90 down to 10, we obtained the following spike values: a detection rate of 80% marked bold-italic for the (100; 0.2; 83) group and the maximum value of the detection rate

Table IVb contains the pressure sample detection rates using the large database. From the start, using the nine-neuron map, an important fact appears: the nine-neuron map can not cope with the large database due to the small number of neurons that have to remember a large amount of samples, leading to confusion and very low detection rates. The variation methods are the same ones as in the complete version of Table IVa but, even by varying each of the parameters and keeping the other two constant, we can not obtain a spike value higher than 29.78% marked italic, value resulting from the combination (100; 0.4; 5). Performing the same variation techniques as in Table IVa, the maximum value for the detection rate in Table IVb results of 90.57% from the (400; 0.2; 400) and (500; 0.2; 400) combinations – both marked bold - , with lower but not less important spikes of 89.66% for

Table Va contains the vibration sample detection rates for the small database. The same variation methods as those in Tables IVa and IVb were used for the exact same values. The one-hundred-neuron network encounters its top value of 80% for the (100; 0.2; 95) combination and also a smaller spike of 74.28% for (100; 0.2; 60). The four-hundred-neuron network tops out at the 82.85% detection rate for the (300; 0.2; 400) combination of parameters. The same marking methods as in the previous tables were also used here and in the following ones.

The large database results for the vibration sample vectors are found in Table Vb. These values have come from the same methods of testing and values used in Tables IVa, IVb and Va. As in the case of the complete Table IVa (from which only the one-hundred neuron

good detection rates and with no prior in-factory learning processes.

**3.3 Kohonen Self–Organizing Map neural network results** 

vibration samples.

parameters will not always lead to a linear variation in the detection rate.

it does not need to be repeated and the processing of the test batch begins.

for the small database 82.85% marked bold for the (100; 0.2; 82) combination.

(100; 0.2; 400) and (100; 0.3; 400) – marked bold-italic.

Fig. 8. Plot of Tf (blue) ,Rate[%](red) and No.classes (green) (Table IIb)

maximum correct detection rate of 93.40% is achieved for (3.5; 0.24; 1) – values bolded. Set two contains combinations from (3.3; 0.3; 1) down to (3.3; 0.3; 0.2), parameter β varying between 1 and 0.2 and the other two staying constant. A detection value not as high but equally as important as the maximum one obtained in the previous set is showed in combinations from (3.3; 0.3; 0.7) to (3.3; 0.3; 0.3). The value is 85.10% and presents interest because it is a much higher value than the ones constantly obtained and also represents a correct class detection of two classes.

The vibration situation presented in Tables IIa and IIb leads us to the same results revealed by Tables Ia and Ib, that an increase in the database size will lead to a substantial increase in the detection rate.

In the case of the large sample group shown in Tables Ia and Ib, respectively in Tables IIa and IIb, the neural network does not show any difference in maximum detection rates, differences being observed only for the small sample group. Both tables also present the same maximum detection rate, showing that the network can learn to identify both types of samples with the same accuracy.

Table III presents the time situation. It contains the average detection time situation for both pressure and vibration samples and also from a small and large database point of view. It is clear that the large database obtains better results with almost equally small detection times – 0.0022s for pressure and 0.0046s for vibration – and that pressure vectors have the tendency of being faster detected than vibration ones because the pressure group is more coherent and homogenous than vibration group.


Table III. Average detection times representing pressure and vibration for both small and large databases

224 Fuzzy Logic – Algorithms, Techniques and Implementations

maximum correct detection rate of 93.40% is achieved for (3.5; 0.24; 1) – values bolded. Set two contains combinations from (3.3; 0.3; 1) down to (3.3; 0.3; 0.2), parameter β varying between 1 and 0.2 and the other two staying constant. A detection value not as high but equally as important as the maximum one obtained in the previous set is showed in combinations from (3.3; 0.3; 0.7) to (3.3; 0.3; 0.3). The value is 85.10% and presents interest because it is a much higher value than the ones constantly obtained and also represents a

The vibration situation presented in Tables IIa and IIb leads us to the same results revealed by Tables Ia and Ib, that an increase in the database size will lead to a substantial increase in

In the case of the large sample group shown in Tables Ia and Ib, respectively in Tables IIa and IIb, the neural network does not show any difference in maximum detection rates, differences being observed only for the small sample group. Both tables also present the same maximum detection rate, showing that the network can learn to identify both types of

Table III presents the time situation. It contains the average detection time situation for both pressure and vibration samples and also from a small and large database point of view. It is clear that the large database obtains better results with almost equally small detection times – 0.0022s for pressure and 0.0046s for vibration – and that pressure vectors have the tendency of being faster detected than vibration ones because the pressure group is more

> Pressur e

Large database **0.0022 0.0046**  Small database 0.0052 0.0056

Table III. Average detection times representing pressure and vibration for both small and

Vibratio n

Fig. 8. Plot of Tf (blue) ,Rate[%](red) and No.classes (green) (Table IIb)

correct class detection of two classes.

samples with the same accuracy.

coherent and homogenous than vibration group.

Average detection time [s]

the detection rate.

large databases

What can be observed from the start is that the bigger sample group has almost equal detection times in both pressure and vibration cases to the smaller group, a significant increase being shown in the detection rates. The average detection times in Table III show that via optimization the network can be used in real–time knock applications with very good detection rates and with no prior in-factory learning processes.

One can observe for the Fuzzy Kwan-Cai algorithm that different combinations of parameters can produce the same detection rates, so that a linear variation in any of the parameters will not always lead to a linear variation in the detection rate.

### **3.3 Kohonen Self–Organizing Map neural network results**

The Kohonen–Self Organizing Map has a separate learning stage taking place before the detection process begins and being composed of epochs. After the learning stage has ended it does not need to be repeated and the processing of the test batch begins.

For this neural network three sizes of neural maps were used – nine, one-hundred and fourhundred neurons –, as shown in Tables IV, V, VI. They were tested on both pressure and vibration samples.

Table IVa contains only the pressure sample detection rate results for the small vector database using the one hundred–neuron configuration. By keeping the number of epochs constant at 100 and the learning rate at 0.2 and by means of a variation of the neighborhood size from 90 down to 10, we obtained the following spike values: a detection rate of 80% marked bold-italic for the (100; 0.2; 83) group and the maximum value of the detection rate for the small database 82.85% marked bold for the (100; 0.2; 82) combination.

Table IVb contains the pressure sample detection rates using the large database. From the start, using the nine-neuron map, an important fact appears: the nine-neuron map can not cope with the large database due to the small number of neurons that have to remember a large amount of samples, leading to confusion and very low detection rates. The variation methods are the same ones as in the complete version of Table IVa but, even by varying each of the parameters and keeping the other two constant, we can not obtain a spike value higher than 29.78% marked italic, value resulting from the combination (100; 0.4; 5). Performing the same variation techniques as in Table IVa, the maximum value for the detection rate in Table IVb results of 90.57% from the (400; 0.2; 400) and (500; 0.2; 400) combinations – both marked bold - , with lower but not less important spikes of 89.66% for (100; 0.2; 400) and (100; 0.3; 400) – marked bold-italic.

Table Va contains the vibration sample detection rates for the small database. The same variation methods as those in Tables IVa and IVb were used for the exact same values. The one-hundred-neuron network encounters its top value of 80% for the (100; 0.2; 95) combination and also a smaller spike of 74.28% for (100; 0.2; 60). The four-hundred-neuron network tops out at the 82.85% detection rate for the (300; 0.2; 400) combination of parameters. The same marking methods as in the previous tables were also used here and in the following ones.

The large database results for the vibration sample vectors are found in Table Vb. These values have come from the same methods of testing and values used in Tables IVa, IVb and Va. As in the case of the complete Table IVa (from which only the one-hundred neuron

Engine Knock Detection Based on Computational Intelligence Methods 227

Fig. 10. Plot of No. Neurons (blue), Neighborhood (red) and Rate[%] (green)(Table IVb)

Fig. 11. Plot No. Neurons (blue) Neighborhood(red) and Rate[%] green)(Table Va)

Table Va. Vibration detection rates- small database

No. neurons Epochs Learning rate Neighborhood Rate [%] **100 100 0.2 95 80**  100 100 0.2 90 65.71 100 100 0.2 80 65.71 100 100 0.2 70 57.14 **100 100 0.2 60 74.28**  100 100 0.2 50 65.71 100 100 0.2 40 65.71 100 100 0.2 30 68.57 100 100 0.2 20 60 400 100 0.2 400 65.71 400 200 0.2 400 71.42 **400 300 0.2 400 82.85**  400 400 0.2 400 65.71 400 500 0.2 400 62.85 400 600 0.2 400 71.42


Table IVa. Pressure detection rates- small database

Fig. 9. Plot of No. Neurons (blue), Neighborhood (red) and Rate[%] (green)(Table IVa)


Table IVb. Pressure detection rates- large database

226 Fuzzy Logic – Algorithms, Techniques and Implementations

No. neurons Epochs Learning rate Neighborhood Rate [%] 100 100 0.2 90 68.57 *100 100 0.2 83 80*  **100 100 0.2 82 82.85**  100 100 0.2 80 77.14 100 100 0.2 70 74.28 100 100 0.2 60 68.57 100 100 0.2 50 71.42 100 100 0.2 40 71.42 100 100 0.2 30 71.42 100 100 0.2 20 77.14 100 100 0.2 10 65.71

Fig. 9. Plot of No. Neurons (blue), Neighborhood (red) and Rate[%] (green)(Table IVa)

No. neurons Epochs Learning rate Neighborhood Rate [%] 9 100 0.2 5 23.03 9 100 0.3 5 27.35 *9 100 0.4 5 29.78 9 100 0.5 5 28.57*  9 100 0.6 5 19.75 9 100 0.7 5 20.06 *400 100 0.2 400 89.66 400 100 0.3 400 89.96*  400 100 0.4 400 87.84 400 100 0.5 400 88.75 400 100 0.6 400 89.96 400 100 0.7 400 88.75 *400 100 0.2 400 89.66*  400 200 0.2 400 89.36 400 300 0.2 400 88.75 **400 400 0.2 400 90.57 400 500 0.2 400 90.57** 

Table IVa. Pressure detection rates- small database

Table IVb. Pressure detection rates- large database

Fig. 10. Plot of No. Neurons (blue), Neighborhood (red) and Rate[%] (green)(Table IVb)


Table Va. Vibration detection rates- small database

Fig. 11. Plot No. Neurons (blue) Neighborhood(red) and Rate[%] green)(Table Va)

Engine Knock Detection Based on Computational Intelligence Methods 229

Table VI represent the average detection times using both pressure and vibration vectors for both small and large databases. With values of 0.0023s (small database) and 0.0024s (large database) the pressure samples obtain smaller detection times than the vibration samples with 0.0027s (small database) and 0.0028s (large database). This situation is representative for the four-hundred-neuron network, this also being the slowest solution but with the highest detection rates. The nine-neuron network, even though it has the best detection times, can not be taken into account as a real application because it is not able to cope with large database. The one-hundred-neuron network is the best compromise between detection

As with the previous described algorithms, the SOM results shown in Tables IV and V that an increase in the sample group size (training set case) will lead to an increase in detection rates. In this case, the two separate groups are not separated by big detection rate gaps.

samples

SOM - 400 neurons 0.0023 0.0027 0.0024 0.0028 SOM - 100 neurons **0.000193 0.000478 0.000538 0.000498** SOM - 9 neurons 0.0000576 0.0000579 0.0000535 0.0000872 Table VI. Pressure and vibration average detection times for both small and large sample

As in theory, the experimental results in Tables IV, V and VI show that with the increase in neurons there is an increase in detection rates but a decrease in detection times because more neurons translate to more detail that can be remembered, so the distinction between knock and non-knock situations can be more precisely done - therefore a compromise must be made. Being interested not only in obtaining high detection rates but also detection times that would be coherent to the task at hand (samples must be processed in under an engine cycle so the modifications can be brought to the next one), the one-hundred-neuron map seems to be the best option from the three methods tested. The nine-neuron map, even if it produces very high detection times, has a very poor detection rate in both pressure and

The four-hundred-neuron map presented the highest detection rates for this neural network, values that are a little bit smaller than the Fuzzy Kwan-Cai but with detection times very similar to it, the only difference being that the SOM needs separate training. In this case, looking at the detection times in Table VI, the SOM does not seem to make any difference between pressure and vibration signals, the medium detection times showing very small variations. There is a small difference in detection rates between pressure and vibration

A very important factor in the good working of the Kohonen Self-Organizing Map is getting the number of epochs and the learning rate well calibrated. A greater than necessary number of epochs would lead to the situation where the network learns in the necessary time period but it is left with more epochs that are not used for learning. This situation, in combination with a high learning rate, would lead to the network learning everything very fast in the first epochs and then forgetting or distorting the knowledge in the following ones.

Pressure samples Vibration

vibration groups making it useless for any further applications.

samples; the SOM seems to handle both models very well.

Small database Large database

Pressure samples

Vibration samples

speed and detection rates as shown in this table.

Average detection time [s]

databases


Table Vb. Vibration detection rates- large database

Fig. 12. Plot of No. Neurons (blue), Neighborhood (red) and Rate[%] (green) (Table Vb)

section has been presented in this chapter), the nine-neuron network in the complete Table Vb is not suited for working with such a large database, the network becoming confused. This shows in constant results under 50% which can not be taken into account as valid experimental results. These values can only be used as examples of exceptional cases. The one-hundred-neuron network section presented in Table Vb obtains a maximum detection rate of 81.76% for combinations (100; 0.2; 50), another important value over 80% being of 81.15 % for (100; 0.2; 70) . The four-hundred-neuron network tops out at 89.66% for combinations (100; 0.2; 250) and present other important values of 89.36% for (100; 0.2, 325) and of 89,05% for (100; 0.2; 375).

228 Fuzzy Logic – Algorithms, Techniques and Implementations

Neighborho

od Rate [%]

rate

Fig. 12. Plot of No. Neurons (blue), Neighborhood (red) and Rate[%] (green) (Table Vb)

section has been presented in this chapter), the nine-neuron network in the complete Table Vb is not suited for working with such a large database, the network becoming confused. This shows in constant results under 50% which can not be taken into account as valid experimental results. These values can only be used as examples of exceptional cases. The one-hundred-neuron network section presented in Table Vb obtains a maximum detection rate of 81.76% for combinations (100; 0.2; 50), another important value over 80% being of 81.15 % for (100; 0.2; 70) . The four-hundred-neuron network tops out at 89.66% for combinations (100; 0.2; 250) and present other important values of 89.36% for (100; 0.2, 325)

100 100 0.2 95 79.63 100 100 0.2 90 79.93 100 100 0.2 80 79.02 *100 100 0.2 70 81.15* 100 100 0.2 60 78.11 **100 100 0.2 50 81.76**  100 100 0.2 40 75.98 100 100 0.2 30 79.93 100 100 0.2 20 76.59 400 100 0.2 400 87.53 *400 100 0.2 375 89.05* 400 100 0.2 350 88.75 *400 100 0.2 325 89.36* 400 100 0.2 300 86.83 400 100 0.2 275 86.62 **400 100 0.2 250 89.66**  400 100 0.2 225 88.75 400 100 0.2 200 88.44 400 100 0.2 175 88.44

No. neurons Epochs Learning

Table Vb. Vibration detection rates- large database

and of 89,05% for (100; 0.2; 375).

Table VI represent the average detection times using both pressure and vibration vectors for both small and large databases. With values of 0.0023s (small database) and 0.0024s (large database) the pressure samples obtain smaller detection times than the vibration samples with 0.0027s (small database) and 0.0028s (large database). This situation is representative for the four-hundred-neuron network, this also being the slowest solution but with the highest detection rates. The nine-neuron network, even though it has the best detection times, can not be taken into account as a real application because it is not able to cope with large database. The one-hundred-neuron network is the best compromise between detection speed and detection rates as shown in this table.

As with the previous described algorithms, the SOM results shown in Tables IV and V that an increase in the sample group size (training set case) will lead to an increase in detection rates. In this case, the two separate groups are not separated by big detection rate gaps.


Table VI. Pressure and vibration average detection times for both small and large sample databases

As in theory, the experimental results in Tables IV, V and VI show that with the increase in neurons there is an increase in detection rates but a decrease in detection times because more neurons translate to more detail that can be remembered, so the distinction between knock and non-knock situations can be more precisely done - therefore a compromise must be made. Being interested not only in obtaining high detection rates but also detection times that would be coherent to the task at hand (samples must be processed in under an engine cycle so the modifications can be brought to the next one), the one-hundred-neuron map seems to be the best option from the three methods tested. The nine-neuron map, even if it produces very high detection times, has a very poor detection rate in both pressure and vibration groups making it useless for any further applications.

The four-hundred-neuron map presented the highest detection rates for this neural network, values that are a little bit smaller than the Fuzzy Kwan-Cai but with detection times very similar to it, the only difference being that the SOM needs separate training. In this case, looking at the detection times in Table VI, the SOM does not seem to make any difference between pressure and vibration signals, the medium detection times showing very small variations. There is a small difference in detection rates between pressure and vibration samples; the SOM seems to handle both models very well.

A very important factor in the good working of the Kohonen Self-Organizing Map is getting the number of epochs and the learning rate well calibrated. A greater than necessary number of epochs would lead to the situation where the network learns in the necessary time period but it is left with more epochs that are not used for learning. This situation, in combination with a high learning rate, would lead to the network learning everything very fast in the first epochs and then forgetting or distorting the knowledge in the following ones.

Engine Knock Detection Based on Computational Intelligence Methods 231

Training vectors Test vectors Press. rate [%] Vib. rate [%] 11 90 65.50 55.55 12 89 56.17 58.42 **13 88 72.50 60.22**  15 86 51.16 53.48 21 80 63.75 66.25 28 73 54.79 64.38 35 66 59.09 60.60 41 60 68.33 70 47 54 59.25 68.51 55 46 56.52 73.91 61 40 67.50 70 65 36 52.77 77.77 67 34 68.57 74.28 75 26 50 76.92 80 21 42.85 76.19

Table VIIa. Pressure - vibration detection rates- small database

Table VIIb. Pressure - vibration detection rates- large database

Fig. 13. Test vectors (blue), pressure (red) and vibration rates(green) (Table VIIa)

Training vectors Test vectors Press. rate [%] Vib. rate [%] **371 629 93.64 90.30**  391 609 93.43 90.14 411 589 93.20 90.32 431 569 92.97 89.98 451 549 92.71 89.79 471 529 92.43 89.60 491 509 92.14 89.58 511 489 91.82 89.77 531 469 91.42 89.55 **551 449 91.09 89.08 571 429 90.67 89.04 591 409 91.44 89.48**  611 389 92.28 90.23 631 369 93.22 91.32 651 349 94.26 92.26

### **3.4 Bayes classifier results**

The Bayes Classifier, as described by its name, is not a neural network but has been included in this chapter as a basic reference point for the evaluation of the two neural networks. It uses a method of calculating the minimum distance from a sample to one of the knock or non-knock class centers - classes that are considered Gaussian by nature. That is why it presents the worst detection times, as shown in Table VIII.

Table VIIa represents the combined pressure and vibration detection rates status for the small database. The way the testing has been done for this algorithm is by progressively growing from a small comparison group (the batch of samples chosen to represent the known classes for testing) versus large test group situation, to a large comparison group versus small test group situation.

The process starts out with a balance of 11 training vectors and 90 testing ones, which leads to a detection rate starting from 65.50% for pressure and 55.55% for vibration and grows (for training vectors) versus shrinks (for testing vectors) in a progressive way to 85 training vectors and 16 testing vectors, leading to a detection rate ending at 43.75% for pressure and 81.25% for vibration. An interesting detail can be observed in this table: the pressure vectors seem to present a constant state even though more and more are added to the learning group every time the detection rates stay approximately between 50% and 72.50%, the last value being the highest pressure detection rate.

The change of state occurs at the end of the table where we can observe a decrease in the learning rate for the combinations of (80 training vectors; 21 testing vectors) with a detection rate of 42.85% and (85 training vectors; 16 testing vectors) with a detection rate of 43.75%.

This decrease is due to the inclusion in the learning group of vectors that are radically different from their stated class; therefore, the knock or non-knock distinction can not be made. In the case of the vibration sample vectors the progression is of almost uniform growth from 55.55% to 81.25%, the last being also the maximum detection rate for the small database experiment.

Table VIIb follows the same type of progression, only that the large database is used for both pressure and vibration samples. The progression goes from a combination of (371 training vectors; 629 testing vectors) with a detection rate of 93.64% for pressure and 90.30% for vibration samples to a combination of (671 training vectors; 329 testing vectors) with the maximum detection rate achieved in this table of 95.44% for pressure samples and 92.40% for vibration samples. Within this progression it can be seen more clearly that the pressure samples are very cohesive in nature and that, given enough samples, the algorithm goes past the problems it has with radically different sample vectors, maintaining a detection rate over 90% in every case.

Table VIII represents the average detection times for both the small and large databases using both pressure and vibration samples.

Being a simple comparative algorithm, we can see in Table VIII that an increase in the database size leads to a slowing down of the process because the comparison must be made with more vectors. In the case of the small database, pressure vectors are detected faster 230 Fuzzy Logic – Algorithms, Techniques and Implementations

The Bayes Classifier, as described by its name, is not a neural network but has been included in this chapter as a basic reference point for the evaluation of the two neural networks. It uses a method of calculating the minimum distance from a sample to one of the knock or non-knock class centers - classes that are considered Gaussian by nature. That is why it

Table VIIa represents the combined pressure and vibration detection rates status for the small database. The way the testing has been done for this algorithm is by progressively growing from a small comparison group (the batch of samples chosen to represent the known classes for testing) versus large test group situation, to a large comparison group

The process starts out with a balance of 11 training vectors and 90 testing ones, which leads to a detection rate starting from 65.50% for pressure and 55.55% for vibration and grows (for training vectors) versus shrinks (for testing vectors) in a progressive way to 85 training vectors and 16 testing vectors, leading to a detection rate ending at 43.75% for pressure and 81.25% for vibration. An interesting detail can be observed in this table: the pressure vectors seem to present a constant state even though more and more are added to the learning group every time the detection rates stay approximately between 50% and 72.50%, the last

The change of state occurs at the end of the table where we can observe a decrease in the learning rate for the combinations of (80 training vectors; 21 testing vectors) with a detection rate of 42.85% and (85 training vectors; 16 testing vectors) with a detection rate of 43.75%.

This decrease is due to the inclusion in the learning group of vectors that are radically different from their stated class; therefore, the knock or non-knock distinction can not be made. In the case of the vibration sample vectors the progression is of almost uniform growth from 55.55% to 81.25%, the last being also the maximum detection rate for the small

Table VIIb follows the same type of progression, only that the large database is used for both pressure and vibration samples. The progression goes from a combination of (371 training vectors; 629 testing vectors) with a detection rate of 93.64% for pressure and 90.30% for vibration samples to a combination of (671 training vectors; 329 testing vectors) with the maximum detection rate achieved in this table of 95.44% for pressure samples and 92.40% for vibration samples. Within this progression it can be seen more clearly that the pressure samples are very cohesive in nature and that, given enough samples, the algorithm goes past the problems it has with radically different sample vectors, maintaining a detection rate

Table VIII represents the average detection times for both the small and large databases

Being a simple comparative algorithm, we can see in Table VIII that an increase in the database size leads to a slowing down of the process because the comparison must be made with more vectors. In the case of the small database, pressure vectors are detected faster

**3.4 Bayes classifier results** 

versus small test group situation.

database experiment.

over 90% in every case.

using both pressure and vibration samples.

value being the highest pressure detection rate.

presents the worst detection times, as shown in Table VIII.


Table VIIa. Pressure - vibration detection rates- small database

Fig. 13. Test vectors (blue), pressure (red) and vibration rates(green) (Table VIIa)


Table VIIb. Pressure - vibration detection rates- large database

Engine Knock Detection Based on Computational Intelligence Methods 233

<sup>68</sup> 82,85 68,57

Kwan- Cai SOM Bayes

Detection rates (Pressure samples) [%]

(a)

Kwan- Cai SOM Bayes

90,57

Detection rates (Pressure samples) [%]

(b) Fig. 15. Pressure sample detection rates using the small database (a) and the large database

An increase in the database size from one hundred to one thousand sample vectors will lead to a minimum increase of ten percent in the detection rates. For the small database, the Fuzzy Kwan-Cai neural network obtains maximum detection rates for the pressure samples at 68% that are higher than the ones for vibration samples at 48%, but after using the large data set the maximum pressure and vibration detection rates become equal at 93.40%. The difference in detection rates for the pressure and vibration samples using the small database shows that the pressure samples are more coherent and therefore easier to classify. The same evolution as shown by the Fuzzy Kwan- Cai is also true for the Kohonen Self-Organizing Map (SOM). Even more so, the increase in learning database size will lead to a

The second discussion will be based on the detection rate point of view. As shown in Fig.15 and Fig.16, the Bayes Classifier seems to show the best detection rates. Its fault is that it needs large amounts of comparison data in order to create classes that are comprehensive enough. Out of the three algorithms tested in this chapter, it is also the less stabile due to the fact that it calculates distances to the center of the comparison classes. If these classes are not well defined and separated, the detection rates fall dramatically. This can be seen in Table VIIb. The Fuzzy Kwan-Cai obtains the highest detection rates of all three algorithms - these being valid detection rates that are not influenced by the nature of learned vectors leading to the great stability of this method. The learning method used employs the automatic generation of learning classes as it goes through the sample set. The fuzzy logic creates a more organic representation of the knowledge classes than the boolean one. The Kohonen Self-Organizing Map (SOM) presents the second highest detection rates and a more controlled and stabile learning and training environment then the other two algorithms. Because the learning is done prior to the start of the testing process and in repetitive epochs, the neural network has the chance to go through the data set again and again until a

95,44

93,4

(b) for the Kwan- Cai, SOM neural networks and the Bayes Classifier

theoretical increase in the detection rate of the Bayes Classifier.

Fig. 14. Test vectors (red), training vectors (blue), pressure rates (green) and vibration rates (violet)


Table VIII. Pressure and vibration average detection times for both small and large sample databases

(0.0287s) than vibration samples (0.0297s). The large database experiments lead to almost equal average detection times between pressure (0.0948s) and vibration (0.094s) samples, with a tendency to better recognize vibration samples.

There is little relevance in the detection rates for the small sample group, even though a small variation between pressure and vibration can be seen. The increase in detection rates due to a bigger knowledge database can also be seen from Tables VIIa and VIIb.

The greatest importance of the Bayes Classifier in this chapter comes from its great sensitivity to change. When the knowledge group includes vectors that are incoherent with the others or that are more different, the detection rate goes down immediately. In this case, the algorithm can not classify properly because one or both classes contain vectors that are very far away from their centers and vectors from one class may get tangled up with the other one. By doing this the Bayes Classifier acts as a monitor for change in the constitution of the sample classes or a "magnifying glass" reacting to the internal composition of the data groups.

Given a big enough knowledge database that is also very coherent in the nature of its classes, the detection rates go up and can be comparable to the neural networks but at a great cost in speed.

### **4. Comparison among the three tested methods**

The first discussion will be based on the database size point of view. As we can see from Fig.15 and Fig.16 that summarize results in Tables I, II, IV, V and VII, the size of the learning, training or comparison database is very important in the good functioning of all three tested algorithms.

232 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 14. Test vectors (red), training vectors (blue), pressure rates (green) and vibration rates

Small sample database **0.0287** 0.0297 Large sample database 0.0948 **0.094**  Table VIII. Pressure and vibration average detection times for both small and large sample

(0.0287s) than vibration samples (0.0297s). The large database experiments lead to almost equal average detection times between pressure (0.0948s) and vibration (0.094s) samples,

There is little relevance in the detection rates for the small sample group, even though a small variation between pressure and vibration can be seen. The increase in detection rates

The greatest importance of the Bayes Classifier in this chapter comes from its great sensitivity to change. When the knowledge group includes vectors that are incoherent with the others or that are more different, the detection rate goes down immediately. In this case, the algorithm can not classify properly because one or both classes contain vectors that are very far away from their centers and vectors from one class may get tangled up with the other one. By doing this the Bayes Classifier acts as a monitor for change in the constitution of the sample classes or

Given a big enough knowledge database that is also very coherent in the nature of its classes, the detection rates go up and can be comparable to the neural networks but at a

The first discussion will be based on the database size point of view. As we can see from Fig.15 and Fig.16 that summarize results in Tables I, II, IV, V and VII, the size of the learning, training or comparison database is very important in the good functioning of all

due to a bigger knowledge database can also be seen from Tables VIIa and VIIb.

a "magnifying glass" reacting to the internal composition of the data groups.

[s] Pressure Vibration

Average detection time

with a tendency to better recognize vibration samples.

**4. Comparison among the three tested methods** 

(violet)

databases

great cost in speed.

three tested algorithms.

Fig. 15. Pressure sample detection rates using the small database (a) and the large database (b) for the Kwan- Cai, SOM neural networks and the Bayes Classifier

An increase in the database size from one hundred to one thousand sample vectors will lead to a minimum increase of ten percent in the detection rates. For the small database, the Fuzzy Kwan-Cai neural network obtains maximum detection rates for the pressure samples at 68% that are higher than the ones for vibration samples at 48%, but after using the large data set the maximum pressure and vibration detection rates become equal at 93.40%. The difference in detection rates for the pressure and vibration samples using the small database shows that the pressure samples are more coherent and therefore easier to classify. The same evolution as shown by the Fuzzy Kwan- Cai is also true for the Kohonen Self-Organizing Map (SOM). Even more so, the increase in learning database size will lead to a theoretical increase in the detection rate of the Bayes Classifier.

The second discussion will be based on the detection rate point of view. As shown in Fig.15 and Fig.16, the Bayes Classifier seems to show the best detection rates. Its fault is that it needs large amounts of comparison data in order to create classes that are comprehensive enough. Out of the three algorithms tested in this chapter, it is also the less stabile due to the fact that it calculates distances to the center of the comparison classes. If these classes are not well defined and separated, the detection rates fall dramatically. This can be seen in Table VIIb. The Fuzzy Kwan-Cai obtains the highest detection rates of all three algorithms - these being valid detection rates that are not influenced by the nature of learned vectors leading to the great stability of this method. The learning method used employs the automatic generation of learning classes as it goes through the sample set. The fuzzy logic creates a more organic representation of the knowledge classes than the boolean one. The Kohonen Self-Organizing Map (SOM) presents the second highest detection rates and a more controlled and stabile learning and training environment then the other two algorithms. Because the learning is done prior to the start of the testing process and in repetitive epochs, the neural network has the chance to go through the data set again and again until a

Engine Knock Detection Based on Computational Intelligence Methods 235

It is clear from the information presented in this chapter that the best detection rates correlated to very good detection times belong to the Kohonen Self-Organizing Map with a

The SOM with a configuration of four-hundred-neurons obtains results almost similar to the Fuzzy Kwan-Cai. The difference between these two networks is that the SOM requires a separate training stage where the separated and well defined learning classes are given to it and the Fuzzy Kwan-Cai learns as it receives sample vectors and builds its own classes.

The Bayes Classifier is very useful for showing the nature of the knock and non-knock classes how well they are defined and separated due to its sensitivity to drastic variations in sample vectors. Its detection rate depends on the size of the knowledge database and its

From a real-world application point of view, in order to further maximize detection rates, it is clear that a parallel process composed of a pressure-vibration analysis and detection becomes necessary, based on the experimental results. Due to the developments in digital signal processing (DSP) technology, the parallel process would not lead to an increasing

In order to avoid overcrowding, this final chapter contains general concluding remarks due to the fact that details and accurate conclusions have already been widely presented in

Three methods of knock detection were studied and compared in this chapter. Testing was performed on a Bosch Group database. Two of the three algorithms used are of neural nature: Fuzzy Kwan-Cai neural network – presenting the unsupervised learning approach and fuzzy inference core - and Kohonen Self-Organizing Map (SOM) – with a separate

The three algorithms were either trained or had comparison classes and were tested on two different database sizes, one small of one hundred samples vectors and one large representing one thousand samples in order to show how the database size would affect the

Experiments were made on both pressure and vibration sample vectors in order to see which of these are more coherent in nature, leading to results that show an overall greater coherence with slightly more increased detection rates and how this coherence might affect the algorithms being tested. The experiments performed have led to results that prove the superiority of the neural methods in contrast to the normal classification – the situation being looked at from a rate-time point of view as seen in Fig.15, Fig.16, Fig.17, Fig.18.The difference between the neural and non neural methods is represented by an average scale factor of 0,001s in favour of the neural. This superiority should be seen also from a stability to errors point of view as seen in Table VIIb where a stray vector can distort the judgement

Comparisons were made between the algorithms leading to experimental results enabling us to draw conclusions on which methods are superior to others, in what way and also on

supervised learning stage - and the third is non-neural: Bayes Classifier.

of the non neural Bayes Classifier so that detection rates fall.

the properties and nature of the database used in the experiments.

configuration of one-hundred-neurons.

detection times.

**5. Concluding remarks** 

chapters III and IV above.

detection outcome.

coherence making it useless in real-world applications.

complete image is formed. The two neural networks show no considerable preference between pressure and vibration samples and present high stability to drastic variations in training samples which in a non-neural method could cause a decrease in detection rates. The nature of these types of signals and their differences are outlined by the Bayes Classifiers sensitivity to unclear classes and the way in which the Fuzzy Kwan-Cai neural network works by showing the internal structure of the classes.

Fig. 16. Vibration sample detection rates using the small database (a) and the large database (b) for the Kwan-Cai, SOM neural networks and the Bayes Classifier

The third discussion will be based on the detection time point of view. As present in Fig.17 and Fig.18 that summarize results in Tables III, VI and VIII, it is clear at first glance that the neural networks are far superior to the normal non-neural classification algorithm. The Bayes Classifier obtains the longest detection times due to the process of comparing each new vector to the knowledge classes. The best, valid, detection times are shown by the Kohonen Self-Organizing Map with the one-hundred-neurons configuration. This configuration, given optimization of the code, can lead to detection times coherent to the engine combustion cycles in which the knock detection needs to take place. Any number of neurons under one hundred will make it hard for the network to give satisfactory detection rates even though the detection times will decrease dramatically. In this chapter we are interested in maximizing the balance between high detection rates and low detection times and not achieving the two extremes and having to compromise one outcome. The second best detection times that are also very close to one another belong to the Fuzzy Kwan-Cai and SOM with the configuration of four-hundred-neurons.

These two algorithms also show the highest detection rates from the methods tested in this chapter. In a real-time application there should not be any problem with the SOMs separate training stage because it would be performed only once inside the factory. The Fuzzy Kwan-Cai neural network presents a different advantage in that it can learn as it goes along, not needing a separate training stage and continuously receiving information and gaining knowledge.

It is clear from the information presented in this chapter that the best detection rates correlated to very good detection times belong to the Kohonen Self-Organizing Map with a configuration of one-hundred-neurons.

The SOM with a configuration of four-hundred-neurons obtains results almost similar to the Fuzzy Kwan-Cai. The difference between these two networks is that the SOM requires a separate training stage where the separated and well defined learning classes are given to it and the Fuzzy Kwan-Cai learns as it receives sample vectors and builds its own classes.

The Bayes Classifier is very useful for showing the nature of the knock and non-knock classes how well they are defined and separated due to its sensitivity to drastic variations in sample vectors. Its detection rate depends on the size of the knowledge database and its coherence making it useless in real-world applications.

From a real-world application point of view, in order to further maximize detection rates, it is clear that a parallel process composed of a pressure-vibration analysis and detection becomes necessary, based on the experimental results. Due to the developments in digital signal processing (DSP) technology, the parallel process would not lead to an increasing detection times.

### **5. Concluding remarks**

234 Fuzzy Logic – Algorithms, Techniques and Implementations

complete image is formed. The two neural networks show no considerable preference between pressure and vibration samples and present high stability to drastic variations in training samples which in a non-neural method could cause a decrease in detection rates. The nature of these types of signals and their differences are outlined by the Bayes Classifiers sensitivity to unclear classes and the way in which the Fuzzy Kwan-Cai neural

Fig. 16. Vibration sample detection rates using the small database (a) and the large database

The third discussion will be based on the detection time point of view. As present in Fig.17 and Fig.18 that summarize results in Tables III, VI and VIII, it is clear at first glance that the neural networks are far superior to the normal non-neural classification algorithm. The Bayes Classifier obtains the longest detection times due to the process of comparing each new vector to the knowledge classes. The best, valid, detection times are shown by the Kohonen Self-Organizing Map with the one-hundred-neurons configuration. This configuration, given optimization of the code, can lead to detection times coherent to the engine combustion cycles in which the knock detection needs to take place. Any number of neurons under one hundred will make it hard for the network to give satisfactory detection rates even though the detection times will decrease dramatically. In this chapter we are interested in maximizing the balance between high detection rates and low detection times and not achieving the two extremes and having to compromise one outcome. The second best detection times that are also very close to one another belong to the Fuzzy Kwan-Cai

These two algorithms also show the highest detection rates from the methods tested in this chapter. In a real-time application there should not be any problem with the SOMs separate training stage because it would be performed only once inside the factory. The Fuzzy Kwan-Cai neural network presents a different advantage in that it can learn as it goes along, not needing a

separate training stage and continuously receiving information and gaining knowledge.

(b) for the Kwan-Cai, SOM neural networks and the Bayes Classifier

and SOM with the configuration of four-hundred-neurons.

network works by showing the internal structure of the classes.

In order to avoid overcrowding, this final chapter contains general concluding remarks due to the fact that details and accurate conclusions have already been widely presented in chapters III and IV above.

Three methods of knock detection were studied and compared in this chapter. Testing was performed on a Bosch Group database. Two of the three algorithms used are of neural nature: Fuzzy Kwan-Cai neural network – presenting the unsupervised learning approach and fuzzy inference core - and Kohonen Self-Organizing Map (SOM) – with a separate supervised learning stage - and the third is non-neural: Bayes Classifier.

The three algorithms were either trained or had comparison classes and were tested on two different database sizes, one small of one hundred samples vectors and one large representing one thousand samples in order to show how the database size would affect the detection outcome.

Experiments were made on both pressure and vibration sample vectors in order to see which of these are more coherent in nature, leading to results that show an overall greater coherence with slightly more increased detection rates and how this coherence might affect the algorithms being tested. The experiments performed have led to results that prove the superiority of the neural methods in contrast to the normal classification – the situation being looked at from a rate-time point of view as seen in Fig.15, Fig.16, Fig.17, Fig.18.The difference between the neural and non neural methods is represented by an average scale factor of 0,001s in favour of the neural. This superiority should be seen also from a stability to errors point of view as seen in Table VIIb where a stray vector can distort the judgement of the non neural Bayes Classifier so that detection rates fall.

Comparisons were made between the algorithms leading to experimental results enabling us to draw conclusions on which methods are superior to others, in what way and also on the properties and nature of the database used in the experiments.

Engine Knock Detection Based on Computational Intelligence Methods 237

known value for each vector at a time and incremented into an error counter. The databases

This work was supported by CNCSIS – UEFISCSU, project number PNII – IDEI code

Adeli, H. & Karim , A. (2005). Wavelets in Intelligent Transportation Systems (1st edition) ,

Auld, T.; Moore, A.W. & Gull, S.F. (2007). Bayesian Neural Networks for Internet Traffic

Banakar A. & Azeem M. F. (2008). Articial wavelet neural network and its application in

Billings, S.A. & Wei H.L. (2005). A new class of wavelet networks for nonlinear system

Borg, J.M., Cheok K.C, Saikalis G. & Oho, S (2005). Wavelet-based knock detection with

Bosch, R. (2004). Bosch-Gasoline-Engine Management, Ed. Robert Bosch GmbH, ISBN-13:

Boubai, O. (2000). Knock detection in automobile engines, vol.3, issue 3, pp. 24-28, ISSN:

Chen, P.C. (2005). Neuro-fuzzy-based fault detection of the air flow sensor of an idling

Erjavec, J. (2009). Automotive Technology: A System Approach (5th edition), Ed. Delmar

Ettefagh, M M., Sadeghi, H., Pirouzpanah, V. H. & Arjmandi T. (2008). Knock detection in

Gupta, H.N. (2006). Fundamentals of Internal Combustion Engines, Ed. Prentice-Hall of

Hamilton, L J. & Cowart, J S. (2008). The first wide-open throttle engine cycle: transition into

Hsu, C.C. (2006). Generalizing self-organizing map for categorical data, vol. 17, issue.2, pp.

Hui, C.L. P. (2011). Artificial Neural Networks - Application, Publisher: InTech, ISBN 978-

Ibrahim, A. M. (2004). Fuzzy logic for embedded systems applications, Ed. Elsevier Science,

Jonathan, M.B., Saikalis, G., Oho, S.T. & Cheok, K.C. (2006). Knock Signal Analysis Using the Discrete Wavelet Transform, No. 2006-01-0226, DOI: 10.4271/2006-01-0226

spark ignition engines by vibration analysis of cylinder block: A parametric modeling approach, vol. 22, Issue 6, pp. 1495-1514, august 2008, ISSN: 0888-3270 Fleming, W.J. (2001). Overview of Automotive Sensors, vol.1, issue 4, pp.296-308, ISSN:

knock experiments with fast in-cylinder sampling, vol. 9, no. 2, pp. 97-109, ISSN

Cengage Learning, ISBN-13: 978-1428311497, Clifton Park NY USA

India Private Limited, ISBN-13: 978-8120328549, New Delhi India

neuro-fuzzy models, Appl. Soft Comput., vol. 8, no. 4, pp. 1463–1485, ISSN: 1568-4946

fuzzy logic, in IEEE International Conference on Computational Intelligence for Measurement Systems and Applications – CIMSA 2005, , pp.26-31, ISBN: 978-1-

Classification, vol. 18, issue. 1, pp. 223–239, ISSN: 1045-9227

identification, vol. 16, issue. 4, pp. 862 – 874, ISSN: 1045-9227

gasoline engine, vol.219, no. 4, pp.511-524, ISSN 0954-4070

were verified to be consistent of their description.

Ed. Wiley, ISBN-13: 978-0470867426, England

4244-2306-4, Sicily Italy 14-16 July 2005

978-0837611006

1094-6969

1530-437X

1468-0874

294 - 304, ISSN: 1045-9227

ISBN-13: 978-0750676052, MA USA

953-307-188-6, Croatia

**6. Acknowledgement** 

1693/2008.

**7. References** 

Fig. 17. Pressure sample detection times using the small database (a) and the large database (b) for the Kwan- Cai, SOM neural networks and the Bayes Classifier

Fig. 18. Vibration sample detection times using the small database (a) and the large database (b) for the Kwan- Cai, SOM neural networks and the Bayes Classifier

Suggestions for real-world applications were made in the prior chapter leading to further optimizations around the strengths and weaknesses of each algorithm.

The three algorithms and most of all the two neural networks have long been used for varied applications showing great robustness and stability. The versions of these applications used in this paper are presented and have been used and tested in their standard form as presented in (Kohonen, 2000, 2002) and (Kwan&Cai, 1994) using as method of verification direct comparison of the outcome of detection and the optimal known value for each vector at a time and incremented into an error counter. The databases were verified to be consistent of their description.

### **6. Acknowledgement**

This work was supported by CNCSIS – UEFISCSU, project number PNII – IDEI code 1693/2008.

### **7. References**

236 Fuzzy Logic – Algorithms, Techniques and Implementations

Kwan- Cai Small SOM 100 Small SOM 400 Small Bayes Small

0,0287

0,0948

0,0297

0,094

0,0052 0,000193 0,0023

0,0056 0,000538 0,0024

0,0022 0,000538 0,0027

0,0046 0,000498 0,0028

(b) for the Kwan- Cai, SOM neural networks and the Bayes Classifier

optimizations around the strengths and weaknesses of each algorithm.

Detection times (vibration samples) [s]

(a)

Kwan- Cai Large SOM 100 Large SOM 400 Large Bayes Large

Detection times (Vibration samples) [s]

(b) Fig. 18. Vibration sample detection times using the small database (a) and the large database

Suggestions for real-world applications were made in the prior chapter leading to further

The three algorithms and most of all the two neural networks have long been used for varied applications showing great robustness and stability. The versions of these applications used in this paper are presented and have been used and tested in their standard form as presented in (Kohonen, 2000, 2002) and (Kwan&Cai, 1994) using as method of verification direct comparison of the outcome of detection and the optimal

(b) for the Kwan- Cai, SOM neural networks and the Bayes Classifier

Detection times (Pressure samples) [s]

(a)

Kwan- Cai Small SOM 100 Small SOM 400 Small Bayes Small

Detection times (Pressure samples) [s]

(b) Fig. 17. Pressure sample detection times using the small database (a) and the large database

> Kwan- Cai Large SOM 100 Large SOM 400 Bayes Large


**12** 

*Brasil* 

**Fault Diagnostic of Rotating Machines Based** 

 **on Artificial Intelligence: Case Studies of** 

*Centrais Elétricas do Norte do Brasil S/A – ELETROBRAS-ELETRONORTE,* 

The efficiency of the maintenance techniques applied in energy generation power plants is improved when expert diagnosis systems are used to analysis information provided by the continuous monitoring systems used in these installations. There are a large number of equipments available in the power plants of the Centrais Elétricas do Norte do Brazil S/A - ELETROBRAS-ELETRONORTE (known as ELETRONORTE). These equipments operate continuously because are indispensable for the correct functioning of the generation and transmission systems of the company. Anomalies in the operation of these devices can be detected with the use of intelligent diagnosis tools which analysis the information of the continuous monitoring systems and, based in a set of qualitative rules, indicate the best

The best maintenance strategy used in each equipment operated by ELETRONORTE should consider factors as: equipments importance for the production process, acquisition cost and failure rate. To accomplish this task, one of the three maintenance techniques more used nowadays is chosen: corrective, preventive or predictive [1]. In the predictive maintenance, an operational report of the equipment's condition is emitted using the information collected by the continuous monitoring system. The formulation of such report is a task divided in the following stages: 1) Anomaly identification that can be occurring in the equipment; 2) Detection of the anomalous component; 3) Evaluation of the severity of the fault; and 4) Estimation of the remaining life time of the equipment. The predictive maintenance policies is an efficient practice to identify problems in hydrogenerators that will increase reliability, decrease maintenance costs, limit service failures and increase the

There is a vast literature on techniques for detection and identification of faults known to the FDI (Fault Detection and Isolation). A possible classification of these techniques that consider the aspects related to the type of information available about the process analysis defines three categories: methods based on quantitative models, methods based on qualitative models or semi-qualitative, and methods based on historical data [2]. The first two categories are

commonly named Model Based Fault Detection and Isolation (MBFDI) [3].

**1. Introduction** 

life of the machines.

procedures to avoid the fail of the equipments.

**the Centrais Elétricas do Norte do** 

**Brazil S/A – Eletrobras-Eletronorte** 

Marcelo Nascimento Moutinho


## **Fault Diagnostic of Rotating Machines Based on Artificial Intelligence: Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte**

Marcelo Nascimento Moutinho *Centrais Elétricas do Norte do Brasil S/A – ELETROBRAS-ELETRONORTE, Brasil* 

### **1. Introduction**

238 Fuzzy Logic – Algorithms, Techniques and Implementations

Kohonen, T. (2000).Self-organizing Maps (3rd edition), Ed. Springer, ISBN-13: 978-

Kohonen, T. (2002). The self-organizing map, vol. 78, no. 9., pp. 1464-1480, ISSN: 0018-9219 Kwan, H.K. & Cai, T (1994). A fuzzy neural network and its applications pattern recognition, IEEE Transactions on Fuzzy Systems, vol.2, issue.3, pp. 185-193, ISSN: 1063-6706 Larose, D.T. (2006).Data Mining Methods and Models, Wiley-IEEE Press, ISBN-13: 978-

Lazarescu, D., Radoi, C. & Lazarescu, V. (2004). A Real-Time Knock Detection Algorithm

2004, Bucharest, pp. 65-68., ISBN: 0-7803-8533-0, Bucharest 20-24 June 2004 Li, H. & Karim , G. A. (July 2004). Knock in spark ignition hydrogen engines, vol. 29, issue

Liu, P. & Li, H.X. (2004). Fuzzy neural network theory and application, Publisher: World Scientific Publishing Company, ISBN-13: 978-9812387868, Singapore Lopez-Rubio E. (2010). Probabilistic Self-Organizing Maps for Continuous Data, vol.21,

Midori, Y., Nobuo, K. & Atsunori K. (1999). Engine Knock Detection Using Wavelet Transform, Dynamics & Design Conference, Issue B, , pp. 299-302, Tokio, 1999 Mitchell, T.M. (2010). Machine Learning (3rd edition), Ed. New York: McGraw Hill Higher

Park, S.T. & Jang J. (2004). Engine knock detection based on Wavelet transform, Proceeding

Radoi, A., Lazarescu V., & Florescu A. (2009). Wavelet Analysis To Detect The Knock On Internal Combustion Engines, tome 54, no.3, pp. 301-310, ISSN: 0035-4066 Thomas, J.H., Dubuisson, B. & M.A. Dillies-Peltier (1997). Engine Knock Detection from Vibration Signals Using Pattern Recognition, Mecanica, vol.32, no 5, pp. 431-439 Wang, F.Y.& Liu, D. (2006). Advances in Computational Intelligence: Theory And

Wehenkel, L.A. (1997) Automatic Learning Technique in Power Systems, Kluwer Academic

Wu J.D. & Liu, C.H. (2009). An expert system for fault diagnosis in internal combustion

Yilmaz, S. & Oysal, Y. (2010). Fuzzy Wavelet Neural Network Models for Prediction and

Zhang, Q. & Benveniste A. (1992). Wavelet networks, vol. 3, issue. 6, pp. 889–898, ISSN:

Zhang, H. & Liu, D. (2006). Fuzzy Modeling and Fuzzy Control (Control Engineering) (1st edition) , Ed. Birkhauser Boston, ISBN-13: 978-0817644918, MD USA

KORUS 2004, vol.3, pp. 80-83, ISBN: 0-7803-8383-4, 26 June-3 July 2004 Prokhorov, D.(2008). Computational Intelligence in Automotive Applications (1st Edition),

of the 8th Russian-Korean International Symposiom on Science and Technology –

Applications (1st edition), Ed. World Scientific Publishing Company,. ISBN-13: 978-

engines using wavelet packet transform and neural network, vol. 36, issue 3, pp.

Identification of Dynamical Systems, vol. 21 , issue 10, pp. 1599 – 1609, ISSN:

Based on Fast Wavelet Transform, in International Conference Communications

3540679219, Berlin

0471666561, USA

8, pp. 859-865, ISSN: 0360-3199

issue.10, pp. 1543 - 1554, ISSN: 1045-9227

Education, ISBN 0070428077, Oregon USA

Publishers, ISBN-13: 978-0792380689 , USA

Ed.Springer, ISBN 978-3-540-79256-7

9812567345, Singapore

4278-4286, ISSN: 0957-4174

1045-9227

1045-9227

The efficiency of the maintenance techniques applied in energy generation power plants is improved when expert diagnosis systems are used to analysis information provided by the continuous monitoring systems used in these installations. There are a large number of equipments available in the power plants of the Centrais Elétricas do Norte do Brazil S/A - ELETROBRAS-ELETRONORTE (known as ELETRONORTE). These equipments operate continuously because are indispensable for the correct functioning of the generation and transmission systems of the company. Anomalies in the operation of these devices can be detected with the use of intelligent diagnosis tools which analysis the information of the continuous monitoring systems and, based in a set of qualitative rules, indicate the best procedures to avoid the fail of the equipments.

The best maintenance strategy used in each equipment operated by ELETRONORTE should consider factors as: equipments importance for the production process, acquisition cost and failure rate. To accomplish this task, one of the three maintenance techniques more used nowadays is chosen: corrective, preventive or predictive [1]. In the predictive maintenance, an operational report of the equipment's condition is emitted using the information collected by the continuous monitoring system. The formulation of such report is a task divided in the following stages: 1) Anomaly identification that can be occurring in the equipment; 2) Detection of the anomalous component; 3) Evaluation of the severity of the fault; and 4) Estimation of the remaining life time of the equipment. The predictive maintenance policies is an efficient practice to identify problems in hydrogenerators that will increase reliability, decrease maintenance costs, limit service failures and increase the life of the machines.

There is a vast literature on techniques for detection and identification of faults known to the FDI (Fault Detection and Isolation). A possible classification of these techniques that consider the aspects related to the type of information available about the process analysis defines three categories: methods based on quantitative models, methods based on qualitative models or semi-qualitative, and methods based on historical data [2]. The first two categories are commonly named Model Based Fault Detection and Isolation (MBFDI) [3].

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

where *A*(*q*-1) and *B*(*q*-1) are as follow:

where *e*(*k*) is a Gaussian white noise;

.., *N*, we obtain, in matrix notation:

where. ˆ () () () *<sup>T</sup>*

 *k y k k* 

1 Represented as follows <sup>1</sup> *q yk yk* ( ) ( 1)

model parameters. The vectors

ˆ

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 241

where *y*(*k*) and *u*(*k*) are, respectively, the values of the output and input signals at the discrete time *k*, an integer multiple of the sampling interval *Ts*, *na* and *nb* are the number of regressors is the output and input signals, respectively, and 1 *d* is the output transport system delay as an integer multiple of the sampling interval. Using the discrete delay

1 1 ( )() ( )() *<sup>d</sup> Aq y t q B q u t* (2)

*a <sup>n</sup> <sup>A</sup> <sup>n</sup> <sup>q</sup> <sup>a</sup> <sup>q</sup> <sup>a</sup> <sup>q</sup> <sup>a</sup> <sup>q</sup>* (3)

*b <sup>n</sup> <sup>B</sup> <sup>n</sup> <sup>q</sup> b b <sup>q</sup> <sup>b</sup> <sup>q</sup> <sup>b</sup> <sup>q</sup>* (4)

(*k*) of Eq. (5). The objective of

(*k*) is the vector of regressors and

(*k*) are represented as follows:

*a b*

*T*

*t aa a bb b* (7)

**<sup>y</sup> <sup>Φ</sup>** (8)

(9)

. The following quadratic performance index must be minimized:

*k yk yk n uk d ut d n* (6)

*n n*

(5)

(*k*) is the vector of

operator, ( <sup>1</sup> *q* )1 , the following polynomial representation of Eq. (1) can be obtained:

1 12

1 12

affected by uncorrelated noise. Thus, the following representation can be obtained:

(*k*) and

*y y*

(*k*), which represents approximately the parameter vector

1 2 ( )1 ... *<sup>a</sup>*

01 2 ( ) ... *<sup>b</sup>*

It is interesting to add stochastic characteristics in the model representing as realistically as possible the nature of the process. This can be done considering that the output signal is

> () ()() () *<sup>T</sup> y k k k ek*

( ) ( 1) ... ( ) ( ) ... ( ) *<sup>T</sup>*

12 01 ( ) ... ... *a b*

The non-recursive least squares method [14] will be used in order to estimate the vector

the method is to minimize the sum of the squares of the prediction error between the estimated model output and the real output of the plant. Substituting in equation (5) *k* = 1,2,

> (1) (1) (1) (2) (2) (2) , ,

*T T*

( ) ( ) ( )

*y N N N*

ˆ **y Φ ε** 

*T*

A MBFDI algorithm consists of two components: the residues generator and the process of decision making: the residues generator compares the current values of inputs, outputs or states of the process with the estimated model that describes the normal behavior; the process decision is the logic that converts the residue signal (quantitative knowledge) on a qualitative information (normal operating condition or abnormal). The bases of MBFDI algorithms are described in [3], [4] and [5]. The main difficulty in implementing a MBFDI algorithm lies in the fact that the fidelity of the model affects the sensitivity of the fault detection mechanism and the diagnosis precision. Many real systems are not susceptible to conventional modeling techniques due to: the lack of precise knowledge about the system, the strongly nonlinear behavior, the high degree of uncertainty, or the time-varying characteristics.

Recently, well successfully applications of predictive techniques have been reported. In [6 to 9] are presented intelligent systems for predictive maintenance addressed to the diagnosis in real-time of industrial processes. In [10] a fault detection and isolation scheme of sensor and actuator is presented. The project considers multivariate dynamic systems with uncertainties in the mathematical model of the process. Detailed studies on the robustness of anomalous systems of identification in presence of modeling errors is also reported in the survival paper [2].

Nowadays, the expert diagnosis technologies available in the market are in maturation process. The tools commercially available have restrictions in the information exchange with the company's legacy systems. The users normally can't change the software structure and don't know the conceptual data base model. Due to these limitations, the company who uses this kind of paradigm is in a difficult situation when software modifications, not considered in the initial project, are necessary to adjust it to a specific application.

In this chapter is described the procedures for designing and test MBFDI system. Two types of models will be used: autoregressive models and fuzzy models. The proposed system is evaluated experimentally using real monitoring data from a synchronous compensator and a synchronous generator. The synchronous compensator analyzed is in operation at Vila do Conde substation, located at Pará state, Brazil. The synchronous generator studied is in operation at Tucuruí Hydroelectric, located at Pará state too. Both equipments are operated by ELETRONORTE.

### **2. Fuzzy system and regression models for use in diagnosis systems**

To design the fault detection system proposed in this work mathematical models are used to describe the relationships between the variables monitored in the equipment analyzed. Two types of models will be used: autoregressive models and fuzzy models. The purpose of this section is describe the two structures used.

### **2.1 System identification with regression models**

The following structure, known in literature as Autoregressive model with exogenous inputs (ARX), will be used [14]:

$$\begin{aligned} y(t) + a\_1 y(k-1) + \dots + a\_{n\_a} y(k - n\_a) &= \\ b\_0 \mu(k - d) + b\_1 \mu(k - 1 - d) + \dots + b\_{n\_b} \mu(k - d - n\_b) \end{aligned} \tag{1}$$

where *y*(*k*) and *u*(*k*) are, respectively, the values of the output and input signals at the discrete time *k*, an integer multiple of the sampling interval *Ts*, *na* and *nb* are the number of regressors is the output and input signals, respectively, and 1 *d* is the output transport system delay as an integer multiple of the sampling interval. Using the discrete delay operator, ( <sup>1</sup> *q* )1 , the following polynomial representation of Eq. (1) can be obtained:

$$A(q^{-1})y(t) = q^{-d}B(q^{-1})u(t)\tag{2}$$

where *A*(*q*-1) and *B*(*q*-1) are as follow:

240 Fuzzy Logic – Algorithms, Techniques and Implementations

A MBFDI algorithm consists of two components: the residues generator and the process of decision making: the residues generator compares the current values of inputs, outputs or states of the process with the estimated model that describes the normal behavior; the process decision is the logic that converts the residue signal (quantitative knowledge) on a qualitative information (normal operating condition or abnormal). The bases of MBFDI algorithms are described in [3], [4] and [5]. The main difficulty in implementing a MBFDI algorithm lies in the fact that the fidelity of the model affects the sensitivity of the fault detection mechanism and the diagnosis precision. Many real systems are not susceptible to conventional modeling techniques due to: the lack of precise knowledge about the system, the strongly nonlinear

Recently, well successfully applications of predictive techniques have been reported. In [6 to 9] are presented intelligent systems for predictive maintenance addressed to the diagnosis in real-time of industrial processes. In [10] a fault detection and isolation scheme of sensor and actuator is presented. The project considers multivariate dynamic systems with uncertainties in the mathematical model of the process. Detailed studies on the robustness of anomalous systems of identification in presence of modeling errors is also reported in the

Nowadays, the expert diagnosis technologies available in the market are in maturation process. The tools commercially available have restrictions in the information exchange with the company's legacy systems. The users normally can't change the software structure and don't know the conceptual data base model. Due to these limitations, the company who uses this kind of paradigm is in a difficult situation when software modifications, not considered

In this chapter is described the procedures for designing and test MBFDI system. Two types of models will be used: autoregressive models and fuzzy models. The proposed system is evaluated experimentally using real monitoring data from a synchronous compensator and a synchronous generator. The synchronous compensator analyzed is in operation at Vila do Conde substation, located at Pará state, Brazil. The synchronous generator studied is in operation at Tucuruí Hydroelectric, located at Pará state too. Both equipments are operated

To design the fault detection system proposed in this work mathematical models are used to describe the relationships between the variables monitored in the equipment analyzed. Two types of models will be used: autoregressive models and fuzzy models. The purpose of this

The following structure, known in literature as Autoregressive model with exogenous

( ) ( 1 ) ... ( ) *a*

*b n a*

*n b*

(1)

( ) ( 1) ... ( )

*yt ayk a yk n buk d buk d b uk d n* 

**2. Fuzzy system and regression models for use in diagnosis systems** 

behavior, the high degree of uncertainty, or the time-varying characteristics.

in the initial project, are necessary to adjust it to a specific application.

survival paper [2].

by ELETRONORTE.

section is describe the two structures used.

inputs (ARX), will be used [14]:

**2.1 System identification with regression models** 

1

0 1

$$A(q^{-1}) = 1 + a\_1 q^{-1} + a\_2 q^{-2} + \dots + a\_{n\_a} q^{-n\_a} \tag{3}$$

$$B(q^{-1}) = b\_0 + b\_1 q^{-1} + b\_2 q^{-2} + \dots + b\_{n\_b} q^{-n\_b} \tag{4}$$

It is interesting to add stochastic characteristics in the model representing as realistically as possible the nature of the process. This can be done considering that the output signal is affected by uncorrelated noise. Thus, the following representation can be obtained:

$$
\phi(k) = \phi^T(k)\theta(k) + c(k) \tag{5}
$$

where *e*(*k*) is a Gaussian white noise; (*k*) is the vector of regressors and (*k*) is the vector of model parameters. The vectors (*k*) and (*k*) are represented as follows:

$$\phi(k) = \begin{bmatrix} -y(k-1) \ \dots - y(k-n\_a) \ u(k-d) \ \dots \ u(t-d-n\_b) \ \end{bmatrix}^T \tag{6}$$

$$\boldsymbol{\Theta}(t) = \begin{bmatrix} a\_1 \ a\_2 \dots \ a\_{n\_a} b\_0 \ b\_1 \dots \ b\_{n\_b} \end{bmatrix}^T \tag{7}$$

The non-recursive least squares method [14] will be used in order to estimate the vector ˆ (*k*), which represents approximately the parameter vector (*k*) of Eq. (5). The objective of the method is to minimize the sum of the squares of the prediction error between the estimated model output and the real output of the plant. Substituting in equation (5) *k* = 1,2, .., *N*, we obtain, in matrix notation:

$$\mathbf{y} = \begin{bmatrix} y(1) \\ y(2) \\ \vdots \\ y(N) \end{bmatrix}, \mathbf{O} = \begin{bmatrix} \phi(1)^T \\ \phi(2)^T \\ \vdots \\ \phi(N)^T \end{bmatrix}, \mathbf{c} = \begin{bmatrix} \xi(1) \\ \xi(2) \\ \vdots \\ \xi(N) \end{bmatrix} \tag{8}$$

$$\mathbf{y} = \mathbf{O} \, \hat{\boldsymbol{\theta}} + \mathbf{c} \tag{9}$$

where. ˆ () () () *<sup>T</sup> k y k k* . The following quadratic performance index must be minimized:

<sup>1</sup> Represented as follows <sup>1</sup> *q yk yk* ( ) ( 1)

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

*Vi*(*k*).

each of the *M* rules:

as follows:

**Systems (ANFIS)** 

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 243

Fig. 2. Membership functions of the *ni* fuzzy sets associated with the Linguistic variable

degree of precision, dynamic systems governed by nonlinear relationships.

*y k*

real data from a continuous monitoring system described in the next section.

**2.3.1 Synchronous compensator monitoring system – VIBROCOMP** 

**2.3 Prediction techniques based on Adaptive-Network-based Fuzzy Inference** 

This monitoring system was designed as predictive maintenance tool for Synchronous Compensators (SC). These equipments are large rotary machines of 150 MVAr where the constant evaluation of its physical parameters is critical. In the State of Pará, Eletronorte operates three SC that are part of its transmission system: two are installed in Vila do Conde substation, located in Para State, and one is installed in the Marabá substation. These three

The set of M rules defined by (14) describe a fuzzy system of Sugeno [16], a mathematical tool that can represent globally and approximately the system described in Eq (12). It is a universal nonlinear approximator, a mathematical function that represents with an arbitrary

According to the theory of fuzzy systems [16], the output signal *y*( ) *<sup>k</sup>* of the fuzzy system defined by the set of rules (14) is obtained by weighted average of the individual outputs of

1

( ) ˆ( ) *M l l l M l l*

The weights, ω*l*, are called Functions Validation. They are calculated in terms of the vector

According to Eq.(16), the value of the signal output of the model is a function of the ω*<sup>l</sup>* weights and functions *fl*(.). Therefore, for a given set of values of signal *y*(*k*) can be found an optimal setting of the parameters of fuzzy membership functions defined on each input and the parameters of the functions *fl*(.) that minimizes the difference *y*() () *k yk* for the entire set. The details of the procedure for identification of these parameters will be the subject matter of section IV which will be described a procedure for identifying models based on

1

1, 2 , , 1 2 *ll l* ( ( )) ( ( )) ( ( )) *<sup>i</sup> <sup>j</sup> k p l k VV V*

 

*Vt Vt Vt* (17)

*y k*

(16)

$$J(\hat{\theta}) = \frac{1}{2} \sum\_{k=1}^{N} \xi(t)^2 \tag{10}$$

The value of ˆthat minimizes the Eq. (10) is [15]:

$$\boldsymbol{\theta}\_{\boldsymbol{M}\boldsymbol{Q}} = (\boldsymbol{\Phi}^{\boldsymbol{T}}\boldsymbol{\Phi})^{-1}\boldsymbol{\Phi}^{\boldsymbol{T}}\boldsymbol{\mathbf{y}} \tag{11}$$

### **2.2 Identification of predictive models based on fuzzy logic**

In this subsection the structure of the fuzzy model used in this work will be described. The following discrete nonlinear system representation is used:

$$y(k) = f\left[\Psi(k-1)\right] \tag{12}$$

where *f*(.) is a nonlinear function of the *Information Vector* , defined as:

$$\Psi(k-1) = \left[ y(k-1)\dotsm y(k-n\_a)\dotsm u(k-d)\dots u(k-d-n\_b) \right]^\Gamma \tag{13}$$

where: *na* and *nb* represent the number of regressors of discrete output signals, *y*(*k*) and input *u*(*k*), respectively, *d* is the output transport delay as an integer multiple of the sampling interval *Ts*; *e*(k) is a random signal that supposedly corrupts the signals of the model is designed in a stochastic environment. This model is known as Non-linear Autoregressive with exogenous inputs (NARX).

We consider the existence of a measurable set of variables that characterize the operating conditions of the system (12) at every moment. Using these variables, you can define a set of rules that describe, approximately, the behavior of the function *y*(*k*):

$$\mathbf{R}^{0\text{-}} \mathbf{I} \mathbf{F} < V\_1 \text{ is } V\_{1,l}^l > \mathbf{A} \mathbf{N} \mathbf{D} < V\_2 \text{ is } V\_{2,j}^l > \mathbf{A} \mathbf{N} \mathbf{D} \dots \mathbf{A} \mathbf{N} \mathbf{D} \\ \le V\_k \text{ is } V\_{k,\eta}^l > \mathbf{T} \mathbf{H} \mathbf{E} \mathbf{N} \ y\_l(k) = f\_l(k) \qquad (14)$$

where: *l*=1,2, ..., *M*; *i* = 1,2, ..., *n*1; *j* = 1,2, ..., *n*2; and *p* = 1,2,..., *nk*. The terms, *V*1, *V*2, ..., *V*k are fuzzy linguistic variables that are part of the vector and were chosen to describe the system (12). The domain of these variables is uniformly partitioned into *ni* = *n*1, *n*2, ..., *nk* fuzzy sets (for example, the partitions of *Vi* are: *Vi*,1, *Vi*,2, ..., *Vi*,*ni*. In this work the function *fl*(.) is represented by the following linear combination:

$$f\_l(k) = c\_l^0 + c\_l^1 V\_1 + c\_l^2 V\_2 + \dots + c\_l^k V\_k \tag{15}$$

onde *<sup>i</sup> <sup>l</sup> c i*=1,2,...,*k* are coefficients to be estimated.

At a given instant of discrete time *k* each linguistic variable, *Vi*, will have a membership value , [ ( )] *V i i j V t* associated with the fuzzy set *j* (*j* = 1,2, ..., *ni*). For mathematical simplicity, the membership functions used to represent these sets are triangular and trapezoidal, with the trapezoidal used only in two extreme sets, as shown in Figure 2. It is easy to see that, for each fuzzy variable, at most two and at least one fuzzy set has a membership value different from zero and the sum of these values always is equal one.

242 Fuzzy Logic – Algorithms, Techniques and Implementations

1 <sup>1</sup> <sup>ˆ</sup> ( ) () <sup>2</sup> *N*

<sup>1</sup> ( ) *T T*

In this subsection the structure of the fuzzy model used in this work will be described. The

( 1) ( 1) ( ) ( ) ( ) *<sup>T</sup>*

where: *na* and *nb* represent the number of regressors of discrete output signals, *y*(*k*) and input *u*(*k*), respectively, *d* is the output transport delay as an integer multiple of the sampling interval *Ts*; *e*(k) is a random signal that supposedly corrupts the signals of the model is designed in a stochastic environment. This model is known as Non-linear Auto-

We consider the existence of a measurable set of variables that characterize the operating conditions of the system (12) at every moment. Using these variables, you can define a set of

where: *l*=1,2, ..., *M*; *i* = 1,2, ..., *n*1; *j* = 1,2, ..., *n*2; and *p* = 1,2,..., *nk*. The terms, *V*1, *V*2, ..., *V*k are fuzzy linguistic variables that are part of the vector and were chosen to describe the system (12). The domain of these variables is uniformly partitioned into *ni* = *n*1, *n*2, ..., *nk* fuzzy sets (for example, the partitions of *Vi* are: *Vi*,1, *Vi*,2, ..., *Vi*,*ni*. In this work the function *fl*(.)

> 01 2 1 2 ( ) *<sup>k</sup>*

At a given instant of discrete time *k* each linguistic variable, *Vi*, will have a membership

 *V t* associated with the fuzzy set *j* (*j* = 1,2, ..., *ni*). For mathematical simplicity, the membership functions used to represent these sets are triangular and trapezoidal, with the trapezoidal used only in two extreme sets, as shown in Figure 2. It is easy to see that, for each fuzzy variable, at most two and at least one fuzzy set has a membership value different

*<sup>l</sup> V <sup>j</sup>* > **AND** ... **AND** < *Vk* is ,

*l ll l l k f k c cV cV cV* (15)

*a b k yk yk n uk d uk d n* (13)

 

*k J t* 

that minimizes the Eq. (10) is [15]:

**2.2 Identification of predictive models based on fuzzy logic** 

following discrete nonlinear system representation is used:

regressive with exogenous inputs (NARX).

R(l): **IF** < *V*1 is 1,

onde *<sup>i</sup>*

value , [ ( )] *V i i j* 

*MQ*

where *f*(.) is a nonlinear function of the *Information Vector* , defined as:

rules that describe, approximately, the behavior of the function *y*(*k*):

*<sup>l</sup> V <sup>i</sup>* > **AND** < *V*2 is 2,

is represented by the following linear combination:

*<sup>l</sup> c i*=1,2,...,*k* are coefficients to be estimated.

from zero and the sum of these values always is equal one.

The value of ˆ

2

(10)

**ΦΦ Φ y** (11)

*yk f k* ( ) ( 1) (12)

*<sup>l</sup> Vk <sup>p</sup>* >**THEN** () () *l l y k fk* (14)

Fig. 2. Membership functions of the *ni* fuzzy sets associated with the Linguistic variable *Vi*(*k*).

The set of M rules defined by (14) describe a fuzzy system of Sugeno [16], a mathematical tool that can represent globally and approximately the system described in Eq (12). It is a universal nonlinear approximator, a mathematical function that represents with an arbitrary degree of precision, dynamic systems governed by nonlinear relationships.

According to the theory of fuzzy systems [16], the output signal *y*( ) *<sup>k</sup>* of the fuzzy system defined by the set of rules (14) is obtained by weighted average of the individual outputs of each of the *M* rules:

$$\hat{\boldsymbol{y}}(k) = \frac{\sum\_{l=1}^{M} \alpha\_l \boldsymbol{y}\_l(k)}{\sum\_{l=1}^{M} \alpha\_l} \tag{16}$$

The weights, ω*l*, are called Functions Validation. They are calculated in terms of the vector as follows:

$$
\mu\_l = \mu\_{V\_{1,i}^l}(V\_1(t)) \times \mu\_{V\_{2,j}^l}(V\_2(t)) \times \cdots \times \mu\_{V\_{k,p}^l}(V\_k(t)) \tag{17}
$$

According to Eq.(16), the value of the signal output of the model is a function of the ω*<sup>l</sup>* weights and functions *fl*(.). Therefore, for a given set of values of signal *y*(*k*) can be found an optimal setting of the parameters of fuzzy membership functions defined on each input and the parameters of the functions *fl*(.) that minimizes the difference *y*() () *k yk* for the entire set. The details of the procedure for identification of these parameters will be the subject matter of section IV which will be described a procedure for identifying models based on real data from a continuous monitoring system described in the next section.

### **2.3 Prediction techniques based on Adaptive-Network-based Fuzzy Inference Systems (ANFIS)**

### **2.3.1 Synchronous compensator monitoring system – VIBROCOMP**

This monitoring system was designed as predictive maintenance tool for Synchronous Compensators (SC). These equipments are large rotary machines of 150 MVAr where the constant evaluation of its physical parameters is critical. In the State of Pará, Eletronorte operates three SC that are part of its transmission system: two are installed in Vila do Conde substation, located in Para State, and one is installed in the Marabá substation. These three

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

Table 2. Some signals monitored by VibroComp.

Fig. 4. Main Interface of Client Module of VibroComp

VibroComp are presented.

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 245

The Data Acquisition Module is a client/server application that uses the TCP/IP protocol to send information to the Client Module and Database Module. The Client Module was developed in order to be the interface between the user and the Acquisition and Database modules. The client can get the waveforms of the measured signals from the acquisition module and also make the trend analysis and event analysis. The Expert Diagnosis System is used to analyze the information stored in the Database Module. The application runs on the client module and provides to the analyst the possibility of each fails of the equipment. To do this is used a Fuzzy Inference Engine. In Figures 4 and 5 some of the interfaces of

In the next section will present the procedure for the identification of predictive models

used in this work. The modeling techniques presented in section II will be used.

**Tag Description Unit Type**  *Mlah* Vibr. Bearing Ring - Horizontal μm Vibration *Mlaa* Vibr. Bearing Ring - Axial μm Vibration *Mlav* Vibr. Bearing Ring - Vertical μm Vibration *Mlbh* Vibr. Pump Bearing - Horizontal μm Vibration *Mlba* Vibr. Pump Bearing - Axial μm Vibration *Mlbv* Vibr. Pump Bearing - Vertical. μm Vibration *Ldh1* Vibr. Left - Horizontal 1 μm Vibration *Leh2* Vibr. Left - Horizontal 2 μm Vibration *Ph2* Pressure of Cooling Hydrogen bar Pressure *Rot* Compensator Speed RPM Speed *P* Active Power MW Power *Q* Reactive Power MVAR Power *Tbea*<sup>87</sup> Temp. stator bars - slot 87 Cº Temperature *Tbea*<sup>96</sup> Temp. stator bars - slot 96 Cº Temperature *Tbea*<sup>105</sup> Temp. stator bars - slot 105 Cº Temperature *Taer* Temp. Cooling Water - Input Cº Temperature *Thsr* Temp. Cooling Hydrogen - Output Cº Temperature *Ther* Temp. Cooling Hydrogen - Input Cº Temperature

equipments are monitored by VibroComp. Figure 3 shows a photograph of CPAV-01, one of SC monitored in the substation of Vila do Conde. This equipment, a member of the National Interconnected System (SIN), is used for voltage regulation. The main features of CPAV-01 are presented in Table 1.

Fig. 3. Synchronous Compensator 01 of the substation Vila do Conde.


Table 1. Nominal Characteristics of CPAV-01.

The VibroComp system consists of the following parts:

	- Sources, sensors, transmitters and signal conditioners;
	- Aquisition Computers, Database Computers, data acquisition cards, serial cards, cables, etc..;
	- Data Acquisition Module;
	- Database Module;
	- Expert Diagnosis System;
	- Client Module

The signal conditioning hardware and monitoring software were developed at the Centro de Tecnologia da ELETRONORTE, known as the Laboratório Central (LACEN). Further details on the development of VibroComp can be obtained in [17].

To evaluate the operational condition of a SC, mechanical, electrical and thermal properties are monitored. Table 2 shows some of the signs monitored by VibroComp that are used in this work.


Table 2. Some signals monitored by VibroComp.

244 Fuzzy Logic – Algorithms, Techniques and Implementations

equipments are monitored by VibroComp. Figure 3 shows a photograph of CPAV-01, one of SC monitored in the substation of Vila do Conde. This equipment, a member of the National Interconnected System (SIN), is used for voltage regulation. The main features of CPAV-01

Fig. 3. Synchronous Compensator 01 of the substation Vila do Conde.

Table 1. Nominal Characteristics of CPAV-01.

1. *Ha*rdw*a*r*e:* 

2. *S*of*t*w*are:* 

this work.

cables, etc..;

 Database Module; Expert Diagnosis System;

Client Module

Data Acquisition Module;

The VibroComp system consists of the following parts:

Sources, sensors, transmitters and signal conditioners;

details on the development of VibroComp can be obtained in [17].

Characteristics Value Power 150 MVAR Speed 900 RPM Voltage 13.8 KV Current 6.275 A Frequency 60Hz

Aquisition Computers, Database Computers, data acquisition cards, serial cards,

The signal conditioning hardware and monitoring software were developed at the Centro de Tecnologia da ELETRONORTE, known as the Laboratório Central (LACEN). Further

To evaluate the operational condition of a SC, mechanical, electrical and thermal properties are monitored. Table 2 shows some of the signs monitored by VibroComp that are used in

are presented in Table 1.

The Data Acquisition Module is a client/server application that uses the TCP/IP protocol to send information to the Client Module and Database Module. The Client Module was developed in order to be the interface between the user and the Acquisition and Database modules. The client can get the waveforms of the measured signals from the acquisition module and also make the trend analysis and event analysis. The Expert Diagnosis System is used to analyze the information stored in the Database Module. The application runs on the client module and provides to the analyst the possibility of each fails of the equipment. To do this is used a Fuzzy Inference Engine. In Figures 4 and 5 some of the interfaces of VibroComp are presented.

In the next section will present the procedure for the identification of predictive models used in this work. The modeling techniques presented in section II will be used.

Fig. 4. Main Interface of Client Module of VibroComp

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

Fig. 6. Profile Auto-correlation function of the signal *Tbea87*(*k*).

*r*

where *u* is the average value of the signal *u*(k).

of the auto-correlation signal *Tbea87*(*k*).

following formulation was used:

the signs *Tbea87*(*k*), *Ph2*(*k*) and *Q*(*k*).

signal *Tbea87*(*k*).

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 247

regressive Moving Average (ARMA) [18]. The profile of the ACF shows signs of a fixed pattern for the first time delays followed by a pattern composed of combinations of exponential and damped sinusoidal functions. In Figure 6, for example, is shown the profile

The CCF is used to assess correlations between two discrete signals *u*(*k*) and *y*(*k*). The

Delays

1

In the Figure 7 is presented the profile of the CCF between the signal *Taer*(*k*) and the signal *Tbea87*(*k*). The analysis of the CCF of the *Taer*(*k*) indicates that this signal is more correlated to

*k*

2

(19)

[ ( ) ][ ( ) ] [ () ]

*yt y*

*y k y uk u*

1

Fig. 7. Profile of the cross-correlation function between the signal *Taer*(*k*) and the

Delays

*N k yu N*

Fig. 5. Interface of the Expert Diagnosis System of VibroComp

### **3. Case studies 1: System modeling of a synchronous compensator**

### **3.1 Synchronous compensator predictive models**

In this section we present a case study where we identified the parameters of mathematical models that describe, approximately, the behavior of a SC operating in a normal condition.

The equipment CPAV-01, located in the Vila do Conde substation was examined. The models proposed in this work were estimated and validated with real data from the VibroComp monitoring system. The analyzed period was 03/01/2008 to 25/03/2008. The SC was under normal conditions without showing any anomaly.

The identification of mathematical models to describe the behavior of CPAV-01 in this period is a practical procedure that can be divided into the following steps:


The objective of the first step is to identify the correlations that exist in the monitored signals. In this work two mathematical functions are used: Autocorrelation Function (ACF) and Cross-Correlation Function (CCF):

The ACF was used to identify correlations in time of a discrete signal *y*(*k*). The formulation used is as follows:

$$r\_{\tau} = \frac{\sum\_{k=\tau+1}^{N} [y(k) - \overline{y}][y(k-\tau) - \overline{y}]}{\sum\_{k=1}^{N} [y(k) - \overline{y}]^2} \tag{18}$$

where: *y* is the average value of the signal *y*(*k*) and *k* is the discrete time, an integer multiple of the sampling interval, *Ts* .

The ACF analysis revealed that some of the monitored signals (*P*(*k*) and *Rot*(*k*)) behaves approximately like random and uncorrelated white noise. Other signs (*Taer*(*k*), *Ph2*(*k*), *Mlah*(*k*), *P*(*k*), *Tbea87*(*k*) and *Ldh1*(*k*)) are auto-correlated and can be characterized by models with Auto246 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 5. Interface of the Expert Diagnosis System of VibroComp

SC was under normal conditions without showing any anomaly.

period is a practical procedure that can be divided into the following steps:

1. Statistic analysis of the monitored signals to identify dynamic relationships;

1

*N k*

*r*

**3.1 Synchronous compensator predictive models** 

2. Choose the structure of the models; 3. Models Estimation and validation;

and Cross-Correlation Function (CCF):

multiple of the sampling interval, *Ts* .

used is as follows:

**3. Case studies 1: System modeling of a synchronous compensator** 

In this section we present a case study where we identified the parameters of mathematical models that describe, approximately, the behavior of a SC operating in a normal condition. The equipment CPAV-01, located in the Vila do Conde substation was examined. The models proposed in this work were estimated and validated with real data from the VibroComp monitoring system. The analyzed period was 03/01/2008 to 25/03/2008. The

The identification of mathematical models to describe the behavior of CPAV-01 in this

The objective of the first step is to identify the correlations that exist in the monitored signals. In this work two mathematical functions are used: Autocorrelation Function (ACF)

The ACF was used to identify correlations in time of a discrete signal *y*(*k*). The formulation

1

where: *y* is the average value of the signal *y*(*k*) and *k* is the discrete time, an integer

The ACF analysis revealed that some of the monitored signals (*P*(*k*) and *Rot*(*k*)) behaves approximately like random and uncorrelated white noise. Other signs (*Taer*(*k*), *Ph2*(*k*), *Mlah*(*k*), *P*(*k*), *Tbea87*(*k*) and *Ldh1*(*k*)) are auto-correlated and can be characterized by models with Auto-

 

*N k*

2

(18)

[ ( ) ][ ( ) ] [() ]

*y k y y k y*

*yk y*

regressive Moving Average (ARMA) [18]. The profile of the ACF shows signs of a fixed pattern for the first time delays followed by a pattern composed of combinations of exponential and damped sinusoidal functions. In Figure 6, for example, is shown the profile of the auto-correlation signal *Tbea87*(*k*).

Fig. 6. Profile Auto-correlation function of the signal *Tbea87*(*k*).

The CCF is used to assess correlations between two discrete signals *u*(*k*) and *y*(*k*). The following formulation was used:

$$r\_{yu} = \frac{\sum\_{k=\tau+1}^{N} [\underline{\boldsymbol{y}}(k) - \overline{\boldsymbol{y}}] [\boldsymbol{u}(k-\tau) - \overline{\boldsymbol{u}}]}{\sum\_{k=1}^{N} [\underline{\boldsymbol{y}}(t) - \overline{\boldsymbol{y}}]^2} \tag{19}$$

where *u* is the average value of the signal *u*(k).

In the Figure 7 is presented the profile of the CCF between the signal *Taer*(*k*) and the signal *Tbea87*(*k*). The analysis of the CCF of the *Taer*(*k*) indicates that this signal is more correlated to the signs *Tbea87*(*k*), *Ph2*(*k*) and *Q*(*k*).

Fig. 7. Profile of the cross-correlation function between the signal *Taer*(*k*) and the signal *Tbea87*(*k*).

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

( )1

rules.

( ) ( ) ( )

Fuzzy Model Topology 1: MFT1

Fuzzy Model Topology 2: MFT2

Fuzzy Model Topology 3: MFT3

Table 4. Structure of Fuzzy Models for Signal *Taer*(*k*)

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 249

1 1

<sup>2</sup>

The third model proposed model uses a fuzzy inference system to represent the signal *Taer*(*k*). Table 4 presents details of the two topologies used. All models are Sugeno fuzzy systems with weighted average defuzzifier and number of outputs equal to the number of

Inputs Sets Function Parameters

*Taer*, *Tbea*<sup>87</sup> 2-3 *Bell* 50 ou 135

*Taer*, *Tbea*87, *Ph2* 2-3 *Gaussiana* 44 ou 96

The nomenclature used to identify the models is as follows: *MFT1* represents the Fuzzy Model Topology 1. The interpretation of other fields in the Table 4 is as follows: in each model are specified the inputs, the number of sets on each input and the type of membership function. The model *MFT1*, for example, uses two inputs with two or three Bell fuzzy sets in each input. The number of parameters in the model is 50 or 135, depending on

<sup>1</sup> (,,,)

1 *Bell <sup>b</sup> f xabc*

<sup>2</sup> (, ,)

The parameter estimation was performed in the MATLAB environment. To estimate the models of the linear equations (20) and (21) we used the System Identification Toolbox [19]. The estimation method used was the non-recursive least squares. The mass of data was divided into two parts: the first was used for the estimation of the model and the second part was used for validation. Figure 8 shows the time domain validation of the model of

*Gauss f xce*

2

2

(22)

(23)

*x c a*

*x c*

*Taer*, *Tbea*<sup>87</sup> 2 *Gaussian* 212

the chosen combination. The Bell and Gaussian function used are as follows.

 

*T*

(21)

2*q*

2 87 1 1 23 4 1 23 4

*h bea*

( )() ( ) () () () () () () () ()

*Aq yk Bq Uk k yk T k Uk P k T k Qk A q a q a q a q a q Bq B q B q B q Bq b bq bq Bq b bq bq Bq b bq b*

*aer*

1 111 012 1 12 0 00 01 02 1 12 1 10 11 12 1 1 2 20 21 2

() () () ()

 

 

A similar analysis realized for the signal *Taer*(*k*) was performed for all other signs in Table 2. The final result of the statistic analysis is presented in Table 3. The interpretation of this table is as follows: the signals in the left column are related to the central column signals at delays intervals specified in the right column. For example, the signal *Taer*(*k*), is selfcorrelated and is also related with the signs *Tbea87*(*k*), *Tbea87*(*k*-1), *Ph*2(*k*-1), *Ph*2(*k*-2), *Ph*2(*k*-3) and so on. The order of presentation of the signs in the center column is proportional to the intensity of the relationship with the signals in the left column.


Table 3. Correlations of Signals Monitored by VibroComp.

The choice of the model structure, the goal of the second step of the identification procedure was based on information in Table 3. The statistical characteristics of the signals indicate that Auto-regressive Moving Average with Exogenous Input (ARMAX) models are good alternatives to explain the dynamic relationships of the monitored signals. However, it is suspected that there are nonlinear relationships between the monitored signals. These relationships are better described by a universal nonlinear approximate operator. For comparison purposes in this paper will be use three types of mathematical models: Singleinput single-output (SISO) ARMAX, multi-inpult single-output (MISO) ARX and a MISO Sugeno fuzzy system. For exemplification purposes, details of the procedure for identification of the *Taer*(*k*) model will be presented. The other signs presented in Table 3. can be estimated by a similar procedure.

The first model analyzed for the sign *Taer*(*k*) is the SISO ARMAX with the following structure:

$$T\_{arr}(k) = \sum\_{i=1}^{4} a\_i T\_{arr}(k-i) + \sum\_{i=0}^{1} b\_i T\_{heat37}(k-i) + \varepsilon(k) + c\_1 \varepsilon(k-1) \tag{20}$$

where is an uncorrelated noise that supposedly corrupts the data, since the model is designed in a stochastic environment. For the sake of structural simplicity, only the signal *Tbea87*(*k*) was chosen as the input for this model. As shown in Table 3, this signal has higher values for the CCF with the *Taer*(*k*). Under an intuitive point of view, it is coherent to suppose that the temperature of cooling water is dependent on the temperature values of the stator bars of SC.

A second more complex MISO ARX model was proposed to explain the behavior of the signal *Taer*(*k*). In this case, the other relationships identified in Table 3 were used. The following structure was chosen:

248 Fuzzy Logic – Algorithms, Techniques and Implementations

A similar analysis realized for the signal *Taer*(*k*) was performed for all other signs in Table 2. The final result of the statistic analysis is presented in Table 3. The interpretation of this table is as follows: the signals in the left column are related to the central column signals at delays intervals specified in the right column. For example, the signal *Taer*(*k*), is selfcorrelated and is also related with the signs *Tbea87*(*k*), *Tbea87*(*k*-1), *Ph*2(*k*-1), *Ph*2(*k*-2), *Ph*2(*k*-3) and so on. The order of presentation of the signs in the center column is proportional to the

intensity of the relationship with the signals in the left column.

Table 3. Correlations of Signals Monitored by VibroComp.

be estimated by a similar procedure.

structure:

where

the stator bars of SC.

following structure was chosen:

Tag Correlations Delays

*Ldh1 Mlah*, *Tbea*87 e *Leh2* [1 7], [1 7] e [1 7] *Ph2 Q*, *Tbea*87 e *Taer* [1 3], [1 3] e [1 3] *Q Ph2*, *Tbea*87 e *Thsr* [1 3], [1 2] e [1 3] *Tbea*87 *Tbea105*, *Q*, e *Ther* [1 2], [1 2] e [1 2]

*Taer Taer*, *Tbea*87, *Ph2, Q* [1 4], [0 2], [0 2] e [0 2]

The choice of the model structure, the goal of the second step of the identification procedure was based on information in Table 3. The statistical characteristics of the signals indicate that Auto-regressive Moving Average with Exogenous Input (ARMAX) models are good alternatives to explain the dynamic relationships of the monitored signals. However, it is suspected that there are nonlinear relationships between the monitored signals. These relationships are better described by a universal nonlinear approximate operator. For comparison purposes in this paper will be use three types of mathematical models: Singleinput single-output (SISO) ARMAX, multi-inpult single-output (MISO) ARX and a MISO Sugeno fuzzy system. For exemplification purposes, details of the procedure for identification of the *Taer*(*k*) model will be presented. The other signs presented in Table 3. can

The first model analyzed for the sign *Taer*(*k*) is the SISO ARMAX with the following

<sup>87</sup> <sup>1</sup> 1 0 () ( ) ( ) ( ) ( 1) *aer i aer i bea i i T k aT k i bT k i k c k*

designed in a stochastic environment. For the sake of structural simplicity, only the signal *Tbea87*(*k*) was chosen as the input for this model. As shown in Table 3, this signal has higher values for the CCF with the *Taer*(*k*). Under an intuitive point of view, it is coherent to suppose that the temperature of cooling water is dependent on the temperature values of

A second more complex MISO ARX model was proposed to explain the behavior of the signal *Taer*(*k*). In this case, the other relationships identified in Table 3 were used. The

is an uncorrelated noise that supposedly corrupts the data, since the model is

(20)

 

4 1

$$\begin{aligned} A(q^{-1})y(k) &= B(q^{-1})LI(k) + \varepsilon(k) \\ y(k) &= T\_{\text{arc}}(k) \end{aligned}$$

$$\begin{aligned} \begin{aligned} \begin{aligned} \left[I(k) = \left[P\_{h2}(k)\int\_{\text{break}} T\_{\text{break}}(k)\right]Q(k)\right]^T\\ A(q^{-1}) &= 1 + a\_1q^{-1} + a\_2q^{-2} + a\_3q^{-3} + a\_4q^{-4} \end{aligned} \\ B(q^{-1}) &= \left[B\_0(q^{-1}) + B\_1(q^{-1}) + B\_2(q^{-1})\right] \end{aligned} \tag{21}$$

$$\begin{aligned} B\_0(q^{-1}) &= b\_{00} + b\_{01}q^{-1} + b\_{02}q^{-2} \\ B\_1(q^{-1}) &= b\_{10} + b\_{11}q^{-1} + b\_{12}q^{-2} \\ B\_2(q^{-1}) &= b\_{20} + b\_{21}q^{-1} + b\_{22}q^{-2} \end{aligned} \tag{32}$$

The third model proposed model uses a fuzzy inference system to represent the signal *Taer*(*k*). Table 4 presents details of the two topologies used. All models are Sugeno fuzzy systems with weighted average defuzzifier and number of outputs equal to the number of rules.


Table 4. Structure of Fuzzy Models for Signal *Taer*(*k*)

The nomenclature used to identify the models is as follows: *MFT1* represents the Fuzzy Model Topology 1. The interpretation of other fields in the Table 4 is as follows: in each model are specified the inputs, the number of sets on each input and the type of membership function. The model *MFT1*, for example, uses two inputs with two or three Bell fuzzy sets in each input. The number of parameters in the model is 50 or 135, depending on the chosen combination. The Bell and Gaussian function used are as follows.

$$f\_{Bell}(\mathbf{x}, a, b, c) = \frac{1}{\mathbf{1} + \left| \frac{\mathbf{x} - c}{a} \right|^{2b}} \tag{22}$$

$$f\_{\text{Gauss}}(\mathbf{x}, \sigma, \mathbf{c}) = e^{\left(\frac{-\mathbf{x} + \mathbf{c}}{\sqrt{2\sigma}}\right)^2} \tag{23}$$

The parameter estimation was performed in the MATLAB environment. To estimate the models of the linear equations (20) and (21) we used the System Identification Toolbox [19]. The estimation method used was the non-recursive least squares. The mass of data was divided into two parts: the first was used for the estimation of the model and the second part was used for validation. Figure 8 shows the time domain validation of the model of

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

Table 6. Coefficients of the Linear MISO Model for Signal *Taer*(*k*).

be presented in the next section.

**3.2 Performance evaluation of predictive models** 

estimated in Section IV-A. The criteria used are as follows:

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 251

Fuzzy models presented in Table 6 were estimated with the algorithm ANFIS (Adaptive-Network-based Fuzzy Inference System) proposed by Jyh-Shing [20] and available on Fuzzy Systems Toolbox of MATLAB, MathWorks (2002). ANFIS is an algorithm for parameter adjustment of Sugeno fuzzy systems based on training data. In Figure 10 presents the results

of the comparison between the output of the model *MFT2* and the real signal *Taer*(*k*).

**Parameters Values Parameters Values**  *a1* -0.546560 *b10* 0.624452 *a2* -0.194398 *b11* -0.531587 *a3* -0.031782 *b12* -0.296779 *a4* 0.035349 *b20* -0.024828 *b00* 0.829278 *b21* 0.002835 *b01* -0.450037 *b22* 0.013416 *b02* -0.145693 - -

Fig. 10. Comparison between the output signal of the *MFT2* model and the real signal *Taer*(*k*).

Time(hours)

A similar procedure to that described for the signal *Taer(k)* was performed for all other signals in Table 3. Annex A shows the identified models. The set of models obtained represents the normal behavior of CPAV-01. Comparing the behavior estimated by the standard model with the actual behavior of the equipment is possible to identify the occurrence of malfunctions. The performance of predictive models of the signal *Taer(k)* will

In this section we present the results of performance evaluation of predictive models

Equation (20). The sampling interval used in the model is *Ts* = 1 hour. The model can explain the dynamics of the signal in most of the time interval analyzed. The identified parameters are presented in the Table 5.

Fig. 8. Comparison between the output signal of the SISO Model and the real signal *Taer*(*k*).


Table 5. Coefficients of the Linear SISO Model for Signal *Taer*(*k*).

Figure 9 shows the time domain validation of the MISO model of equation (21). Table 5 shows the values of the estimated coefficients.

Fig. 9. Comparison between the output signal of the MISO Model and the real signal *Taer*(*k*).

250 Fuzzy Logic – Algorithms, Techniques and Implementations

Equation (20). The sampling interval used in the model is *Ts* = 1 hour. The model can explain the dynamics of the signal in most of the time interval analyzed. The identified parameters

Fig. 8. Comparison between the output signal of the SISO Model and the real signal *Taer*(*k*).

Time(hours)

Parameters Values Parameters Values *a1* -1.054051 *b0* 0.624452 *a2* 0.130458 *b1* -0.531587 *a3* 0.036976 *c1* -0.296779 *a4* 0.016368 - -

Figure 9 shows the time domain validation of the MISO model of equation (21).

Fig. 9. Comparison between the output signal of the MISO Model and the real signal *Taer*(*k*). Time(hours)

Table 5. Coefficients of the Linear SISO Model for Signal *Taer*(*k*).

Table 5 shows the values of the estimated coefficients.

are presented in the Table 5.

Fuzzy models presented in Table 6 were estimated with the algorithm ANFIS (Adaptive-Network-based Fuzzy Inference System) proposed by Jyh-Shing [20] and available on Fuzzy Systems Toolbox of MATLAB, MathWorks (2002). ANFIS is an algorithm for parameter adjustment of Sugeno fuzzy systems based on training data. In Figure 10 presents the results of the comparison between the output of the model *MFT2* and the real signal *Taer*(*k*).


Table 6. Coefficients of the Linear MISO Model for Signal *Taer*(*k*).

Fig. 10. Comparison between the output signal of the *MFT2* model and the real signal *Taer*(*k*).

A similar procedure to that described for the signal *Taer(k)* was performed for all other signals in Table 3. Annex A shows the identified models. The set of models obtained represents the normal behavior of CPAV-01. Comparing the behavior estimated by the standard model with the actual behavior of the equipment is possible to identify the occurrence of malfunctions. The performance of predictive models of the signal *Taer(k)* will be presented in the next section.

### **3.2 Performance evaluation of predictive models**

In this section we present the results of performance evaluation of predictive models estimated in Section IV-A. The criteria used are as follows:

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

Table 7. Results of Fuzzy Models Training for the Signal *Taer*(*k*).

**compensator** 

steps:

**4.1 Project of the fuzzy expert system** 

simplifying the inference unit;

What are the faults to be detected?

must be specified for each variable;

**4. Case studies 2: Development of a Fuzzy expert system for a synchronous** 

This section describes the project of a Fuzzy Expert System used to faults diagnosis of a SC based on a Mandan fuzzy system [16]. The design methodology is formed by the following

1. Selection of input variables - the choice depends on the quantity and quality of information provided by the monitoring system. The cause and effect relationships involved in the operation of the equipment helps in this selection. A detailed study of the correlation between variables can help eliminate redundancy of information

2. Selection of Output Variables - At this stage the following question must be answered:

3. Selection of Membership Functions – For each input and output, acceptable and not acceptable levels should be determined. In addition, the number of sets and the overlap

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 253

**ID Training** *EMQT EMQV MFT1* 2 sets, 50 parameters, 8 rules 1 20 0.7071 1.029 2 150 0.6009 1.231 3 sets, 135 parameters, 27 rules 3 10 0.4936 3.4069 4 50 0.4762 6.1030 *MFT2* 2 sets, 44 parameters, 8 rules 5 10 0.6597 1.0083 6 250 0.6064 0.9085 3 sets, 96 parameters, 27 regras 7 10 0.4692 3.8152 8 20 0.4663 4.1179 *MFT3* 2 sets, 212 parameters, 32 rules 9 10 0.3434 4.4351 10 20 0.3405 4.3070 *ML1* 0 sets, 7 parameters, 0 rules 11 1 1.7053 4.4881 *ML2* 0 sets, 13 parameters, 0 rules 12 1 1.2130 3.9079


The Structural Complexity (SCO) can be evaluated by the total number of adjustable parameters. For the fuzzy models the number of rules and membership sets are also considered.

The Computational Effort for Estimation (CEE) can be measured by the number of training epochs until a good model is estimated. In this work the efficiency of the estimation method is not considered. Therefore, a simplifying assumption will be used to assume that the cost estimation is associated only to the number of training epochs until a certain level of accuracy of the model is achieved.

The quality of a model depends on the value of the Mean Square Error Training (*EMQT*) and the Mean Squared Validation (*EMQV*). In this work the following index will be used:

$$EMQ\_x = \frac{1}{N} \sum\_{k=0}^{N} \left[ \hat{T}\_{\text{aerx}}(k) - T\_{\text{aerx}}(k) \right]^2 \tag{24}$$

where <sup>ˆ</sup> ( ) *T k aerx* is the signal is estimated; ( ) *T k aerx* is the real measured signal, and *<sup>x</sup>* [*T V*] indicates the error is calculated with training or validation data.

In the Table 7 are presented the results of the training of the fuzzy models. In some situations the increase in the number of membership functions results in improved performance during the training but decreased performance in the validation. This observation can be proved for the model *MFT1* comparing lines 1 and 2 with lines 3 and 4 and for the model *MFT2* comparing lines 5 and 6 with rows 7 and 8.

The increase in the number of training epochs can also exert a deleterious effect on the *EMQV*. For the model *MFT1*, this effect is observed comparing lines 1 with 2 and 3 with 4. For the model *MFT2* this increase in *EMQV* is observed comparing the line 7 to line 8. The cause of this behavior is to decrease the generalize ability of the model during the training, phenomenon known as overfitting. The best performance in the training was obtained with the model *MFT3* on line 10 and the best performance in the validation phase was observed in line 6 with the model *MFT2*.

Comparing the model MFT3 with the models MFT1 and MFT2 it's observed that increasing the number of inputs improves performance in training data. However, this relationship was not observed when the validation data are analyzed.

The comparison between the models of Equations (20) and (21) shows that the MISO is beter. In this case the increase in the SCO resulted in better performance.

In all simulations the performance of the Fuzzy model was superior to linear models in the training data. However, when the validation data are considered this relationship is not always true. An example is the comparison between lines 12 and 4 where we observe an increase in the SCO and a degradation of performance in the validation data.

252 Fuzzy Logic – Algorithms, Techniques and Implementations

The Structural Complexity (SCO) can be evaluated by the total number of adjustable parameters. For the fuzzy models the number of rules and membership sets are also

The Computational Effort for Estimation (CEE) can be measured by the number of training epochs until a good model is estimated. In this work the efficiency of the estimation method is not considered. Therefore, a simplifying assumption will be used to assume that the cost estimation is associated only to the number of training epochs until a certain level of

The quality of a model depends on the value of the Mean Square Error Training (*EMQT*) and

<sup>1</sup> <sup>ˆ</sup> () () *<sup>N</sup>*

2

*<sup>N</sup>* (24)

the Mean Squared Validation (*EMQV*). In this work the following index will be used:

indicates the error is calculated with training or validation data.

and for the model *MFT2* comparing lines 5 and 6 with rows 7 and 8.

was not observed when the validation data are analyzed.

beter. In this case the increase in the SCO resulted in better performance.

increase in the SCO and a degradation of performance in the validation data.

0

*<sup>x</sup> aerx aerx <sup>k</sup> EMQ T k T k*

where <sup>ˆ</sup> ( ) *T k aerx* is the signal is estimated; ( ) *T k aerx* is the real measured signal, and *<sup>x</sup>* [*T V*]

In the Table 7 are presented the results of the training of the fuzzy models. In some situations the increase in the number of membership functions results in improved performance during the training but decreased performance in the validation. This observation can be proved for the model *MFT1* comparing lines 1 and 2 with lines 3 and 4

The increase in the number of training epochs can also exert a deleterious effect on the *EMQV*. For the model *MFT1*, this effect is observed comparing lines 1 with 2 and 3 with 4. For the model *MFT2* this increase in *EMQV* is observed comparing the line 7 to line 8. The cause of this behavior is to decrease the generalize ability of the model during the training, phenomenon known as overfitting. The best performance in the training was obtained with the model *MFT3* on line 10 and the best performance in the validation phase was observed

Comparing the model MFT3 with the models MFT1 and MFT2 it's observed that increasing the number of inputs improves performance in training data. However, this relationship

The comparison between the models of Equations (20) and (21) shows that the MISO is

In all simulations the performance of the Fuzzy model was superior to linear models in the training data. However, when the validation data are considered this relationship is not always true. An example is the comparison between lines 12 and 4 where we observe an

Structural Complexity (SCO);

accuracy of the model is achieved.

in line 6 with the model *MFT2*.

Mean Square Error (EMQ);

considered.

Computational Effort for Estimation (CEE);


Table 7. Results of Fuzzy Models Training for the Signal *Taer*(*k*).

### **4. Case studies 2: Development of a Fuzzy expert system for a synchronous compensator**

### **4.1 Project of the fuzzy expert system**

This section describes the project of a Fuzzy Expert System used to faults diagnosis of a SC based on a Mandan fuzzy system [16]. The design methodology is formed by the following steps:


Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

distribution has great influence on the behavior of the diagnostic system.

Fig. 11. Membership functions for variable F7, misalignment of bearings.

Universe [0

Below is one of the specified rules:

of input variables.

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 255

around the values of precision defined by the expert engineers. Triangular functions with no more than 10% of base showed satisfactory results. Figure 11 is shown an example of distribution of membership functions of the output variable F7. It was observed that this

Fifteen rules were defined by the expert engineers, so that the diagnostic system can detect the faults described in Table 8. These rules use only vibration and temperature variables.

**Regra 1:** IF *Mlah* IS Alarme 1 AND *Mlbh* IS Alarme 1 THEN F1 IS 70% AND F2 IS 30% AND F3 IS 20% AND F4 IS 10% AND F5 IS 10% AND F6 IS 10% AND F7 IS 20%. In this and all other rules provided by the experts can be observed another unique feature: the antecedents are short combinations of the monitored signals and the consequents are long combinations of faults. Table 9 shows the characteristics of the membership functions

100] Function

Table 9. Membership Functions of Vibration Inputs: *Mlah*, *Mlav*, *Mlaa*, *Mlbh*, *Mlbv* and *Mlba*.

Changes made in the distribution of the membership functions in the output variables and the choice of the precision of the faults resulted in satisfactory performance. In the validation tests were observed differences of performance related to the defuzzifier used. This is a project choice and the most appropriate defuzzifier depends on the application.

The other operators of the inference unit have not great influence on the performance. The Fuzzy Expert System, in its current state of development, allows the use of the following methods: T-Norm, Min operator; Mandani implication and Maximum aggregation method.

Fuzzy Sets Type Interval Normal Trapezoidal [0 0 30 40] Alarm - 1 Triangular [40 45 50] Alarm - 2 Trapezoidal [50 60 100 100]


In the first stage two approaches have been proposed: the first strategy considers only the global values of the signals monitored by VibroComp as inputs and the second approach uses the spectral information of the vibration signals as inputs. In this paper, only the conventional approach will be used because the data base structure of the Expert System of VibroComp has not using spectrum information. The input signals to be used are: *Mlah*, *Mlav*, *Mlaa*, *Mlba*, *Mlbh* and *Mlbv*. A description of these abbreviations can be found in Table 2.

The output variables are the faults to be detected. For each fault the expert maintenance engineers of the company defined default probability values of the fault. Table 8 shows the outputs of the fuzzy expert system and the probabilities values defined.


Table 8. Outputs of Synchronous Compensator Fuzzy Expert System

The structure defined in the expert system outputs is so peculiar: for each fault are defined the expected possibilities. The table 8 was determined from the experience of the company's maintenance experts. The validation tests of the fuzzy expert system proposed show that this feature can be better used if each fault is described by a finite number of fuzzy sets equal to the number of possibilities provided by the experts. From a practical point of view, this project choice is based on the following argument: defining a finite number of fault possibilities ensures that the diagnostic system will present expected results.

This project choice, however, does not guarantee the accuracy of the diagnosis. The distribution of fuzzy membership sets in the output variables is a very important aspect of the fuzzy expert system. There are significant inconsistencies between the output values of the fuzzy expert system and the expected values when the membership functions uniformly distributed throughout the universe of discourse of the output variables. So uniformly in the distribution the membership functions, which is a common practice in most applications described in the literature [16], did not show satisfactory results for any kind of defuzzifier used. The solution was to specify non-overlapping fuzzy sets, located in a rather narrow 254 Fuzzy Logic – Algorithms, Techniques and Implementations

4. Formulation of Rules – Standard fuzzy IF-THEN rules that considers the normality

In the first stage two approaches have been proposed: the first strategy considers only the global values of the signals monitored by VibroComp as inputs and the second approach uses the spectral information of the vibration signals as inputs. In this paper, only the conventional approach will be used because the data base structure of the Expert System of VibroComp has not using spectrum information. The input signals to be used are: *Mlah*, *Mlav*,

The output variables are the faults to be detected. For each fault the expert maintenance engineers of the company defined default probability values of the fault. Table 8 shows the

> F2 Faulty bearing 10%, 20%, 70%, 90% F3 Rubbing Axis 20%, 30%, 50%, 70%

> F5 Oil Whirl 10%, 20%, 70%, 90% F6 A bent shaft 10%, 30%, 40%, 60%

The structure defined in the expert system outputs is so peculiar: for each fault are defined the expected possibilities. The table 8 was determined from the experience of the company's maintenance experts. The validation tests of the fuzzy expert system proposed show that this feature can be better used if each fault is described by a finite number of fuzzy sets equal to the number of possibilities provided by the experts. From a practical point of view, this project choice is based on the following argument: defining a finite number of fault

This project choice, however, does not guarantee the accuracy of the diagnosis. The distribution of fuzzy membership sets in the output variables is a very important aspect of the fuzzy expert system. There are significant inconsistencies between the output values of the fuzzy expert system and the expected values when the membership functions uniformly distributed throughout the universe of discourse of the output variables. So uniformly in the distribution the membership functions, which is a common practice in most applications described in the literature [16], did not show satisfactory results for any kind of defuzzifier used. The solution was to specify non-overlapping fuzzy sets, located in a rather narrow

10%, 20%, 70%, 90%

10%, 20%, 70%, 90%

10%, 20%, 30%, 70%

*Mlaa*, *Mlba*, *Mlbh* and *Mlbv*. A description of these abbreviations can be found in Table 2.

outputs of the fuzzy expert system and the probabilities values defined.

F1 Mechanical Unbalance

F4 Housing/Support Loose

F7 Misalignment of Bearings

Table 8. Outputs of Synchronous Compensator Fuzzy Expert System

possibilities ensures that the diagnostic system will present expected results.

**ID Fail Values** 

5. Selection of Operators – Plausibility and continuity should be used for this selection; 6. Adjust of Rule Base – Simulation using trial and error procedure used to detect inconsistencies in the rule base. Mathematical models of the monitored system also can

conditions;

be used;

around the values of precision defined by the expert engineers. Triangular functions with no more than 10% of base showed satisfactory results. Figure 11 is shown an example of distribution of membership functions of the output variable F7. It was observed that this distribution has great influence on the behavior of the diagnostic system.

Fig. 11. Membership functions for variable F7, misalignment of bearings.

Fifteen rules were defined by the expert engineers, so that the diagnostic system can detect the faults described in Table 8. These rules use only vibration and temperature variables. Below is one of the specified rules:

### **Regra 1:** IF *Mlah* IS Alarme 1 AND *Mlbh* IS Alarme 1 THEN F1 IS 70% AND F2 IS 30% AND F3 IS 20% AND F4 IS 10% AND F5 IS 10% AND F6 IS 10% AND F7 IS 20%.

In this and all other rules provided by the experts can be observed another unique feature: the antecedents are short combinations of the monitored signals and the consequents are long combinations of faults. Table 9 shows the characteristics of the membership functions of input variables.


Table 9. Membership Functions of Vibration Inputs: *Mlah*, *Mlav*, *Mlaa*, *Mlbh*, *Mlbv* and *Mlba*.

Changes made in the distribution of the membership functions in the output variables and the choice of the precision of the faults resulted in satisfactory performance. In the validation tests were observed differences of performance related to the defuzzifier used. This is a project choice and the most appropriate defuzzifier depends on the application.

The other operators of the inference unit have not great influence on the performance. The Fuzzy Expert System, in its current state of development, allows the use of the following methods: T-Norm, Min operator; Mandani implication and Maximum aggregation method.

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

MBFDI technique with Fuzzy *MFT2*.

technique with the SISO linear model.

with MISO Linear Model.

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 257

Fig. 13. Increase in the stator bars temperatures - Results of the event analysis using the

Time(x1 hours)

Fig. 14. Increase in the stator bars temperatures - Results of the event analysis using MBFDI

Time(x1 hours)

Fig. 15. Increase in the stator bars temperatures - Results of the event analysis using MBFDI

Time(x1 hours)

deviation. At the beginning of the anomalous behavior, the residues signal increases in the three models, which allows a rapid and reliable detection of the failure. In the model the

### **4.2 Experimental evaluation of the fuzzy expert system**

To evaluate the fault detection methodology proposed in this work will be presented a case study where it was possible to detect an anomalous behavior based on the residual analysis of the reference models of the SC.

In the figure 12 is presented the analyzed event which was monitored by VibroComp on 11/03/2008 11:47:56 AM in CPAV-01 at *t* = 1560 hours of operation. In this situation was detected a considerable increase in the stator bars temperatures *Tbea87*(*k*), *Tbea96*(*k*) and *Tbea105*(*k*). We also observed an increase in the value of reactive power that reached the value of *Q*(*k*) = 148,5 MVAR, close to the nominal limit of apparent power of the equipment (150MVA). In Table 10 are presented the recorded values and the normal limits of each monitored signal. Is note the scope of this work explain the causes of the dynamic behavior observed in CPAV-01 based on laws of physics and mechanical models. To establish clearly these cause and effect relationships, mechanical engineering and dynamic vibration knowhow are required, The author of this article does not have this knowhow. The main objective of this work is not to explain. The intention is to describe the mechanical behavior of the studied system and classify this dynamic in patterns or signatures using mathematical models estimated and validated based on real monitoring data. Using these models a fault detection system is projected based on MBFDI techniques.

Fig. 12. Experimental evaluation of fuzzy expert system with case study in CPAV-01. Signals monitored during the event and normal limits.


Table 10. Values monitored by VibroComp during the stator temperature event.

Figures 13 to 15 are presented the results of the analysis of the event using the MBFDI technique presented in this work. *MFT2*, the SISO linear model of Equation (20) and the MIMO linear model of Equation (21) are used as reference models, respectively. In the condition of normality, the residual signal has a mean near zero and a low standard 256 Fuzzy Logic – Algorithms, Techniques and Implementations

To evaluate the fault detection methodology proposed in this work will be presented a case study where it was possible to detect an anomalous behavior based on the residual analysis

In the figure 12 is presented the analyzed event which was monitored by VibroComp on 11/03/2008 11:47:56 AM in CPAV-01 at *t* = 1560 hours of operation. In this situation was detected a considerable increase in the stator bars temperatures *Tbea87*(*k*), *Tbea96*(*k*) and *Tbea105*(*k*). We also observed an increase in the value of reactive power that reached the value of *Q*(*k*) = 148,5 MVAR, close to the nominal limit of apparent power of the equipment (150MVA). In Table 10 are presented the recorded values and the normal limits of each monitored signal. Is note the scope of this work explain the causes of the dynamic behavior observed in CPAV-01 based on laws of physics and mechanical models. To establish clearly these cause and effect relationships, mechanical engineering and dynamic vibration knowhow are required, The author of this article does not have this knowhow. The main objective of this work is not to explain. The intention is to describe the mechanical behavior of the studied system and classify this dynamic in patterns or signatures using mathematical models estimated and validated based on real monitoring data. Using these models a fault

Fig. 12. Experimental evaluation of fuzzy expert system with case study in CPAV-01. Signals

Time(x1 hours)

*Tbea87*(*k*) 82ºC 70º C *Tbea96*(*k*) 82,23ºC 70º C *Tbea105*(*k*) 82,53ºC 70º C *Q*(*k*) 148,5 MVAR 150 MVAR

Figures 13 to 15 are presented the results of the analysis of the event using the MBFDI technique presented in this work. *MFT2*, the SISO linear model of Equation (20) and the MIMO linear model of Equation (21) are used as reference models, respectively. In the condition of normality, the residual signal has a mean near zero and a low standard

Table 10. Values monitored by VibroComp during the stator temperature event.

Tag Value Limit of Normality

**4.2 Experimental evaluation of the fuzzy expert system** 

detection system is projected based on MBFDI techniques.

monitored during the event and normal limits.

of the reference models of the SC.

Fig. 13. Increase in the stator bars temperatures - Results of the event analysis using the MBFDI technique with Fuzzy *MFT2*.

Fig. 14. Increase in the stator bars temperatures - Results of the event analysis using MBFDI technique with the SISO linear model.

Fig. 15. Increase in the stator bars temperatures - Results of the event analysis using MBFDI with MISO Linear Model.

deviation. At the beginning of the anomalous behavior, the residues signal increases in the three models, which allows a rapid and reliable detection of the failure. In the model the

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

The structure of the MISO linear model for the signal *Tbea87* is as follows:

( )1

1 2 20 21

*Bq b bq*

In the figure 17 is presented the results of the estimation of the MISO model of Eq. (27).

Fig. 17. Comparison between the signal estimated by the linear MISO model and the real

Time(hours)

<sup>87</sup> <sup>1</sup> 1 0 () ( ) ( ) ( ) ( 1) *<sup>i</sup> i bea i i Q k aQ k i bT k i k c k*

In the figure 18 is presented the results of the estimation of the SISO model of Eq. (28).

(28)

 

The structure of the SISO linear model for the signal *Q*(*k*)is as follows:

7 1

signal *Tbea87*(*k*).

( ) ( ) ( )

1 1

105

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 259

*T*

(27)

1 1 23 4 1 23 4

1 2

<sup>22</sup> *b q*

 

1 111 012 1 12 0 00 01 02 1 12 1 10 11 12

() () () ()

 

 

87

( )() ( ) () () () () () () () ()

*Aq yk Bq Uk k yk T k Uk T k T k Qk Aq aq aq aq aq Bq B q B q B q Bq b bq bq Bq b bq bq*

*bea her*

*bea*

increase is greater *MFT2* indicating that this model has a higher sensitivity for the detection of such failures. The two linear models have approximately the same level of residue during the event.

One of the rules used for the residues evaluation is shown:

$$\begin{aligned} \text{IF} \left( |r(k)| > r\_{\text{Tar}} \right) & \text{AND} \left( T\_{\text{bast}\mathcal{T}}(k) \text{Real} \right) \approx LT\\ \text{THEN} & \\ \text{FALLURE} = \text{Statement Temperature out of Range} \\ r(k) &= T\_{ar}(k) \text{Real} \quad \text{[} \quad T\_{ar}(k) \text{Estimated]} \end{aligned} \tag{25}$$

where: *r*(*k*) represents the residual of signal *Taer*(*k*); *rTaer* is the maximum allowed residue in the normal condition, and *LT* represents the thermal limit of the stator winding.

### **5. Conclusions**

In this paper we presented the design and experimental evaluation of a MBFDI system used for SC. The predictive models used in the proposed system were estimated based on real data obtained from the monitoring of the studied equipment during normal conditions. With these models is still possible to detect failures in a fledgling state from the comparison between model output and real signs monitored, as was presented in a case study where it was possible to detect the moment of occurrence of a failure.

### **6. Appendix A: Structure of the models of the signs** *Mlah***,** *Ldh1***,** *Ph2***,** *Q***,** *Tbea87*

The structure of the SISO linear model for the signal *Tbea87* is as follows:

$$T\_{\text{ben87}}(k) = \sum\_{i=1}^{4} a\_i T\_{\text{ben87}}(k-i) + \sum\_{i=0}^{3} b\_i T\_{\text{ben105}}(k-i) + \varepsilon(k) + c\_1 \varepsilon(k-1) \tag{26}$$

In the figure 16 is presented the results of the estimation of the SISO model of Eq. (26).

Fig. 16. Comparison between the signal estimated by the linear SISO model and the real signal *Tbea87*(*k*).

The structure of the MISO linear model for the signal *Tbea87* is as follows:

258 Fuzzy Logic – Algorithms, Techniques and Implementations

increase is greater *MFT2* indicating that this model has a higher sensitivity for the detection of such failures. The two linear models have approximately the same level of residue during

IF ( ( ) *Taer rk r* ) AND ( <sup>87</sup> ( ) *T k bea Real* > *LT*) THEN FAILURE = Stator Temperature out of Range (25) () () *aer rk T k* Real - ( ) *T k aer Estimated*

where: *r*(*k*) represents the residual of signal *Taer*(*k*); *rTaer* is the maximum allowed residue in

In this paper we presented the design and experimental evaluation of a MBFDI system used for SC. The predictive models used in the proposed system were estimated based on real data obtained from the monitoring of the studied equipment during normal conditions. With these models is still possible to detect failures in a fledgling state from the comparison between model output and real signs monitored, as was presented in a case study where it

**6. Appendix A: Structure of the models of the signs** *Mlah***,** *Ldh1***,** *Ph2***,** *Q***,** *Tbea87*

<sup>87</sup> <sup>87</sup> <sup>105</sup> <sup>1</sup> 1 0 ( ) ( ) ( ) ( ) ( 1) *bea i bea i bea i i T k aT k i bT k i k c k*

Fig. 16. Comparison between the signal estimated by the linear SISO model and the real

Time(hours)

In the figure 16 is presented the results of the estimation of the SISO model of Eq. (26).

(26)

 

the normal condition, and *LT* represents the thermal limit of the stator winding.

One of the rules used for the residues evaluation is shown:

was possible to detect the moment of occurrence of a failure.

The structure of the SISO linear model for the signal *Tbea87* is as follows:

4 3

the event.

**5. Conclusions** 

signal *Tbea87*(*k*).

$$\begin{aligned} A(q^{-1})y(k) &= B(q^{-1})\mathcal{U}(k) + \varepsilon(k) \\ y(k) &= T\_{\text{bar}87}(k) \end{aligned}$$

$$\begin{aligned} \mathcal{U}(k) &= \left[T\_{\text{bar}105}(k)\,T\_{\text{bar}}(k)\,\,Q(k)\right]^T \\ A(q^{-1}) &= 1 + a\_1q^{-1} + a\_2q^{-2} + a\_3q^{-3} + a\_4q^{-4} \\ B(q^{-1}) &= \left[B\_0(q^{-1}) + B\_1(q^{-1}) + B\_2(q^{-1})\right] \\ B\_0(q^{-1}) &= b\_{00} + b\_{01}q^{-1} + b\_{02}q^{-2} \\ B\_1(q^{-1}) &= b\_{10} + b\_{11}q^{-1} + b\_{12}q^{-2} \\ B\_2(q^{-1}) &= b\_{20} + b\_{21}q^{-1} + b\_{22}q^{-2} \end{aligned} \tag{27}$$

In the figure 17 is presented the results of the estimation of the MISO model of Eq. (27).

Fig. 17. Comparison between the signal estimated by the linear MISO model and the real signal *Tbea87*(*k*).

The structure of the SISO linear model for the signal *Q*(*k*)is as follows:

$$Q(k) = \sum\_{i=1}^{7} a\_i Q(k-i) + \sum\_{i=0}^{1} b\_i T\_{\text{benS}7}(k-i) + \varepsilon(k) + c\_1 \varepsilon(k-1) \tag{28}$$

In the figure 18 is presented the results of the estimation of the SISO model of Eq. (28).

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

signal *Q*(*k*).

signal *Ph2*(*k*).

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 261

Fig. 19. Comparison between the signal estimated by the linear MISO model and the real

Time(hours)

2 2 <sup>1</sup> 1 0 ( ) ( ) ( ) ( ) ( 1) *<sup>h</sup> i h i aer i i P k aP k i bT k i k c k*

In the figure 20 is presented the results of the estimation of the SISO model of Eq. (30).

Fig. 20. Comparison between the signal estimated by the linear SISO model and the real

Time(hours)

The structure of the MISO linear model for the signal *Ph2*(*k*)is as follows:

30)

 

The structure of the SISO linear model for the signal *Ph2*(*k*)is as follows:

5 8

Fig. 18. Comparison between the signal estimated by the linear SISO model and the real signal *Q*(*k*).

The structure of the MISO linear model for the signal *Q*(*k*)is as follows:

$$\begin{aligned} A(q^{-1})y(k) &= B(q^{-1})LI(k) + C(q^{-1})\varepsilon(k) \\ y(k) &= Q(k) \end{aligned}$$

$$\begin{aligned} \mathcal{U}(k) &= \left[T\_{\text{new}\mathcal{G}}(k)\ T\_{\text{arr}}(k)\ \ T\_{\text{lsr}}(k)\right]^T \\ A(q^{-1}) &= 1 + a\_1 q^{-1} \\ B(q^{-1}) &= \left[B\_0(q^{-1}) + B\_1(q^{-1}) + B\_2(q^{-1})\right] \\ B\_0(q^{-1}) &= b\_{00} + b\_{01}q^{-1} + b\_{02}q^{-2} \\ B\_1(q^{-1}) &= b\_{10} + b\_{11}q^{-1} + b\_{12}q^{-2} \\ B\_2(q^{-1}) &= b\_{20} + b\_{21}q^{-1} + b\_{22}q^{-2} \\ \mathcal{C}(q^{-1}) &= 1 + c\_1q^{-1} + c\_2q^{-2} \end{aligned} \tag{29}$$

In the figure 19 is presented the results of the estimation of the MISO model of Eq. (29).

260 Fuzzy Logic – Algorithms, Techniques and Implementations

Fig. 18. Comparison between the signal estimated by the linear SISO model and the real

Time(hours)

87

( )1 () () () ()

1

*C q*

( ) ( ) ( ) ( )

*T*

(29)

1 1 1

 

 1 2 1 2 <sup>1</sup> *cq cq*

 

*bea aer hsr*

1 111 012 1 12 0 00 01 02 1 12 1 10 11 12 1 12 2 20 21 22

*Bq b bq bq Bq b bq bq Bq b bq bq*

In the figure 19 is presented the results of the estimation of the MISO model of Eq. (29).

11 1

 

*A q y k B q Uk C q k yk Qk Uk T k T k T k Aq aq B Bq Bq Bq q*

( )() ( ) () ( )() () () () () () ()

The structure of the MISO linear model for the signal *Q*(*k*)is as follows:

signal *Q*(*k*).

Fig. 19. Comparison between the signal estimated by the linear MISO model and the real signal *Q*(*k*).

The structure of the SISO linear model for the signal *Ph2*(*k*)is as follows:

$$P\_{h2}(k) = \sum\_{i=1}^{5} a\_i P\_{h2}(k - i) + \sum\_{i=0}^{8} b\_i T\_{aer}(k - i) + \varepsilon(k) + c\_1 \varepsilon(k - 1) \tag{30}$$

In the figure 20 is presented the results of the estimation of the SISO model of Eq. (30).

Fig. 20. Comparison between the signal estimated by the linear SISO model and the real signal *Ph2*(*k*).

The structure of the MISO linear model for the signal *Ph2*(*k*)is as follows:

Fault Diagnostic of Rotating Machines Based on Artificial Intelligence:

SISO and MISO models Eq. (32) and Eq. (33).

signal *Ldh1*(*k*).

signal *Ldh1*(*k*).

Case Studies of the Centrais Elétricas do Norte do Brazil S/A – Eletrobras-Eletronorte 263

( )() ( ) () () () () () () () ()

*Aq yk Bq Uk k yk L k Uk T k M k L k Aq aq B Bq Bq Bq q*

() () () ()

*B q bq B q bq B q bq*

 

1 1

( )1

( ) ( ) ( )

*i i*

*i i*

*i i*

*bea lah eh*

 

 

 

In the figures 22 and 23 are presented the results of the time comparison of the estimated

Fig. 22. Comparison between the signal estimated by the linear SISO model and the real

Time(hours)

Fig. 23. Comparison between the signal estimated by the linear MISO model and the real

Time(hours)

*T*

(33)

*i*

*i*

*i*

1 87 2 1 1 1 1 111 012 1 7 0 0 0 1 7 1 1 0 1 7 2 2 0

*dh*

$$\begin{aligned} A(q^{-1})y(k) &= B(q^{-1})LI(k) + \varepsilon(k) \\ y(k) &= P\_{h2}(k) \end{aligned}$$

$$\begin{aligned} LI(k) &= \left[ Q(k) \ T\_{\text{beastS7}}(k) \ T\_{\text{arr}}(k) \right]^T \\ A(q^{-1}) &= 1 + a\_1 q^{-1} \end{aligned}$$

$$\begin{aligned} B(q^{-1}) &= \left[ B\_0(q^{-1}) + B\_1(q^{-1}) + B\_2(q^{-1}) \right] \\ B\_0(q^{-1}) &= b\_{00} + b\_{01}q^{-1} + b\_{02}q^{-2} + b\_{03}q^{-3} \\ B\_1(q^{-1}) &= b\_{10} + b\_{11}q^{-1} + b\_{12}q^{-2} + b\_{13}q^{-3} \\ B\_2(q^{-1}) &= b\_{20} + b\_{21}q^{-1} + b\_{22}q^{-2} + b\_{23}q^{-3} \end{aligned} \tag{31}$$

In the figure 21 is presented the results of the time comparison of the estimated MISO model of Eq. (31).

Fig. 21. Comparison between the signal estimated by the linear MISO model and the real signal *Ph2*(*k*).

The structure of the SISO linear model for the signal *Ldh1*(*k*) is as follows:

$$\begin{split} L\_{dl1}(k) &= \sum\_{i=1}^{2} a\_i L\_{dl1}(k-i) + \sum\_{i=0}^{2} b\_i T\_{bar87}(k-i) + \\ &+ \varepsilon(k) + \sum\_{i=1}^{2} c\_i \varepsilon(k-1) \end{split} \tag{32}$$

The structure of the MISO linear model for the signal *Ldh1*(*k*)is as follows:

262 Fuzzy Logic – Algorithms, Techniques and Implementations

1 1

( )1

*Bq b bq bq*

( ) ( ) ( )

of Eq. (31).

signal *Ph2*(*k*).

 

 

<sup>3</sup>

*bea aer*

*T*

(31)

<sup>23</sup> *<sup>b</sup> <sup>q</sup>*

2 87 1 1 1 1 111 012 1 123 0 00 01 02 03 1 123 1 10 11 12 13 1 12 2 20 21 22

 

*h*

() () () ()

*B q b b q b q b q B q b b q b q b q*

Fig. 21. Comparison between the signal estimated by the linear MISO model and the real

Time(hours)

2 2 1 1 87 1 0 2 1

*L k aL k i bT k i*

( ) ( 1) *dh i dh i bea i i*

*i i*

 

() ( ) ( )

(32)

 *k ck* 

The structure of the SISO linear model for the signal *Ldh1*(*k*) is as follows:

The structure of the MISO linear model for the signal *Ldh1*(*k*)is as follows:

In the figure 21 is presented the results of the time comparison of the estimated MISO model

( )() ( ) () () () () () () () ()

*Aq yk Bq Uk k yk P k Uk Qk T k T k Aq aq B Bq Bq Bq q*

$$\begin{aligned} A(q^{-1})y(k) &= B(q^{-1})L(k) + \varepsilon(k) \\ y(k) &= L\_{dh1}(k) \end{aligned}$$

$$\begin{aligned} L(k) &= \begin{bmatrix} T\_{ho87}(k) & M\_{lail}(k) \ L\_{ch2}(k) \end{bmatrix}^T \\ A(q^{-1}) &= 1 + a\_1 q^{-1} \\ B(q^{-1}) &= \begin{bmatrix} B\_0(q^{-1}) + B\_1(q^{-1}) + B\_2(q^{-1}) \end{bmatrix} \\ B\_0(q^{-1}) &= \sum\_{i=0}^7 b\_{0i} q^{-i} \\ B\_1(q^{-1}) &= \sum\_{i=0}^7 b\_{1i} q^{-i} \\ B\_2(q^{-1}) &= \sum\_{i=0}^7 b\_{2i} q^{-i} \end{aligned} \tag{33}$$

In the figures 22 and 23 are presented the results of the time comparison of the estimated SISO and MISO models Eq. (32) and Eq. (33).

Fig. 22. Comparison between the signal estimated by the linear SISO model and the real signal *Ldh1*(*k*).

Fig. 23. Comparison between the signal estimated by the linear MISO model and the real signal *Ldh1*(*k*).

**13**

*Japan*

**Understanding Driver Car-Following Behavior**

Recently, automatic systems that control driving speeds and headway distances while following a vehicle have been developed worldwide. Some products, such as adaptive cruise control systems, have already been installed in upper segments of passenger vehicles. Car following is an important operation in safe and comfortable driving on straight and/or curved roads. The number of traffic accidents involving rear-end collisions is the highest over the last decade in Japan (Iwashita et al., 2011). A rear-end collision occurs when the distance between two vehicles decreases due to deceleration of the lead vehicle and/or higher speed of the following vehicle. The automatic vehicle control system maintains a safe headway distance while following a vehicle and controls velocity according to the relative

If the system's automatic controls do not match the driver's manual controls, driver acceptance of the automatic vehicle control systems decreases, and the driver is not likely to use them. For example, when a lead vehicle speeds up and the inter-vehicle distance increases, one driver may accelerate strongly, whereas another driver may accelerate slightly; and other drivers may not accelerate. The system's automatic hard acceleration does not suit drivers whose acceleration is slight and those who do not accelerate, and they may regard such automatic systems as dangerous. Therefore, it is expected that drivers will accept longitudinal control systems that operate in a manner similar to their own usual carfollowing behavior. Drivers' car-following behavior must be investigated in a real roadtraffic environment to develop vehicle control and driver support systems that are

Car-following behavior consists of two aspects: how much distance drivers allow for a leading vehicle as an acceptable headway distance, and how they control acceleration according to the movements of the leading vehicle. Figure 1 presents an example of a typical following process. This car-following behavior data was recorded using an instrumented vehicle on a real motorway in Southampton (Sato et al., 2009a). The relative distance and speed were detected by microwave radar. The data length was 5min. Car-following behavior is a goal-seeking process depicted by several spirals as drivers attempt to maintain

the desired following headway behind a vehicle in a car-following situation.

speed of the leading vehicle, in order to avoid a rear-end collision.

compatible with drivers' typical car-following behavior.

**1. Introduction**

**Using a Fuzzy Logic Car-Following Model**

Toshihisa Sato and Motoyuki Akamatsu

*National Institute of Advanced Industrial Science* 

*Human Technology Research Institute,*

*and Technology (AIST),* 

### **7. References**


## **Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model**

Toshihisa Sato and Motoyuki Akamatsu

*Human Technology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Japan*

### **1. Introduction**

264 Fuzzy Logic – Algorithms, Techniques and Implementations

[1] Nepomuceno, L. X.; *Técnicas de Manutenção Preditiva*; Editora Edgard Blücher Ltda;

[2] Venkat Venkatasubramanian, Raghunathan Rengaswamy, Kewen Yin, Surya N. Kavuri;

[3] Patton, R. J., Frank, P. M., and Clark, R. N.; *Fault Diagnosis in Dynamic Systems, Theory and Application*. Control Engineering Series. Prentice Hall, London 1989. [4] Chen, J. and Patton, R. J. *Robust Model Based Fault Diagnosis for Dynamic Systems*. Kluwer

[5] Basseville, M. and Nikiforov, I. V. *Detection of Abrupt Changes: Theory and Application*.

[6] Mari Cruz Garcia, Miguel A. Sanz Bobi and Javier del Pico *SIMAP - Intelligent System for* 

[7] Moutinho, Marcelo N. *Sistema de Análise e Diagnóstico de Equipamentos Elétricos de Potência* 

[8] Moutinho, Marcelo N. *Fuzzy Diagnostic Systems of Rotating Machineries, some* 

[9] Moutinho, Marcelo N. *Classificação de Padrões Operacionais do Atuador Hidráulico do* 

[11] Paul M. Frank, *Fault Diagnosis in Dynamic Systems Using Analytical and Knowledge-based* 

[12] Alexandre Carlos Eduardo; *Diagnóstico de Defeitos em Sistemas Mecânicos Rotativos através* 

[13] VibroSystM. *Zoom 5 Software Guia do Usuário*. P/N: 9476-26M1A-100. VibroSystM Inc.

[15] Karl J. Aström and Björn Wittenmark, Computer *Controlled Systems*, Prentice-Hall, 1984. [16] L. X. Wang, *A course in Fuzzy Systems and Control*, Prentice-Hall International, Inc.,1997, [17] Norberto Bramatti. *Desenvolvimento e Implantação de um Sistema de Monitoração on-line de* 

[20] Jang, J.-S. R., *ANFIS: Adaptive-Network-based Fuzzy Inference Systems*, *IEEE Transactions on Systems, Man, and Cybernetics*, Vol. 23, No. 3, pp. 665-685, May 1993.

[14] Aguirre, L. A., *Introdução à Identificação de Sistemas*, 2ª edição, UFMG, 2004.

Computers and Chemical Engineering, 27, 2003, 293-311.

*gearbox*, Elsevier, Computers in Industry, vol 57, 2006, 552568;

System Application to Power Systems, Curitiba - Brazil, 2009.

*A review of process fault detection and diagnosis Part I: Quantitative model-based methods*.

*Predictive Maintenance Application to the health condition monitoring of a windturbine* 

*- SADE*. II Semana Eletronorte do Conhecimento e Inovação (II SECI). 21 a 23 de

*ELETRONORTE's applications*. The 15th International Conference on Intelligent

*Distribuidor de um Hidrogerador Utilizando Técnicas de Estimação Paramétrica e Lógica Fuzzy - Resultados Experimentais*. XIX SNPTEE - Seminário Nacional de Produção e Transmissão de Energia Elétrica. Florianópolis, SC. 23 a 26 Outubro de 2011. [10] ]Zhengang Han, Weihua Li and Sirish L. Shah; *Fault detection and isolation in the presence of process uncertainties*; Elsevier, Control Engineering Practice 13, 2005, pag 587-599;

*Redundancy A Survey and Some New Results*, Automatica, Vol. 26, No. 3, pp. 459-474,

*da Análise de Correlações e Redes Neurais Artificiais*, Tese de doutorado apresentada à Comissão de Pós-Graduação da Faculdade de Engenharia Mecânica, como requisito para a obtenção do título de Doutor em Engenharia Mecânica. Campinas,

*Compensadores Síncronos*. Dissertação de Mestrado. Universidade Federal do Pará, Centro Tecnológico, Programa de Pós-graduação em Engenharia Elétrica. 2002. [18] Lennart Ljung, *System Identification - Theory for the User*, PTR Prentice Hall, Englewood

**7. References** 

Volume 2; 1989

Academic, 1999.

1990;

2005.

2003, S.P. - Brasil.

Cliffs, New Jersey, 1987;

[19] Lennart Ljung; *System Identification Tolbox 7 User's Guide*.

Prentice Hall, 1993.

outubro de 2009, São Luís - MA.

Recently, automatic systems that control driving speeds and headway distances while following a vehicle have been developed worldwide. Some products, such as adaptive cruise control systems, have already been installed in upper segments of passenger vehicles. Car following is an important operation in safe and comfortable driving on straight and/or curved roads. The number of traffic accidents involving rear-end collisions is the highest over the last decade in Japan (Iwashita et al., 2011). A rear-end collision occurs when the distance between two vehicles decreases due to deceleration of the lead vehicle and/or higher speed of the following vehicle. The automatic vehicle control system maintains a safe headway distance while following a vehicle and controls velocity according to the relative speed of the leading vehicle, in order to avoid a rear-end collision.

If the system's automatic controls do not match the driver's manual controls, driver acceptance of the automatic vehicle control systems decreases, and the driver is not likely to use them. For example, when a lead vehicle speeds up and the inter-vehicle distance increases, one driver may accelerate strongly, whereas another driver may accelerate slightly; and other drivers may not accelerate. The system's automatic hard acceleration does not suit drivers whose acceleration is slight and those who do not accelerate, and they may regard such automatic systems as dangerous. Therefore, it is expected that drivers will accept longitudinal control systems that operate in a manner similar to their own usual carfollowing behavior. Drivers' car-following behavior must be investigated in a real roadtraffic environment to develop vehicle control and driver support systems that are compatible with drivers' typical car-following behavior.

Car-following behavior consists of two aspects: how much distance drivers allow for a leading vehicle as an acceptable headway distance, and how they control acceleration according to the movements of the leading vehicle. Figure 1 presents an example of a typical following process. This car-following behavior data was recorded using an instrumented vehicle on a real motorway in Southampton (Sato et al., 2009a). The relative distance and speed were detected by microwave radar. The data length was 5min. Car-following behavior is a goal-seeking process depicted by several spirals as drivers attempt to maintain the desired following headway behind a vehicle in a car-following situation.

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 267

Basically, the response is the acceleration (deceleration) rate of the following vehicle. This is a function of driver sensitivity and the stimulus. The stimulus is assumed to be the difference between the speed of the lead vehicle and that of the following vehicle. Driver sensitivity is a function of the spacing between the lead and following vehicles and the speed of the following vehicle. Several derived equations have been proposed in the last 20

However, one weakness of the General Motors Model is that the response of the following vehicle is determined by one stimulus, speed relative to the leading vehicle. When the relative speed between the two vehicles is zero, the acceleration or deceleration response is zero. This is not a realistic phenomenon, because a driver decelerates to increase intervehicle separation when the relative speed is zero but the spacing is too short. To overcome this problem, Helly developed a linear model that includes the additional stimulus term of

( ) [ ( ) ( )] {[ ( ) ( )] ( )}

where Dn(t+T) is the desired following distance at time t+T; and α, β, γ, C1, and C2 are

Another limitation is the assumption of symmetrical behavior under car-following conditions. For example, a lead vehicle has a positive relative speed with a certain magnitude, and another lead vehicle has a negative relative speed with the same magnitude. In these situations, the General Motors Model gives the same deceleration rate in the first case as the acceleration rate in second case. In a real road-traffic environment, deceleration

The Stopping-Distance Model assumes that a following vehicle always maintains a safe following distance in order to bring the vehicle to a safe stop if the leading vehicle suddenly stops. This model is based on a function of the speeds of the following and leading vehicles and the follower's reaction time. The original formulation (Kometani &

where Δx is the relative distance between the lead and following vehicles; v<sup>L</sup> is the speed of the lead vehicle; vF is the speed of the following vehicle; T is the driver's reaction time; and

The Stopping-Distance Model is widely used in microscopic traffic simulations (Gipps, 1981), because of its easy calibration based on a realistic driving behavior, requiring only the maximal deceleration of the following vehicle. However, the "safe headway" concept is not a totally valid starting point, and this assumption is not consistent with empirical

Δx(t-T) =αv2L(t-T)+ βlv2

( ) ( ) ( ) (2)

F(t)+ βvF(t)+b0 (3)

years (see Mehmood et al., 2001).

calibration constants.

Stopping-Distance Model:

α, β, βl, and b0 are calibration constants.

Sasaki, 1959) is:

observations.

Action-Point Model:

the desired headway distance (Helly, 1959):

in the second case is greater than acceleration to avoid risk.

In this chapter, the range of headway distances that drivers leave for leading vehicles is the "static" aspect of car-following behavior. A driver's acceleration controls based on the relationship between the driver's own vehicle and the leading vehicle is termed the "dynamic" aspect. Following distances, Time Headway (THW) (defined by the relative distance to a leading vehicle divided by the driving speed of driver's own vehicle), and Time to Collision (TTC) (defined by the relative distance to a leading vehicle divided by the relative speed between the leading and drivers' own vehicles) are indicators for evaluating the static aspect. A number of car-following models deal with the dynamic aspect.

Fig. 1. Example of car-following behavior data collected using an instrumented vehicle on an actual road. (For details of the data collection method, please see section 3.2.)

### **1.1 Brief review of car-following models**

Car-following models have been developed since the 1950s (e.g., Pipes, 1953). Many models describe the accelerative behavior of a driver as a function of inter-vehicle separation and relative speed. The following are representative car-following models (for details, please see Brackstone & McDonald, 1999).

General Motors Model:

The fundamental concept behind the General Motors Model is the stimulus-response theory (Chandler et al., 1958). Equation (1) presents a representative formulation.

$$\mathbf{a\_{F}(t+T)} = \alpha \left[ \frac{[\mathbf{v\_{F}(t)}]^{n}}{[\mathbf{x\_{f}(t) - \mathbf{x\_{F}(t)}]^{l}}} \right] [\mathbf{V\_{L}(t) - \mathbf{V\_{F}(t)}] \tag{1}$$

where aF(t+T) is the acceleration or deceleration rate of the following vehicle at time t+T; VL(t) is the speed of the lead vehicle at time t; VF(t) is the speed of the following vehicle at time t; XL(t) is the spacing of the lead vehicle at time t; XF(t) is the spacing of the following vehicle at time t; T is the perception-reaction time of the driver; and m, l, and α are constants to be determined.

Basically, the response is the acceleration (deceleration) rate of the following vehicle. This is a function of driver sensitivity and the stimulus. The stimulus is assumed to be the difference between the speed of the lead vehicle and that of the following vehicle. Driver sensitivity is a function of the spacing between the lead and following vehicles and the speed of the following vehicle. Several derived equations have been proposed in the last 20 years (see Mehmood et al., 2001).

However, one weakness of the General Motors Model is that the response of the following vehicle is determined by one stimulus, speed relative to the leading vehicle. When the relative speed between the two vehicles is zero, the acceleration or deceleration response is zero. This is not a realistic phenomenon, because a driver decelerates to increase intervehicle separation when the relative speed is zero but the spacing is too short. To overcome this problem, Helly developed a linear model that includes the additional stimulus term of the desired headway distance (Helly, 1959):

$$\mathbf{a}\_{\rm F}(\mathbf{t} + \mathbf{T}) = \mathbf{C}\_{\rm 1}[\mathbf{V\_{L}(t)} - \mathbf{V\_{F}(t)}] + \mathbf{C}\_{\rm 2}\{[\mathbf{X\_{L}(t)} - \mathbf{X\_{F}(t)}] - \text{Dn}(\mathbf{t} + \mathbf{T})\}$$

$$\mathbf{Dn}(\mathbf{t} + \mathbf{T}) = \mathbf{a} + \beta \mathbf{V\_{F}(t)} + \gamma \mathbf{a\_{F}(t)}\tag{2}$$

where Dn(t+T) is the desired following distance at time t+T; and α, β, γ, C1, and C2 are calibration constants.

Another limitation is the assumption of symmetrical behavior under car-following conditions. For example, a lead vehicle has a positive relative speed with a certain magnitude, and another lead vehicle has a negative relative speed with the same magnitude. In these situations, the General Motors Model gives the same deceleration rate in the first case as the acceleration rate in second case. In a real road-traffic environment, deceleration in the second case is greater than acceleration to avoid risk.

Stopping-Distance Model:

266 Fuzzy Logic – Algorithms, Techniques and Implementations

In this chapter, the range of headway distances that drivers leave for leading vehicles is the "static" aspect of car-following behavior. A driver's acceleration controls based on the relationship between the driver's own vehicle and the leading vehicle is termed the "dynamic" aspect. Following distances, Time Headway (THW) (defined by the relative distance to a leading vehicle divided by the driving speed of driver's own vehicle), and Time to Collision (TTC) (defined by the relative distance to a leading vehicle divided by the relative speed between the leading and drivers' own vehicles) are indicators for evaluating

the static aspect. A number of car-following models deal with the dynamic aspect.

aspect

Static

Fig. 1. Example of car-following behavior data collected using an instrumented vehicle on

Car-following models have been developed since the 1950s (e.g., Pipes, 1953). Many models describe the accelerative behavior of a driver as a function of inter-vehicle separation and relative speed. The following are representative car-following models (for details, please see

The fundamental concept behind the General Motors Model is the stimulus-response theory

[ ( ) ( )]

where aF(t+T) is the acceleration or deceleration rate of the following vehicle at time t+T; VL(t) is the speed of the lead vehicle at time t; VF(t) is the speed of the following vehicle at time t; XL(t) is the spacing of the lead vehicle at time t; XF(t) is the spacing of the following vehicle at time t; T is the perception-reaction time of the driver; and m, l, and α are constants

][ ( ) ( )] (1)

Dynamic

aspect

an actual road. (For details of the data collection method, please see section 3.2.)

0 10 20 30 40 50

Relative distance (m)

(Chandler et al., 1958). Equation (1) presents a representative formulation.

( ) [ [ ( )]

**1.1 Brief review of car-following models**

Brackstone & McDonald, 1999).

General Motors Model:



0

Relative speed (m/s)

2

4

to be determined.

The Stopping-Distance Model assumes that a following vehicle always maintains a safe following distance in order to bring the vehicle to a safe stop if the leading vehicle suddenly stops. This model is based on a function of the speeds of the following and leading vehicles and the follower's reaction time. The original formulation (Kometani & Sasaki, 1959) is:

$$
\Delta \mathbf{x}(\mathbf{t} \cdot \mathbf{T}) = \mathbf{c} \mathbf{v} \cdot \mathbf{\hat{L}}(\mathbf{t} \cdot \mathbf{T}) + \beta\_1 \mathbf{v}^2 \mathbf{\hat{F}}(\mathbf{t}) + \beta \mathbf{v}\_\mathbf{F}(\mathbf{t}) + \mathbf{b}\_0 \tag{3}
$$

where Δx is the relative distance between the lead and following vehicles; v<sup>L</sup> is the speed of the lead vehicle; vF is the speed of the following vehicle; T is the driver's reaction time; and α, β, βl, and b0 are calibration constants.

The Stopping-Distance Model is widely used in microscopic traffic simulations (Gipps, 1981), because of its easy calibration based on a realistic driving behavior, requiring only the maximal deceleration of the following vehicle. However, the "safe headway" concept is not a totally valid starting point, and this assumption is not consistent with empirical observations.

Action-Point Model:

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 269

Sugeno-type. The main difference between the Mamdani and Sugeno types is that the output membership functions are only linear or constant for Sugeno-type fuzzy inference. A

The fuzzy logic car-following model was developed by the Transportation Research Group (TRG) at the University of Southampton (Wu et al., 2000). McDonald et al. collected carfollowing behavior data on real roads and developed and validated the proposed fuzzy logic car-following model based on the real-world data (briefly mentioned in 2.2 and 2.3;

The fuzzy logic model uses relative velocity and distance divergence (DSSD) (the ratio of headway distance to a desired headway) as input variables. The output variable is the acceleration-deceleration rate. The DSSD is the average of the headway distance that is observed when the relative speeds between vehicles are close to zero. This model adopts fuzzy functions (fuzzy sets described by membership functions) as the formula for the input-output

**Fuzzy Inference System**

Rule 1 Rule 2 Rule 3 Rule 4 Rule 5 Rule 6

Output membership function

Rule 8 Rule 7

Rule 9 Rule 10 Rule 11 Rule 12 Rule 13 Rule 14 Rule 15 **Acceleration**

relationship. Figure 2 depicts the structure of the fuzzy logic car-following model.

Input membership function

Fig. 2. Structure of the fuzzy inference system in the fuzzy logic car-following model

Specifications of the fuzzy inference system are as follows.

 Type of input membership function: Gaussian Type of output membership function: Constant

Type of inference system: Sugeno

**DSSD**

Headway distance the desired headway <sup>=</sup>

> **Relative Velocity**

Velocity of leading vehicle – velocity of driver's vehicle

=

The constant output membership function is obtained from a singleton spike (*p*=*q*=0).

typical rule in the Sugeno-type fuzzy inference (Sugeno, 1985) is:

If input *x* is *A* and input *y* is *B* then output *z* is *x*\**p*+*y*\**q*+*r;* where *A* and *B* are fuzzy sets and *p*, *q*, and *r* are constants.

please see Wu, 2003; Zheng, 2003 for further explanation).

**2.1 Overview**

The Action-Point Model is the first car-following model to incorporate human perception of motion. The model developed by Michaels suggests that the dominant perceptual factor is changes in the apparent size of the vehicle (i.e., the changing rate of visual angle) (Michaels, 1963):

$$\frac{d\theta}{dt} = \frac{4\ast\mathsf{W}\_{\mathrm{L}}}{4\ast\left[\mathsf{X}\_{\mathrm{L}}(\mathrm{t}) - \mathsf{X}\_{\mathrm{F}}(\mathrm{t})\right]^{2} + \mathsf{W}\_{\mathrm{L}}^{2}} \left[\mathsf{V}\_{\mathrm{L}}(\mathrm{t}) - \mathsf{V}\_{\mathrm{F}}(\mathrm{t})\right] \tag{4}$$

where WL is the width of the lead vehicle.

This model assumes that a driver appropriately accelerates or -decelerates if the angular velocity exceeds a certain threshold. Once the threshold is exceeded, the driver chooses to decelerate until he/she can no longer perceive any relative velocity. Thresholds include a spacing-based threshold that is particularly relevant in close headway situations, a relative speed threshold for the perception of closing, and thresholds for the perception of opening and closing for low relative speeds (a recent work suggests that the perception of opening and that of closing have different thresholds (Reiter, 1994)). Car-following conditions are further categorized into subgroups: free driving, overtaking, following, and emergency situation. A driver engages in different acceleration behaviors in different situations when the perceived physical perception exceeds the thresholds.

The Action-Point Model takes into account the human threshold of perception, establishing a realistic rationale. However, various efforts have focused on identifying threshold values during the calibration phase, while the adjustment of acceleration above the threshold has not been considered, and the acceleration rate is normally assumed to be a constant. Additionally, the model dynamic (switching between the subgroups) has not been investigated. Finally, the ability to perceive speed differences and estimate distances varies widely among drivers. Therefore, it is difficult to estimate and calibrate the individual thresholds associated with the Action-Point Model.

### **2. Fuzzy logic car-following model**

Drivers perform a car-following task with real-time information processing of several kinds of information sources. The car-following models discussed above have established a unique interpretation of drivers' car-following behaviors. A driver in a car-following situation is described as a stimuli-responder in the General Motors Model, a safe distancekeeper in the Stopping-Distance Model, and a state monitor who wants to keep perceptions below the threshold in the Action-Point Model. However, these models include non-realistic constraints to describe car-following behavior in real road-traffic environments: symmetry between acceleration and deceleration, the "safe headway" concept, and constant acceleration or deceleration above the threshold.

The fuzzy logic car-following model describes driving operations under car-following conditions using linguistic terms and associated rules, instead of deterministic mathematical functions. Car-following behavior can be described in a natural manner that reflects the imprecise and incomplete sensory data presented by human sensory modalities. The fuzzy logic car-following model treats a driver as a decision-maker who decides the controls based on sensory inputs using a fuzzy reasoning. There are two types of fuzzy inference system that uses fuzzy reasoning to map an input space to an output space, Mandani-type and Sugeno-type. The main difference between the Mamdani and Sugeno types is that the output membership functions are only linear or constant for Sugeno-type fuzzy inference. A typical rule in the Sugeno-type fuzzy inference (Sugeno, 1985) is:

If input *x* is *A* and input *y* is *B* then output *z* is *x*\**p*+*y*\**q*+*r;*

where *A* and *B* are fuzzy sets and *p*, *q*, and *r* are constants.

The constant output membership function is obtained from a singleton spike (*p*=*q*=0).

### **2.1 Overview**

268 Fuzzy Logic – Algorithms, Techniques and Implementations

The Action-Point Model is the first car-following model to incorporate human perception of motion. The model developed by Michaels suggests that the dominant perceptual factor is changes in the apparent size of the vehicle (i.e., the changing rate of visual angle) (Michaels,

> 

This model assumes that a driver appropriately accelerates or -decelerates if the angular velocity exceeds a certain threshold. Once the threshold is exceeded, the driver chooses to decelerate until he/she can no longer perceive any relative velocity. Thresholds include a spacing-based threshold that is particularly relevant in close headway situations, a relative speed threshold for the perception of closing, and thresholds for the perception of opening and closing for low relative speeds (a recent work suggests that the perception of opening and that of closing have different thresholds (Reiter, 1994)). Car-following conditions are further categorized into subgroups: free driving, overtaking, following, and emergency situation. A driver engages in different acceleration behaviors in different situations when

The Action-Point Model takes into account the human threshold of perception, establishing a realistic rationale. However, various efforts have focused on identifying threshold values during the calibration phase, while the adjustment of acceleration above the threshold has not been considered, and the acceleration rate is normally assumed to be a constant. Additionally, the model dynamic (switching between the subgroups) has not been investigated. Finally, the ability to perceive speed differences and estimate distances varies widely among drivers. Therefore, it is difficult to estimate and calibrate the individual

Drivers perform a car-following task with real-time information processing of several kinds of information sources. The car-following models discussed above have established a unique interpretation of drivers' car-following behaviors. A driver in a car-following situation is described as a stimuli-responder in the General Motors Model, a safe distancekeeper in the Stopping-Distance Model, and a state monitor who wants to keep perceptions below the threshold in the Action-Point Model. However, these models include non-realistic constraints to describe car-following behavior in real road-traffic environments: symmetry between acceleration and deceleration, the "safe headway" concept, and constant

The fuzzy logic car-following model describes driving operations under car-following conditions using linguistic terms and associated rules, instead of deterministic mathematical functions. Car-following behavior can be described in a natural manner that reflects the imprecise and incomplete sensory data presented by human sensory modalities. The fuzzy logic car-following model treats a driver as a decision-maker who decides the controls based on sensory inputs using a fuzzy reasoning. There are two types of fuzzy inference system that uses fuzzy reasoning to map an input space to an output space, Mandani-type and

[ ( ) ( )] (4)

 

the perceived physical perception exceeds the thresholds.

thresholds associated with the Action-Point Model.

acceleration or deceleration above the threshold.

**2. Fuzzy logic car-following model**

where WL is the width of the lead vehicle.

[ ( ) ( )]

1963):

The fuzzy logic car-following model was developed by the Transportation Research Group (TRG) at the University of Southampton (Wu et al., 2000). McDonald et al. collected carfollowing behavior data on real roads and developed and validated the proposed fuzzy logic car-following model based on the real-world data (briefly mentioned in 2.2 and 2.3; please see Wu, 2003; Zheng, 2003 for further explanation).

The fuzzy logic model uses relative velocity and distance divergence (DSSD) (the ratio of headway distance to a desired headway) as input variables. The output variable is the acceleration-deceleration rate. The DSSD is the average of the headway distance that is observed when the relative speeds between vehicles are close to zero. This model adopts fuzzy functions (fuzzy sets described by membership functions) as the formula for the input-output relationship. Figure 2 depicts the structure of the fuzzy logic car-following model.

Specifications of the fuzzy inference system are as follows.


Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 271

where Ŷ<sup>i</sup> is a predicted value using the fuzzy logic model at time increment i, Yi is raw data

All possible model formulations (a single variable, combination of two variables, and combination of three variables) were tested. The data were collected on real motorways using a TRG instrumented vehicle. Although a three-input model suggested better RMSE performance than a one-input model or a two-input model, the two-input model using relative speed and distance divergence was adopted because of the complexity of the model structure and its applicability to a wide range of car-following situations. For details of the

The developed fuzzy logic car-following model was validated in terms of reproducing a single vehicle's car-following behavior, as well as reproducing traffic flow under car-

The single vehicle's car-following behavior was evaluated from empirical data, and the average RMSE of acceleration was 0.20m/s2. The platoon behavior was evaluated using simulation. The response of a platoon of 20 vehicles to step changes of acceleration or deceleration of a lead vehicle was assessed in order to investigate the influence of the movement of the lead vehicle on a line of vehicles. The results validated that the fuzzy logic car-following model could reproduce both stable and unstable traffic behavior. For details of

**3. Case study 1: Car-following behavior comparison between the UK and** 

This section introduces a case study focusing on a comparison of drivers' car-following behavior in the UK and in Japan (Sato et al., 2009b). The fuzzy logic car-following model was developed using naturalistic data collected in Southampton. We applied this model to behavioral data collected in Japan. One objective is to confirm whether Japanese carfollowing behavior can be described by the fuzzy logic model with the same structure as the UK model. Another objective is to investigate cross-cultural variations of the car-following

With increasing globalization of automotive markets, it is important to understand the differences between driving behavior in different countries. Car-following behavior may differ due to differences in nationality and the road traffic environments of different countries. The findings may contribute to designing human-centered automatic vehicle

An AIST instrumented vehicle and a TRG instrumented vehicle are used for behavioral data collection (Brackstone et al., 1999; Sato & Akamatsu, 2007). Both vehicles are equipped with

control systems based on international differences in driving behavior.

at time increment i, and N is the number of data.

input variable validation, refer to Zheng, 2003.

following conditions (a platoon of vehicles).

behaviors of drivers in the two countries.

the model validation, refer to Wu et al., 2003 and Zheng, 2003.

**2.3 Model validation**

**Japan**

**3.1 Motivation**

**3.2 Methods**

**3.2.1 Instrumented vehicles**


The parameter of the fuzzy inference system is estimated using the following combination of back-propagation and least square methods. The initial fuzzy inference system adopts the grid partition method in which the membership functions of each input are evenly assigned in the range of the training data. Next, the membership function parameters are adjusted using the hybrid learning algorithm. The parameters of output membership functions are updated in a forward pass using the least square method. The inputs are first propagated forward. The overall output is then a linear combination of the parameters of output membership functions. The parameters of input membership functions are estimated using back propagation in each iteration, where the differences between model output and training data are propagated backward and the parameters are updated by gradient descent. The parameter optimization routines are applied until a given number of iterations or an error reduction threshold is reached.

The input-output mapping specified by the fuzzy inference system has a three-dimensional structure. We focus on relative velocity-acceleration mapping in order to analyze the dynamic aspect of car-following behavior (i.e., drivers' acceleration controls based on the variation in relative speeds).

### **2.2 Input variable validation**

The following eight candidates were applied to the fuzzy inference system estimation in order to obtain satisfactory performance of the fuzzy logic model.


The performance of the fuzzy logic model was evaluated by the Root Mean Square Error (RMSE) of the model prediction:

$$\text{RMSE} = \sqrt{\frac{\sum\_{l=1}^{N} (\overline{Y\_l} - Y\_l)^2}{N}} \tag{5}$$

where Ŷ<sup>i</sup> is a predicted value using the fuzzy logic model at time increment i, Yi is raw data at time increment i, and N is the number of data.

All possible model formulations (a single variable, combination of two variables, and combination of three variables) were tested. The data were collected on real motorways using a TRG instrumented vehicle. Although a three-input model suggested better RMSE performance than a one-input model or a two-input model, the two-input model using relative speed and distance divergence was adopted because of the complexity of the model structure and its applicability to a wide range of car-following situations. For details of the input variable validation, refer to Zheng, 2003.

### **2.3 Model validation**

270 Fuzzy Logic – Algorithms, Techniques and Implementations

Number of partitions for input (Relative Velocity): 5 (closing+, closing, about zero,

The parameter of the fuzzy inference system is estimated using the following combination of back-propagation and least square methods. The initial fuzzy inference system adopts the grid partition method in which the membership functions of each input are evenly assigned in the range of the training data. Next, the membership function parameters are adjusted using the hybrid learning algorithm. The parameters of output membership functions are updated in a forward pass using the least square method. The inputs are first propagated forward. The overall output is then a linear combination of the parameters of output membership functions. The parameters of input membership functions are estimated using back propagation in each iteration, where the differences between model output and training data are propagated backward and the parameters are updated by gradient descent. The parameter optimization routines are applied until a given number of iterations

The input-output mapping specified by the fuzzy inference system has a three-dimensional structure. We focus on relative velocity-acceleration mapping in order to analyze the dynamic aspect of car-following behavior (i.e., drivers' acceleration controls based on the

The following eight candidates were applied to the fuzzy inference system estimation in




The performance of the fuzzy logic model was evaluated by the Root Mean Square Error

̂ )

(5)

√<sup>∑</sup> (

order to obtain satisfactory performance of the fuzzy logic model.

relative speeds between vehicles were close to zero.)

Learning algorithm: combination of back-propagation and least square methods

 Number of partitions for input (DSSD): 3 (close, ok, and far) Initialization of fuzzy inference system: grid partition method

opening, and opening+)

Defuzzification method: weighted average

or an error reduction threshold is reached.




(RMSE) of the model prediction:

= 0.)

variation in relative speeds).

**2.2 Input variable validation**

The developed fuzzy logic car-following model was validated in terms of reproducing a single vehicle's car-following behavior, as well as reproducing traffic flow under carfollowing conditions (a platoon of vehicles).

The single vehicle's car-following behavior was evaluated from empirical data, and the average RMSE of acceleration was 0.20m/s2. The platoon behavior was evaluated using simulation. The response of a platoon of 20 vehicles to step changes of acceleration or deceleration of a lead vehicle was assessed in order to investigate the influence of the movement of the lead vehicle on a line of vehicles. The results validated that the fuzzy logic car-following model could reproduce both stable and unstable traffic behavior. For details of the model validation, refer to Wu et al., 2003 and Zheng, 2003.

### **3. Case study 1: Car-following behavior comparison between the UK and Japan**

### **3.1 Motivation**

This section introduces a case study focusing on a comparison of drivers' car-following behavior in the UK and in Japan (Sato et al., 2009b). The fuzzy logic car-following model was developed using naturalistic data collected in Southampton. We applied this model to behavioral data collected in Japan. One objective is to confirm whether Japanese carfollowing behavior can be described by the fuzzy logic model with the same structure as the UK model. Another objective is to investigate cross-cultural variations of the car-following behaviors of drivers in the two countries.

With increasing globalization of automotive markets, it is important to understand the differences between driving behavior in different countries. Car-following behavior may differ due to differences in nationality and the road traffic environments of different countries. The findings may contribute to designing human-centered automatic vehicle control systems based on international differences in driving behavior.

### **3.2 Methods**

### **3.2.1 Instrumented vehicles**

An AIST instrumented vehicle and a TRG instrumented vehicle are used for behavioral data collection (Brackstone et al., 1999; Sato & Akamatsu, 2007). Both vehicles are equipped with

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 273

roundabout junctions. The driving behavior data in Southampton was collected as part of an EC STARDUST project (Zheng et al., 2006). The field experiments at the two sites were

The passive mode was used for the data collected (Fig. 5), reflecting random drivers who followed the instrumented vehicle. The passive mode can collect and evaluate a large population of drivers, rather than just the participating driver in the instrumented vehicle, in a short period and at a lower level of detail (Brackstone et al., 2002). The measured data in the passive mode enable evaluation of car-following behavior trends in each country.

Urban road Bypass Trunk road Motorway

conducted during the morning from 9:00 to 10:45.

Fig. 4. Road environments used for car-following behavior analyses

Fig. 5. Active and passive modes in car-following conditions

Following vehicle

In the analysis, the car-following condition was defined as a situation in which a driver followed a leading vehicle with relative speeds between 15km/h and -15km/h. The relative distance to a following vehicle under car-following conditions was obtained from the

Random drivers who are not involved in experiments

"Passive mode" "Active mode"

Instrumented vehicle

Driver participating in experiments

Leading vehicle

**3.2.3 Variables**

Tsukuba (Japan)

Southampton (UK)

various sensors and driving recorder systems in order to detect the vehicle driving status and to measure the driver's operations. Velocity is measured using a speed pulse signal, and acceleration is detected by a G-sensor. The relative distance and relative speed to the leading and following vehicles are recorded with laser radar units (AIST instrumented vehicle) or microwave radar (TRG instrumented vehicle) that are fixed within the front and rear bumpers. Figure 3 presents an overview of the AIST instrumented vehicle. This vehicle collects the following data:


Fig. 3. AIST instrumented vehicle with sensors and a recorder system for detecting nearby vehicles

The velocity of the following vehicle was calculated based on the velocity of the instrumented vehicles and the relative speed. The visual image of the rear scenes was used for better understanding of the traffic conditions while driving and for clarifying uncertainties identified in the radars.

### **3.2.2 Road-traffic environment**

Figure 4 depicts the road environment in the Southampton (UK) and Tsukuba (Japan) routes. The driving route in Tsukuba was 15km long (travel time 30min). This route included urban roads with several left and right turns at intersections, with a traffic lane that was mostly one lane, and a bypass that had one and two traffic lanes. The driving route in Southampton included trunk roads and motorways with two and three lanes and roundabout junctions. The driving behavior data in Southampton was collected as part of an EC STARDUST project (Zheng et al., 2006). The field experiments at the two sites were conducted during the morning from 9:00 to 10:45.

### **3.2.3 Variables**

272 Fuzzy Logic – Algorithms, Techniques and Implementations

various sensors and driving recorder systems in order to detect the vehicle driving status and to measure the driver's operations. Velocity is measured using a speed pulse signal, and acceleration is detected by a G-sensor. The relative distance and relative speed to the leading and following vehicles are recorded with laser radar units (AIST instrumented vehicle) or microwave radar (TRG instrumented vehicle) that are fixed within the front and rear bumpers. Figure 3 presents an overview of the AIST instrumented vehicle. This vehicle

Relative distance and speed to the leading and following vehicles by laser radar units,

Visual images (forward and rear scenes, lane positions, and driver's face) by five CCD

Fig. 3. AIST instrumented vehicle with sensors and a recorder system for detecting nearby

Driving recorder system, sampling rate at 30Hz

Laser radar for relative distance and velocity to a leading vehicle

The velocity of the following vehicle was calculated based on the velocity of the instrumented vehicles and the relative speed. The visual image of the rear scenes was used for better understanding of the traffic conditions while driving and for clarifying

Figure 4 depicts the road environment in the Southampton (UK) and Tsukuba (Japan) routes. The driving route in Tsukuba was 15km long (travel time 30min). This route included urban roads with several left and right turns at intersections, with a traffic lane that was mostly one lane, and a bypass that had one and two traffic lanes. The driving route in Southampton included trunk roads and motorways with two and three lanes and

collects the following data:

cameras.

vehicles

Driving speed by speed pulse signal,

Geographical position by D-GPS sensor,

Position of driver's right foot by laser sensors,

Application of gas and brake pedals by potentiometers,

 Vehicle acceleration by G-sensor, Angular velocity by gyro sensor,

 Steering wheel angle by encoder, Turn signal activation by encoder, and

> Laser radar for relative distance and velocity to a following vehicle

uncertainties identified in the radars.

**3.2.2 Road-traffic environment**

The passive mode was used for the data collected (Fig. 5), reflecting random drivers who followed the instrumented vehicle. The passive mode can collect and evaluate a large population of drivers, rather than just the participating driver in the instrumented vehicle, in a short period and at a lower level of detail (Brackstone et al., 2002). The measured data in the passive mode enable evaluation of car-following behavior trends in each country.

Fig. 4. Road environments used for car-following behavior analyses

Random drivers who are not involved in experiments

Fig. 5. Active and passive modes in car-following conditions

In the analysis, the car-following condition was defined as a situation in which a driver followed a leading vehicle with relative speeds between 15km/h and -15km/h. The relative distance to a following vehicle under car-following conditions was obtained from the

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 275

Figure 7 presents the relative velocity–acceleration mapping obtained from the fuzzy inference specification in Tsukuba and Southampton. The two sites have similar traces (Southampton, 15; Tsukuba, 14) and data length (Southampton, 511.5sec; Tsukuba, 522.9sec). The RMSEs of the predicted acceleration and the measured data in the estimated fuzzy logic model were 0.15m/sec2 in Tsukuba and 0.17m/sec2 in Southampton. These findings indicate a satisfactory model-to-data fit compared to other published works (Wu et

The deceleration of Tsukuba drivers is greater than that of Southampton drivers when their vehicle approaches the leading vehicle. When the distance between vehicles is opening, the acceleration of Southampton drivers is greater than that of Tsukuba drivers. Thus, the acceleration-deceleration rate of Tsukuba drivers indicates a tendency opposite that of

Fig. 7. Results of fuzzy logic model specification: Relative velocity–acceleration mapping


*Strong acceleration*

*Deceleration*

Acceleration (m/sec2)


Southampton [UK]

*Closing* Relative Velocity (m/sec) *Opening*

The low RMSE of the Tsukuba acceleration rate suggests that the proposed fuzzy logic model is well-suited to Japanese car-following behavior. The findings imply that Japanese drivers use relative velocity and distance divergence for adjusting acceleration and

The THW of Tsukuba drivers was longer at slow velocity. When Tsukuba drivers approached a preceding vehicle in the same traffic lane, they decelerated more strongly. In addition, Tsukuba drivers accelerated less as the distance to the leading vehicle increased. Strong deceleration while moving toward the leading vehicle and weak acceleration when

Southampton drivers tended to adopt shorter THW when in car-following in the low driving speed range. The acceleration rate of Southampton drivers was higher than that of Tsukuba drivers when overtaking a vehicle. It is assumed that such strong acceleration

contributes to maintaining a short headway distances in car-following situations.

**3.3.2 Dynamic aspect**

Southampton drivers.

*Strong acceleration*


*Deceleration*

Acceleration (m/sec2)

between Tsukuba and Southampton

deceleration while following a vehicle.

following a preceding vehicle led to long headway distances.


*Closing* Relative Velocity (m/sec) *Opening* Tsukuba [Japan]

**3.4 Discussion**

al., 2003).

measured data. The rear distances collected were divided into two sets in terms of the associated driving speeds: 30 to 49km/h and 50 to 69km/h. The speed range of 30 to 49km/h corresponds to driving on an urban road (Tsukuba) and on a trunk road (Southampton), while the speed of 50 of 69km/h corresponds to driving on a bypass (Tsukuba) and on a motorway (Southampton).

The THW of the passive mode (defined by the relative distance between the following vehicle and the instrumented vehicle divided by the driving speed of the following vehicle) was calculated, and the distributions of the THW at each set were compared for analysis of the static aspect of car-following behavior.

In addition to the rear distances, the relative speeds and acceleration of the following vehicle were used for the fuzzy logic car-following model. Although this model can be used to describe individual drivers' acceleration-deceleration behavior, we applied the model to the passive mode data in order to compare general features of the dynamic aspect of carfollowing behavior between Tsukuba and Southampton. The continuous data for more than 20sec was input to the model specification within the measured car- following data.

### **3.3 Results**

### **3.3.1 Static aspect**

Figure 6 presents the distributions of the THW for each speed range and proportions of the time when drivers take the relevant THW to the total time while driving at the corresponding velocity.

In the lower speed range (30 to 49km/h), the proportion of Southampton drivers taking very short THW (0.5 to 1s) exceeds that of Tsukuba drivers. The proportion of Tsukuba drivers taking THW longer than 3s exceeds that of Southampton drivers. In the higher speed range (50 to 69km/h), no difference in THW between the two regions is observed. Both Tsukuba drivers and Southampton drivers spend more time with the short THW (0.5s to 1.5s). As mentioned in previous research (Brackstone et al., 2009), THW tends to decrease as velocity increases.

Fig. 6. Comparison of THW between two countries for each speed range

### **3.3.2 Dynamic aspect**

274 Fuzzy Logic – Algorithms, Techniques and Implementations

measured data. The rear distances collected were divided into two sets in terms of the associated driving speeds: 30 to 49km/h and 50 to 69km/h. The speed range of 30 to 49km/h corresponds to driving on an urban road (Tsukuba) and on a trunk road (Southampton), while the speed of 50 of 69km/h corresponds to driving on a bypass

The THW of the passive mode (defined by the relative distance between the following vehicle and the instrumented vehicle divided by the driving speed of the following vehicle) was calculated, and the distributions of the THW at each set were compared for analysis of

In addition to the rear distances, the relative speeds and acceleration of the following vehicle were used for the fuzzy logic car-following model. Although this model can be used to describe individual drivers' acceleration-deceleration behavior, we applied the model to the passive mode data in order to compare general features of the dynamic aspect of carfollowing behavior between Tsukuba and Southampton. The continuous data for more than

Figure 6 presents the distributions of the THW for each speed range and proportions of the time when drivers take the relevant THW to the total time while driving at the

In the lower speed range (30 to 49km/h), the proportion of Southampton drivers taking very short THW (0.5 to 1s) exceeds that of Tsukuba drivers. The proportion of Tsukuba drivers taking THW longer than 3s exceeds that of Southampton drivers. In the higher speed range (50 to 69km/h), no difference in THW between the two regions is observed. Both Tsukuba drivers and Southampton drivers spend more time with the short THW (0.5s to 1.5s). As mentioned in previous research (Brackstone et al., 2009), THW tends to decrease as

Tsukuba [Japan] Southampton [UK]

Fig. 6. Comparison of THW between two countries for each speed range

3.5

4~4.5

4.5

~5

5~

0% 5% 10% 15% 20% 25% 30% 35% 40%

THW (s) from 30km/h to 49km/h THW (s) from 50km/h to 69km/h

0

0~0.5

0.5

~1

1~1.5

1.5

2~2.5

2.5

~3

3~3.5

3.5

4~4.5

4.5

~5

5~

~4

~2

~4

Frequency rate Frequency rate

20sec was input to the model specification within the measured car- following data.

(Tsukuba) and on a motorway (Southampton).

the static aspect of car-following behavior.

**3.3 Results**

**3.3.1 Static aspect**

velocity increases.

0% 5% 10% 15% 20% 25% 30% 35% 40%

0

0~0.5

0.5

~1

1~1.5

1.5

2~2.5

2.5

~3

3~3.5

~2

corresponding velocity.

Figure 7 presents the relative velocity–acceleration mapping obtained from the fuzzy inference specification in Tsukuba and Southampton. The two sites have similar traces (Southampton, 15; Tsukuba, 14) and data length (Southampton, 511.5sec; Tsukuba, 522.9sec). The RMSEs of the predicted acceleration and the measured data in the estimated fuzzy logic model were 0.15m/sec2 in Tsukuba and 0.17m/sec2 in Southampton. These findings indicate a satisfactory model-to-data fit compared to other published works (Wu et al., 2003).

The deceleration of Tsukuba drivers is greater than that of Southampton drivers when their vehicle approaches the leading vehicle. When the distance between vehicles is opening, the acceleration of Southampton drivers is greater than that of Tsukuba drivers. Thus, the acceleration-deceleration rate of Tsukuba drivers indicates a tendency opposite that of Southampton drivers.

Fig. 7. Results of fuzzy logic model specification: Relative velocity–acceleration mapping between Tsukuba and Southampton

### **3.4 Discussion**

The low RMSE of the Tsukuba acceleration rate suggests that the proposed fuzzy logic model is well-suited to Japanese car-following behavior. The findings imply that Japanese drivers use relative velocity and distance divergence for adjusting acceleration and deceleration while following a vehicle.

The THW of Tsukuba drivers was longer at slow velocity. When Tsukuba drivers approached a preceding vehicle in the same traffic lane, they decelerated more strongly. In addition, Tsukuba drivers accelerated less as the distance to the leading vehicle increased. Strong deceleration while moving toward the leading vehicle and weak acceleration when following a preceding vehicle led to long headway distances.

Southampton drivers tended to adopt shorter THW when in car-following in the low driving speed range. The acceleration rate of Southampton drivers was higher than that of Tsukuba drivers when overtaking a vehicle. It is assumed that such strong acceleration contributes to maintaining a short headway distances in car-following situations.

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 277

driving styles within a few years. One aim of this study is to clarify how elderly drivers follow a lead vehicle, based on analysis of how car-following behavior changes with aging. We collected car-following behavior data of elderly drivers determined in one year and compared it with that determined five years later. The distributions of THW in the two field experiments were compared in order to investigate the static aspect of car-following behavior. For analysis of the dynamic aspect, the fuzzy logic car-following model was

Field experiments were conducted in 2003 (first experiment) and in 2008 (second experiment). The two experiments were conducted using the same instrumented vehicle, the same driving route, and the same participants. The AIST instrumented vehicle was used for the data collection (Fig. 3). Almost all the sensors and the driving recorder system were fixed inside the trunk, so that the participating drivers could not see them, in order to

The experiments were conducted on rural roads around Tsukuba. The route included several left and right turns, and the travel time was 25min (total distance 14km). The participant rode alone in the instrumented vehicle during the experiment trials. Before the recorded drives, the participants performed practice drives from the starting point to the

Four elderly drivers (three males and one female) with informed consent participated in the two experiments. Their ages ranged from 65 to 70 years (average 67.3 years) in the first experiment and from 70 to 74 years (average 72.0 years) in the second experiment. Their annual distance driven ranged from 5,000 to 8,000km in the first experiment and from 2,000

The participants were instructed to drive in their typical manner. In the first experiment, the recorded trip for each elderly participant was made once a day on weekdays, for a total of 10 trips. In the second experiment, the trial was conducted twice a day on weekdays, for a total of 30 trips. The participants took a break between the experiment trials in the second

Figure 8 depicts a target section for the analysis of elderly drivers' car-following behavior. We focused on a two-lane bypass, the same road environment as that in section 3.2.2. We included only drives with a leading vehicle, excluding drives without a leading vehicle on

The active mode (distance between the instrumented vehicle and the leading vehicle) was used in the analysis of the elderly drivers' car-following behavior. We also used the passive mode (distance between the instrumented vehicle and the following vehicle) to investigate traffic characteristics on the analyzed road. The latter is expected to indicate whether changes in elderly drivers' car-following behaviors are influenced by their functional

applied to compare elderly drivers' accelerative behavior while following a vehicle.

encourage natural driving behaviors during the experiment trials.

destination without using a map or an in-vehicle navigation system.

declines or by changes in traffic characteristics on the target road.

to 10,000km in the second experiment (average 6,500km in both experiments).

**4.2 Methods**

experiment.

**4.2.2 Data analysis**

the target section.

**4.2.1 Procedures**

Tsukuba car-following behavior data were collected on urban roads and a bypass. When driving on urban roads, a leading vehicle often has to decelerate suddenly due to other vehicles at crossroads, a change of traffic signals, and the emergence of pedestrians or bicycles. The leading vehicle might also slow down suddenly on the bypass because a merging car may cut in front of it. Drivers adopted longer headway distances and decelerated more strongly in closing inter-vehicle separations when driving on roads where they should pay more attention to the movements of the leading vehicle.

Southampton car-following behavior data were collected on major roads with two or three lanes. In the speed range of 30 to 69km/h, traffic was quite congested in the morning peak when the field experiments were conducted. The drivers kept short headway distances in order to avoid lane changes of vehicles in front of them, leading to strong acceleration with opening inter-vehicle distances.

The road traffic environment in which the behavior data are collected is an important factor in the differences between car-following behavior in Southampton and that in Tsukuba, indicating that the road-traffic environment influences car-following behavior, regardless of the country of data collection. These findings imply that a single operational algorithm would suffice even when using vehicle control and driver support systems in different counties, although different algorithms would be necessary for different road types (e.g., roads in a city and roads connecting cities).

### **4. Case study 2: Longitudinal study of elderly drivers' car-following behavior**

### **4.1 Motivation**

This section introduces another case study focusing on the assessment of elderly drivers' carfollowing behavior, using the proposed fuzzy logic car-following model. The number of elderly drivers who drive their own passenger vehicles in their daily lives has increased annually. Driving a vehicle expands everyday activities and enriches the quality of life for the elderly. However, cognitive and physical functional changes of elderly drivers may lead to their increased involvement in traffic accidents. Thus, it is important to develop advanced driver assistance and support systems that promote safe driving for elderly drivers. Automatic vehicle control systems are expected to enhance comfort as well as safety when elderly drivers follow a vehicle. Understanding elderly drivers' car-following behavior is essential for developing automatic control systems that adapt to their usual car-following behavior.

Various studies comparing physical and cognitive functions between young and elderly drivers have been conducted in order to investigate the influence of age-related functional decline on driving (e.g., Owsley, 2004). Driving behavior is influenced by several driver characteristics (e.g., driving skill and driving style); and individual drivers' characteristics differ, especially between young and elderly drivers. Thus, a comparison of the driving behaviors of young and elderly drivers includes the influence of drivers' characteristics as well as the impact of the age-related decline of cognitive functions.

We have been involved in a cohort study on the driving behaviors of elderly drivers on an actual road (Sato & Akamatsu, 2011). A cohort study conducted in real road-traffic environments is expected to focus on changes in elderly drivers' cognitive functions because their cognitive functional changes may be greater than changes in their driving skills or driving styles within a few years. One aim of this study is to clarify how elderly drivers follow a lead vehicle, based on analysis of how car-following behavior changes with aging. We collected car-following behavior data of elderly drivers determined in one year and compared it with that determined five years later. The distributions of THW in the two field experiments were compared in order to investigate the static aspect of car-following behavior. For analysis of the dynamic aspect, the fuzzy logic car-following model was applied to compare elderly drivers' accelerative behavior while following a vehicle.

### **4.2 Methods**

276 Fuzzy Logic – Algorithms, Techniques and Implementations

Tsukuba car-following behavior data were collected on urban roads and a bypass. When driving on urban roads, a leading vehicle often has to decelerate suddenly due to other vehicles at crossroads, a change of traffic signals, and the emergence of pedestrians or bicycles. The leading vehicle might also slow down suddenly on the bypass because a merging car may cut in front of it. Drivers adopted longer headway distances and decelerated more strongly in closing inter-vehicle separations when driving on roads where

Southampton car-following behavior data were collected on major roads with two or three lanes. In the speed range of 30 to 69km/h, traffic was quite congested in the morning peak when the field experiments were conducted. The drivers kept short headway distances in order to avoid lane changes of vehicles in front of them, leading to strong acceleration with

The road traffic environment in which the behavior data are collected is an important factor in the differences between car-following behavior in Southampton and that in Tsukuba, indicating that the road-traffic environment influences car-following behavior, regardless of the country of data collection. These findings imply that a single operational algorithm would suffice even when using vehicle control and driver support systems in different counties, although different algorithms would be necessary for different road types (e.g.,

**4. Case study 2: Longitudinal study of elderly drivers' car-following behavior**

This section introduces another case study focusing on the assessment of elderly drivers' carfollowing behavior, using the proposed fuzzy logic car-following model. The number of elderly drivers who drive their own passenger vehicles in their daily lives has increased annually. Driving a vehicle expands everyday activities and enriches the quality of life for the elderly. However, cognitive and physical functional changes of elderly drivers may lead to their increased involvement in traffic accidents. Thus, it is important to develop advanced driver assistance and support systems that promote safe driving for elderly drivers. Automatic vehicle control systems are expected to enhance comfort as well as safety when elderly drivers follow a vehicle. Understanding elderly drivers' car-following behavior is essential for

developing automatic control systems that adapt to their usual car-following behavior.

well as the impact of the age-related decline of cognitive functions.

Various studies comparing physical and cognitive functions between young and elderly drivers have been conducted in order to investigate the influence of age-related functional decline on driving (e.g., Owsley, 2004). Driving behavior is influenced by several driver characteristics (e.g., driving skill and driving style); and individual drivers' characteristics differ, especially between young and elderly drivers. Thus, a comparison of the driving behaviors of young and elderly drivers includes the influence of drivers' characteristics as

We have been involved in a cohort study on the driving behaviors of elderly drivers on an actual road (Sato & Akamatsu, 2011). A cohort study conducted in real road-traffic environments is expected to focus on changes in elderly drivers' cognitive functions because their cognitive functional changes may be greater than changes in their driving skills or

they should pay more attention to the movements of the leading vehicle.

opening inter-vehicle distances.

**4.1 Motivation**

roads in a city and roads connecting cities).

### **4.2.1 Procedures**

Field experiments were conducted in 2003 (first experiment) and in 2008 (second experiment). The two experiments were conducted using the same instrumented vehicle, the same driving route, and the same participants. The AIST instrumented vehicle was used for the data collection (Fig. 3). Almost all the sensors and the driving recorder system were fixed inside the trunk, so that the participating drivers could not see them, in order to encourage natural driving behaviors during the experiment trials.

The experiments were conducted on rural roads around Tsukuba. The route included several left and right turns, and the travel time was 25min (total distance 14km). The participant rode alone in the instrumented vehicle during the experiment trials. Before the recorded drives, the participants performed practice drives from the starting point to the destination without using a map or an in-vehicle navigation system.

Four elderly drivers (three males and one female) with informed consent participated in the two experiments. Their ages ranged from 65 to 70 years (average 67.3 years) in the first experiment and from 70 to 74 years (average 72.0 years) in the second experiment. Their annual distance driven ranged from 5,000 to 8,000km in the first experiment and from 2,000 to 10,000km in the second experiment (average 6,500km in both experiments).

The participants were instructed to drive in their typical manner. In the first experiment, the recorded trip for each elderly participant was made once a day on weekdays, for a total of 10 trips. In the second experiment, the trial was conducted twice a day on weekdays, for a total of 30 trips. The participants took a break between the experiment trials in the second experiment.

### **4.2.2 Data analysis**

Figure 8 depicts a target section for the analysis of elderly drivers' car-following behavior. We focused on a two-lane bypass, the same road environment as that in section 3.2.2. We included only drives with a leading vehicle, excluding drives without a leading vehicle on the target section.

The active mode (distance between the instrumented vehicle and the leading vehicle) was used in the analysis of the elderly drivers' car-following behavior. We also used the passive mode (distance between the instrumented vehicle and the following vehicle) to investigate traffic characteristics on the analyzed road. The latter is expected to indicate whether changes in elderly drivers' car-following behaviors are influenced by their functional declines or by changes in traffic characteristics on the target road.

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 279

Figure 10 compares the relative velocity–acceleration mapping of the first and second experiments. In the fuzzy logic model specification, there is a total of 27 traces (data length 770.5sec) in the first experiment and 29 traces (data length 1481.0sec) in the second

The RMSEs between the predicted and measured accelerations in the estimated fuzzy logic model were 0.25m/sec2 in the first experiment and 0.14m/sec2 in the second experiment, which are within adequate errors compared to those estimated based on other real-world

The deceleration when the elderly participants approach the lead vehicle was the same in the two experiments. However, the elderly drivers accelerate more strongly in the second experiment than in the first experiment, when the leading vehicle goes faster and the

Acceleration (m/sec2)

*Strong acceleration*

*Deceleration*


Fig. 10. Results of fuzzy logic model specification of elderly drivers: Relative velocity–

Comparison of THW to following vehicles between the first and second experiments indicates no change in traffic flow on the target section in five years. In contrast, THW to leading vehicles is longer in the second experiment than in the first experiment, suggesting that elderly drivers take longer THW and the static aspect of their car-following behaviors

Second experiment

*Closing* Relative Velocity (m/sec) *Opening*


The task-capability interface model (Fuller, 2005) helps clarify why elderly drivers' carfollowing behavior changes with aging. In this model, drivers adjust task difficulty while driving in order to avoid road accidents. Task difficulty can be described as an interaction between the driver's capability and task demands. When the driver's capability exceeds the task demands, the task is easy and the driver completes the task successfully. When the task demands exceed the driver's capability, the task is difficult and a collision or loss of control occurs because the driver fails to accomplish the task. Here, the driver's capability is determined by the individual's physical and cognitive characteristics (e.g., vision, reaction time, and information processing capacity), personality, competence, skill, and driving style.

acceleration mapping between the first and second experiments


*Closing* Relative Velocity (m/sec) *Opening* First experiment

**4.3.2 Dynamic aspect**

data (Wu et al., 2003).

**4.4 Discussion**

changes over five years.

headway distance is opening.

*Strong acceleration*


*Deceleration*

Acceleration (m/sec2)

experiment.

Fig. 8. Road section used for car-following behavior analysis

### **4.3 Results**

### **4.3.1 Static aspect**

Figure 9 presents the distributions of THW for leading and following vehicles. The THW distributions suggest the proportion of time experienced in each category to the total time of the car-following conditions. There were no differences in the distribution of THW to following vehicles between the first and second experiments.

The peak of the distribution of THW to leading vehicles is found in the category from 1 to 1.5s in the first experiment. However, the peak is found in the category from 1.5 to 2s in the second experiment, indicating that the THW in the second experiment exceeds that in the first experiment.

Fig. 9. Comparison of THW to leading and following vehicles between the first and second experiments

### **4.3.2 Dynamic aspect**

278 Fuzzy Logic – Algorithms, Techniques and Implementations

Figure 9 presents the distributions of THW for leading and following vehicles. The THW distributions suggest the proportion of time experienced in each category to the total time of the car-following conditions. There were no differences in the distribution of THW to

From origin

To destination

The peak of the distribution of THW to leading vehicles is found in the category from 1 to 1.5s in the first experiment. However, the peak is found in the category from 1.5 to 2s in the second experiment, indicating that the THW in the second experiment exceeds that in the

First experiment Second experiment

Fig. 9. Comparison of THW to leading and following vehicles between the first and second

THW (s) to following vehicles THW (s) to leading vehicles

0% 5% 10% 15% 20% 25% 30% 35% 40%

0

0~0.5

0.5

~1

1~1.5

1.5

2~2.5

2.5

~3

3~3.5

3.5

~4

4~4.5

4.5

~5

5~

~2

5~

Fig. 8. Road section used for car-following behavior analysis

Distance:1.8km (about 2-min drive )

Target section for car-following behaviour analyses

following vehicles between the first and second experiments.

**4.3 Results**

**4.3.1 Static aspect**

first experiment.

experiments

0% 5% 10% 15% 20% 25% 30% 35% 40%

0

0~0.5

0.5

~1

1~1.5

1.5

2~2.5

2.5

~3

3~3.5

3.5

~4

Frequency rate Frequency rate

4~4.5

4.5

~5

~2

Figure 10 compares the relative velocity–acceleration mapping of the first and second experiments. In the fuzzy logic model specification, there is a total of 27 traces (data length 770.5sec) in the first experiment and 29 traces (data length 1481.0sec) in the second experiment.

The RMSEs between the predicted and measured accelerations in the estimated fuzzy logic model were 0.25m/sec2 in the first experiment and 0.14m/sec2 in the second experiment, which are within adequate errors compared to those estimated based on other real-world data (Wu et al., 2003).

The deceleration when the elderly participants approach the lead vehicle was the same in the two experiments. However, the elderly drivers accelerate more strongly in the second experiment than in the first experiment, when the leading vehicle goes faster and the headway distance is opening.

Fig. 10. Results of fuzzy logic model specification of elderly drivers: Relative velocity– acceleration mapping between the first and second experiments

### **4.4 Discussion**

Comparison of THW to following vehicles between the first and second experiments indicates no change in traffic flow on the target section in five years. In contrast, THW to leading vehicles is longer in the second experiment than in the first experiment, suggesting that elderly drivers take longer THW and the static aspect of their car-following behaviors changes over five years.

The task-capability interface model (Fuller, 2005) helps clarify why elderly drivers' carfollowing behavior changes with aging. In this model, drivers adjust task difficulty while driving in order to avoid road accidents. Task difficulty can be described as an interaction between the driver's capability and task demands. When the driver's capability exceeds the task demands, the task is easy and the driver completes the task successfully. When the task demands exceed the driver's capability, the task is difficult and a collision or loss of control occurs because the driver fails to accomplish the task. Here, the driver's capability is determined by the individual's physical and cognitive characteristics (e.g., vision, reaction time, and information processing capacity), personality, competence, skill, and driving style.

Understanding Driver Car-Following Behavior Using a Fuzzy Logic Car-Following Model 281

In the cross-cultural study, we compared the car-following behavior gathered on roads where driving is on the left side of the road. Further research will be addressed to compare the car-following behavior between left-hand driving and right-hand driving (e.g., in the

In the longitudinal study, we investigated the car-following behavior of small samples. The next step is to collect and analyze more elderly driver car-following behaviors to validate the findings of this study. Additionally, further study should be conducted to examine individual differences in car-following behaviors to clarify which cognitive function influences changes in car-following behavior with aging. We will assess the relationship between car-following behavior on a real road and elderly drivers' cognitive functions (e.g., attention, working memory, and planning (Kitajima & Toyota, 2012)) measured in a laboratory experiment. Analysis of the relationship between driving behavior and a driver's cognitive functions will help determine how driver support systems may assist driving behavior and detect the driver's cognitive functions based on natural driving behavior.

The authors are grateful to Prof. M. McDonald of the University of Southampton and Prof. P. Zheng of Ningbo University for useful discussions on estimation methodologies and

Brackstone, M., McDonald, M., & Sultan, B.; (1999). Dynamic behavioural data collection

Brackstone, M. & McDonald, M.; (1999). Car-following: a historical review. *Transportation Research Part F,* Vol.2, No.4, (December 1999), pp. 181-196, ISSN 1369-8478 Brackstone, M., Sultan, B., & McDonald, M.; (2002). Motorway driver behaviour: studies on

Brackstone, M., Waterson, B, & McDonald, M.; (2009). Determinants of following headway

Chandler, R.E., Herman, R. & Montroll, E.W.; (1958). Traffic dynamics: Studies in car

Fuller, R.; (2005). Towards a general theory of driver behaviour. *Accident Analysis and* 

Gipps, P.G.; (1981). A behavioural car following model for computer simulation. *Transportation Research Part B,* Vol.15, No.2, (April 1981), pp. 105-111, ISSN 0191-2615 Helly, W.; (1959). Simulation of Bottlenecks in Single Lane Traffic Flow. *Proceedings of the Symposium of Theory of Traffic Flow,* pp. 207-238, New York, USA, 1959 Iwashita, Y., Ishibashi, M., Miura, Y., & Yamamoto, M.; (2011). Changes of Driver Behavior

*Prevention,* Vol.37, No.3, (May 2005), pp. 461-472, ISSN 0001-4575

*accident,* Tokyo, Japan, September 5-9, 2011

using an instrumented vehicle. *Transportation Research Record,* No.1689, (1999), pp.

car following. *Transportation Research Part F,* Vol.5, No.1, (March 2002), pp. 31-46,

in congested traffic. *Transportation Research Part F,* Vol.12, No.2, (March 2009), pp.

following. *Operations Research,* Vol.6, No.2, (March 1958), pp. 165-184, ISSN 0030-364X

by Rear-end Collision Prevention Support System in Poor Visibility. *Proceedings of First International Symposium on Future Active Safety Technology toward zero-traffic-*

United States).

**7. Acknowledgments**

**8. References**

results of the fuzzy logic car-following model.

9-17, ISSN 0361-1981

ISSN 1369-8478

131-142, ISSN 1369-8478

Task demands are determined by the operational features of the vehicle (e.g., its control characteristics), environmental factors (e.g., road surface and curve radii), interactions with other road users (e.g., slowing down of a lead vehicle and crossing of pedestrians or bicycles), and human factors (e.g., choice of driving speeds, headway distances, and acceleration control). The longitudinal assessment in this study is conducted using the same participant, the same instrumented vehicle, and the same route. These experiment settings lead to no differences in driver personality affecting capability or in vehicle operational features and road traffic environments influencing task demands. The decline in physical and cognitive functions may lead to a decrease in the elderly driver's capability. Therefore, elderly drivers reduce task demands by adopting longer THW to a leading vehicle, and they seek to maintain capability higher than the reduced task demands.

The results of the fuzzy logic car-following model estimation suggest that the acceleration rate when the inter-vehicle distance is opening becomes higher after five years, although the deceleration rate while approaching the vehicle in front does not change. The stronger acceleration may be a compensating behavior for maintaining the driver's capability by increasing the task demand temporarily, because the driver's capability interacts with the task demands, and drivers can control the task demands by changing their driving behavior in order to improve their capability (e.g., increasing speed, to wake up when feeling sleepy while driving).

Our findings imply that when a leading vehicle drives faster and the headway distances are opening while driving on multi-traffic lanes or while approaching a merging point, information or warning about the movements of the surrounding vehicles is helpful to elderly drivers because they accelerate more strongly and the temporal task demand is higher in this situation.

### **5. Limitations**

The fuzzy logic car-following model deals mainly with two vehicles: a vehicle in front and the driver's own vehicle. When drivers approach an intersection with a traffic light under car-following conditions, they may pay more attention to the signal in front of the leading vehicle and manage their acceleration based on the traffic light. Drivers allocate their attention to the forward road structure instead of the leading vehicle when they approach a tight curve; thus, they may reduce their driving speed before entering the curve even if the headway distance is opening. The car-following behavior before intersections or tight curves can be influenced by environmental factors other than a lead vehicle. Further car-following models should be developed to reproduce the car-following behavior in these situations.

### **6. Conclusion**

This chapter describes the fuzzy logic car-following model, including a comparison with other car-following models. We introduce two case studies that investigate drivers' carfollowing behavior using the fuzzy logic car-following model. This model can determine the degree to which a driver controls longitudinal acceleration according to the relationship between the preceding vehicle and his/her vehicle. The fuzzy logic model evaluates the driver's acceleration and deceleration rates using a rule base in natural language. This model contributes to interpretation of the difference in headway distances between Tsukuba and Southampton and changes in elderly drivers' headway distances with aging.

In the cross-cultural study, we compared the car-following behavior gathered on roads where driving is on the left side of the road. Further research will be addressed to compare the car-following behavior between left-hand driving and right-hand driving (e.g., in the United States).

In the longitudinal study, we investigated the car-following behavior of small samples. The next step is to collect and analyze more elderly driver car-following behaviors to validate the findings of this study. Additionally, further study should be conducted to examine individual differences in car-following behaviors to clarify which cognitive function influences changes in car-following behavior with aging. We will assess the relationship between car-following behavior on a real road and elderly drivers' cognitive functions (e.g., attention, working memory, and planning (Kitajima & Toyota, 2012)) measured in a laboratory experiment. Analysis of the relationship between driving behavior and a driver's cognitive functions will help determine how driver support systems may assist driving behavior and detect the driver's cognitive functions based on natural driving behavior.

### **7. Acknowledgments**

The authors are grateful to Prof. M. McDonald of the University of Southampton and Prof. P. Zheng of Ningbo University for useful discussions on estimation methodologies and results of the fuzzy logic car-following model.

### **8. References**

280 Fuzzy Logic – Algorithms, Techniques and Implementations

Task demands are determined by the operational features of the vehicle (e.g., its control characteristics), environmental factors (e.g., road surface and curve radii), interactions with other road users (e.g., slowing down of a lead vehicle and crossing of pedestrians or bicycles), and human factors (e.g., choice of driving speeds, headway distances, and acceleration control). The longitudinal assessment in this study is conducted using the same participant, the same instrumented vehicle, and the same route. These experiment settings lead to no differences in driver personality affecting capability or in vehicle operational features and road traffic environments influencing task demands. The decline in physical and cognitive functions may lead to a decrease in the elderly driver's capability. Therefore, elderly drivers reduce task demands by adopting longer THW to a leading vehicle, and they

The results of the fuzzy logic car-following model estimation suggest that the acceleration rate when the inter-vehicle distance is opening becomes higher after five years, although the deceleration rate while approaching the vehicle in front does not change. The stronger acceleration may be a compensating behavior for maintaining the driver's capability by increasing the task demand temporarily, because the driver's capability interacts with the task demands, and drivers can control the task demands by changing their driving behavior in order to improve their capability (e.g., increasing speed, to wake up when feeling sleepy

Our findings imply that when a leading vehicle drives faster and the headway distances are opening while driving on multi-traffic lanes or while approaching a merging point, information or warning about the movements of the surrounding vehicles is helpful to elderly drivers because they accelerate more strongly and the temporal task demand is

The fuzzy logic car-following model deals mainly with two vehicles: a vehicle in front and the driver's own vehicle. When drivers approach an intersection with a traffic light under car-following conditions, they may pay more attention to the signal in front of the leading vehicle and manage their acceleration based on the traffic light. Drivers allocate their attention to the forward road structure instead of the leading vehicle when they approach a tight curve; thus, they may reduce their driving speed before entering the curve even if the headway distance is opening. The car-following behavior before intersections or tight curves can be influenced by environmental factors other than a lead vehicle. Further car-following models should be developed to reproduce the car-following behavior in these situations.

This chapter describes the fuzzy logic car-following model, including a comparison with other car-following models. We introduce two case studies that investigate drivers' carfollowing behavior using the fuzzy logic car-following model. This model can determine the degree to which a driver controls longitudinal acceleration according to the relationship between the preceding vehicle and his/her vehicle. The fuzzy logic model evaluates the driver's acceleration and deceleration rates using a rule base in natural language. This model contributes to interpretation of the difference in headway distances between Tsukuba

and Southampton and changes in elderly drivers' headway distances with aging.

seek to maintain capability higher than the reduced task demands.

while driving).

**5. Limitations**

**6. Conclusion**

higher in this situation.


282 Fuzzy Logic – Algorithms, Techniques and Implementations

Kitagima, M. & Toyota, M.; (2012). Simulating navigation behaviour based on the

Kometani, E. & Sasaki, T.; (1959). Dynamic behaviour of traffic with a nonlinear spacing-

Mehmood, A., Saccomanno, F., & Hellinga, B.; (2001). Evaluation of a car-following model

Michaels, R.M.; (1963). Perceptual factors in car following. *Proceedings of the Second International Symposium on the Theory of Road Traffic Flow,* pp. 44-59, Paris, France, 1963 Owsley, C.; (2004). Driver capabilities. *Transportation in an Aging Society A Decade of* 

Pipes, L.A.; (1953). An operational analysis of traffic dynamics. *Journal of Applied Physics,* 

Reiter, U.; (1994). Empirical studies as basis for traffic flow models. *Proceedings of the Second International Symposium on Highway Capacity,* pp. 493-502, Sydney, Australia, 1994 Sato, T., & Akamatsu, M.; (2007). Influence of traffic conditions on driver behavior before

Sato, T., Akamatsu, M., Zheng, P., & McDonald, M.; (2009a). Comparison of Car-Following

Sato, T., Akamatsu, M., Zheng, P., & McDonald, M.; (2009b). Comparison of Car Following

Sato, T, & Akamatsu, M.; (2011). Longitudinal Study of Elderly Driver's Car-Following

Sugeno, M.; (1985). *Industrial Applications of Fuzzy Control,* Elsevier Science Inc., ISBN

Wu. J., Brackstone, M., & McDonald, M.; (2000). Fuzzy sets and systems for a motorway

Wu, J., Brackstone, M., & McDonald, M.; (2003). The validation of a microscopic simulation

Zheng, P.; (2003). *A microscopic simulation model of merging operation at motorway on-ramps,* 

Zheng, P., McDonald, M., & Wu, J.; (2006). Evaluation of collision warning-collision

*Conference 2009,* pp. 4155-4160, Fukuoka, Japan, August 18-21, 2009

*Dynamics Society,* Atlanta, Georgia, USA, July 23-27, 2001

Vol.24, No.3, (March 1953), pp. 274-281, ISSN 0021-8979

1027-1652, Washington, D.C., USA, 2004

(September 2003), pp. 397-413, ISSN 1369-8478

41-58, ISSN 1362-3001

New York, USA, 1959

August 9-14, 2009

September 5-9, 2011

0444878297, New Yorkm USA

2000), pp. 65-76, ISSN 0165-0114

(December 2003), pp. 463-479, ISSN 0968-090X

No.1944, (2006), pp. 1-7, ISSN 0361-1981

PhD Thesis, University of Southampton, Southampton, UK

architecture model Model Human Processor with Real-Time Constraints (MHP/RT). *Behaviour & Information Technology,* Vol.31, No.1, (November 2011), pp.

speed relationhip. *Proceedings of the Symposium of Theory of Traffic Flow,* pp. 105-119,

using systems dynamics. *Proceedings of the 19th International Conference of the System* 

*Experience (Transportation Research Board Conference Proceedings 27),* pp. 44-55, ISSN

making a right turn at an intersection: analysis of driver behavior based on measured data on an actual road. *Transportation Research Part F,* Vol.10, No.5,

Behavior between Four Countries from the Viewpoint of Static and Dynamic Aspects. *Proceedings of 17th World Congress on Ergonomics (IEA 2009),* Beijing, China,

Behavior between UK and Japan. *Proceedings of ICROS-SICE International Joint* 

Behavior in Actual Road Environments. *Proceedings of First International Symposium on Future Active Safety Technology toward zero-traffic-accident,* Tokyo, Japan,

microscopic simulation model. *Fuzzy Sets and Systems,* Vol.116, No.1, (November

model : a methodological case study. *Transportation Research Part C,* Vol.11, No.6,

avoidance systems using empirical driving data. *Transportation Research Record,* 

### *Edited by Elmer P. Dadios*

Fuzzy Logic is becoming an essential method of solving problems in all domains. It gives tremendous impact on the design of autonomous intelligent systems. The purpose of this book is to introduce Hybrid Algorithms, Techniques, and Implementations of Fuzzy Logic. The book consists of thirteen chapters highlighting models and principles of fuzzy logic and issues on its techniques and implementations. The intended readers of this book are engineers, researchers, and graduate students interested in fuzzy logic systems.

Fuzzy Logic - Algorithms, Techniques and Implementations

Fuzzy Logic

Algorithms, Techniques and Implementations

*Edited by Elmer P. Dadios*

Photo by Nattakit / iStock