**2. The proposed approach**

Fig.1 shows a typical feed-forward ANN with a single hidden layer of sigmoid neurons. Conventionally, the output of such an ANN is given by:

$$w\_k = \sum\_{j=1}^{N\_{\tilde{l}}} w\_{jk} \text{sig}\left(\sum\_{i=1}^{N\_j} w\_{hij} \mathbf{x}\_i^p + b\_j\right) \tag{1}$$

where *<sup>p</sup> w ,b ,w ,x ,N ,N hijj jk hi <sup>i</sup>* , *<sup>k</sup> <sup>o</sup>* are the weight of the connection between the ith input to the jth hidden neuron, the bias of the jth hidden neuron, the weight of the connection between the jth hidden neuron and the kth output neuron, the number of hidden neurons, the number of inputs (elements in each input pattern), the kth output, respectively. The number of output neurons is *No* .

Fig. 1. Architecture of a typical feed-forward ANN

consider a car shape optimization problem. The ANN is required to estimate the shape parameters required to achieve certain air resistance during car motion. Constraints on the shape parameters exist; however, no clear rule database exists relating shape parameters to

The present work proposes a new approach that combines the advantages of fuzzy systems and ANNs through a simple modification of ANN's activation calculations. The proposed approach yields weights that are readily interpretable as logical consistent fuzzy rules because it includes the "semantic" of both input and output variables in the

The rest of the chapter is organized as follows. Section II describes the proposed framework. Section III demonstrates its effectiveness through a case study. Section IV shows how it to can be generalized to solve optimization problems. An illustrative example is given for this purpose. Finally, Section V concludes the chapter with a summary of the advantages of the

Fig.1 shows a typical feed-forward ANN with a single hidden layer of sigmoid neurons.

*k jk hij j i j= i= o = w sig w x +b*

where *<sup>p</sup> w ,b ,w ,x ,N ,N hijj jk hi <sup>i</sup>* , *<sup>k</sup> <sup>o</sup>* are the weight of the connection between the ith input to the jth hidden neuron, the bias of the jth hidden neuron, the weight of the connection between the jth hidden neuron and the kth output neuron, the number of hidden neurons, the number of inputs (elements in each input pattern), the kth output, respectively. The number of output

*p*

(1)

 

1 1

*N N h i*

the desired performance.

learning/optimization process.

**2. The proposed approach** 

Conventionally, the output of such an ANN is given by:

Fig. 1. Architecture of a typical feed-forward ANN

proposed approach.

neurons is *No* .

It has been proved in [1] that the sigmoid response to a sum of inputs is equivalent to combining the sigmoid response to each input using the fuzzy logic operator "ior" (interactive or). The truth table of the "ior" operator is shown in table 1. The truth table can be readily generalized to an arbitrary number of inputs. Eq.(1) can, thus, be interpreted as the output of a fuzzy inference system, where the weight *wjk* is the action recommended by

fuzzy rule *j*. However, this *wjk* does not contribute directly to the ANN output. Instead, its

contribution to the output is weighted by the sigmoid term 1 *Ni p hij j i i= sig w x + b* .

The sigmoid term corresponds to the degree of firing of the rule, which judges to what extent rule 'j' should participate in the ANN final decision. Moreover, the inference is based on 'ior' rather than the product/'and' fuzzy operator used in ANFIS. It is clear from the 'ior' truth table that the 'ior' operator decides that a rule fully participate in the ANN final decision if all its inputs satisfy their corresponding constraints or if some of them does, while the others are neutral. On the other hand, it decides that the rule should not participate if one or some of the inputs do not satisfy their constraints, while the others are neutral. In the case that some of the inputs completely satisfy the constraints; while others completely violate them, the rule becomes neutral participating by half-weighted recommended action in the final ANN output. The mathematical expression for "ior" is as follows [1]:

$$\text{ior}(a\_1, a\_2, \dots, \dots, \dots, a\_n) = \tag{2}$$

$$\frac{a\_1.a\_2, \dots, \dots, \dots, \dots, a\_n}{(1 - a\_1).(1 - a\_2)\dots, \dots, (1 - a\_n) + a\_1.a\_2, \dots, \dots, \dots, a\_n} \tag{2}$$

In linguistic terms, an antecedent formed by "ior-ing" several conditions, is equivalent to replacing the conventional phrase: "*if* A & B & --- then" with "So long as none of the conditions A, B, … are violated --- then". Throughout the paper we will use the Mnemonic "SLANCV" as a shortcut for this phrase. Thus we can say that Eq. 1 can be restated as a set of rules taking the following format:

$$\begin{aligned} \text{SLANCV} \quad \mathbf{x}\_i^p &\succ - \left( b\_j \;/\; N\_i \right) / \; w\_{hij} \quad \text{then} \quad o\_{jk} = w\_{jk} \\ \mathbf{i} &= \mathbf{1}, \mathbf{2}, \dots, \mathbf{N}\_i \end{aligned}$$

Despite the successful deployment of the "ior" based rule extraction in several applications ([1], [6] and [7]), it has several disadvantages. For example, the weights and biases of a hidden neuron have no direct clear logical interpretation. This makes the incorporation of available knowledge difficult. Such knowledge is of great use in accelerating the ANN training procedure. Besides, leaving weights and biases values unconstrained often lead to some un-plausible rules (rules with impossible antecedent) that need pruning. Therefore, to overcome these disadvantages, our approach is to modify Eq.(1) as follows:

$$w\_k = \sum\_{j=1}^{N\_h} w\_{jk}^c \text{sig}\left(\sum\_{i=1}^{N\_i} w\_{hij}^c (\mathbf{x}\_i^p - b\_{ij}^c)\right) \tag{3}$$

A Framework for Bridging the Gap Between Symbolic and Non-Symbolic AI 27

1

*N hij ij i*

*<sup>h</sup> i= <sup>c</sup>*

*Ni c c <sup>p</sup>*

 

*sig w (x b )*

*sig w (x b )*

*hij ij i*

(5)

1 1

Adding a penalty term to the objective function used in the ANN training so as to

To apply the proposed approach to a particular design problem, there are essentially three

1. Initialization and knowledge incorporation: In this phase, the designer defines the number of rules (hidden neurons) and chooses suitable weights and biases constraints.

3. Rule Analysis and Post-Rule-Analysis Processing: The weights are interpreted as fuzzy rules. A suitable method is used to analyse the rules and improve the system

*j= i=*

*k jk N N h i j= c c <sup>p</sup>*

1

performance based on the insight gained from this rule analysis.

Fig. 2. Parameters used to describe the path tracking problem.

*o= w*

impose a maximum limit to its output. In this research, we adopted the first approach.

phases

2. Training phase.

where, *<sup>c</sup> whij* are the weights joining input *i* to hidden neuron *j,* and *<sup>c</sup> wjk* are the weights joining hidden neuron *j* to output neuron *k*.

The superscript *'c*' denotes that these weights are constrained. In general, a constrained variable *<sup>c</sup> Par* that directly appears in Eq. (3) is related to its corresponding free optimization variable *Par* by the following transformation:

$$Par^c = \left(Param - Parnm\right) \cdot \text{sig}\left(\frac{Par}{\max\left(Param, \left(Param\right)\right)}\right) + Parnm \tag{4}$$

where, *Parmx,Parmn* are the maximum and minimum values of the parameter, respectively.

Comparing Eqs. (1) and (3), it is clear that our approach introduces two simple, yet effective, modifications:

 First, in Eq. (3), *whij* is taken as a common factor to the bracket containing the input and bias. Second, there is a bias corresponding to each input (*bij*). Using these two modifications, Eq.(3) has a simple direct fuzzy interpretation.


$$\begin{aligned} \text{SLANCC} \quad \mathbf{x}\_i^p \rhd b\_{ij}^c (\mathbf{i} f w^c\_{\ h ij} \rhd \mathbf{0}), \mathbf{x}\_i^p \prec b\_{ij}^c (\mathbf{i} f w^c\_{\ h ij} \prec \mathbf{0}) \quad \text{then} \\ o\_{jk} = w^c\_{\ jk}, \text{where } \mathbf{i} = \mathbf{1}, \mathbf{2}, ..., \mathbf{N}\_i \end{aligned}$$

Table 1. Truth Table of the IOR- Operator.

This direct interpretation makes it easy for the designer to incorporate available knowledge through appropriate weight initialization; as will be made clear in the adopted case study.

 The weights and biases included in Eq. (3) are constrained according to limits defined by the system designer. This ensures that the deduced rules are logically sound and consistent.

Furthermore, often, the nature of a problem poses constraints on the ANN output. Two possible approaches; are possible; to satisfy this requirement:

Modifying Eq.(2) by replacing the sigmoid with a normalized sigmoid.

$$\begin{aligned} \rho\_k = \sum\_{j=1}^{N\_{\bar{h}}} w\_{jk}^c \frac{\text{sig}\left(\sum\_{i=1}^{N\_{\bar{i}}} w\_{hij}^c (\mathbf{x}\_i^p - b\_{ij}^c)\right)}{\sum\_{j=1}^{N\_{\bar{h}}} \text{sig}\left(\sum\_{i=1}^{N\_{\bar{i}}} w\_{hij}^c (\mathbf{x}\_i^p - b\_{ij}^c)\right)} \end{aligned} \tag{5}$$

 Adding a penalty term to the objective function used in the ANN training so as to impose a maximum limit to its output.

In this research, we adopted the first approach.

To apply the proposed approach to a particular design problem, there are essentially three phases


26 Recurrent Neural Networks and Soft Computing

where, *<sup>c</sup> whij* are the weights joining input *i* to hidden neuron *j,* and *<sup>c</sup> wjk* are the weights

The superscript *'c*' denotes that these weights are constrained. In general, a constrained variable *<sup>c</sup> Par* that directly appears in Eq. (3) is related to its corresponding free optimization

*<sup>c</sup> Par Par = Parmx Parmn sig + Parmn*

where, *Parmx,Parmn* are the maximum and minimum values of the parameter, respectively.

Comparing Eqs. (1) and (3), it is clear that our approach introduces two simple, yet effective,

 First, in Eq. (3), *whij* is taken as a common factor to the bracket containing the input and bias. Second, there is a bias corresponding to each input (*bij*). Using these two

> *p p c c c c i i ij hij ij hij*

*SLANCV x > b ifw > ,x < b ifw < then o = w ,where i = ,N*

First Input Second Input IOR Output 0 0 0 0 0.5 0 1 0.5 1 0.5 0.5 0.5 1 0 0.5 0 1 0.5 1 1 1 0.5 1 1 0.5 0 0

This direct interpretation makes it easy for the designer to incorporate available knowledge through appropriate weight initialization; as will be made clear in the adopted

 The weights and biases included in Eq. (3) are constrained according to limits defined by the system designer. This ensures that the deduced rules are logically sound and

Furthermore, often, the nature of a problem poses constraints on the ANN output. Two

*jk jk i*

*c*

 0 0 1,2,...

max

*Parmx, Parmn*

(4)

joining hidden neuron *j* to output neuron *k*.

variable *Par* by the following transformation:

Table 1. Truth Table of the IOR- Operator.

possible approaches; are possible; to satisfy this requirement:

Modifying Eq.(2) by replacing the sigmoid with a normalized sigmoid.

modifications:

case study.

consistent.

modifications, Eq.(3) has a simple direct fuzzy interpretation.

3. Rule Analysis and Post-Rule-Analysis Processing: The weights are interpreted as fuzzy rules. A suitable method is used to analyse the rules and improve the system performance based on the insight gained from this rule analysis.

Fig. 2. Parameters used to describe the path tracking problem.

A Framework for Bridging the Gap Between Symbolic and Non-Symbolic AI 29

*v = x t +y t*

<sup>1</sup> sin

Applying the control inputs ( and *ref ref v ω* ) (or equivalently (*vr* and *vl*)) to the robot would enable it; in the absence of noise and other types of inaccuracies; to follow the required path.

However, to assist the ANN in learning the concept of a path (not an instance of a path) as well as making it robust against disturbances, we need to find a closed-loop control law. This is not straightforward because the kinematics model is nonlinear. In what follows, our objective is to show how the proposed ANN-based framework can provide reliable closedloop control; of the form shown in Fig. (3); compared to the direct forcing case. In Fig. (3), the role of the Robot kinematics simulator is to predict the robot location at the current time

For the purpose of illustration, we will restrict our case study to the following family of

<sup>3</sup> *x t = bt; < b < y t = ct ; < c <* 0 1 and 1 1

The ANN is trained on randomly chosen 11 members of this family and tested on different 11 members of the same family. To demonstrate the robustness of the proposed approach an additive disturbance (of uniform distribution) is added to both *v* and *ω* . The value of the

Several possible input-output choices exist. The first has been reported in [11]. The time is the input and the speeds are the output. An alternative choice is to consider the coordinates (x, y) of each point on the path; as inputs; and the corresponding actions (speeds); as output. The third choice is to input the path as a whole as a single input vector and the corresponding sequence of actions (speeds) as a single output vector. All these choices share two fundamental disadvantages. First, it is impossible to interpret the trained weights as fuzzy rules. Furthermore, the ANN does not learn the "concept" of path tracking in general. Instead, it learns to track a single path only. In addition, the first and second choices do not explicitly capture the relation between consecutive path points. To overcome these

limitations, we investigated different combinations of ANN inputs and outputs.

Only the two input-output combinations, that produced the best results, are discussed:

where *'t'* is the time vector= [0 (start time):0.05(time step):1 (final time)].

disturbance can reach up to 200% of *v* value and 100% of *ω* value.

**3.3 Choice of the ANN's inputs and outputs** 

The inputs to the ANN are chosen to be:

*dt v*

*d y t*

*ref*

*ref*

This open loop design will be called the *"direct forcing case".* 

step given its current control inputs.

paths:

i. Case "A":

*w =*

2 2

(9)

*ref*
