2. Preliminaries

#### 2.1. The conventional fuzzy inference model

The conventional fuzzy inference model using SDM is described [1]. Let Zj = {1, …, j} and Zj∗ = {0, 1,…, j}. Let R be the set of real numbers. Let x = (x1, …, xm) and y be input and output variables, respectively, where xj ∈ R for j ∈ Zm, and y ∈ R. Then, the rule of simplified fuzzy inference model is expressed as

$$R\_i \colon \text{if } \mathbf{x}\_1 \text{ is } M\mathbf{i}\_1 \text{ and } \mathbf{x}\_j \text{ is } M\_{\vec{\eta}} \cdot \text{and } \mathbf{x}\_m \text{ is } M\_{im} \text{ then } \boldsymbol{y} \text{ is } \boldsymbol{w}\_i \tag{1}$$

where j ∈ Zm is a rule number, i ∈ Zn is a variable number, Mij is a membership function of the antecedent part, and wi is the weight of the consequent part.

A membership value μ<sup>i</sup> of the antecedent part for input x is expressed as

$$\mu\_i = \prod\_{j=1}^{m} M\_{i\bar{\jmath}}(\mathbf{x}\_j) \tag{2}$$

Then, the output y<sup>∗</sup> of fuzzy inference method is obtained as

self-organization or a vector quantization technique to determine the initial assignment of parameters [11–15, 19]. Specifically, it is known that learning methods using vector quantization (VQ) and steepest descent method (SDM) are superior in the number of rules (parameters) to other methods [16, 19]. So, why is it effective to combine VQ with SDM in fuzzy modeling? First, let us explain how to combine SDM with methods other than VQ. (1) Although the learning time is short, the generation method is known to have low test accuracy, while the reduction method has high test accuracy but takes long learning time [2]. (2) The method using GA and PSO shows high accuracy when the input dimension and the number of rules are small, but it is known that there is a problem of scalability [3]. (3) SIRM and DIRM methods are excellent in scalability, but the accuracy of learning is not always sufficient [9]. As described above, many methods are not necessarily effective models because of the difficulty of learning accompanying the increase of the input dimension and the number of rules and the low accuracy. On the other hand, the method combining VQ with SDM is possible to efficiently conduct learning of SDM by arranging suitably the initial parameters of fuzzy rules using VQ [1, 16]. However, since VQ is unsupervised learning, it is easy to reflect the input part of learning data, but how to capture output information in learning is difficult. With their studies, the first learning method is the one using VQ only in determining the initial parameters of the antecedent part of fuzzy rules using input part of learning data [1, 11–14]. The second method is the one determining the same parameter using input/output parts of learning data [15, 19]. Further, the third method is one iterating learning process of VQ and SDM for the second method. Kishida and Pedrycz proposed the method based on the third one [13, 15]. These methods are the ones determining only the antecedent parameters by VQ. Therefore, we introduced generalized inverse matrix (GIM) to determine the initial assignment of weight parameters for the consequent part of fuzzy rules as the fourth method and showed the effectiveness in the previous paper [16, 17]. In this paper, improved methods for learning process of SDM in learning methods using VQ, GIM, and SDM are introduced and show that the method is superior in the number of rules to other methods in numerical simulations.

The conventional fuzzy inference model using SDM is described [1]. Let Zj = {1, …, j} and Zj∗ = {0, 1,…, j}. Let R be the set of real numbers. Let x = (x1, …, xm) and y be input and output variables, respectively, where xj ∈ R for j ∈ Zm, and y ∈ R. Then, the rule of simplified fuzzy

where j ∈ Zm is a rule number, i ∈ Zn is a variable number, Mij is a membership function of the

Ri : if x<sup>1</sup> is Mi<sup>1</sup> and xj is Mij and xm is Mim, then y is wi (1)

2. Preliminaries

2.1. The conventional fuzzy inference model

130 From Natural to Artificial Intelligence - Algorithms and Applications

antecedent part, and wi is the weight of the consequent part.

inference model is expressed as

$$y^\* = \frac{\sum\_{i=1}^n \mu\_i \cdot w\_i}{\sum\_{i=1}^n \mu\_i} \tag{3}$$

If Gaussian membership function is used, then Mij is expressed as

$$M\_{\vec{\eta}}(\mathbf{x}\_{\vec{\eta}}) = \exp\left(-\frac{1}{2} \left(\frac{\mathbf{x}\_{\vec{\eta}} - c\_{\vec{\eta}\vec{\eta}}}{b\_{\vec{\eta}}}\right)^2\right) \tag{4}$$

where cij and bij denote the center and the width values of Mij, respectively.

The objective function E is determined to evaluate the inference error between the desirable output y<sup>r</sup> and the inference output y<sup>∗</sup> .

Let D = {(xp , … , xp , yr )|p∈ZP } and D<sup>∗</sup> = {(xp , …, x<sup>p</sup> )|p∈Zp} be the set of learning data and the set of input part of D, respectively. The objective of learning is to minimize the following mean square error (MSE) as

$$E = \frac{1}{P} \sum\_{p=1}^{P} \left( y\_p^\* - y\_p^r \right)^2 \tag{5}$$

where yp∗ and yr mean inference and desired output for the pth input x<sup>p</sup> .

In order to minimize the objective function E, each parameter of c, b, and w is updated based on SDM using the following relation:

$$\frac{\partial E}{\partial w\_i} = \frac{\mu\_i}{\sum\_{l=1}^n \mu\_l} \cdot (y^\* - y') \tag{6}$$

$$\frac{\partial E}{\partial \mathbf{c}\_{ij}} = \frac{\mu\_i}{\sum\_{l=1}^n \mu\_l} \cdot (y^\* - y^r) \cdot (w\_i - y^\*) \cdot \frac{\mathbf{x}\_{\hat{l}} - \mathbf{c}\_{\hat{i}\hat{l}}}{b\_{\hat{i}\hat{l}}^2} \tag{7}$$

$$\frac{\partial E}{\partial \mathbf{c}\_{\dot{\mathbf{y}}}} = \frac{\mu\_i}{\sum\_{I=1}^n \mu\_I} \cdot (y^\* - y^f) \cdot (w\_i - y^\*) \cdot \frac{\left(\mathbf{x}\_{\dot{\mathbf{y}}} - \mathbf{c}\_{\dot{\mathbf{y}}}\right)^2}{b\_{\dot{\mathbf{y}}}^3} \tag{8}$$

where t is iteration time and K<sup>α</sup> is a learning constant [1].

The learning algorithm for the conventional fuzzy inference model is shown as follows:
