4. Numerical simulations

In order to compare the ability of Learning Algorithms (a'), (b'), (c'), and (d') with Learning Algorithms (a), (b), (c), and (d), numerical simulations for function approximation and pattern classification are performed.

#### 4.1. Function approximation

Figure 3. Flowchart of Learning Algorithm D' corresponding to Figure 2(d').

140 From Natural to Artificial Intelligence - Algorithms and Applications

Figure 4. The optimum values M <sup>∗</sup> and n<sup>∗</sup> for M and n.

The systems are identified by fuzzy inference systems. This simulation uses four systems specified by the following functions with two-dimensional input space [0, 1]<sup>2</sup> (Eqs. (25)–(28)) and one output with the range [0, 1];

$$y = \sin\left(\pi x\_1^3\right) x\_2\tag{25}$$

$$y = \frac{\sin\left(2\pi\mathbf{x}\_1^3\right)\cos\left(\pi\mathbf{x}\_2\right) + 1}{2} \tag{26}$$

$$y = \frac{1.9\left(\left(1.35 + \exp\left(\mathbf{x}\_1\right)\right)\sin\left(13\left(\mathbf{x}\_1 - 0.6\right)^2\right)\exp\left(-\mathbf{x}\_2\right)\sin\left(7\mathbf{x}\_2\right)\right)}{2} \tag{27}$$

$$y = \frac{\sin\left(10(\mathbf{x}\_1 - 0.5)^2 + 10(\mathbf{x}\_2 - 0.5)^2\right) + 1}{2} \tag{28}$$

In this simulation, Tmax<sup>1</sup> = 100000 and Tmax<sup>2</sup> = 50000 for (a) and Tmax<sup>1</sup> = 10000 and Tmax<sup>2</sup> = 5000 for (b), (c), and (d) and <sup>θ</sup> = 1.0 � <sup>10</sup>�<sup>4</sup> , K<sup>0</sup> = 100, Kmax = 190, K = 10, Kc = 0.01, Kb = 0.01, Kc = 0.1, the number of learning data is 200 and the number of test data is 2500.

Table 1 shows the results for the simulation. In Table 1, the number of rules, MSEs for learning and test, and learning time (second) are shown, where the number of rules means the one when the threshold θ of inference error is achieved in learning. The result of simulation is the average value from 20 trials. As a result, the results of (a'), (b'), (c'), and (d') are almost same as the cases of (a), (b), (c), and (d) as shown in Table 1. It seems that there is no difference of the ability for the regression problem.

#### 4.2. Classification problems for UCI database

Iris, Wine, Sonar, and BCW data from UCI database shown in Table 2 are used as the second numerical simulation [20]. In this simulation, fivefold cross validation is used. As the initial conditions for classification problem, Kc = 0.001, Kb = 0.001, Kw = 0.05, εinit = 0.1, εfin = 0.01, and <sup>λ</sup> = 0.7 are used. Further, Tmax = 50000, <sup>M</sup> = 100, and <sup>θ</sup> = 1.0 � <sup>10</sup>�<sup>2</sup> for iris and wine. Tmax = 50000, <sup>M</sup> = 200, and <sup>θ</sup> = 2.0 � <sup>10</sup>�<sup>2</sup> for BCW; and Tmax = 5000, <sup>M</sup> = 100, and <sup>θ</sup> = 5.0 � <sup>10</sup>�<sup>2</sup> for sonar are used.

Table 3 shows the result of classification problem. In Table 3, the number of rules, RMs for learning, and test data are shown, where RM means the rate of misclassification. As a result, the


Table 1. The results of simulations for function approximation.


from the set of learning data. In the proposed method, parameters are updated by any data selected based on the probability pM (x). The probability pM (x) is determined based on output change for learning data, so many fuzzy rules are likely to generate at or near the places where

(a) The number of rules 3.4 7.8 14.4 11.0

(b) The number of rules 2.0 20.8 26.0 3.7

(c) The number of rules 2.0 3.2 4.8 4.0

(d) The number of rules 3.7 2.5 2.5 4.0

(a') The number of rules 2.3 2.2 3.5 4.6

(b') The number of rules 2.0 2.0 2.1 3.7

(c') The number of rules 2.3 3.0 3.6 4.0

(d') The number of rules 2.3 2.0 2.4 3.3

RM for learning (%) 3.0 1.4 1.6 5.3 RM of test (%) 3.3 10.3 4.3 20.6

RM of learning (%) 3.3 13.6 2.2 5.1 RM of test (%) 3.3 16.6 3.5 18.2

RM of learning (%) 3.3 1.5 1.6 5.1 RM of test (%) 4.0 6.7 3.8 19.0

RM of learning (%) 3.3 1.1 1.3 5.1 RM of test (%) 3.8 6.5 2.1 18.3

RM for learning (%) 2.9 1.4 1.6 5.0 RM of test (%) 3.5 8.5 3.9 20.0

RM for learning (%) 3.9 3.0 2.1 5.0 RM of test (%) 4.9 9.2 3.9 19.0

RM for learning (%) 3.3 2.6 2.2 5.3 RM of test (%) 4.0 7.2 3.5 19.4

RM for learning (%) 3.0 1.8 2.2 5.0 RM of test (%) 3.5 7.6 3.7 19.1

selected 50 times from the set of learning data in learning. As a result, membership functions are likely to generate at or near the places where output change is large for the set of learning

data. The probability pM (x) is used in a method to improve the local search of SDM.

) = 0.5, then learning data x<sup>0</sup> is

Iris Wine BCW Sonar

http://dx.doi.org/10.5772/intechopen.79925

143

Learning Algorithms for Fuzzy Inference Systems Using Vector Quantization

output change is large for the set of learning data.

Table 3. The result for pattern classification.

For example, if the number of learning time is 100 and pM (x<sup>0</sup>

Table 2. The dataset for pattern classification.

results of (a'), (b'), (c'), and (d') are superior in the number of rules to the cases of (a), (b), (c), and (d) as shown in Table 3. It seems that there is the difference of ability for pattern classification.

Let us consider the reason why we can get the good result by using the probability pM (x). In the conventional learning method, parameters are updated by any data selected randomly


Table 3. The result for pattern classification.

results of (a'), (b'), (c'), and (d') are superior in the number of rules to the cases of (a), (b), (c), and (d) as shown in Table 3. It seems that there is the difference of ability for pattern classification.

The number of data 150 178 683 208 The number of input 4 13 9 60 The number of class 3 3 2 2

Eq. (25) Eq. (26) Eq. (27) Eq. (28)

) 0.47 0.35 0.65 0.41

) 0.44 0.38 0.84 0.35

) 0.24 0.54 0.65 0.33

) 0.28 0.39 0.69 0.29

) 0.37 0.41 0.52 0.45

) 0.42 0.38 0.65 0.39

) 0.40 0.23 0.57 0.35

) 0.39 0.49 0.62 0.35

Iris Wine BCW Sonar

) 2.29 21.12 2.83 7.37

) 0.70 2.96 2.34 0.48

) 0.65 1.36 4.48 0.44

) 0.57 1.93 1.78 0.36

) 1.55 9.56 2.8 1.06

) 1.41 9.66 4.12 2.38

) 1.70 1.28 3.90 1.10

) 1.43 2.58 1.89 0.42

(a) The number of rules 8.3 22.5 52.4 6.1

(b) The number of rules 4.7 6.8 9.6 4.0

(c) The number of rules 5.4 7.4 11.1 3.5

(d) The number of rules 4.3 6.1 9.7 3.5

(a') The number of rules 5.0 8.9 11.8 4.7

(b') The number of rules 5.0 8.9 13.0 4.3

(c') The number of rules 5.7 8.0 13.1 4.1

(d') The number of rules 4.6 6.9 10.0 3.6

MSE for learning(<sup>10</sup><sup>4</sup>

142 From Natural to Artificial Intelligence - Algorithms and Applications

MSE of learning (<sup>10</sup><sup>4</sup>

MSE of learning (<sup>10</sup><sup>4</sup>

MSE of learning (<sup>10</sup><sup>4</sup>

MSE for learning (<sup>10</sup><sup>4</sup>

MSE for learning (<sup>10</sup><sup>4</sup>

MSE for learning (<sup>10</sup><sup>4</sup>

MSE for learning (<sup>10</sup><sup>4</sup>

Table 1. The results of simulations for function approximation.

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

MSE of test (<sup>10</sup><sup>4</sup>

Table 2. The dataset for pattern classification.

Let us consider the reason why we can get the good result by using the probability pM (x). In the conventional learning method, parameters are updated by any data selected randomly from the set of learning data. In the proposed method, parameters are updated by any data selected based on the probability pM (x). The probability pM (x) is determined based on output change for learning data, so many fuzzy rules are likely to generate at or near the places where output change is large for the set of learning data.

For example, if the number of learning time is 100 and pM (x<sup>0</sup> ) = 0.5, then learning data x<sup>0</sup> is selected 50 times from the set of learning data in learning. As a result, membership functions are likely to generate at or near the places where output change is large for the set of learning data. The probability pM (x) is used in a method to improve the local search of SDM.
