**3. The self-constructing fuzzy CMAC**

From relative architectures to learning algorithms this section provides a brief review and discussions of the self-constructing FCMAC (SC-FMAC, Lee et al., 2007a).

#### **3.1 Architecture of the SC-FCMAC model**

As illustrated in Fig. 2, the SC-FCMAC model (Lee et al., 2007a) consists of the input space partition, association memory selection, and defuzzification. Similar to the traditional CMAC model, the SC-FCMAC model approximates a nonlinear function *y fx* ( ) by applying the following two primary mappings:

$$S: X \Rightarrow A \tag{1}$$

$$P \colon A \Rightarrow D \tag{2}$$

where *X* is an s-dimensional input space, *A* is an *NA*-dimensional association space, and *D* is a 1-dimensional (1-D) output space. These two mappings are realized by using fuzzy

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 425

1

*L*

*y*

Fig. 2. Structure of the SC-FCMAC model (Lee et al., 2007)

For the initial system, the values of the tuning parameters *<sup>m</sup> wj* and *wj*

The receptive field functions can map input patterns. Hence, the discriminative ability of these new features is determined by the centers of the receptive field functions. To achieve good classification, centers are best selected based on their ability to provide large class

**3.2 Learning Algorithm of the SC-FCMAC** 

are generated randomly, and the *m* and

**3.2.1 The input space partition scheme** 

the proposed SCM clustering method.

separation.

1

*N m jj j j N j j j*

*w w*

*w*

*L*

In this section, for completing the SC-FCMAC model (Lee et al., 2007a) the self-constructing learning algorithm, which consists of an input space partition scheme (i.e., scheme 1) and a parameter-learning scheme (i.e., scheme 2), is reviewed. First, the input space partition scheme is used to determine proper input space partitioning and to find the mean and the width of each receptive field function. This scheme is based on the self-clustering method (SCM) to appropriately determine the various distributions of the input training data. Second, the parameter-learning scheme is based on the gradient descent learning algorithm. To minimize a given cost function, the receptive field functions and the fuzzy weights are adjusted using the back-propagation algorithm. According to the requirements of the system, these parameters will be given proper values to represent the memory information.

.

(6)

of the receptive field functions are generated by

of the fuzzy weights

operations. The function *S*(*x*) maps each point *x* in the input space onto an association vector *Sx A* ( ) that has *NL* nonzero elements (*NL* < *NA*). Here, 1 2 ( , , , ), *NA* where 0 1 for all components in is derived from the composition of the receptive field functions and sensory inputs. Different from the traditional CMAC model, several hypercubes are addressed by the input state *x*. The hypercube values are calculated by product operation through the strength of the receptive field functions for each input state.

In the SC-FCMAC model, we use the Gaussian basis function as the receptive field function and the fuzzy weight function for learning. Some learned information is stored in the fuzzy weight vector. The 1-D Gaussian basis function can be given as follows:

$$
\mu(\mathbf{x}) = e^{-\left(\left(\mathbf{x} - m\right)\left\|\sigma\right\|^2\right)^2} \tag{3}
$$

where *x* represents the specific input state, *m* represents the corresponding center, and represents the corresponding variance.

Let us consider a *ND*-dimensional problem. A Gaussian basis function with *ND* dimensions is given as follows:

$$\alpha\_{\vec{j}} = \prod\_{i=1}^{N\_D} e^{-((x\_i - m\_{\vec{v}}) \int \sigma\_{\vec{j}})^2} \tag{4}$$

where represents the product operation, *<sup>j</sup>* represents the *j*-th element of the association memory selection vector, *xi* represents the input value of the *i*-th dimension for a specific input state *x*, *mij* represents the center of the receptive field functions, *ij* represents the variance of the receptive field functions, and *ND* represents the number of the receptive field functions for each input state. The function *P*(*a*) computes a scalar output *y* by projecting the association memory selection vector onto a vector of adjustable fuzzy weights. Each fuzzy weight is inferred to produce a partial fuzzy output using the value of its corresponding association memory selection vector as the input matching degree. The fuzzy weight is considered here so that the partial fuzzy output is defuzzified into a scalar output using standard volume-based centroid defuzzification (Kosko, 1997; Paul & Kumar, 2002). The term volume is used in a general sense to include multi-dimensional functions. For 2-D functions, the volume reduces to the area. If *vj* is the volume of the consequent set and *<sup>j</sup>* is the weight of the scale , *<sup>j</sup>* then the general expression for defuzzification is

$$y = \frac{\sum\_{j=1}^{N\_L} a\_j w\_j^m v\_j \zeta\_j}{\sum\_{j=1}^{N\_L} a\_j v\_j \zeta\_j} \tag{5}$$

where *<sup>m</sup> wj* is the mean value of the fuzzy weights and *NL* is the number of hypercube cells. The volume *vj* in this case is simply the area of the consequent weights, which are represented by Gaussian fuzzy sets. Therefore, , *j j v w* where *wj* represents the variance of the fuzzy weights. If the weight *<sup>j</sup>* is considered to be one, as in this work, then the actual output *y* is derived as

operations. The function *S*(*x*) maps each point *x* in the input space onto an association vector

functions and sensory inputs. Different from the traditional CMAC model, several hypercubes are addressed by the input state *x*. The hypercube values are calculated by product operation through the strength of the receptive field functions for each input state. In the SC-FCMAC model, we use the Gaussian basis function as the receptive field function and the fuzzy weight function for learning. Some learned information is stored in the fuzzy

<sup>2</sup> (( ) ) ( ) *x m x e*

Let us consider a *ND*-dimensional problem. A Gaussian basis function with *ND* dimensions

*<sup>N</sup> x m*

memory selection vector, *xi* represents the input value of the *i*-th dimension for a specific

variance of the receptive field functions, and *ND* represents the number of the receptive field functions for each input state. The function *P*(*a*) computes a scalar output *y* by projecting the association memory selection vector onto a vector of adjustable fuzzy weights. Each fuzzy weight is inferred to produce a partial fuzzy output using the value of its corresponding association memory selection vector as the input matching degree. The fuzzy weight is considered here so that the partial fuzzy output is defuzzified into a scalar output using standard volume-based centroid defuzzification (Kosko, 1997; Paul & Kumar, 2002). The term volume is used in a general sense to include multi-dimensional functions. For 2-D functions, the volume reduces to the area. If *vj* is the volume of the consequent set and *<sup>j</sup>*

*<sup>j</sup>* then the general expression for defuzzification is

*w v*

 

*v*

1

*L*

*y*

represented by Gaussian fuzzy sets. Therefore, , *j j v w*

1

 

*N m j j jj j N jjj j*

*L*

where *<sup>m</sup> wj* is the mean value of the fuzzy weights and *NL* is the number of hypercube cells. The volume *vj* in this case is simply the area of the consequent weights, which are

<sup>2</sup> (( ) )

*i ij ij*

where *x* represents the specific input state, *m* represents the corresponding center, and

1 *D*

 

*j i e*

input state *x*, *mij* represents the center of the receptive field functions,

 is derived from the composition of the receptive field

(3)

(4)

*<sup>j</sup>* represents the *j*-th element of the association

(5)

where *wj*

is considered to be one, as in this work, then

represents the

*ij* represents the

is

 where

*Sx A* ( ) that has *NL* nonzero elements (*NL* < *NA*). Here, 1 2 ( , , , ), *NA*

weight vector. The 1-D Gaussian basis function can be given as follows:

0 1 

for all components in

represents the corresponding variance.

where represents the product operation,

variance of the fuzzy weights. If the weight *<sup>j</sup>*

the actual output *y* is derived as

is given as follows:

the weight of the scale ,

Fig. 2. Structure of the SC-FCMAC model (Lee et al., 2007)

#### **3.2 Learning Algorithm of the SC-FCMAC**

In this section, for completing the SC-FCMAC model (Lee et al., 2007a) the self-constructing learning algorithm, which consists of an input space partition scheme (i.e., scheme 1) and a parameter-learning scheme (i.e., scheme 2), is reviewed. First, the input space partition scheme is used to determine proper input space partitioning and to find the mean and the width of each receptive field function. This scheme is based on the self-clustering method (SCM) to appropriately determine the various distributions of the input training data. Second, the parameter-learning scheme is based on the gradient descent learning algorithm. To minimize a given cost function, the receptive field functions and the fuzzy weights are adjusted using the back-propagation algorithm. According to the requirements of the system, these parameters will be given proper values to represent the memory information. For the initial system, the values of the tuning parameters *<sup>m</sup> wj* and *wj* of the fuzzy weights are generated randomly, and the *m* and of the receptive field functions are generated by the proposed SCM clustering method.

#### **3.2.1 The input space partition scheme**

The receptive field functions can map input patterns. Hence, the discriminative ability of these new features is determined by the centers of the receptive field functions. To achieve good classification, centers are best selected based on their ability to provide large class separation.

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 427

In this way, the maximum distance from any hypercube cell center to the examples that belong to this hypercube cell is not greater than the threshold value *Dthr*, though the algorithm does not keep any information on passed examples. The center and the jump

, 1,2,..., , *m C j j j n* (7)

*<sup>d</sup> E y t yt* (9)

( 1) ( ) , *m mm wt wt w j jj* (10)

(11)

(8)

*ij* *<sup>m</sup> wj* ,

positions of the receptive field functions are then defined by the following equation:

*r*

where *j n* 1,2,..., , and 1,2,..., .*<sup>s</sup> r n*

experimentation or trial-and-error

and *wj* 

and *wj* 

by the amount

defined as follows:

**3.2.2 The parameter-learning scheme** 

*n*

*j j s*

The threshold parameter *Dthr* is an important parameter in the input space partition scheme. A low threshold value leads to the learning of fine clusters (such that many hypercube cells are generated), whereas a high threshold value leads to the learning of coarse clusters (such that fewer hypercube cells are generated). Therefore, the selection of the threshold value *Dthr* critically affects the simulation results, and the threshold value is determined by practical

In the parameter-learning scheme, there are four adjustable parameters ( , , *mij*

 ) that need to be tuned. The parameter-learning algorithm of the SC-FCMAC model uses the supervised gradient descent method to modify these parameters. When we consider the single output case for clarity, our goal is to minimize the cost function E,

> <sup>1</sup> <sup>2</sup> () () , <sup>2</sup>

where ( ) *<sup>d</sup> y t* denotes the desired output at time *t* and *y*(*t*) denotes the actual output at time *t*.

( 1) ( ) *wt wt w <sup>j</sup> j j*

where *j* denotes the *j*-th fuzzy weight cell for *j*=1,2,…*NL*, *<sup>m</sup> wj* the mean of the fuzzy weights,

*m j j j m N*

 

*we e*

the variance of the fuzzy weights. The elements of the fuzzy weights are updated

*y w*

 

1 *L*

(12)

*<sup>j</sup> j j <sup>j</sup>*

*w w*

The parameter-learning algorithm, based on back-propagation, is defined as follows.

The fuzzy weight cells are updated according to the following equations:

 <sup>1</sup> , 1 /2

*r D*

An input space partition scheme, called the SCM, is used to implement scatter partitioning of the input space. Without any optimization, the proposed SCM is a fast, one-pass algorithm for a dynamic estimation of the number of hypercube cells in a set of data, and for finding the current centers of hypercube cells in the input data space. It is a distance-based connectionist-clustering algorithm. In any hypercube cell, the maximum distance between an example point and the hypercube cell center is less than a threshold value, which has been set as a clustering parameter and which would affect the number of hypercube cells to be estimated.

In the clustering process, the data examples come from a data stream, and the process starts with an empty set of hypercube cells. When a new hypercube cell is created, the hypercube cell center, *C*, is defined, and its hypercube cell distance and hypercube cell width, *Dc* and *Wd*, respectively, are initially set to zero. When more samples are presented one after another, some created hypercube cells will be updated by changing the positions of their centers and increasing the hypercube cell distances and hypercube cell width. Which hypercube cell will be updated and how much it will be changed depends on the position of the current example in the input space. A hypercube cell will not be updated any more when its hypercube cell distance, *Dc*, reaches the value that is equal to the threshold value *Dthr*.

Figure 3 shows a brief clustering process using the SCM in a two-input space. The detailed clustering process can be found in (Lee et al., 2007a).

Fig. 3. A brief clustering process using the SCM with samples *P1* to *P9* in a 2-D space. (Notations: *Pi* for pattern, *Cj* for hypercube cell center, *Dcj* is hypercube cell distance, *Wdj\_x* represents *x-*dimensions hypercube cell width, and *Wdj\_y* stands for *y-*dimensions hypercube cell width) (a) The example *P1* causes the SCM to create a new hypercube cell center *C1*. (b) *P2*: update hypercube cell center *C1*, *P3*: create a new hypercube cell center *C2*, *P4*: do nothing. (c) *P5*: update hypercube cell *C1*, *P6*: do nothing, *P7*: update hypercube cell center *C2*, *P8*: create a new hypercube cell *C3*. (d) *P9*: update hypercube cell *C1*.

In this way, the maximum distance from any hypercube cell center to the examples that belong to this hypercube cell is not greater than the threshold value *Dthr*, though the algorithm does not keep any information on passed examples. The center and the jump positions of the receptive field functions are then defined by the following equation:

$$m\_j = \mathbf{C}\_{j,\prime}, \; j = 1, 2, \dots, n,\tag{7}$$

$$\theta\_j^r = \frac{1}{\left(\left(n\_s + 1\right) / \, 2\right)} \cdot r \cdot D\_{j'} \tag{8}$$

where *j n* 1,2,..., , and 1,2,..., .*<sup>s</sup> r n*

426 Fuzzy Inference System – Theory and Applications

An input space partition scheme, called the SCM, is used to implement scatter partitioning of the input space. Without any optimization, the proposed SCM is a fast, one-pass algorithm for a dynamic estimation of the number of hypercube cells in a set of data, and for finding the current centers of hypercube cells in the input data space. It is a distance-based connectionist-clustering algorithm. In any hypercube cell, the maximum distance between an example point and the hypercube cell center is less than a threshold value, which has been set as a clustering parameter and which would affect the number of hypercube cells to

In the clustering process, the data examples come from a data stream, and the process starts with an empty set of hypercube cells. When a new hypercube cell is created, the hypercube cell center, *C*, is defined, and its hypercube cell distance and hypercube cell width, *Dc* and *Wd*, respectively, are initially set to zero. When more samples are presented one after another, some created hypercube cells will be updated by changing the positions of their centers and increasing the hypercube cell distances and hypercube cell width. Which hypercube cell will be updated and how much it will be changed depends on the position of the current example in the input space. A hypercube cell will not be updated any more when its hypercube cell

Figure 3 shows a brief clustering process using the SCM in a two-input space. The detailed

(a) (b)

(c) (d)

Fig. 3. A brief clustering process using the SCM with samples *P1* to *P9* in a 2-D space. (Notations: *Pi* for pattern, *Cj* for hypercube cell center, *Dcj* is hypercube cell distance, *Wdj\_x*

represents *x-*dimensions hypercube cell width, and *Wdj\_y* stands for *y-*dimensions hypercube cell width) (a) The example *P1* causes the SCM to create a new hypercube cell center *C1*. (b) *P2*: update hypercube cell center *C1*, *P3*: create a new hypercube cell center *C2*, *P4*: do nothing. (c) *P5*: update hypercube cell *C1*, *P6*: do nothing, *P7*: update hypercube cell

center *C2*, *P8*: create a new hypercube cell *C3*. (d) *P9*: update hypercube cell *C1*.

distance, *Dc*, reaches the value that is equal to the threshold value *Dthr*.

clustering process can be found in (Lee et al., 2007a).

be estimated.

The threshold parameter *Dthr* is an important parameter in the input space partition scheme. A low threshold value leads to the learning of fine clusters (such that many hypercube cells are generated), whereas a high threshold value leads to the learning of coarse clusters (such that fewer hypercube cells are generated). Therefore, the selection of the threshold value *Dthr* critically affects the simulation results, and the threshold value is determined by practical experimentation or trial-and-error

#### **3.2.2 The parameter-learning scheme**

In the parameter-learning scheme, there are four adjustable parameters ( , , *mij ij <sup>m</sup> wj* , and *wj* ) that need to be tuned. The parameter-learning algorithm of the SC-FCMAC model uses the supervised gradient descent method to modify these parameters. When we consider the single output case for clarity, our goal is to minimize the cost function E, defined as follows:

$$E = \frac{1}{2} \left( y^d(t) - y(t) \right)^2,\tag{9}$$

where ( ) *<sup>d</sup> y t* denotes the desired output at time *t* and *y*(*t*) denotes the actual output at time *t*. The parameter-learning algorithm, based on back-propagation, is defined as follows.

The fuzzy weight cells are updated according to the following equations:

$$
\Delta w\_j^m(t+1) = w\_j^m(t) + \Delta w\_j^m \,\,\,\,\,\tag{10}
$$

$$w\_j^{\sigma}(t+1) = w\_j^{\sigma}(t) + \Delta w\_j^{\sigma} \tag{11}$$

where *j* denotes the *j*-th fuzzy weight cell for *j*=1,2,…*NL*, *<sup>m</sup> wj* the mean of the fuzzy weights, and *wj* the variance of the fuzzy weights. The elements of the fuzzy weights are updated by the amount

$$
\Delta w\_j^m = \eta \cdot e \cdot \frac{\partial y}{\partial w\_j^m} = \eta \cdot e \cdot \frac{\alpha\_j w\_j^\sigma}{\sum\_{j=1}^{N\_L} \alpha\_j w\_j^\sigma} \tag{12}
$$

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 429

We solved the differential Eqs. (18) and (19) with *t* from *t*=0 to *t*=20 and with *x*1(0)=1.0 and *x*2(0)=1.0. We obtained 107 values of *x*1(*t*) and *x*2(*t*) (the chaotic glycolytic oscillator, Wang, 1994) and 107 values of *y*(*t*). Figure 4 shows *y*(*t*), which is the desired function to be learned

this chaotic problem, the initial parameters *η*=0.1 and *Dthr* =1.3 were chosen. First, using the SCA clustering method, we obtained three hypercube cells. The learning scheme then entered parameter learning using the back-propagation algorithm. The parameter training process continued for 200 epochs, and the final trained rms (root mean square) error was 0.000474. The number of training epochs is determined by practical experimentation or trial-

We compared the SC-FCMAC model with other models (Lin et al., 2004; Lin et al., 2001). Figure 5(a) shows the learning curves of the SC-FCMAC model, the FCMAC model (Lin et al., 2004), and the SCFNN model (Lin et al., 2001). As shown in this figure, the learning curve that resulted from our method has a lower rms error. Trajectories of the desired output *y*(*t*) and the SC-FCMAC model's output are shown in Figures 5(b)-5(d). A comparison analysis of the SC-FCMAC model, the FCMAC model (Lin et al., 2004), and the SCFNN model (Lin et al., 2001) is presented in Table 1. It can be concluded that the proposed model obtains better results than some of the other existing models (Lin et al.,

Fig. 4. The nonlinear system: *y t xt xt* sin 1 2 , defined by Eqs. (18)-(20).

*<sup>p</sup> x t* , and the output data was *<sup>p</sup> y t* , for *p*=1,2,...,107. For

by the SC-FCMAC model.

The input data were <sup>1</sup>

and-error tests.

2004; Lin et al., 2001).

*<sup>p</sup> x t* and <sup>2</sup>

$$\begin{split} \Delta w\_{j}^{\sigma} &= \eta \cdot \boldsymbol{\varepsilon} \cdot \frac{\partial \boldsymbol{y}}{\partial w\_{j}^{\sigma}} \\ &= \eta \cdot \boldsymbol{\varepsilon} \cdot \frac{\boldsymbol{\alpha}\_{j} w\_{j}^{\sigma} \sum\_{j=1}^{N\_{\perp}} \boldsymbol{\alpha}\_{j} w\_{j}^{\sigma} - \boldsymbol{\alpha}\_{j} \sum\_{j=1}^{N\_{\perp}} \boldsymbol{\alpha}\_{j} w\_{j}^{\sigma} w\_{j}^{\sigma}}{\left(\sum\_{j=1}^{N\_{\perp}} \boldsymbol{\alpha}\_{j} w\_{j}^{\sigma}\right)^{2}} \end{split} \tag{13}$$

where *η* is the learning rate of the mean and the variance for the fuzzy weight functions between 0 and 1, and *e* is the error between the desired output and the actual output, . *<sup>d</sup> e y y*

The receptive field functions are updated according to the following equations:

$$m\_{i\rangle}(t+1) = m\_{i\rangle}(t) + \Delta m\_{i\rangle} \tag{14}$$

$$
\sigma\_{ij}(t+1) = \sigma\_{ij}(t) + \Delta\sigma\_{ij} \tag{15}
$$

where *i* denotes the ith input dimension for *i*=1,2,…,*n*, *mij* denotes the mean of the receptive field functions, and *σij* denotes the variance of the receptive field functions. The parameters of the receptive field functions are updated by the amount

$$\begin{split} \Delta m\_{ij} &= \eta \cdot e \cdot \frac{\partial y}{\partial \alpha\_{j}} \cdot \frac{\partial \alpha\_{j}}{\partial m\_{ij}} \\ &= \eta \cdot e \cdot \frac{w\_{j}^{m} w\_{j}^{\sigma} \sum\_{j=1}^{N\_{\perp}} \alpha\_{j} w\_{j}^{\sigma} - w\_{j}^{\sigma} \sum\_{j=1}^{N\_{\perp}} \alpha\_{j} w\_{j}^{m} w\_{j}^{\sigma}}{\left(\sum\_{j=1}^{N\_{\perp}} \alpha\_{j} w\_{j}^{\sigma}\right)^{2}} \cdot \alpha\_{j} \cdot \frac{2\left(\mathbf{x}\_{i} - \mathbf{m}\_{ij}\right)}{\sigma\_{ij}^{2}} \end{split} \tag{16}$$

$$\begin{split} \Delta \sigma\_{\vec{\eta}} &= \eta \cdot e \cdot \frac{\partial y}{\partial \alpha\_{j}} \cdot \frac{\partial \alpha\_{j}}{\partial \sigma\_{\vec{\eta}}} \\ &= \eta \cdot e \cdot \frac{w\_{j}^{m} w\_{j}^{\sigma} \sum\_{j=1}^{N\_{\perp}} \alpha\_{j} w\_{j}^{\sigma} - w\_{j}^{\sigma} \sum\_{j=1}^{N\_{\perp}} \alpha\_{j} w\_{j}^{\sigma} w\_{j}^{\sigma}}{\left(\sum\_{j=1}^{N\_{\perp}} \alpha\_{j} w\_{j}^{\sigma}\right)^{2}} \cdot \alpha\_{j} \cdot \frac{2\left(x\_{i} - m\_{\vec{\eta}}\right)^{2}}{\sigma\_{\vec{\eta}}^{3}} \end{split} \tag{17}$$

where *η* is the learning rate of the mean and the variance for the receptive field functions.

#### **3.3 An example: Learning chaotic behaviors**

A nonlinear system *y*(*t*) with chaotic behaviors (Wang, 1994) is defined by the following equations, i.e.,

$$\dot{\mathbf{x}}\_1(t) = -\mathbf{x}\_1(t)\mathbf{x}\_2^{\prime 2}(t) + 0.999 + 0.42\cos(1.75t) \tag{18}$$

$$
\dot{\mathbf{x}}\_2(t) = \mathbf{x}\_1(t)\mathbf{x}\_2^{\prime 2}(t) - \mathbf{x}\_2(t) \tag{19}
$$

$$y(t) = \sin\left(\mathbf{x}\_1(t) + \mathbf{x}\_2(t)\right) \tag{20}$$

 

where *η* is the learning rate of the mean and the variance for the fuzzy weight functions between 0 and 1, and *e* is the error between the desired output and the actual output,

( 1) ( ) *ij ij ij*

where *i* denotes the ith input dimension for *i*=1,2,…,*n*, *mij* denotes the mean of the receptive field functions, and *σij* denotes the variance of the receptive field functions. The parameters

 *t t* 

*w*

*m m N N*

 

1

where *η* is the learning rate of the mean and the variance for the receptive field functions.

*m m N N*

 

1

*L*

1 1

*w*

A nonlinear system *y*(*t*) with chaotic behaviors (Wang, 1994) is defined by the following

<sup>2</sup>

<sup>2</sup>

*L*

*j*

The receptive field functions are updated according to the following equations:

*j*

*j ij*

*m*

*j*

*j ij*

of the receptive field functions are updated by the amount

*ij*

*ij*

 

equations, i.e.,

*<sup>y</sup> m e*

*e*

*<sup>y</sup> <sup>e</sup>*

 

*e*

**3.3 An example: Learning chaotic behaviors** 

*w*

*j*

. *<sup>d</sup> e y y*

*<sup>y</sup> w e*

*e*

1

*L*

1 1

*w*

 

1 1

*<sup>j</sup> <sup>N</sup> ij j j <sup>j</sup>*

*<sup>j</sup> <sup>N</sup> ij j j <sup>j</sup>*

1 12 *x t x tx t* 0.999 0.42 cos 1.75*t* (18)

2 12 2 *x t x tx t x t* (19)

*y t xt xt* sin 1 2 (20)

2 *L L*

 

*j j jj j jj j j j i ij*

2 *L L*

 

*j j jj j jj j j j i ij*

*ww w w ww x m*

*ww w w ww x m*

2 2

2 3

 

 

<sup>2</sup>

*L L*

*m m N N jj jj j jj j j j N j j j*

*w w w w*

2

 

( 1) ( ) *mt mt m ij ij ij* (14)

(15)

(16)

(17)

(13)

We solved the differential Eqs. (18) and (19) with *t* from *t*=0 to *t*=20 and with *x*1(0)=1.0 and *x*2(0)=1.0. We obtained 107 values of *x*1(*t*) and *x*2(*t*) (the chaotic glycolytic oscillator, Wang, 1994) and 107 values of *y*(*t*). Figure 4 shows *y*(*t*), which is the desired function to be learned by the SC-FCMAC model.

The input data were <sup>1</sup> *<sup>p</sup> x t* and <sup>2</sup> *<sup>p</sup> x t* , and the output data was *<sup>p</sup> y t* , for *p*=1,2,...,107. For this chaotic problem, the initial parameters *η*=0.1 and *Dthr* =1.3 were chosen. First, using the SCA clustering method, we obtained three hypercube cells. The learning scheme then entered parameter learning using the back-propagation algorithm. The parameter training process continued for 200 epochs, and the final trained rms (root mean square) error was 0.000474. The number of training epochs is determined by practical experimentation or trialand-error tests.

We compared the SC-FCMAC model with other models (Lin et al., 2004; Lin et al., 2001). Figure 5(a) shows the learning curves of the SC-FCMAC model, the FCMAC model (Lin et al., 2004), and the SCFNN model (Lin et al., 2001). As shown in this figure, the learning curve that resulted from our method has a lower rms error. Trajectories of the desired output *y*(*t*) and the SC-FCMAC model's output are shown in Figures 5(b)-5(d). A comparison analysis of the SC-FCMAC model, the FCMAC model (Lin et al., 2004), and the SCFNN model (Lin et al., 2001) is presented in Table 1. It can be concluded that the proposed model obtains better results than some of the other existing models (Lin et al., 2004; Lin et al., 2001).

Fig. 4. The nonlinear system: *y t xt xt* sin 1 2 , defined by Eqs. (18)-(20).

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 431

(c)

(d)

Fig. 5. Simulation results for learning chaotic behaviors. (a) Learning curves of the SC-FCMAC model, the FCMAC model (Lin et al., 2004), and the SCFNN model (Lin et al., 2001). (b) The desired output *y*(*t*) and the SC-FCMAC model's output for time *t* dimension. (c) The desired output *y*(*t*) and the SC-FCMAC model's output for *x*1(*t*) dimension. (d) The

desired output *y*(*t*) and the SC-FCMAC model's output for *x*2(*t*) dimension.

(a)

(b)

Fig. 5. Simulation results for learning chaotic behaviors. (a) Learning curves of the SC-FCMAC model, the FCMAC model (Lin et al., 2004), and the SCFNN model (Lin et al., 2001). (b) The desired output *y*(*t*) and the SC-FCMAC model's output for time *t* dimension. (c) The desired output *y*(*t*) and the SC-FCMAC model's output for *x*1(*t*) dimension. (d) The desired output *y*(*t*) and the SC-FCMAC model's output for *x*2(*t*) dimension.

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 433

1

where 0 *<sup>j</sup> a* and *ij a* denote the scalar value, *ND* the number of the input dimensions, *NL* the number of hypercube cells, and *xi* denotes the ith input dimension. Based on the above structure, a learning algorithm will be proposed to determine the proper network structure

Similar to the SC-FCMAC model, the P-FCMAC's learning algorithm consists of an input space partition scheme and a parameter learning scheme. As the same SCM method was applied for input space partition, in the following paragraph the main focus is drawing to the scheme of parameter learning for the P-FCMAC, which is exhibited as in Figure 7.

First, the input space partition scheme (i.e., scheme 1) is used to determine proper input space partitioning and to find the mean and the width of each receptive field function. The input space partition is based on the SCM to appropriately determine the various distributions of the input training data. After the SCM, the number of hypercube cells is

(22)

*ND j ij i i a ax* 

0

The *j*-th element of the TSK-type output vectors is described as

Fig. 6. Structure of the P-FCMAC model (Lin & Lee, 2009)

**4.2 Learning algorithm of the P-FCMAC model** 

and its adjustable parameters.


Table 1. Comparisons of the SC-FCMAC model with some existing models for dynamic system identification

## **4. The parametric fuzzy CMAC (P-FCMAC)**

In this section the architecture and learning algorithms of the parametric FCMAC (P-FCMAC, Lin & Lee, 2009) are reviewed, which mainly derived from the traditional CMAC and Takagi-Sugeno-Kang (TSK) parametric fuzzy inference system (Sugeno & Kang, 1988; Takagi & Sugeno, 1985). Since the SCM are inherent in the scheme of input-space partition for the P-FCMAC model, the performance of P-FCMAC is definitional better than the SC-FCMAC. Therefore, another system-identification problem is taken in order to explore the benefit of the P-FCMAC, fully and more fairly.

#### **4.1 Architecture of the P-FCMAC model**

As illustrated in Fig. 6, the P-FCMAC model consists of the input space partition, association memory selection, and defuzzification. The P-FCMAC network like the conventional CMAC network that also approximates a nonlinear function *y*=*f*(*x*) by using two primary mappings, *S*(*x*) and *P*(). These two mappings are realized by fuzzy operations. The function *S*(*x*) also maps each point *x* in the input space onto an association vector = *S*(*x*)*A* that has *NL* nonzero elements (*NL* <*NA*). Different from conventional CMAC network, the association vector 1 2 ( , , , ), *NA* where 01 for all components in , is derived from the composition of the receptive field functions and sensory inputs. Another, several hypercubes is addressed by the input state x that hypercube value is calculated by product operation through the strength of the receptive field functions for each input state. In the P-FCMAC network, we use Gaussian basis function as the receptive field functions and the linear parametric equation of the network input variance as the TSK-type output for learning. Some learned information is stored in the receptive field functions and TSK-type output vectors. A one-dimension Gaussian basis function can be given as defined in Eq. (3). Similar to section 2.2, if a *ND*-dimensional problem is considered a Gaussian basis function with *ND* dimensions is expressed as Eq. (4) defined.

Each element of the receptive field functions is inferred to produce a partial fuzzy output by applying the value of its corresponding association vector as input matching degree. The partial fuzzy output is defuzzified into a scalar output y by the centroid of area (COA) approach. Then the actual output y is derived as,

$$y = \frac{\sum\_{j=1}^{N\_{\perp}} \alpha\_j \left(a\_{0j} + \sum\_{i=1}^{N\_{\Omega}} a\_{ij} x\_i\right)}{\sum\_{j=1}^{N\_{\perp}} \alpha\_j} \tag{21}$$

The *j*-th element of the TSK-type output vectors is described as

432 Fuzzy Inference System – Theory and Applications

Training Steps 200 200 200 Parameters 18 18 20 Hypercube Cells 3 4 N/A RMS errors 0.000474 0.000885 0.000908

Table 1. Comparisons of the SC-FCMAC model with some existing models for dynamic

In this section the architecture and learning algorithms of the parametric FCMAC (P-FCMAC, Lin & Lee, 2009) are reviewed, which mainly derived from the traditional CMAC and Takagi-Sugeno-Kang (TSK) parametric fuzzy inference system (Sugeno & Kang, 1988; Takagi & Sugeno, 1985). Since the SCM are inherent in the scheme of input-space partition for the P-FCMAC model, the performance of P-FCMAC is definitional better than the SC-FCMAC. Therefore, another system-identification problem is taken in order to explore the

As illustrated in Fig. 6, the P-FCMAC model consists of the input space partition, association memory selection, and defuzzification. The P-FCMAC network like the conventional CMAC network that also approximates a nonlinear function *y*=*f*(*x*) by using two primary mappings,

nonzero elements (*NL* <*NA*). Different from conventional CMAC network, the association

composition of the receptive field functions and sensory inputs. Another, several hypercubes is addressed by the input state x that hypercube value is calculated by product operation through the strength of the receptive field functions for each input state. In the P-FCMAC network, we use Gaussian basis function as the receptive field functions and the linear parametric equation of the network input variance as the TSK-type output for learning. Some learned information is stored in the receptive field functions and TSK-type output vectors. A one-dimension Gaussian basis function can be given as defined in Eq. (3). Similar to section 2.2, if a *ND*-dimensional problem is considered a Gaussian basis function

Each element of the receptive field functions is inferred to produce a partial fuzzy output by applying the value of its corresponding association vector as input matching degree. The partial fuzzy output is defuzzified into a scalar output y by the centroid of area (COA)

0 1

 

*L*

*N*

*y*

1

*D*

*N*

 

*a ax*

1

*L*

*j j ij <sup>i</sup> <sup>j</sup> <sup>i</sup> N j j*

maps each point *x* in the input space onto an association vector

). These two mappings are realized by fuzzy operations. The function *S*(*x*) also

1 for all components in

= *S*(*x*)*A* that has *NL*

, is derived from the

(21)

(Lin et al., 2004)

SCFNN (Lin et al., 2001)

Models

**4. The parametric fuzzy CMAC (P-FCMAC)** 

benefit of the P-FCMAC, fully and more fairly.

**4.1 Architecture of the P-FCMAC model** 

 

where 0

with *ND* dimensions is expressed as Eq. (4) defined.

approach. Then the actual output y is derived as,

system identification

*S*(*x*) and *P*(

 vector 1 2 ( , , , ), *NA*

Items SC-FCMAC FCMAC

$$a\_{0j} + \sum\_{i=1}^{N\_D} a\_{ij} x\_i \tag{22}$$

where 0 *<sup>j</sup> a* and *ij a* denote the scalar value, *ND* the number of the input dimensions, *NL* the number of hypercube cells, and *xi* denotes the ith input dimension. Based on the above structure, a learning algorithm will be proposed to determine the proper network structure and its adjustable parameters.

Fig. 6. Structure of the P-FCMAC model (Lin & Lee, 2009)

#### **4.2 Learning algorithm of the P-FCMAC model**

Similar to the SC-FCMAC model, the P-FCMAC's learning algorithm consists of an input space partition scheme and a parameter learning scheme. As the same SCM method was applied for input space partition, in the following paragraph the main focus is drawing to the scheme of parameter learning for the P-FCMAC, which is exhibited as in Figure 7.

First, the input space partition scheme (i.e., scheme 1) is used to determine proper input space partitioning and to find the mean and the width of each receptive field function. The input space partition is based on the SCM to appropriately determine the various distributions of the input training data. After the SCM, the number of hypercube cells is

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 435

where *a0j* denotes the proper scalar, *aij* the proper scalar coefficient of the *i*-th input dimension, and *j* the *j*-th element of the TSK-type output vector for *j*=1,2,…,*NL*. The elements

*j N*

*ij N*

where *η* is the learning rate, between 0 and 1, and *e* is the error between the desired output

 *t t* 

*N NN N*

*N NN N*

0 0 1 11 1

*L*

1

where *η* is the learning rate of the mean and the variance for the receptive field functions.

In this example, a nonlinear system with an unknown nonlinear function, which is approximated by the P-FCMAC network as shown in Figure 8(b), is a model. First, some of

*L*

1

where *i* denotes the *i*-th input dimension for *i*=1,2,…,*n*, *mij* the mean of the receptive field functions, and *σij* the variance of the receptive field functions. The parameters of the

 

*<sup>y</sup> ae e a*

*ae e a*

<sup>0</sup> <sup>1</sup>

*ij <sup>j</sup> <sup>j</sup>*

 

0 0 1 11 1

*j ij i j j j ij i i jj i i ij*

2 *D LL D*

*a ax a ax x m*

2 *D LL D*

*a ax a ax x m*

 

*j ij i j j j ij i i jj i i ij*

 

*y x*

 

*<sup>j</sup> <sup>j</sup> <sup>j</sup>*

*L j*

1 *L i j*

(25)

(26)

( 1) ( ) *mt mt m ij ij ij* (27)

2 2

2 3

*<sup>j</sup> <sup>N</sup> ij <sup>j</sup> <sup>j</sup>*

 

  <sup>2</sup>

(29)

(30)

*<sup>j</sup> <sup>N</sup> ij <sup>j</sup> <sup>j</sup>*

(28)

of the TSK-type output vectors are updated by the amounts

( 1) ( ) *ij ij ij*

receptive field functions are updated by the amounts

*j ij*

*m*

*j*

**4.3 An example: Identification of a nonlinear system** 

*j ij*

*j*

and

and the actual output, . *<sup>d</sup> ey y*

*ij*

*ij*

 

and

*<sup>y</sup> m e*

*e*

*<sup>y</sup> <sup>e</sup>*

 

*e*

0

The receptive field functions are updated according to the following equation:

determined. That is, we can obtain the initial *m* and *σ* of receptive field functions by using SCM. Second, the parameter learning scheme (i.e., scheme 2) is based on supervised learning algorithms. The gradient descent learning algorithm is used to adjust the free parameters. To minimize a given cost function, the m and σ of the receptive field functions and the parameters 0 *<sup>j</sup> a* and *ij a* of the TSK-type output vector are adjusted using the backpropagation algorithm. According to the requirements of the system, these parameters will be given proper values to represent the memory information. For the initial system, the values of the tuning parameters 0 *<sup>j</sup> a* and *ij a* of the element of the TSK-type output vector are generated randomly and the *m* and *σ* of receptive field functions are generated by the proposed SCM clustering method.

Fig. 7. Flowchart of the P-FCMAC model's learning scheme.

In the parameter learning scheme, there are four parameters need to be tuned, i.e. *mij*, *σij*, *a0j*, and *aij*. The total number of tuning parameters for the multi-input single-output P-FCMAC network is 2*ND\*NL*+4*NL*, where *ND* and *NL* denote the number of inputs and hypercube cells, respectively. The parameter learning algorithm of the P-FCMAC network uses the supervised gradient descent method to modify these parameters. When we consider the single output case for clarity, our goal is to minimize the cost function *E*, defined as in Eq. (9).

Then their parameter learning algorithm, based on backpropagation, is described in detail as follows. The TSK-type outputs are updated according to the following equation:

$$a\_{0j}(t+1) = a\_{0j}(t) + \Delta a\_{0j} \tag{23}$$

$$a\_{i\rangle}(t+1) = a\_{i\rangle}(t) + \Delta a\_{i\rangle} \tag{24}$$

where *a0j* denotes the proper scalar, *aij* the proper scalar coefficient of the *i*-th input dimension, and *j* the *j*-th element of the TSK-type output vector for *j*=1,2,…,*NL*. The elements of the TSK-type output vectors are updated by the amounts

$$
\Delta a\_{0j} = \eta \cdot e \cdot \frac{\partial \eta}{\partial a\_{0j}} = \eta \cdot e \cdot \frac{\alpha\_j}{\sum\_{j=1}^{N\_{\parallel}} \alpha\_j} \tag{25}
$$

and

434 Fuzzy Inference System – Theory and Applications

determined. That is, we can obtain the initial *m* and *σ* of receptive field functions by using SCM. Second, the parameter learning scheme (i.e., scheme 2) is based on supervised learning algorithms. The gradient descent learning algorithm is used to adjust the free parameters. To minimize a given cost function, the m and σ of the receptive field functions and the parameters 0 *<sup>j</sup> a* and *ij a* of the TSK-type output vector are adjusted using the backpropagation algorithm. According to the requirements of the system, these parameters will be given proper values to represent the memory information. For the initial system, the values of the tuning parameters 0 *<sup>j</sup> a* and *ij a* of the element of the TSK-type output vector are generated randomly and the *m* and *σ* of receptive field functions are generated by the

proposed SCM clustering method.

Fig. 7. Flowchart of the P-FCMAC model's learning scheme.

In the parameter learning scheme, there are four parameters need to be tuned, i.e. *mij*, *σij*, *a0j*, and *aij*. The total number of tuning parameters for the multi-input single-output P-FCMAC network is 2*ND\*NL*+4*NL*, where *ND* and *NL* denote the number of inputs and hypercube cells, respectively. The parameter learning algorithm of the P-FCMAC network uses the supervised gradient descent method to modify these parameters. When we consider the single output

Then their parameter learning algorithm, based on backpropagation, is described in detail as

( 1) ( ) *ij ij ij at at a* (24)

0 00 ( 1) ( ) *j jj at at a* (23)

case for clarity, our goal is to minimize the cost function *E*, defined as in Eq. (9).

follows. The TSK-type outputs are updated according to the following equation:

$$
\Delta a\_{ij} = \eta \cdot \boldsymbol{e} \cdot \frac{\partial \boldsymbol{y}}{\partial a\_{ij}} = \eta \cdot \boldsymbol{e} \cdot \frac{\boldsymbol{\chi}\_i \boldsymbol{\alpha}\_j}{\sum\_{j=1}^{N\_L} a\_j} \tag{26}
$$

where *η* is the learning rate, between 0 and 1, and *e* is the error between the desired output and the actual output, . *<sup>d</sup> ey y*

The receptive field functions are updated according to the following equation:

$$m\_{ij}(t+1) = m\_{ij}(t) + \Delta m\_{ij} \tag{27}$$

$$
\sigma\_{\vec{\eta}}(t+1) = \sigma\_{\vec{\eta}}(t) + \Delta \sigma\_{\vec{\eta}} \tag{28}
$$

where *i* denotes the *i*-th input dimension for *i*=1,2,…,*n*, *mij* the mean of the receptive field functions, and *σij* the variance of the receptive field functions. The parameters of the receptive field functions are updated by the amounts

$$\begin{split} \Delta m\_{\vec{\eta}} &= \eta \cdot e \cdot \frac{\partial y}{\partial \alpha\_{j}} \cdot \frac{\partial \alpha\_{j}}{\partial m\_{\vec{\eta}}} \\ &= \eta \cdot e \cdot \frac{\left(a\_{0j} + \sum\_{i=1}^{N\_{\rm D}} a\_{ij} \mathbf{x}\_{i}\right) \sum\_{j=1}^{N\_{\rm L}} \alpha\_{j} - \sum\_{j=1}^{N\_{\rm L}} \alpha\_{j} \left(a\_{0j} + \sum\_{i=1}^{N\_{\rm D}} a\_{ij} \mathbf{x}\_{i}\right)}{\left(\sum\_{j=1}^{N\_{\rm L}} \alpha\_{j}\right)^{2}} \cdot \alpha\_{j} \cdot \frac{2\left(\mathbf{x}\_{i} - m\_{\vec{\eta}}\right)}{\sigma\_{ij}^{2}} \end{split} \tag{29}$$

and

$$\begin{split} \Delta \sigma\_{ij} &= \eta \cdot c \cdot \frac{\partial y}{\partial \alpha\_{j}} \cdot \frac{\partial \alpha\_{j}}{\partial \sigma\_{ij}} \\ &= \eta \cdot c \cdot \frac{\left(\mathbf{a}\_{0j} + \sum\_{i=1}^{N\_{D}} \mathbf{a}\_{ij} \mathbf{x}\_{i}\right) \sum\_{j=1}^{N\_{L}} \mathbf{a}\_{j} - \sum\_{j=1}^{N\_{L}} \mathbf{a}\_{j} \left(\mathbf{a}\_{0j} + \sum\_{i=1}^{N\_{D}} \mathbf{a}\_{ij} \mathbf{x}\_{i}\right)} \cdot \mathbf{a}\_{j} \cdot \frac{\mathbf{2}\left(\mathbf{x}\_{i} - \mathbf{m}\_{ij}\right)^{2}}{\sigma\_{ij}^{3}} \end{split} \tag{30}$$

where *η* is the learning rate of the mean and the variance for the receptive field functions.

#### **4.3 An example: Identification of a nonlinear system**

In this example, a nonlinear system with an unknown nonlinear function, which is approximated by the P-FCMAC network as shown in Figure 8(b), is a model. First, some of

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 437

Table 3 shows the comparison the learning result among various models. The previous results were taken from (Wan & Li, 2003; Wang et al., 1995; Farag et al., 1998; Juang et al., 2000). The performance of the very compact fuzzy system obtained by the P-FCMAC

(Scheme 1) Gradient Descent

(Wang et al., 1995) 0.2841 0.00024

+

\_

PFCMAC Learning

(Farag et al., 1998) 0.5221

(Karr 1991) 0.67243

 e

e

+

<sup>1</sup> *Z*

<sup>1</sup> *Z*

<sup>1</sup> *Z*

<sup>1</sup> *Z*

\_

Methods Error Methods Error

0.00057

(Wan & Li, 2003) 0.00028 MRDGA

u

g

The unknown nonlinear function

> ^ *f*

PFCMAC

data in Table IV; (b) On-line testing for real *uk k* ( ) sin(2 / 250)

(Scheme 2)

(Juang et al., 2000) 0.1997 Genetic Algorithm et al.

Table 3. Comparison results of the twenty-one training data for off-line learning.

g(u)

( ) ^ *f u*

(a)

0

1

(b) Fig. 8. The series-parallel identification model. (a) Off-line learning by twenty-one training

0

> 1

network is better than all previous works.

P-FCMAC

SGA-SSCP

Symbiotic Evolution

u(k)

training data from the unknown function are collected for an off-line initial learning process of the P-FCMAC network. After off-line learning, the trained P-FCMAC network is applied to the nonlinear system to replace the unknown nonlinear function for on-line test.

Consider a nonlinear system in (Wang, 1994) governed by the difference equation:

$$y(k+1) = 0.\Im y(k) + 0.6y(k-1) + g[\mu(k)]\tag{31}$$

We assume that the unknown nonlinear function has the form:

$$\log(\mu) = 0.6\sin(\pi\mu) + 0.3\sin(3\pi\mu) + 0.1\sin(5\pi\mu)\tag{32}$$

For off-line learning, twenty-one training data pairs are provided in Table 2 using Eq. (31). The off-line learning configuration of the twenty-one training data points is shown in Figure 12 (a). And the on-line test configuration of the 1000 data points is shown in Figure 12 (b) that using the difference equation is defined as:

$$
\hat{y}(k+1) = 0.\Im y(k) + 0.6y(k-1) + \hat{f}[\mu(k)]\tag{33}
$$

where ˆ *f*[ ( )] *u k* is the approximated function for *guk* [ ( )] by the P-FCMAC network and 0 0.3, and 1 0.6 . The error is defined as in Eq. (9)

In this example, the initial threshold value in the SCM is 0.15, and the learning rate is *η*=0.01. After the SCM clustering process, there are eleven hypercube cells generated. Using the first and second parameter learning schemes, the final trained error of the output approximates 0.00057 and 0.00024 after 300 epochs. The numbers of the adjustable parameters of the trained P-FCMAC network are 66.

For on-line testing, we assume that the series-parallel model shown in Figure 8 (b) is driven by *uk k* ( ) sin(2 / 250) . The test results of the P-FCMAC network are shown in Fig. 9(a) and (c) for the scheme-1 and scheme-2 methods. The errors between the desired output and the P-FCMAC network output are shown in Figs. 9(b) and (d) for the scheme-1 and scheme-2 methods. The learning curves of the scheme 1 and scheme 2 methods are shown in Fig. 10. Figures 9 can prove that the P-FCMAC network successfully approximates the unknown nonlinear function.


Table 2. Training data and approximated data obtained using the P-FCMAC network for 300 epochs

training data from the unknown function are collected for an off-line initial learning process of the P-FCMAC network. After off-line learning, the trained P-FCMAC network is applied

*g*( ) 0.6sin( ) 0.3sin(3 ) 0.1sin(5 ) *uu u u*

For off-line learning, twenty-one training data pairs are provided in Table 2 using Eq. (31). The off-line learning configuration of the twenty-one training data points is shown in Figure 12 (a). And the on-line test configuration of the 1000 data points is shown in Figure 12 (b)

In this example, the initial threshold value in the SCM is 0.15, and the learning rate is *η*=0.01. After the SCM clustering process, there are eleven hypercube cells generated. Using the first and second parameter learning schemes, the final trained error of the output approximates 0.00057 and 0.00024 after 300 epochs. The numbers of the adjustable parameters of the

For on-line testing, we assume that the series-parallel model shown in Figure 8 (b) is driven

(c) for the scheme-1 and scheme-2 methods. The errors between the desired output and the P-FCMAC network output are shown in Figs. 9(b) and (d) for the scheme-1 and scheme-2 methods. The learning curves of the scheme 1 and scheme 2 methods are shown in Fig. 10. Figures 9 can prove that the P-FCMAC network successfully approximates the unknown


Table 2. Training data and approximated data obtained using the P-FCMAC network for 300

. The test results of the P-FCMAC network are shown in Fig. 9(a) and

*f u*( ) *u g u*( ) ˆ

*f*[ ( )] *u k* is the approximated function for *guk* [ ( )] by the P-FCMAC network and

*y*( 1) 0.3 ( ) 0.6 ( 1) [ ( )] *k y k y k g u k* (31)

ˆ *y*ˆ( 1) 0.3 ( ) 0.6 ( 1) [ ( )] *k yk yk f uk* (33)

  (32)

*f u*( )

to the nonlinear system to replace the unknown nonlinear function for on-line test. Consider a nonlinear system in (Wang, 1994) governed by the difference equation:

0.6 . The error is defined as in Eq. (9)

We assume that the unknown nonlinear function has the form:

that using the difference equation is defined as:

where ˆ

0.3, and 1

trained P-FCMAC network are 66.

*u g u*( ) ˆ

0.0000 0.0000 0.002119

by *uk k* ( ) sin(2 / 250) 

nonlinear function.

epochs

0 

Table 3 shows the comparison the learning result among various models. The previous results were taken from (Wan & Li, 2003; Wang et al., 1995; Farag et al., 1998; Juang et al., 2000). The performance of the very compact fuzzy system obtained by the P-FCMAC network is better than all previous works.


Table 3. Comparison results of the twenty-one training data for off-line learning.

Fig. 8. The series-parallel identification model. (a) Off-line learning by twenty-one training data in Table IV; (b) On-line testing for real *uk k* ( ) sin(2 / 250) 

System Identification Using Fuzzy Cerebellar Model Articulation Controllers 439

In this paper, starting from the discussion of traditional CMAC approach, two novel and latest developed fuzzy CMACs are reviewed. By summarizing the drawbacks of the CMAC model, relative improvement made in the literature have been addressed and presented. Via the exhibited self-constructing FCMAC (SC-FCMAC) and parametric FCMAC (P-FCMAC), not only the inference ability of FCMAC is demonstrated, but also presented the state-of-the

Albus, J.S. (1975a). A new approach to manipulator control: The Cerebellar Model

Albus, J. S. (1975b). Data Storage in the Cerebellar Model Articulation Controller (CMAC).

Chang, P.-L.; Yang, Y.-K.; Shieh, H.-L.; Hsieh, F.-H. & Jeng, M.-D. (2010). Grey Relational

Chen, J.Y. (2001). A VSS-type FCMAC Controller, *Proceedings of IEEE International Conference on Fuzzy Systems*, vol. 1, pp. 872-875, ISBN 0-7803-7293-X, December 2-5, 2001 Chow, M.Y. & Menozzi, A. (1994). A Self-organized CMAC Controller, *Proceedings of the* 

Commuri, S. & Lewis, F.L. (1997). CMAC Neural networks for Control of Nonlinear

Dai, Y.; Liu, L. & Zhao, X. (2010) Modeling Of Nonlinear Parameters on Ship with Fuzzy

Farag, W.A.; Quintana, V.H. & Lambert-Torres, G. (1998). A Genetic-Based Neuro-fuzzy

*Neural Networks*, Vol.9, No.5, (September 1998), pp. 756–767, ISSN 1045-9227 Glanz, F.H.; Miller, W.T. & Graft, L. G. (1991). An Overview of the CMAC Neural Network,

Guo, C.; Ye, Z.; Sun, Z.; Sarkar, P. & Jamshidi, M. (2002). A Hybrid Fuzzy Cerebellar Model

Harmon, F.G.; Frank, A.A. & Joshi, S.S. (2005). The Control of a Parallel Hybrid-electric

ISBN 0-7803-0205-2, Washington, DC, USA, August 15-17, 1991

97, (September 1975), pp. 228-233, ISSN 0022-0434

Guangzhou, China, December 5-9, 1994

(April 1997), pp. 635-641, ISSN 0005-1098

Articulation Controller (CMAC). *Transactions of the ASME: Journal of Dynamic Systems Measurement and Control*, Vol. 97, (September 1975), pp. 220-227, ISSN 0022-

*Transactions of the ASME: Journal of Dynamic Systems Measurement and Control*, Vol.

Analysis based Approach for CMAC Learning. *International Journal of Innovative Computing, Information and Control*, Vol.6, No.9, (September 2010), pp. 4001-4018,

*IEEE International Conference on Industrial Technology*, pp. 68-72, ISBN 0-7803-1978-8,

Dynamical Systems: Structure, Stability, and Passivity. *Automatica*, Vol.33, No.4,

CMAC Neural Networks, *Proceedings of IEEE International Conference on Information and Automation*, pp. 2070-2075, ISBN 978-1-4244-5701-4, June 20-23, Harbin, China,

Approach for Modeling and Control of Dynamical Systems. *IEEE Transactions on* 

*Proceedings of IEEE Conference on Neural Networks for Ocean Engineering,* pp. 301-308,

Articulation Controller Based Autonomous Controller. *Computers & Electrical Engineering: an International Journal*, (January 2002), Vol.28, pp.1-16, ISSN 0045-7906

Propulsion System for a Small Unmanned Aerial Vehicle Using a CMAC Neural Network. *Neural Networks*, Vol.18, No.5-6, (June-July 2005), pp. 772-780. ISSN 0893-

**5. Conclusions** 

**6. References** 

0434

2010.

6080

ISSN 1349-4198

art in the field of fuzzy inference systems.

Fig. 9. Comparison of simulation results. (a) Outputs of the nonlinear system (solid line) and the identification model using the proposed network (dotted line) for the scheme 1 method. (b) Identification error of the approximated model for the scheme 1 method. (c) Outputs of the nonlinear system (solid line) and the identification model using the proposed network (dotted line) for the scheme 2 method. (d) Identification error of the approximated model for the scheme 2 method.

Fig. 10. Learning curves for the scheme 1 and scheme 2 parameter learning methods.
