**3. Case studies**

In the presented work, the performance of the NG, GNG, and RGNG algorithms on synthetic data are described. The cases studies are carried out to compare the performance of the three approaches. The experimental results on a public synthetic dataset are presented in the next section. Comparison of different neural networks and the need for such performance parameters using statistical evaluations has been recently highlighted by a number of researchers.

There are four parameters that are used in this work to evaluate the performance of the proposed clustering technique. These performance measures are: classification rate (CR), average partition quality (PQ), minimum cluster number (MCN), and mean square error (MSE). A robust clustering technique should be less sensitive to parameter configurations and give better performance under the same parameter settings in all experiments.

In the following experiments, the parameters are fixed for each technique with typical values suggested in literature. The RGNG technique was set with the typical values provided by Qin and Suganthan [29]: *εbi* <sup>=</sup> 0.1, *εbf* <sup>=</sup> 0.01, *εni* <sup>=</sup> 0.005, *εnf* <sup>=</sup> 0.0005, *α*max <sup>=</sup> <sup>100</sup>, *<sup>k</sup>* <sup>=</sup> 1.3, *<sup>η</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>−</sup><sup>4</sup> . GNG and NG techniques were set with the typical values provided by Fritzke [24]: *ε<sup>b</sup>* <sup>=</sup> 0.05, *ε<sup>n</sup>* <sup>=</sup> 0.006,  *<sup>α</sup>*max <sup>=</sup> <sup>100</sup>, *<sup>β</sup>* <sup>=</sup> 0.0005, λ <sup>=</sup> <sup>300</sup> for GNG; and *ε<sup>i</sup>* <sup>=</sup> 0.5,  *<sup>ε</sup><sup>f</sup>* <sup>=</sup> 0.005, *λ<sup>i</sup>* <sup>=</sup> <sup>10</sup>, *λ<sup>f</sup>* <sup>=</sup> 0.01, *<sup>t</sup>* max <sup>=</sup> 40000 for NG network.

Each index of the performance measures is explained in the following sections.

#### **3.1. Classification rate**

This index refers to the classification rate (CR) for the whole dataset so that each data point is classified according to its nearest prototype. CR is based on using a majority voting classifier [32] by labeling all prototypes using a simple voting mechanism. According to the proposed technique, the numbers of prototypes are small, so the resulting CR will not be high.

#### **3.2. Partition quality**

This index refers to the average partition quality (PQ) measurement, which is averaged over all the independent runs in the experiments. PQ was defined by Hamerly and Elkan [33], as:

$$\text{PQ} = \frac{\sum\_{i=1}^{n} p^i (i, j)^2}{\sum\_{i=1}^{n} p^i (i)^2} \tag{4}$$

The number of classes *ncs* should equal the actual number of clusters if each natural cluster is assumed to stand for an individual class. The minimum cluster number *mct* can be obtained by

Performance Assessment of Unsupervised Clustering Algorithms Combined MDL Index

http://dx.doi.org/10.5772/intechopen.74506

185

The *p*(*i*, *j*) term represents the frequency based on the probability that a data point is labeled by clusters *i* and *j*. The *p*(*i*, *j*) quality is normalized by the sum of true probabilities; then, squared. This statistic is related to the rand statistic for comparing partitions [34]. The PQ index is maximized when the number of clusters *mct* is correctly detected and induces the same partition of *ncs*, i.e., *mct* <sup>=</sup> *ncs*, so that all points in each cluster are the same as those in one of the

The minimum cluster number (MCN) is the average number of detected clusters by the techniques. The MCN indexes the ability of the techniques to find the underlying natural clusters. During the training of the techniques and according to the MCN value, only the proposed

During the growing process, this value is defined as the number of natural clusters in which the algorithm places at least one prototype when the number of prototypes in the network reaches the actual number of clusters. Cluster numbers detected by NG and GNG during the growing process deviate from the actual value of clusters when the number of prototypes is

Mean square error (MSE) is another criterion used for evaluating the performance of the proposed clustering technique. The MSE value represents the mean distance between the current nearest prototypes' positions resulting from the application of the techniques and the actual

The average MSE value in this experiment is higher for NG and GNG techniques than the RGNG technique. This indicates that the RGNG approach achieves the best accuracy with the

There are six different types of 2D synthetic datasets [29, 35] which are used in this work. They are snail, screw, ring, set3, set5, and set25 dataset. **Figures 4**–**6** show the plots of NG, GNG, and RGNG clustering with three types of 2D synthetic datasets (screw, set5, and snail)

These figures cannot clearly differentiate between each method. Hence, four parameters are used in this work to evaluate the performance of the proposed clustering techniques: CR, PQ, MCN, and MSE introduced in the previous section. For the best comparison with RGNG, MDL criterion is added to NG and GNG techniques. The training results of these techniques

as an example. The number of neurons are selected randomly, *N* = 7, 10, and 12.

RGNG approach can find the actual number of clusters successfully.

running the techniques.

natural clusters.

**3.3. Minimum cluster number**

the same as the actual number of clusters.

strongest stability among the three approaches.

**4. Experimental results with synthetic data**

**3.4. Mean square error**

cluster centers.

where:

*ncs*: true number of classes

*mct*: minimum number of clusters found by the technique

*p*(*i*, *j*): probability of a point vector in cluster *j* belonging to the class *i*

*p*(*i*): class probability

The number of classes *ncs* should equal the actual number of clusters if each natural cluster is assumed to stand for an individual class. The minimum cluster number *mct* can be obtained by running the techniques.

The *p*(*i*, *j*) term represents the frequency based on the probability that a data point is labeled by clusters *i* and *j*. The *p*(*i*, *j*) quality is normalized by the sum of true probabilities; then, squared. This statistic is related to the rand statistic for comparing partitions [34]. The PQ index is maximized when the number of clusters *mct* is correctly detected and induces the same partition of *ncs*, i.e., *mct* <sup>=</sup> *ncs*, so that all points in each cluster are the same as those in one of the natural clusters.

#### **3.3. Minimum cluster number**

**3. Case studies**

184 Recent Applications in Data Clustering

network.

**3.1. Classification rate**

**3.2. Partition quality**

*ncs*: true number of classes

*p*(*i*): class probability

where:

PQ =

*mct*: minimum number of clusters found by the technique

*p*(*i*, *j*): probability of a point vector in cluster *j* belonging to the class *i*

In the presented work, the performance of the NG, GNG, and RGNG algorithms on synthetic data are described. The cases studies are carried out to compare the performance of the three approaches. The experimental results on a public synthetic dataset are presented in the next section. Comparison of different neural networks and the need for such performance parameters using statistical evaluations has been recently highlighted by a number of researchers. There are four parameters that are used in this work to evaluate the performance of the proposed clustering technique. These performance measures are: classification rate (CR), average partition quality (PQ), minimum cluster number (MCN), and mean square error (MSE). A robust clustering technique should be less sensitive to parameter configurations and give bet-

In the following experiments, the parameters are fixed for each technique with typical values suggested in literature. The RGNG technique was set with the typical values provided by Qin

NG techniques were set with the typical values provided by Fritzke [24]: *ε<sup>b</sup>* <sup>=</sup> 0.05, *ε<sup>n</sup>* <sup>=</sup> 0.006, 

This index refers to the classification rate (CR) for the whole dataset so that each data point is classified according to its nearest prototype. CR is based on using a majority voting classifier [32] by labeling all prototypes using a simple voting mechanism. According to the proposed

This index refers to the average partition quality (PQ) measurement, which is averaged over all the independent runs in the experiments. PQ was defined by Hamerly and Elkan [33], as:

> ∑ *i*=1 *ncs* ∑ *j*=1 *nct p* (*i*, *j*)2 \_\_\_\_\_\_\_\_\_ ∑ *i*=1 *ncs p* (*i*)2

technique, the numbers of prototypes are small, so the resulting CR will not be high.

. GNG and

(4)

max <sup>=</sup> 40000 for NG

and Suganthan [29]: *εbi* <sup>=</sup> 0.1, *εbf* <sup>=</sup> 0.01, *εni* <sup>=</sup> 0.005, *εnf* <sup>=</sup> 0.0005, *α*max <sup>=</sup> <sup>100</sup>, *<sup>k</sup>* <sup>=</sup> 1.3, *<sup>η</sup>* <sup>=</sup> <sup>1</sup> <sup>×</sup> <sup>10</sup><sup>−</sup><sup>4</sup>

*<sup>α</sup>*max <sup>=</sup> <sup>100</sup>, *<sup>β</sup>* <sup>=</sup> 0.0005, λ <sup>=</sup> <sup>300</sup> for GNG; and *ε<sup>i</sup>* <sup>=</sup> 0.5,  *<sup>ε</sup><sup>f</sup>* <sup>=</sup> 0.005, *λ<sup>i</sup>* <sup>=</sup> <sup>10</sup>, *λ<sup>f</sup>* <sup>=</sup> 0.01, *<sup>t</sup>*

Each index of the performance measures is explained in the following sections.

ter performance under the same parameter settings in all experiments.

The minimum cluster number (MCN) is the average number of detected clusters by the techniques. The MCN indexes the ability of the techniques to find the underlying natural clusters. During the training of the techniques and according to the MCN value, only the proposed RGNG approach can find the actual number of clusters successfully.

During the growing process, this value is defined as the number of natural clusters in which the algorithm places at least one prototype when the number of prototypes in the network reaches the actual number of clusters. Cluster numbers detected by NG and GNG during the growing process deviate from the actual value of clusters when the number of prototypes is the same as the actual number of clusters.

#### **3.4. Mean square error**

Mean square error (MSE) is another criterion used for evaluating the performance of the proposed clustering technique. The MSE value represents the mean distance between the current nearest prototypes' positions resulting from the application of the techniques and the actual cluster centers.

The average MSE value in this experiment is higher for NG and GNG techniques than the RGNG technique. This indicates that the RGNG approach achieves the best accuracy with the strongest stability among the three approaches.
