5. HCM incorporating local membership KL divergence

not belong to the ith cluster f in is maximum since the denominator is minimum while the numerator is maximum. This implies that f in causes Din to increase when the pixels of the immediate neighborhood of the nth pixel do not belong to the ith cluster. This increase of Din contributes to decreasing the membership uin for achieving and preserving the minim of the

The membership uin and the cluster-center vi associated with the SFCM method are given by [13].

(14)

(15)

(17)

(18)

uin <sup>¼</sup> <sup>1</sup> P<sup>C</sup> j¼1 Din Djn � � <sup>1</sup> ð Þ m�1

> P<sup>N</sup> <sup>n</sup>¼<sup>1</sup> uinxn <sup>P</sup><sup>N</sup> <sup>n</sup>¼<sup>1</sup> uin

It is obvious from (14) that similar to the standard FCM, the membership uin is inversely proportional to the weighted distance Din, which again means that, increasing Din when the immediate neighboring pixels to the nth pixel do not belong to the ith cluster, decreases the membership function uin. From (15), however, it is clear that the SFCM algorithm computes the cluster-center vi in a similar way as the standard FCM method does. Hence, additive noise

The membership entropy has been incorporated into the HCM for fuzzification. The membership entropy-based FCM (MEFCM) algorithm has the following objective function [17].

where β > 0 is a weight experimentally selected to control the fuzziness of the entropy term. We still need U to be constrained to satisfy (5). It can be shown that the membership and the

> expð Þ din=<sup>β</sup> <sup>þ</sup><sup>1</sup> expð Þ djn=<sup>β</sup> <sup>þ</sup><sup>1</sup>

<sup>2</sup> which is a function of only xn with no data or membership information of the

uin <sup>¼</sup> <sup>1</sup> P<sup>C</sup> j¼1

> P<sup>N</sup> <sup>n</sup>¼<sup>1</sup> uinxn <sup>P</sup><sup>N</sup> <sup>n</sup>¼<sup>1</sup> uin

It is obvious so far that the membership function of the nth entities provided by FCM, HCM and MEFCM algorithms depends upon the inverse of the square of the Euclidean distance

clustering entity's neighbors are involved. Hence, the FCM, HCM and MEFCM algorithms miss important spatial local data and membership information. Thus additive noise can degrade xn, vi and din, thereby biasing the membership of a degraded entity to a false cluster.

vi ¼

<sup>n</sup>¼<sup>1</sup> ð Þ uin log ð Þþ uin ð Þ <sup>1</sup> � uin log 1ð Þ � uin (16)

vi ¼

can still reduce the accuracy of cluster center vi obtained by the SFCM algorithm.

X<sup>C</sup> i¼1 X<sup>N</sup>

cluster-center that minimize (16) are respectively given by [17]

4.4. HCM incorporating membership entropy

JMEFCM ¼ JHCM þ β

din ¼ k k xn � vi

SFCM function in (11).

40 Recent Applications in Data Clustering

In [18], an approach to incorporating local spatial membership information into HCM algorithm has been presented. By adding Kullback-Leibler (KL) divergence between the membership function of an entity and the locally-smoothed membership in the immediate spatial neighborhood, the modified objective function, called the local membership KL divergencebased FCM (LMKLFCM), is given by [18–22].

$$J\_{\rm LMKLFM} = J\_{\rm HCM} + \gamma \left( \sum\_{i=1}^{\mathbb{C}} \sum\_{n=1}^{\mathbb{N}} u\_{in} \log \left( \frac{u\_{in}}{\pi\_{in}} \right) + \sum\_{i=1}^{\mathbb{C}} \sum\_{n=1}^{\mathbb{N}} \overline{u}\_{in} \log \left( \frac{\overline{u}\_{in}}{\overline{\pi}\_{in}} \right) \right) \tag{19}$$

where γ is a weighting parameter experimentally selected to control the fuzziness induced by the second term in (19), uin ¼ 1 � uin is the complement of the membership function uin, πin and πin are the spatial local or moving averages of membership uin and the complement membership uin, functions respectively. These local membership and membership complement averages are computed by [18–22].

$$
\pi\_{\dot{m}} = \frac{1}{N\_{K\_{\stackrel{k}{\rightleftharpoons}}}} \sum\_{k \in N\_{\stackrel{k}{\rightleftharpoons}}, k \neq n} \mu\_{\dot{m}} \tag{20}
$$

$$\overline{\pi}\_{\text{in}} = \frac{1}{N\_{\text{K}}} \sum\_{k \in N\_{\text{n}} \text{'} k \neq n} (1 - \mu\_{\text{ik}}) = 1 - \pi\_{\text{in}} \tag{21}$$

where Nn is a set of entities/pixels falling in a square window around the nth pixel and NK is its cardinality. It is obvious that all entities in the window are weighted equally by wð Þ <sup>u</sup> pq ¼ 1=NK. Other windows can be used such as Gaussian one provided that the weight of the windowcenter is 0 and the rest weights are summed to unity. The first term in (19) provides hardcluster labeling. It pushes the membership function toward 0 or 1. The KL membership and membership-complement divergences, in addition to providing fuzzification approach to HCM clustering, measure the proximity between the membership of a pixel in a certain cluster and the local average of the membership over the immediate spatial neighborhood pixels in this cluster. So, they push the membership function to the locally smoothed membership function πin. Therefore, this can smooth out additive noise and bias the solution to piecewise homogenous labels which leads to a segmented image with piecewise homogenous regions.

The minimization of the objective function JLMKLFCM in (19) yields uin and vi to be given, respectively, by [18].

$$\mu\_{\rm int} = \frac{1}{\sum\_{j=1}^{C} \left( \frac{\pi\_{\rm in} \left( (1 - \pi\_{\rm in}) \exp(d\_{\rm in}/\gamma) + \pi\_{\rm in} \right)}{\left( 1 - \pi\_{\rm in} \right) \exp \left( d\_{\rm in}/\gamma \right) + \pi\_{\rm in}} \right)} \pi\_{\rm in} = \delta\_{\rm in} \pi\_{\rm in} \tag{22}$$

$$\upsilon\_i = \frac{\sum\_{n=1}^{N} \mu\_{in} \mathbf{x}\_n}{\sum\_{n=1}^{N} \mu\_{in}} \tag{23}$$

It is obvious from (22) that uin is proportional to πin and the proportional parameter δin is inversely proportional to the entity's distance din and the maximum δkn occurs when dkn ¼ 0.

divergence-based FCM (LDMKLFCM) algorithms. It is to be noticed that all the algorithms can be implemented almost similar to the pseudo code in Table 1 by replacing the steps 3 and 4 by the corresponding computation of the membership function and cluster centers of each

Incorporating Local Data and KL Membership Divergence into Hard C-Means Clustering for Fuzzy and Noise-Robust…

To measure the performance of the fuzzy clustering algorithms, several quantitative measures or indices have been adopted in [23, 25] and references therein. Few of these measures are the partition coefficient VPC and the partition entropy VPE index of Bezdek and Xie-Beni (XB

X<sup>C</sup>

Acc: ¼ ð Þ TP þ TN =ð Þ TP þ TN þ FP þ FN (29)

Sen: ¼ TP=ð Þ TP þ TN (30)

Spe: ¼ TN=ð Þ TN þ FN (31)

(32)

(33)

X<sup>C</sup>

The closer of the VPC to 1, the better the performance since the minimization is constrained by

In synthetic images, in addition to the above clustering validity measures, several clustering validity and performance measures have also been used such as the accuracy, sensitivity and

where T, F, P, and N are mean true, false, positive, and negative, respectively. The TP, FP, TN, and FN are computed as follows. While generating the synthetic image, the ground truth labels

where xn is the noise-free pixel in the synthetic image and 1 and 0 represent True and False, respectively. After the segmentation is done, the estimated labels are also formulated as logical

<sup>0</sup>; otherwise ; i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, …, C, n <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, ::, N:

<sup>0</sup>; otherwise ; i <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, …, C, n <sup>¼</sup> <sup>1</sup>, <sup>2</sup>, ::, N:

<sup>i</sup>¼<sup>1</sup> uin <sup>¼</sup> <sup>1</sup>: The closer the VPE to 0, the better the performance since this means the less

<sup>i</sup>¼<sup>1</sup> uin (27)

http://dx.doi.org/10.5772/intechopen.74514

43

<sup>i</sup>¼<sup>1</sup> uin log ð Þ uin (28)

VPC <sup>¼</sup> <sup>1</sup> N X<sup>N</sup> n¼1

> N X<sup>N</sup> n¼1

VPE ¼ � <sup>1</sup>

fuzziness of the membership and thus clusters are well-separated.

algorithm.

P<sup>C</sup>

7.1. Clustering validity

index) VXB, given respectively by

specificity given respectively by

matrices generated by [20].

are formulated as the logical matrix given by [23].

�

Finally, the TP, TN, FP, and FN are given by [20].

Lin <sup>¼</sup> <sup>1</sup>; if xn <sup>∈</sup> <sup>i</sup>

<sup>b</sup>Lkn <sup>¼</sup> <sup>1</sup>; k <sup>¼</sup> arg maxið Þ uin

�

It is clear that if γ ! ∞, uin ¼ πin= P<sup>C</sup> <sup>j</sup>¼<sup>1</sup> <sup>π</sup>jn. Therefore, the resultant membership is independent of the data to be clustered but dependent on the initial value of the membership matrix U<sup>0</sup> and on the smoothing fashion. If u<sup>0</sup> in is generated from a random process greater than zero, then u<sup>t</sup> in versus the number of iteration t converges, because of recursive averaging and normalizing, to a normal distribution variable with mean equal to <sup>1</sup> <sup>C</sup> <sup>¼</sup> E u<sup>t</sup> in � � <sup>¼</sup> <sup>E</sup>f g <sup>π</sup>in <sup>=</sup> P<sup>C</sup> <sup>j</sup>¼<sup>1</sup> <sup>E</sup> <sup>π</sup>jn � � which, in this case, means too much fuzzy membership function. This has been proved experimentally by using a synthetic image of 4 clusters and <sup>γ</sup> <sup>¼</sup> 1010: Finally, as shown by (23), the computation of the cluster-center vi is still independent of the local original data.
