1. Introduction

Images are often the most important category among the available digital data. In the recent years, image data is increasing and will continue increase in the near future. Since it is difficult to deal with large amount of image data as the available data increases, it becomes crucial to use the automated tools for various purposes in connection to image data. The image processing provides wide range of techniques to deal with the images. By using the image processing techniques, we can make the work much easier not only for now, but also for the future when there will be more data and more work to do on the images.

Image segmentation is an essential image processing technique that analyzes an image by partitioning it into non-overlapped regions each region referring to a set of pixels. The pixels in a region are similar with respect to some characteristic such as color, intensity, or texture [1]. The pixels significantly differ with those in the other regions with respect to the same characteristic [2–4]. Image segmentation plays an important role in a variety of applications such as robot vision, object recognition, medical imaging and etc. [5–7]. Image segmentation approaches can be divided into four categories. They are thresholding, edge detection, region

extraction and clustering. Clustering techniques can be used for data segmenting image data as they are used for partitioning large datasets into groups according to the homogeneity of data points.

In clustering, a given population of data is partitioned into groups such that objects are similar to one another within the same group and are dissimilar to the objects in other groups [8, 9]. There are different categories of clustering techniques. These can be partitional (hierarchical and non-hierarchical), like K-means, PAM, CLARA, CLARANs [10, 11]; model-based, like Expectation Maximization, SOM, Mixture model clustering [12, 13]; or fuzzy-based like Fuzzy C-Means [14, 15].

Partitional clustering techniques attempt to break a population of data into some predefined number of clusters such that the partition optimizes a given criterion.

Formally, clusters can be seen as subsets of the given dataset. So, clustering methods can be classified according to whether the subsets are fuzzy or crisp (hard). In hard clustering, an object either does or does not belong to a cluster. These methods partition the data into a specified number of mutually exclusive subsets. However, in fuzzy-based clustering, the objects may belong to several clusters with different degrees of membership [16].

It is studied in the literature that many researchers experimented with the Fuzzy C-Means (FCM) algorithm in a wide variety of ways for achieving better image segmentation results [1, 17]. In [18], a penalized FCM (PFCM) algorithm is presented for image segmentation for handling noise by adjusting a penalty coefficient. The penalty term used here takes the spatial dependence of the objects into consideration, which is modified according to the FCM criterion. In [19], a fuzzy rule-based technique is proposed. It employs the rule-based neighborhood enhancement system to impose spatial continuity by post-processing on the clustering results obtained using FCM algorithm. In [20], a Geometrically Guided FCM (GG-FCM) algorithm is proposed, which is based on a semi-supervised FCM technique for multivariate image segmentation. In [21], a regularization term was introduced into the standard FCM to impose the neighborhood effect. In [22], this regularization term is incorporated into a kernel-based fuzzy clustering algorithm. In [23], this regularization) term is incorporated into the adaptive FCM (AFCM) algorithm [24] to overcome the noise sensitivity of AFCM algorithm.

However, it is found in the literature that a very less attention is paid towards the hybridization of clustering techniques for partitioning the datasets.

The present research work aims at developing hybrid clustering algorithms involving K-Means and Fuzzy C-Means (FCM) techniques for achieving better clustering results. As part of hybridization, two algorithms are developed, KMFCM and KMandFCM. The KMFCM algorithm first performs K-Means on the dataset and then performs FCM using the results of K-Means. The KMandFCM algorithm performs K-Means and FCM in the alternative iterations.

All the experiments are carried out using the datasets that are related to four images. For performance evaluation, CPU time, clustering fitness and sum of squared error (SSE) are taken into consideration.

The following sections provide a detailed discussion of K-Means (KM), Fuzzy C-Means (FCM), KMFCM and KMandFCM algorithms.

### 2. The K-Means (KM) algorithms

Partitional clustering methods are appropriate for the efficient representation of large datasets [11]. These methods determine k clusters such that the data objects in a cluster are more similar to each other than to the objects in other clusters.

#### Segmenting Images Using Hybridization of K-Means and Fuzzy C-Means Algorithms DOI: http://dx.doi.org/10.5772/intechopen.86374

The K-Means is a partitional clustering method, which partitions a given dataset into a pre-specified number, k, of clusters [25]. It is a simple iterative method. The algorithm is initialized by randomly choosing k points from a given dataset as the initial cluster centers, i.e., cluster means. The algorithm iterates through two steps till its convergence:


The algorithm for K-Means is as follows [26]. Here, k represents the number of clusters, d represents the number of dimensions or attributes, Xi represents the ith data sample, μ<sup>j</sup> (j = 1, 2, …, k) represents the mean vector of cluster Cj, t is the iteration number. For termination condition the algorithm computes percentage change, Eq. (2). The algorithm terminates when Percentage change < α. Here, α is assumed to be 3 since it is negligible.

#### KM algorithm


$$d\left(\mathbf{X}\_i, \mu\_j\right) = \sqrt{\sum\_{l=1}^d \left(\mathbf{x}\_{il} - \mu\_{jl}\right)^2} \tag{1}$$

3.Update mean vectors μ<sup>j</sup> ( j = 1, …, k).

4.Compute Percentage change as follows

$$\text{Percentage change} = \frac{|\Psi\_t - \Psi\_{t+1}|}{\Psi\_t} \times 100\tag{2}$$

where Ψ<sup>t</sup> is the number of vectors assigned to new clusters in tth iteration and Ψ<sup>t</sup> + 1 is the number of vectors assigned to new clusters in (t + 1)th iteration.

5.Stop the process if Percentage change < α, otherwise set t = t + 1 and repeat the steps 2–4 with the updated parameter.

The K-Means uses Euclidean distance as a proximity measure for determining the closest cluster to which a data object is assigned [13]. The algorithm stops when the assignment of data points to the clusters no longer changes or some other criterion is satisfied. The K-Means is a widely used algorithm for clustering and it requires less CPU time. However, it mainly suffers from detecting the natural clusters that have non-spherical shapes or widely different sizes or densities [25].
