**6. Conclusion**

The vital issue in clustering categorical data is the notion of distance/similarity between the observations. The best practice approaches are limited to numeric values. Hence, specific models are being developed for categorical data. The algorithm introduced in this paper can serve as an alternative to existing ones. It is based on the main characteristics of categorical data. It presents a new framework for clustering categorical data, which is not based either on distance or on similarity measure. The main concept of the algorithm is the assessment of the similarity matrix, updating latter based on the important criteria of each feature or subset of feature and grouping only if the categories of objects entirely match. These allow to cluster of categorical data without conversion. Another advantage is the description of the features, as the algorithm allows to identify the feature which causes the partitioning of the data. These can be very important in interpreting clustering results. The main advantage of the algorithm is high accuracy and few initial parameters.

Our future work plan is to develop and implement a modification of the algorithm to cluster mixture data. Furthermore, overcome its limitation and adopt it to clustering big datasets. Such an algorithm is required in a number of data mining applications, such as partitioning very large sets of objects into a number of smaller and more manageable subsets that can be more easily modeled and analyzed.
