5. Conclusions

This chapter presents three aspects of the K-means algorithm: (a) the works that originated the family of K-means algorithms, (b) a systematic review on the algorithm improvements up to 2016, and (c) some of the most recent publications that describe the prospective uses of the algorithm.

Regarding the origin of K-means, it is worth mentioning that it is not only an algorithm but a family of algorithms with the same purpose, which were developed independently in the decades of the 1950s and 1960s.

The systematic review process involved accessing four large databases, from which 1125 documents were retrieved. After applying inclusion and exclusion criteria, 79 documents remained.

Next, we will mention the most important observations organized by subjects:


Regarding the most recent publications on K-means, improvements have been proposed for solving large instances, as well as parallel and distributed implementations and applications of K-means to new fields such as natural language and text processing, among others.

Finally, because the nature of data clustering is exploratory and applicable to data from many disciplines, it is foreseeable that K-means will continue to be widely used, mainly because of the easiness for interpreting its results.
