4. The algorithm of collective k-means

Results of clustering by N algorithms of sampling of m objects to l clusters solutions are obtained, which we can write in the form of a binary matrix <sup>k</sup>α<sup>v</sup> ijk, ν ¼ 1, 2, …, N, i ¼ 1, 2, …, m, j ¼ 1, 2, …, l. We assume that the cluster numbers in each algorithm are fixed. Then any horizontal layer number i of this three-dimensional matrix will denote the results of object x<sup>i</sup> clustering. As an ensemble clustering of the sample Χ, we can take the result of clustering the "new" descriptions—the layers of the original matrix <sup>k</sup>α<sup>v</sup> ijk, ν ¼ 1, 2, …, n. As a method of clustering, we take the method of minimizing the dispersion criterion. Let there be a lot of N clusterings <sup>k</sup>α<sup>v</sup> i1j <sup>k</sup>, <sup>k</sup>α<sup>v</sup> i2j <sup>k</sup>, …, <sup>k</sup>α<sup>v</sup> iNj k with heuristic clustering algorithms, then we calculate their sample mean <sup>k</sup>α<sup>∗</sup><sup>ν</sup> <sup>j</sup> <sup>k</sup> as the solution of the problem <sup>P</sup><sup>t</sup> <sup>μ</sup>¼<sup>1</sup> <sup>α</sup><sup>∗</sup><sup>v</sup> <sup>j</sup> � <sup>α</sup><sup>v</sup> iμj � �<sup>2</sup> ! min α∗<sup>v</sup> j . Where do we obtain α<sup>∗</sup><sup>v</sup> <sup>j</sup> <sup>¼</sup> <sup>1</sup> N P<sup>N</sup> <sup>μ</sup>¼<sup>1</sup> <sup>α</sup><sup>v</sup> iμj . Note that this method makes it possible to calculate such ensemble clusterings <sup>Κ</sup> <sup>¼</sup> <sup>K</sup><sup>∗</sup> 1;K<sup>∗</sup> <sup>2</sup>;…; K<sup>∗</sup> l � � that the sets of heuristic clustering of the objects of some cluster of the collective solution will be close to each other in the Euclidean metric. The committee synthesis of collective decisions provides more interpretable solutions. Indeed, if <sup>Κ</sup><sup>ν</sup> <sup>¼</sup> <sup>K</sup><sup>ν</sup> 1;K<sup>ν</sup> 2;…;K<sup>ν</sup> l � �, <sup>ν</sup> <sup>¼</sup> <sup>1</sup>, <sup>2</sup>,…, N are separate solutions of heuristic clustering algorithms, then the cluster of collective solution will be the "intersection" of many some original clusters K1 i1 , K<sup>2</sup> i2 , …, K<sup>N</sup> il .
