**2. Cluster problem formulation**

The clustering problem in our treatment is formulated by reference to a graph G = (N, E) where N = {1, …, n} is a set of nodes (cluster elements) and E is a set of edges (pairwise connections between elements) given by E ⊂N × N = {(p,q): p,q∈N}. The notation (p,q) is understood to represent an unordered pair (hence (p,q) = (q,p), and is equivalently represented by the set notation {p,q}). Each edge e = (p,q) ∈ E has an associated cost (or length) denoted by c(e) (= c(p,q)). It is not necessary to assume that G is complete or connected. We also do not require that the costs c(e) be nonnegative.

The goal is to partition N into sets (clusters) N<sup>k</sup> , k∈ K = {1, …, ko}, where the value ko is automatically determined by the clustering process. We also identify an associated set of edges E<sup>k</sup> ⊂ {(p,q), p,q∈ N<sup>k</sup> }, where the subgraph (N<sup>k</sup> ,Ek ) of G constitutes a min cost spanning tree over the nodes of N<sup>k</sup> . In contrast to those tree-based clustering approaches that begin with a min cost spanning tree over all of G and selectively delete particular edges, our algorithm produces subgraphs (N<sup>k</sup> ,Ek ), k ∈ K, that may not be possible to obtain by deleting edges from such a tree.

The class of clustering methods we describe is based on specifying the value of a parameter W, whose value uniquely determines the outcome of each clustering method within the class. W is expressed as an additive threshold for selecting edges and hence nodes to be added to a current construction (collection of subgraphs), and observe that W can equally be expressed as a multiplicative threshold in the case where the costs are nonnegative and the two approaches are equivalent in this instance.

We start with any selected value W = Wo≥ 0 and after obtaining a collection of clusters C(W) for a given W we systematically modify W so that over successive iterations all possible cluster collections C(W) for W ≥ Wo will be generated without duplication. The complete range of cluster collections results by choosing Wo = 0 (or Wo = 1 in the multiplicative version).
