**2. Related works**

In a distributed environment, where each processor is completely connected, task clustering(A. Gerasoulis, 1992; T. Yang, 1994; J. C. Liou, 1996) has been known as one of task scheduling methods. In a task clustering, two or more tasks are merged into one cluster by which communication among them is localized, so that each cluster becomes one assignment unit to a processor. As a result, the number of clusters becomes that of required processors. On the other hand, if we try to perform a task clustering in a heterogeneous distributed system, the objective is to find an optimal processor assignment, i.e., which processor should be assigned to the cluster generated by a task clustering. Furthermore, since the processing time and the data communication time depend on each assigned processor's performance, each cluster should be generated with taking that issue into account. As related works for task clustering in heterogeneous distributed systems, CHP(C. Boeres, 2004), Triplet(B. Cirou, 2001), and FCS(S. Chingchit, 1999) have been known.

CHP(C. Boeres, 2004) firstly assumes that "virtual identical processors", whose processing speed is the minimum among the given set of processors. Then CHP performs task clustering to generate a set of clusters. In the processor assignment phase, the cluster which can be scheduled in earliest time is selected, while the processor which has possibility to make the cluster's completion time earliest among other processors is selected. Then the cluster is assigned to the selected processor. Such a procedure is iterated until every cluster is assigned to a processor. In CHP algorithm, an unassigned processor can be selected as a next assignment target because it has no waiting time. Thus, each cluster is assigned to different processor, so that many processors are required for execution and therefore CHP can not lead to the processor utilization.

In Triplet algorithm(B. Cirou, 2001), task groups, each of which consists of three tasks, named as "triplet" according to data size to be transferred among tasks and out degree of each task. Then a cluster is generated by merging two triplets according to its execution time and data transfer time on the fastest processor and the slowest processor. On the other hand, each processor is grouped as a function of its processing speed and communication bandwidth, so that several processor groups are generated. As a final stage, each cluster is assigned to a processor groups according to the processor group's load. The processor assignment policy in Triplet is that one cluster is assigned a processor groups composed of two or more processors. Thus, such a policy does not match with the concept of processor utilization.

In FCS algorithm(S. Chingchit, 1999), it defines two parameters, i.e., *β*: total task size to total data size ratio (where task size means that the time unit required to execute one instruction) for each cluster and *τ*: processing speed to communication bandwidth ratio for each processor. During task merging steps are performed, if *β* of a cluster exceeds *τ* of a processor, the cluster is assigned to the processor. As a result, the number of clusters depends on each processor's speed and communication bandwidth. Thus, there is one possibility that "very small cluster" is generated and then FCS can not match with the concept of processor utilization.
