**7.1 Considerations in choosing the right algorithm**

Data mining algorithms have to be adapted to work on very large databases. Data reside on hard disks because they are too large to fit in main memory, therefore, algorithms have to make as few passes as possible over the data, as secondary memory fetch cycle increases the computational time and therefore reduces the run time performance. Quadratic algorithms are too expensive, that is the execution time of the operations in clustering algorithms is quadratic and so it becomes an important constraint in choosing an algorithm for the problem at hand. The aim in the thesis is to reduce the interconnections between the circuits with minimum amount of error,hence prototype based clustering is used. The attributes in the data set were less important, so the proximity matrix was created. Since both PAM and NNA belong to partitional and prototype based clustering and also the intention was to get the partition with the minimum interconnections these two algorithms were used.

#### **7.2 Implementation**

The implementation consists of three stages consisting of data extraction, partitioning and result using VHDL (VHSIC (Very High Speed Integrated Circuit) Hardware Description Language) as a tool. In data extraction, a VLSI circuit represented as a bipartite graph is considered. The bipartite graph considered for the approach is shown in Fig 17.

Fig. 17. Bipartition Circuit

Algorithms for CAD Tools VLSI Design 153

The distance of a node to itself is taken as 0 and a low value of distance means the highest similarity. A high value of distance means maximum dissimilarity, and therefore least

This adjacency or distance matrix is acted upon by the two algorithms, to effectively divide the circuit into sub-circuits, with the objective that is minimum interconnection under check. Adapting and applying data mining tools to VLSI circuit partitioning is a new approach. Improvisations and optimizations to the two algorithms are necessary and is essential to

The circuit on which the two data mining algorithms are implemented (NNA and PAM) is as shown below. The circuit is a Binary Coded Decimal (BCD) code to seven segment code converter (Fig18). It has 4 inputs and 7 outputs. In this figure each rectangular block is considered as a node. A node is one which performs a defined function (Fig 19), it may be a simple AND gate or it may contain many interconnected flip-flops. So, a node contains one or more components and performs a logical function, the level of abstraction of a node can

similarity, such nodes can be placed in different sub-circuits.

be changed to suit the basic unit understandable by a CAD tool.

Fig. 18. Circuit of BCD code to Seven Segment code converter

make them workable and viable as CAD tools. Circuit chosen for implementation and testing


Table 3. Bipartition Matrix

The block diagram to recognize sub-circuits with minimum interconnections using two techniques(Nearest Neighbor , PAM ).A new clustering algorithm is explored.

#### **7.3 Applying clustering techniques to VLSI circuit partitioning**

In adapting the two cluster partitioning algorithms to the area of VLSI circuit partitioning, the following considerations are of utmost importance.

The two algorithms take as input an adjacency matrix, which gives an idea of the similarity measure in the form of distances between the various data that are to be clustered. This approach uses this tool to partition circuits, so the circuit to be partitioned is the effective data to be clustered and the basic unit on which the algorithms will act are the nodes in a circuit.

Similarity between nodes in a circuit

Here, the input is the adjacency matrix, which defines the similarity between different nodes in the circuit. The attributes of nodes that are to be quantified as similarity between different nodes are based on several characteristics of logic gates such as,


For example, if two nodes are interconnected, then the similarity between them is increased and the distance between them is reduced compared to two nodes which are not connected together.

Also, if some nodes get a common signal, such as a set of flip-flops sharing a common clock signal, it is desirable to have them partitioned into the same sub-circuit so as to reduce problems due to signal delay of synchronous control inputs. So, the distances between such nodes are also low.

The block diagram to recognize sub-circuits with minimum interconnections using two

In adapting the two cluster partitioning algorithms to the area of VLSI circuit partitioning,

The two algorithms take as input an adjacency matrix, which gives an idea of the similarity measure in the form of distances between the various data that are to be clustered. This approach uses this tool to partition circuits, so the circuit to be partitioned is the effective data to be clustered and the basic unit on which the algorithms will act are the nodes in a

Here, the input is the adjacency matrix, which defines the similarity between different nodes in the circuit. The attributes of nodes that are to be quantified as similarity between different

For example, if two nodes are interconnected, then the similarity between them is increased and the distance between them is reduced compared to two nodes which are not connected

Also, if some nodes get a common signal, such as a set of flip-flops sharing a common clock signal, it is desirable to have them partitioned into the same sub-circuit so as to reduce problems due to signal delay of synchronous control inputs. So, the distances between such

techniques(Nearest Neighbor , PAM ).A new clustering algorithm is explored.

**7.3 Applying clustering techniques to VLSI circuit partitioning** 

nodes are based on several characteristics of logic gates such as,

5. Presence of the node on the maximum delay path

the following considerations are of utmost importance.

Similarity between nodes in a circuit

1. Interconnections between nodes 2. Common signals as input

Sub circuit 1 A, B, C total edges = 7 Sub circuit 2 D, E, F total edges =10 Cell No of edges Bipartition

A 2 1 B 2 1 C 3 1 D 3 0 E 3 0 F 4 0

Table 3. Bipartition Matrix

circuit.

3. Functionality 4. Physical distance

nodes are also low.

together.

The distance of a node to itself is taken as 0 and a low value of distance means the highest similarity. A high value of distance means maximum dissimilarity, and therefore least similarity, such nodes can be placed in different sub-circuits.

This adjacency or distance matrix is acted upon by the two algorithms, to effectively divide the circuit into sub-circuits, with the objective that is minimum interconnection under check. Adapting and applying data mining tools to VLSI circuit partitioning is a new approach. Improvisations and optimizations to the two algorithms are necessary and is essential to make them workable and viable as CAD tools.

Circuit chosen for implementation and testing

The circuit on which the two data mining algorithms are implemented (NNA and PAM) is as shown below. The circuit is a Binary Coded Decimal (BCD) code to seven segment code converter (Fig18). It has 4 inputs and 7 outputs. In this figure each rectangular block is considered as a node. A node is one which performs a defined function (Fig 19), it may be a simple AND gate or it may contain many interconnected flip-flops. So, a node contains one or more components and performs a logical function, the level of abstraction of a node can be changed to suit the basic unit understandable by a CAD tool.

Fig. 18. Circuit of BCD code to Seven Segment code converter

Algorithms for CAD Tools VLSI Design 155

The representation by k-medoids has two advantages. First, it presents no limitations on attributes types and second, the choice of medoids is dictated by the location of a predominant fraction of points inside a cluster, therefore it is less sensitive to the presence of outliers. Therefore, PAM is iterative optimization that combines relocation of points between perspective clusters with re nominating the points as potential medoids.Earlier the task is done to find out the optimum value of threshold "t", which decides the cluster density and quality, shows that the value of threshold from 2 to 5 gives optimal minimization of interconnections between sub-circuits. Therefore, for the two algorithms NNA and PAM, the threshold value of 2 and 3 are respectively chosen based on this task.

**7.4.2 Details of the partitioned Circuits - Results on a Circuit with 8 Nodes is** 

Fig.20 is an example of a Testing Circuit 1 with 8 nodes before applying the partitioning and the circuits after partitioning using the NN algorithm and applying the PAM algorithms are

**discussed** 

.

shown in Fig. 21. and Fig. 22. respectively.

Fig. 20. Circuit before applying partitioning techniques

Fig. 19. A node (node 5) enlarged.

This shows that a node which is part of the main circuit consists of gates, such as Nand gate and or gates, or one which performs a logical function.

#### **7.4 How to choose k and threshold value**

#### **7.4.1 PAM Algorithm – Choosing initial medoids**

PAM starts from an initial set of medoids, by finding representative objects, called medoids, in clusters and iteratively replaces one of the medoids by one of the non-medoids if it improves the total distance of the resulting clustering. The PAM algorithm is based on the search for k medoids which are representative of the sequences based on the distance matrix. These k values should represent the structure of the sequences. After defining the set of k medoids, they would be used to construct the k clusters and partition the nodes by assigning each observation to the nearest medoid. In doing this, the target would be to identify the medoids that minimize the sum of the dissimilarities in the observations. As it can be seen, the choice of the initial medoids is very important. Medoid is the most centrally located point in a cluster, as a representative point of the cluster. The initial medoids chosen decides the quality of the formed clusters and the computational speed. If the initial medoids chosen are close to the final optimal medoids, yielding the final clusters with reduced cost, the computational cost will be reduced. Otherwise the number of iterations to find the final medoids will increase, this in turn increasing the time taken to obtain results and computational cost.

This shows that a node which is part of the main circuit consists of gates, such as Nand gate

PAM starts from an initial set of medoids, by finding representative objects, called medoids, in clusters and iteratively replaces one of the medoids by one of the non-medoids if it improves the total distance of the resulting clustering. The PAM algorithm is based on the search for k medoids which are representative of the sequences based on the distance matrix. These k values should represent the structure of the sequences. After defining the set of k medoids, they would be used to construct the k clusters and partition the nodes by assigning each observation to the nearest medoid. In doing this, the target would be to identify the medoids that minimize the sum of the dissimilarities in the observations. As it can be seen, the choice of the initial medoids is very important. Medoid is the most centrally located point in a cluster, as a representative point of the cluster. The initial medoids chosen decides the quality of the formed clusters and the computational speed. If the initial medoids chosen are close to the final optimal medoids, yielding the final clusters with reduced cost, the computational cost will be reduced. Otherwise the number of iterations to find the final medoids will increase, this in turn increasing the time taken to obtain results

Fig. 19. A node (node 5) enlarged.

and computational cost.

and or gates, or one which performs a logical function.

**7.4.1 PAM Algorithm – Choosing initial medoids** 

**7.4 How to choose k and threshold value** 

The representation by k-medoids has two advantages. First, it presents no limitations on attributes types and second, the choice of medoids is dictated by the location of a predominant fraction of points inside a cluster, therefore it is less sensitive to the presence of outliers. Therefore, PAM is iterative optimization that combines relocation of points between perspective clusters with re nominating the points as potential medoids.Earlier the task is done to find out the optimum value of threshold "t", which decides the cluster density and quality, shows that the value of threshold from 2 to 5 gives optimal minimization of interconnections between sub-circuits. Therefore, for the two algorithms NNA and PAM, the threshold value of 2 and 3 are respectively chosen based on this task.

#### **7.4.2 Details of the partitioned Circuits - Results on a Circuit with 8 Nodes is discussed**

Fig.20 is an example of a Testing Circuit 1 with 8 nodes before applying the partitioning and the circuits after partitioning using the NN algorithm and applying the PAM algorithms are shown in Fig. 21. and Fig. 22. respectively.

Fig. 20. Circuit before applying partitioning techniques

.

Algorithms for CAD Tools VLSI Design 157

Partitioned circuit obtained after applying Partitioning Around Medoids algorithm

Fig. 23. Circuit before applying partitioning techniques (Rubin, Willy Publications)

Fig. 22. PAM Partitioned circuit showing 2 sub-circuits

**Results on a Circuit with 15 Nodes** 

Example Testing Circuit 2 with 15 nodes:

The circuit shown in Fig7.10 is a BCD to seven-segment code converter before applying the partitioning algorithms and it has 8 nodes as shown in Fig 7.10. This circuit is tested in hardware and the functionality is concluded to be correct.

Partitioned circuit obtained after applying Nearest Neighbor Algorithm

Fig. 21. NNA Partitioned circuit showing 4 sub-circuits

The circuit shown in Fig7.10 is a BCD to seven-segment code converter before applying the partitioning algorithms and it has 8 nodes as shown in Fig 7.10. This circuit is tested in

hardware and the functionality is concluded to be correct.

Fig. 21. NNA Partitioned circuit showing 4 sub-circuits

Partitioned circuit obtained after applying Nearest Neighbor Algorithm

Partitioned circuit obtained after applying Partitioning Around Medoids algorithm

#### **Results on a Circuit with 15 Nodes**

Example Testing Circuit 2 with 15 nodes:

Fig. 23. Circuit before applying partitioning techniques (Rubin, Willy Publications)

Algorithms for CAD Tools VLSI Design 159

This section provides observations about the various techniques explained in this chapter with a detailed results based explaination of the Nearest Neighbor and Partitioning Around

Memetic algorithm (MA) are population based heuristic search approaches for combinatorial optimization problems based on cultural evolution. They are designed to search in the space of locally optimal solutions instead of searching in the space of all candidate solutions. This is achieved by applying local search after each of the genetic operators. Crossover and mutation operators are applied to randomly chosen individuals for a predefined number of times. To maintain local optimality, the local search procedure is

Partitioned circuit obtained using Partitioning Around Medoids algorithm

Fig. 25. PAM Partitioned circuit showing 3 sub-circuits

**8. Conclusion** 

Medoids Clustering Algorithms.

**8.1 Memetic approach to circuit partitioning** 

applied to the newly created individuals.

Partitioned circuit obtained after applying Nearest Neighbor Algorithm

Partitioned circuit obtained after applying Nearest Neighbor Algorithm

Fig. 24. NNA partitioned circuit showing 5 sub-circuits

Partitioned circuit obtained using Partitioning Around Medoids algorithm

Fig. 25. PAM Partitioned circuit showing 3 sub-circuits
