5. Artificial Bee Colony algorithm

between all Web services in the collection is a process executed in pairs. Let W be

where P, represents all operation names; I, is the set of input parameters; O, is

In particular, in this work the similarity measures were applied only on the operation names. Therefore, the similarity calculation takes as input a matrix of all

ð Þ Let P ¼ p1; p2; p3; …; pn

� �

Additionally, an application programming interface (API) that implements a

scope of this work. Table 2 shows a summary of the semantic similarity measures

With these measures, all service operations are compared, and a set of eight matrixes are created with the distances between them. Figure 4 shows an example

. A deeper analysis and comparison of similarity measures is out of the

large collection of semantic similarity measures (140 methods) is available

of the calculation of the eight similarities with operation names.

Eight measures that exploit WordNet database were used to calculate the semantic similarity between Web service operations. WordNet is a lexical database available online; it is organized into five categories: nouns, verbs, adjectives, adverbs, and function words [9]. The utilization of WordNet for semantic similarity measurements is a good approach in contrast with the traditional syntactic similarity approaches, specifically in the case of service operations, as they normally include a verb indicating the main functionality of the operation method.

W ¼ h i P;I; O (1)

� � (2)

(3)

∈ P � P; 1≤i≤n n o

the tuple that represents all Web services in the collection as follows:

Advanced Analytics and Artificial Intelligence Applications

operation names in the collection of Web services, that is, as follows:

Input Matrix ¼ pi,qi

the set of output parameters.

WNetSSAPI<sup>2</sup>

used.

Figure 4.

2

16

Example of the calculation of semantic similarities.

http://wnetss-api.smr-team.org/

The Artificial Bee Colony (ABC) algorithm is an optimization technique that simulates the foraging behavior of honey bees and has been successfully applied to various practical problems and is a nature-inspired and swarm intelligence method that has been applied in different scenarios with good results. The ABC algorithm was proposed in 2005 by Karaboga [18, 19]; accordingly, the collective intelligence model of the Bee Colony consists of:


The ABC algorithm has different modes of behavior:


Figure 5. ABC algorithm general workflow.

An important behavior of employed and onlooker bees is their capacity of sharing information (memory) to choose and adjust the food source value. This value depends on the proximity to the nest, the richness or concentration of honey energy [18]. The exchange of information occurs during the waggle dance at the hive. Onlooker foragers watch numerous dances at the dancing area and decide to employ themselves at the most profitable food source. When an onlooker forager recruit starts searching and locates the food source, then it utilizes its own capability to memorize the location and starts exploiting it. The onlooker forager becomes an employed forager. In the ABC algorithm the set of possible solutions represent the food sources, and the food source value represents the quality of the solution. A general representation of the ABC workflow algorithm is presented in Figure 5.

## 6. Hybrid algorithm description

A hybrid algorithm was proposed to make the ABC auto-adjustable during each iteration to decide the number of clusters by incorporating K-means and a Consensus method. In particular, K-means is used to select the elements inside each generated cluster to decide centroids for similarity calculations. The solution of the algorithm is represented as a vector of size n (number of Web services to cluster) where each position of the element in the vector is the group to which it belongs to.

#### 6.1 Objective function

The objective function of this hybrid algorithm is shown in Eq. (4):

$$\mathbf{Min} \sum\_{i=-1 \atop \mathbf{x} \in c\_{\ell} \\ \mathbf{y} \in c\_{\ell}}^{\mathbf{C}} d\left(\mathbf{x}\_{i}, \mathbf{y}\_{i}\right) \tag{4}$$

the average matrix was calculated considering only those values that were filtered and accepted as feasible values that contribute with information to the hybrid

Using the average matrix a set of arrays is generated representing the bees and other important information as the number of groups and centroids. Table 3 shows

a. Max group. This hybrid algorithm does not require the user to indicate how many groups it should generate; the algorithm as it iterates determines how many groups to generate based on the results obtained on the previous iteration and applying the Consensus method. Initially, the i-th bee will generate a random number of groups, based on a discrete uniform distribution with limits 2 to N/2; in the subsequent iterations of the algorithm, the i-th bee will produce a random number γ (based on a normal distribution of the weighted variance of the Max group determined by the colony in the previous

b.Centroids. Next, the i-th bee must determine the centroids of the γ groups involved in the classification. The centroid sub-vector is formed by N integer elements, where xijk is zero if it is not considered as a centroid for any group, in case xijk = a implies that the j-th Web service is centroid of the k-th group. Initially such values are assigned randomly; later by applying Eq. (6), the

<sup>i</sup> � ϕ X<sup>0</sup>

a. β represents the sum of the similarity between centroids of the groups, while α is the summation of the similarity between members of each group to the corresponding centroid. The objective is to minimize β and maximize α

objective is to obtain the highest f(x) value. During each iteration, the f(x)

∑<sup>N</sup> <sup>i</sup>¼1f xð Þ<sup>i</sup>

Groups generated Centroids β α Normalization Assessment Max group Limit 32,231 32,001 1.068 1.55 0.301 0.36 3 5

c. Normalization is used to determine the quality of the food source found.

<sup>i</sup> � X<sup>0</sup> s

(6)

<sup>β</sup> þ α; the

. The bee will abandon the food

, first vector generated by the algorithm; ϕ, aleatory

iteration), then a simple rounding will be applied to γ.

new ¼ round X<sup>0</sup>

b.For each bee the assessment function is calculated using f xð Þ¼ <sup>1</sup>

source if there is a better food source in the near surrounding.

values of the sub-vector are obtained:

new, new vector; X<sup>0</sup>

value is stored in the solution vector.

Normalization is calculated with ni <sup>¼</sup> f xð Þ<sup>i</sup>

Composition of the vector with information of generated groups.

X0

i

algorithm.

where X<sup>0</sup>

Table 3.

19

number between 0 and 1

simultaneously.

6.4 Bee's representation

the structure of the solution generated.

Bio-Inspired Hybrid Algorithm for Web Services Clustering

DOI: http://dx.doi.org/10.5772/intechopen.85200

where d, distance between centroid of cluster yi and a service xi; yi, centroid of cluster i; xi: one of the services included in cluster i. No group of services should be empty, and there should be no intersection between groups.

#### 6.2 Filtering similarities

The first stage of the hybrid algorithm consists of filtering of the eight matrices that contain the information of similarities between Web services. The filtering consists of discarding values that exceed the limits allowed and established by the similarity measures, as a result of this filtering, new matrices are generated with a degree of 95% certainty in the measurements. Eq. (5) shows the filtering calculation:

$$X - 1.96 \frac{\sigma}{\sqrt{N}} \le \mu \le X + 1.96 \frac{\sigma}{\sqrt{N}} \tag{5}$$

where X, average matrix; 1.96, table value; σ, standard deviation; N, element of the similarity matrix; μ, average similarity.

#### 6.3 Food source representation

After the filtering process, all obtained data is stored in an average matrix (food sources) discarding the positions that contain null or zero information; that is, the average matrix was calculated considering only those values that were filtered and accepted as feasible values that contribute with information to the hybrid algorithm.
