**Applications & Exploitations**

### **Fitting Truncated Mode Regression Model by Simulated Annealing Fitting Truncated Mode Regression Model by Simulated Annealing**

Maoxi Tian, Jian He and Keming Yu Maoxi Tian, Jian He and Keming Yu

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66070

#### **Abstract**

Like mean, median, and standard deviation, mode as the value that appears most often in a set of data is an important feature of a distribution. The numerical value of the mode is the same as that of the mean and median in a symmetric distribution but may be very different in a highly skewed distribution. Mode regression, which models the relationship between the mode of a dependent variable and some covariates, was first introduced by Lee in terms of truncated dependent variables. Some modifications of the truncated mode regression have been proposed recently. However, little progress is made on the computation or algorithm of fitting a mode regression due to an NP-hard optimization problem. In this paper we first introduce the popular simulated annealing (SA) to solve the truncated mode regression optimization. Experiments with simulations compare favorably to SA. Then, a mode regression with the proposed algorithm is applied to explore the typical income structure of China. We also compare the income returns to gender, education, experience, job sector, and district between the majority of workers with typical income and the workers with mean, middle income via comparison between mode regression, mean regression, and median regression.

**Keywords:** income inequality, Lee's estimate, median regression, mode, mode regression fitting, simulated annealing algorithm, truncated variable

## **1. Introduction**

Mode, the most likely value of a distribution, has wide applications in biology, astronomy, economics, and finance. For example, in the archeology, many practical questions often focus on "Which era the biological species most likely to survive in according to the biological fossils?" In such cases, mode regression provides a convenient summary of how the repressors

© 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. © 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

affect the conditional mode. Income inequality is of growing concern to people in rich as well as poor countries. This inequality may result in problem in social stability in a region or a country. But using average income as a statistical measurer is largely affected by a small number of richest who earned a high proportion of all income. Mode is the right statistic to measure the typical income over the whole population, or mode-based measurement represents the income of majority workers. The causes of income inequality have been attracting a lot of attention in literature and public. Education and experience are often cited as important factors for income. Gender pay gap has been an issue for many regions, particularly in some developing countries. The difference in wage between public sector staff and private workers changes from country to country and region to region. The mode regression or mode-based regression analysis provides a direct and powerful tool to explore the typical income and the return to education, experience, gender, sector, and so on. Mode-based clustering techniques have also been developed [1].

The mode regression models the relationship between the conditional mode of the dependent variable *y*\* and covariates *x* as

$$\mathbf{y}^\* = \mathbf{x}^\prime \boldsymbol{\beta} + \boldsymbol{\varepsilon} \tag{1}$$

where vector *x* = (1, *x*1, …, *xp*)′, *β* = (1, *β*1, …, *βp*)′ is the unknown parameter vector, and *ε* is the model error. Let mode(*y*\*|*x*) be the mode of *y*\* conditional on *x* and then mode(*y*\*|*x*) = *x*′*β* ⇔ mode(*ε*|*x*) = 0. Usually, one assumes that the density of *ε* is wider than [−*w*, *w*] for a *w* > 0 suitably chosen.

Lee [2] first considers truncated model regression where *y*\* is truncated from below at *c* by *y* or *y*\* is observed only when *y*\* > *c*. That is, *y* = max(*y*\*, *c*). Examples of truncated regression include (1) candidates of students who want to nominate a university prize are required to have a minimum examination mark of 70 out of 100 to qualify for the entry. Thus, the sample is truncated at an examination mark of 70. (2) A researcher has data for a sample of British citizens whose income is above the poverty line. Hence, the lower part of the distribution of income is truncated. Truncated regression cannot be fitted by ordinary least squares (OLS) regression, as OLS regression will not adjust the estimates of the coefficients to take into account the effect of truncation, so that the estimated coefficients may be severely biased. This can be conceptualized as a model specification error [3].

Under model (1), given observations on {(*x*1, *y*1), (*x*2, *y*2), …, (*xn*, *yn*)}, we aim at estimating *β*. This is the main task of fitting the model. Lee's [2] mode regression estimates *β* by

$$\hat{\boldsymbol{\beta}} = \arg\max\_{\boldsymbol{\beta}} \left( \boldsymbol{n}^{-1} \sum\_{i=1}^{n} \boldsymbol{I} \left[ \left| \boldsymbol{\nu}\_{i} - \max(\mathbf{x}^{\prime} \boldsymbol{\beta}, \boldsymbol{c} + \boldsymbol{w}) \right| \leq \boldsymbol{w} \right] \right) \tag{2}$$

where *I*[ ] is the indicator function, which takes the value 1 if the condition inside [ ] is satisfied and 0 otherwise.

But little progress on mode regression has been made due to computation difficulty although some modifications of the proposed mode regression have been developed (e.g., see [4–7]). The difficulty in computing these estimators arises because the objective function consists of indicator function *I*[ ] and involves in an absolute function and maximum operator. So that the objective function in Eq. (2) is neither convex nor differentiable but may result in a large number of local maxima. See further details in Section 2. Therefore, the estimation in Eq. (2) is an NP-hard problem, so that standard optimization algorithms will perform poorly if they tend to get trapped in local maxima, or they may not be applicable if analytical gradients are required, because standard optimization tools, which require the objective function to be differentiable and/or convex, may fail to discover the true mode regression function.

affect the conditional mode. Income inequality is of growing concern to people in rich as well as poor countries. This inequality may result in problem in social stability in a region or a country. But using average income as a statistical measurer is largely affected by a small number of richest who earned a high proportion of all income. Mode is the right statistic to measure the typical income over the whole population, or mode-based measurement represents the income of majority workers. The causes of income inequality have been attracting a lot of attention in literature and public. Education and experience are often cited as important factors for income. Gender pay gap has been an issue for many regions, particularly in some developing countries. The difference in wage between public sector staff and private workers changes from country to country and region to region. The mode regression or mode-based regression analysis provides a direct and powerful tool to explore the typical income and the return to education, experience, gender, sector, and so on. Mode-based clustering techniques have also been developed [1].

The mode regression models the relationship between the conditional mode of the depend-

where vector *x* = (1, *x*1, …, *xp*)′, *β* = (1, *β*1, …, *βp*)′ is the unknown parameter vector, and *ε* is the model error. Let mode(*y*\*|*x*) be the mode of *y*\* conditional on *x* and then mode(*y*\*|*x*) = *x*′*β* ⇔ mode(*ε*|*x*) = 0. Usually, one assumes that the density of *ε* is wider than [−*w*, *w*] for a *w* > 0

Lee [2] first considers truncated model regression where *y*\* is truncated from below at *c* by *y* or *y*\* is observed only when *y*\* > *c*. That is, *y* = max(*y*\*, *c*). Examples of truncated regression include (1) candidates of students who want to nominate a university prize are required to have a minimum examination mark of 70 out of 100 to qualify for the entry. Thus, the sample is truncated at an examination mark of 70. (2) A researcher has data for a sample of British citizens whose income is above the poverty line. Hence, the lower part of the distribution of income is truncated. Truncated regression cannot be fitted by ordinary least squares (OLS) regression, as OLS regression will not adjust the estimates of the coefficients to take into account the effect of truncation, so that the estimated coefficients may be severely biased. This

Under model (1), given observations on {(*x*1, *y*1), (*x*2, *y*2), …, (*xn*, *yn*)}, we aim at estimating *β*.

*I y x c w wn*

where *I*[ ] is the indicator function, which takes the value 1 if the condition inside [ ] is satisfied

(2)

This is the main task of fitting the model. Lee's [2] mode regression estimates *β* by

æ ö <sup>=</sup> ç ÷ é ù ¢ <sup>+</sup> £- ë û è ø <sup>å</sup>

*i*

1 1 ˆ arg max ma ( , )x - =

b

*i*

bb

*n*

(1)

\* *y x* = + ¢b e

ent variable *y*\* and covariates *x* as

84 Computational Optimization in Engineering - Paradigms and Applications

can be conceptualized as a model specification error [3].

suitably chosen.

and 0 otherwise.

Currently, rather than dealing with the NP-hard problem directly, some attempts were made to solve the computation of mode regression via replacing rectangular kernel in Eq. (2) by a "smooth" version [4, 5]. However, these smooth versions of mode estimators, in spite of their improved asymptotical properties, may not estimate mode but estimate something else. This big issue is often ignored in literature.

For example, the smoothing version of mode estimator of Kemp and Silva [5] is to maximize

$$\mathcal{Q}\_n(\beta) = n^{-1} \sum\_{l=1}^n K\_h \{ \mathbf{y}\_l - \mathbf{x}' \beta \} \text{ with } K\_h = \frac{1}{h} K\left(\frac{\cdot}{h}\right) \text{ and the standard normal density as kernel.}$$

function *K*( ), and then the estimator is closer to mean than to mode due to the quadratic property of normal density function. Also, a careful selection of bandwidth *h* which requires tending to zero is not available.

Lee [4] employed a quadratic kernel (QME) to smoothing the rectangular kernel and estimated *β* by

$$\begin{split} \hat{\boldsymbol{\beta}} &= \arg\max\_{\boldsymbol{\beta}} \left( \boldsymbol{n}^{-1} \sum\_{i=1}^{n} \left[ \boldsymbol{w}^{2} - \left\{ \boldsymbol{\boldsymbol{\gamma}}\_{i} - \max(\mathbf{x}^{\prime} \boldsymbol{\beta}, \boldsymbol{c} + \boldsymbol{\boldsymbol{w}}) \right\}^{2} \right] \boldsymbol{I} \left[ \left| \boldsymbol{\boldsymbol{\gamma}}\_{i} - \max(\mathbf{x}^{\prime}\_{i} \boldsymbol{\beta}, \boldsymbol{c} + \boldsymbol{\boldsymbol{w}}) \right| \leq \boldsymbol{w}^{\prime} \right] \right) \\ &\Longleftrightarrow \arg\max\_{\boldsymbol{\beta}} \left( \boldsymbol{n}^{-1} \sum\_{i=1}^{n} \max \left[ \boldsymbol{w}^{2} - \left\{ \boldsymbol{\boldsymbol{\gamma}}\_{i} - \max(\mathbf{x}^{\prime} \boldsymbol{\beta}, \boldsymbol{c} + \mathbf{w}) \right\}^{2}, \boldsymbol{0} \right] \right) \end{split} \tag{3}$$

This QME is quite similar to the symmetrically trimmed least squares (STLSs) estimation in Powell [8], which, instead of maximization, minimizes

$$\hat{\beta} = \arg\min\_{\beta} n^{-1} \sum\_{i=1}^{n} \left[ \mathbf{y}\_i - \max(0.5 \mathbf{y}\_i + 0.5c, \mathbf{x}' \boldsymbol{\beta}) \right]^2 \tag{4}$$

However, QME is quite sensitive to the choice of *w* whose optimal value is difficult to derive in practice. And the STLS, which strongly depends on the symmetric requirement of *y* conditional on, needs the symmetry up to ± *x*′*β*. That is, STLS requires global symmetry if *x* is unbounded.

The other way to develop efficient algorithms for the truncated mode regression objective function (2) could be the emulation algorithms (EA) [9, 10], which compute the truncated mode regression estimator by checking every critical point and solving maximum score estimation as a nonlinear programming problem. However, EA exhibits a high degree of complexity in its implementation. EA may achieve convergence to local minima, whereas obtaining a global minimum requires a heavy computational load, something that renders its use in solving real problems impractical.

This paper aims to introduce meta-heuristic methods for computing the elegant rectangular mode regression estimator so that one could not only fit the mode regression but also improve computation efficiency. The recommended heuristic method for dealing with complicated objective function with many local optima could be the popular simulated annealing (SA) [11], because SA provides a means to avoid getting stuck in local optima by accepting worse neighbors in hopes of finding a global optimum. However, SA has not been used in mode regression fitting.

The paper is organized as follows. In Section 2, we outline the issue of many local optima of Lee's estimator and SA computation. Section 3 compares different algorithms or estimation methods such as QME, STLS, EA, and SA via (Monte Carlo) simulation study. In Section 4, we fit a multiple mode regression model for the analysis of a real income data via SA algorithm. The final section concludes with brief remarks.

## **2. Lee's estimator with many local optima and SA algorithm**

As mentioned earlier, truncated mode regression fitting problem may be seen as a global optimization problem (GOP) with many local optima. SA is applied to optimize Eq. (2), which represents a nonlinear objective function. Then, the estimation equation can be formulated as follows:

$$\left\{ f(\beta) = n^{-1} \sum\_{i=1}^{n} I\left[ \left| \mathbf{y}\_i - \max(\mathbf{x}^\prime \beta, \mathbf{c} + \mathbf{w}) \right| \le \mathbf{w} \right] \tag{5}$$

Note that *f*(*β*), as a function of regression coefficient *β*, is an unconstrained nonlinear and nonsmooth function; it is difficult to calculate its optimizer. Moreover, *f*(*β*) is not convex in *β*. Therefore, an effective tool for solving global optimization problems is required. Let us demonstrate this issue via a simple example and graphical representation of the objective function. Draw an independent sample size of 10 from a standard normal distribution, i.e., *xi* ~ *N*(0, 1), *i* = 1, 2,…, 10, and let the i.i.d error term *ε* also have a standard normal distribution. So the dependent variable *y*\* can be generated from the mode regression *yi* \* = *βxi* + *εi* . Then we obtain observations of *yi* from *yi* \* via truncated point *y*0 = 0. This gives a 52% heavy truncated on average. The objective function *f*(*β*) against *β*'s values from this model can take a form as plotted in **Figure 1**. Clearly, the objective function under *w* = 1 has many local maxima, which lie in the range between −0.5 and 0.25. When *w* = 1.5, this objective function still has many localities but in the range between −0.75 and 0.25. However, **Figure 1** shows that these localities under *w* = 0.5 fall in a much narrower range than those under *w* = 1 and *w* = 1.5.

The other way to develop efficient algorithms for the truncated mode regression objective function (2) could be the emulation algorithms (EA) [9, 10], which compute the truncated mode regression estimator by checking every critical point and solving maximum score estimation as a nonlinear programming problem. However, EA exhibits a high degree of complexity in its implementation. EA may achieve convergence to local minima, whereas obtaining a global minimum requires a heavy computational load, something that renders its use in solving real

This paper aims to introduce meta-heuristic methods for computing the elegant rectangular mode regression estimator so that one could not only fit the mode regression but also improve computation efficiency. The recommended heuristic method for dealing with complicated objective function with many local optima could be the popular simulated annealing (SA) [11], because SA provides a means to avoid getting stuck in local optima by accepting worse neighbors in hopes of finding a global optimum. However, SA has not been used in mode

The paper is organized as follows. In Section 2, we outline the issue of many local optima of Lee's estimator and SA computation. Section 3 compares different algorithms or estimation methods such as QME, STLS, EA, and SA via (Monte Carlo) simulation study. In Section 4, we fit a multiple mode regression model for the analysis of a real income data via SA algorithm.

As mentioned earlier, truncated mode regression fitting problem may be seen as a global optimization problem (GOP) with many local optima. SA is applied to optimize Eq. (2), which represents a nonlinear objective function. Then, the estimation equation can be formulated as

**2. Lee's estimator with many local optima and SA algorithm**

1 1

=

So the dependent variable *y*\* can be generated from the mode regression *yi*

from *yi*

*i y xf* bb

*n*

( ) max( , ) -

*i*

=- £ é ù ¢ <sup>û</sup> <sup>+</sup> å <sup>ë</sup>

Note that *f*(*β*), as a function of regression coefficient *β*, is an unconstrained nonlinear and nonsmooth function; it is difficult to calculate its optimizer. Moreover, *f*(*β*) is not convex in *β*. Therefore, an effective tool for solving global optimization problems is required. Let us demonstrate this issue via a simple example and graphical representation of the objective function. Draw an independent sample size of 10 from a standard normal distribution, i.e.,

~ *N*(0, 1), *i* = 1, 2,…, 10, and let the i.i.d error term *ε* also have a standard normal distribution.

on average. The objective function *f*(*β*) against *β*'s values from this model can take a form as plotted in **Figure 1**. Clearly, the objective function under *w* = 1 has many local maxima, which

*c wn wI* (5)

\* via truncated point *y*0 = 0. This gives a 52% heavy truncated

\* = *βxi* + *εi*

. Then we

problems impractical.

regression fitting.

follows:

*xi*

obtain observations of *yi*

The final section concludes with brief remarks.

86 Computational Optimization in Engineering - Paradigms and Applications

**Figure 1.** The objective function of a simple example with *w* = 1, *w* = 1.5, and *w* = 0.5.

The SA algorithm proposed by Kirkpatrick, Gelatt, and Vecchi [11], as a local search metaheuristic, is characterized by an acceptance criterion for neighboring solutions that adapts itself at run time. In this chapter, we process the data by the R software. One of the packages to implement SA in R is *stats* with *sann* as an option of *optim* function [12]. An alternative SA is the *GenSA* function in specific R-package *GenSA*. The SA flow diagram is presented in **Figure 2**.

**Figure 2.** The flow diagram of SA.

#### **3. Numerical comparison**

Following Lee ([2, 4]) consider the mode regression model

$$y\_i = 0 + x\_i + \varepsilon\_i, i = 1, 2, \cdots, n \tag{6}$$

where sample size *n* = 30, *xi* follows a standard normal random variable, and the model error *εi* is generated from a standard normal distribution.

The four methods—QME (quadratic kernel smoothing method in Ref. [2]), STLS (trimmed least square method in Ref. [8]), emulation algorithm (EA in Ref. [9, 10]), and SA [11]—are implemented for numerical comparison of optimization of *f*(*β*) in terms of *β*, where *β* simply stands for the slope coefficient whose true value is 1.

The data generated from the model are under four different distributions (normal, Cauchy, logistic, and gamma) for model error *ε* and two different numbers of truncated schemes: 25 and 50%. For each of these five different distributions of *ε* and each truncated scheme, we implement all four methods to estimate the slope coefficient 200 times, respectively, via simulation.

Then the performance criteria of each method consist of bias of the estimator of slope coefficient, standard deviation (STD), root mean square errors (RMSE), lower quartile (LQ), median (MED), and upper quartile (UQ) of estimation. At each simulation of 200 times, the bias is the difference between estimate and the true value; then the bias we collect for comparison is a simple average of 200 times of replications. The computation of STD, RMSE, LQ, MED, and UQ are based on 200 times of replications.

**Table 1** reports the results by different designs which are the combination of truncation rate and distribution of *ε*.


**Figure 2.** The flow diagram of SA.

**3. Numerical comparison**

where sample size *n* = 30, *xi*

*εi*

Following Lee ([2, 4]) consider the mode regression model

88 Computational Optimization in Engineering - Paradigms and Applications

is generated from a standard normal distribution.

*yxi n i ii* = += 0+ , 1,2, , e

L (6)

follows a standard normal random variable, and the model error


**Table 1.** *yi* = 0 + *xi* + *εi* , *i* = 1, 2, ⋯, *n*, *xi* ~ *N*(0, 1), *n* = 30, and 200 replications; only the slope is reported.

In Design 1, with 50% truncation and the standard normal distribution as the model error, we first check the effect of choosing *w* on the performance of SA algorithm. We note that *w* = 0.5 − 2 appears to be the range. So we try SA algorithm corresponding to *w* = 0.5, 1, 1.5, and 2, respectively. The algorithm gives quite "stable" results with *w* = 0.5, 1, and 1.5 but provides big variation for *w* = 2 under different replicates. This fact not only concludes that the selection of *w* may not be necessarily unique but also indicates that larger *w* may not have better outcome. So in the real data analysis of Section 4, we suppose that *w* follows a uniform distribution over the interval of [*w*1, *w*2], where *w*1 = 0.5 and *w*2 = 1.5.

Because of the results from Design 1, we use *w* = 1 (a value between 0.5 and 1.5) in Designs 2 and 3 for SA algorithm but use *w* = 0.1, 0.5, 0.9, and 1.3 to construct a weighted version of QME (WQME) which is due to Lee's [2] suggestion that an average of the estimates for several *w*s may work along with WQME applied to the reasonable range of *w*. Actually, WQME sometimes performs better than a specific QME. STLS and EA seem to do well in both Designs 2 and 3. This, however, will change for distributions with thicker tails. Furthermore, STLS and EA get much more estimation variation than SA.

In Designs 4–7, we check the sensitivity to the underlying distribution, particularly to the tail behavior of the distribution. We consider the standard Cauchy, standard logistic, and gamma distributions with 50% truncation.

In Designs 4 and 5, except STLS, all methods perform similar. This is also true in Design 7. STLS method, in spite of the small bias on average, gets big estimation variation for most of cases except the case with a normal distribution. For example, among methods, STLS gives the biggest variance and RMSE for fat-tailed distribution (Cauchy) and skewed distributions (logistic and gamma). So STLS is not very reliable for practical application.

In Designs 4 and 5, EA and SA show slightly better than others in some of cases.

A final comment goes to algorithm speed, measured by CPU time. We found that CPU times for all algorithms used in **Table 1** are comparable, typically in the range between 5 and 10 s.

## **4. Returns of human capital in China**

**BIAS SE RMSE LQ MED UQ**

STLS 0.396 6.390 6.949 0.823 1.109 1.605 EA 0.401 1.978 2.231 0.901 1.123 1.589 SA 0.393 1.759 2.217 0.801 1.012 1.598

QME 0.767 2.338 2.479 0.858 1.488 2.620 STLS 0.494 3.765 3.798 0.776 1.189 1.966 EA 0.506 2.634 2.634 0.884 1.052 1.890 SA 0.509 2.031 3.012 0.765 1.175 1.935

QME 0.615 3.147 3.209 0.879 1.402 2.710 STLS 0.556 3.403 3.450 1.048 1.474 2.168 EA 0.598 3.479 3.581 0.881 1.057 1.111 SA 0.501 2.549 3.000 0.893 1.231 1.786

QME 0.688 2.361 2.460 0.771 1.613 2.761 STLS 0.705 3.712 3.778 0.731 1.335 2.486 EA 0.676 2.476 3.247 0.790 0.895 1.431 SA 0.617 3.001 3.223 0.801 1.112 1.524

~ *N*(0, 1), *n* = 30, and 200 replications; only the slope is reported.

In Design 1, with 50% truncation and the standard normal distribution as the model error, we first check the effect of choosing *w* on the performance of SA algorithm. We note that *w* = 0.5 − 2 appears to be the range. So we try SA algorithm corresponding to *w* = 0.5, 1, 1.5, and 2, respectively. The algorithm gives quite "stable" results with *w* = 0.5, 1, and 1.5 but provides big variation for *w* = 2 under different replicates. This fact not only concludes that the selection of *w* may not be necessarily unique but also indicates that larger *w* may not have better outcome. So in the real data analysis of Section 4, we suppose that *w* follows a uniform

Because of the results from Design 1, we use *w* = 1 (a value between 0.5 and 1.5) in Designs 2 and 3 for SA algorithm but use *w* = 0.1, 0.5, 0.9, and 1.3 to construct a weighted version of QME (WQME) which is due to Lee's [2] suggestion that an average of the estimates for several *w*s may work along with WQME applied to the reasonable range of *w*. Actually, WQME sometimes performs better than a specific QME. STLS and EA seem to do well in both Designs 2 and 3. This, however, will change for distributions with thicker tails. Furthermore, STLS and

distribution over the interval of [*w*1, *w*2], where *w*1 = 0.5 and *w*2 = 1.5.

Design 5: 50% truncation, standard normal

90 Computational Optimization in Engineering - Paradigms and Applications

Design 6: 50% truncation, gamma (2, 1) mode

Design 7: 50% truncation, gamma (3, 1) mode

, *i* = 1, 2, ⋯, *n*, *xi*

EA get much more estimation variation than SA.

**Table 1.** *yi*

 = 0 + *xi* + *εi* China has experienced rapid economic growth in the past 30 years, but the increase in wage inequality is gradually a serious problem and should be given more attention. We will analyze the important factors, which affect the incomes for most of Chinese workers. This study consists of an analysis of annual income of 1967 Chinese citizens with ages between 18 and 55 in 2008. The data is provided by Chinese Social Survey Open Database (CSSOD) (http:// www.cssod.org/index.php in Chinese). We aim to check how education, experience, sex, district, and job sector affect the typical income. One may use mean regression to carry out the analysis, but average income is largely affected by small number of high-income receivers, so that the resulting analysis does not represent the majority people or the typical case. For example, majority workers often see their income as far lower than the reported average income. Therefore, mode regression is an ideal model to carry out the analysis.

We use a standard log-linear Mincer formulation:

$$\log Y = \beta\_1 + \beta\_2 \text{Edu} + \beta\_3 \text{Exp} + \beta\_4 \text{Exp}^2 + \beta\_5 \text{Gender} + \beta\_6 \text{Dis} + \beta\_7 \text{Sector} + \varepsilon \tag{7}$$

where log *Y* is the logarithm transform of annual income (in 1000 Yuan) and truncated by 2, as majority annual income is more than 8000 Yuan. *Edu* is the number of years of schooling, *Exp* is potential experience (approximated by the age minus years of schooling minus 7), *Gender* is equal to 1 for male and 0 otherwise, *Dis* is equal to 1 for workers from the eastern China and 0 otherwise, *Sector* is equal to 1 for private sector workers and 0 otherwise, and *ε* is the model error. The estimate coefficient *βi* means that a unit increase in the predictor variable results in an increase in (100(exp(*β*<sup>i</sup> )−1))%.

We now carry out the mean and median regression analysis. That is, we fit the model (7) by the mean regression and the median regression model, respectively. By implementing the *lm* and *rq* (in R-package *quantreg*) function in the R software, we obtain the fitted mean and median results as shown in **Table 2**.


Note: Asymptotic standard errors are in parentheses for the mean and median regression.

\* *p* < 0.1, \*\* *p* < 0.05, and \*\*\* *p* < 0.01 (two tailed)

**Table 2.** The results of three different regressions (%).

**Table 2** shows that each additional year of education increases the conditional-mean income by a factor of exp(0.09381) = 1.09835, which indicates a 9.835% increase. And the fitted median regression model gives a coefficient of 0.08501, which indicates that one more year of education increases the conditional-median income by exp(0.08501) = 1.08873 or a 8.873% increase. That is, for small values of the estimated coefficient *βi* , this is approximately 100 *βi* %.

The experience entering the model in quadratic form is due to the most popular version of the Mincer equation [13], which includes a quadratic function in years of potential experience to capture the fact that on-the-job training investments decline over time in a standard life cycle human capital model. The estimated coefficients of years of experience and its square both are significant in the mean and the median regression. That is, the Chinese worker's income shows an inverted U-shaped relationship with years of experience and reaches the maximum at years of 21.459 and 20.878, respectively. This means that experience within about 20 years does make difference on the income but experience longer than 20 years would not add anything more for the conditional-mean or the conditional-median income.

Compared with being female, being male increases the conditional-mean income by 100[exp(0.27919)−1]% = 32.206% according to the mean regression results but by 100[exp(0.25912)−1]% = 29.579% according to the median regression results. In other words, the conditional-mean income for male is 32.206% higher than it is for female, and male's conditional-median income is 29.579% higher than female's, with other covariates held constant.

and *rq* (in R-package *quantreg*) function in the R software, we obtain the fitted mean and median

134.376\*\*\* (11.486)

8.501\*\*\* (12.614)

1.879\*\*\* (3.314)

−0.045\*\*\* (−3.173)

25.912\*\*\* (7.585)

43.874\*\*\* (12.714)

−2.418 (0. 520)

**Table 2** shows that each additional year of education increases the conditional-mean income by a factor of exp(0.09381) = 1.09835, which indicates a 9.835% increase. And the fitted median regression model gives a coefficient of 0.08501, which indicates that one more year of education increases the conditional-median income by exp(0.08501) = 1.08873 or a 8.873% increase. That

The experience entering the model in quadratic form is due to the most popular version of the Mincer equation [13], which includes a quadratic function in years of potential experience to capture the fact that on-the-job training investments decline over time in a standard life cycle human capital model. The estimated coefficients of years of experience and its square both are significant in the mean and the median regression. That is, the Chinese worker's income shows an inverted U-shaped relationship with years of experience and reaches the maximum at years of 21.459 and 20.878, respectively. This means that experience within about 20 years does make difference on the income but experience longer than 20 years would not add anything more

Note: Asymptotic standard errors are in parentheses for the mean and median regression.

**Mean regression Median regression Mode regression**

132.179\*\*\* (228.825)

3.519\*\*\* (63.185)

3.887\*\*\* (56.280)

−0.192\*\*\* (−47.839)

20.921\*\*\* (232.146)

42.571\*\*\* (447.565)

−3.291\*\*\* (−23.626)

%.

, this is approximately 100 *βi*

results as shown in **Table 2**.

Intercept 113.884\*\*\*

Education 9.381\*\*\*

Experience 2.103\*\*\*

Experience2 −0.049\*\*\*

Gender 27.919\*\*\*

District 44.463\*\*\*

Sector 7.024\*

(9.755)

92 Computational Optimization in Engineering - Paradigms and Applications

(14.418)

(3.630)

(−3.569)

(8.347)

(13.249)

(1.919)

\* *p* < 0.1, \*\* *p* < 0.05, and \*\*\* *p* < 0.01 (two tailed)

**Table 2.** The results of three different regressions (%).

is, for small values of the estimated coefficient *βi*

for the conditional-mean or the conditional-median income.

Similarly, the conditional-mean and conditional-median income difference between workers from the west and the east of China both is about 100[exp(0.44)−1]% = 55.271%.

For the dummy variable sector, the results of the mean regression indicate that the conditionalmean income of workers from private sectors is greater than from public sectors by a factor of exp(0.07024) = 1.072766, that is, a 7.277% increase in conditional-mean income. However, according to the result of the median regression model, the coefficient of sector is not significant, which means that there is no difference between the public and private sectors for the conditional-median income.

As is known that the selection of *w* may not be unique for the mode regression, we suppose that *w* ~ U[0.5, 1.5] and draw 300 different random *w*s from this uniform distribution. Based on the 300 different *w*s, we implement the *GenSA* function in the R-package *GenSA* to fit the multivariate mode regression and obtain 300 different vectors of the coefficients . Following the law of large numbers, the mean of these estimated must approach to the real *βi* . Then

we can use the two-tailed T test for checking if hypotheses about = 0 are true or not. The mean and the two-tailed T test results of these seven coefficients *βi* based on the 300 different mode regressions are shown in **Table 2**.

According to the results of mode regression based on the SA algorithm, the estimated returns to an additional year of education are about 3.519%, and the education has a much smaller effect on the conditional-mode income. Maybe this is one of the reasons why the typical income is lower than the mean and median income.

The coefficients of years of experience and its square both are significant. That is, the relationship between the conditional-mode income and the years of experience is shown in the inverted U-shaped relationship too, but the income for most workers reaches the maximum at 10.122 years of experience. The return to experience, i.e., the derivative of the typical value of log income with respect to experience, is therefore given by a combination of coefficients (3.887 − 2 × 0.192 × experience). The derivative needs to be evaluated at some specified level of experience. Two points were chosen: 5 years of experience, representing fairly new entrants, and 15 years of experience, representing experienced workers. And these two points give the combination of coefficients 1.967 and −1.873. That is, the return to experience of 5 years is 1.967% and experience of 15 years is −1.873%. In contrast with the results of the mean and median regression model, the return to experience of 15 years both is positive. The relationship between income and experience based on these three different regression models is represented in **Figure 3**.

**Figure 3.** The relationship of income and experience.

As shown in **Figure 3**, we conclude that experience does make a great contribution to the conditional-mode income within 10 years, but experience longer than 10 years would not add anything more for it. That is, the conditional-mode income increases rapidly due to experience and decreases rapidly too after reaching the peak income. However, experience returns are positive until about 20 years to the conditional-mean and conditional-median income. Maybe this is another important reason why the typical income is lower than the mean and median income.

The conditional-mode income from a man is about 20.921% more than from a woman, which means that the gender pay gap is slightly less serious for majority workers compared with the mean and median regression. Similarly, the conditional-mode income from workers of eastern China is about 42.571% more than from worker of western China.

The private sector does not have positive return to mode income; this is in sharp contrast with the results of the mean regression. Concretely speaking, the conditional-mode income for the public sector worker is 3.291% higher than for the private sector worker. This result is the opposite of what is shown in the mean regression model.

The results from the analyses of both mean and the mode regression models are inconsistent with the economic intuitions. In China, private sector includes private enterprises operated by local Chinese; Sino foreign joint ventures; Hong Kong-, Macao-, and Taiwan-funded enterprises; and foreign-funded enterprises. Workers in the private enterprises which account for the vast majority of the private sectors do not have the "compilation" or sign the labor contracts with these enterprises; thus, their rights and interests cannot be guaranteed and their wage is low. Although the workers in Sino foreign joint ventures and Hong Kong-, Macao-, and Taiwan-funded enterprises do not have the "compilation," they gain an attractive income because of these enterprises' relatively good efficiency. Workers in the foreign enterprises have the highest wages, especially for those CEOs. So the average wage level of workers in the private sector is relatively high. Most workers in the public sectors protected by the "compilation," often sign the labor contracts with the stated own enterprises, tend to gain a higher wage than the workers in the private enterprise, and the wage gap is very relatively small. Thus, the majority of workers have a lower wage in the private sector than in the public sector, but on average, the opposite is the case.

## **5. Conclusion**

**Figure 3.** The relationship of income and experience.

94 Computational Optimization in Engineering - Paradigms and Applications

income.

As shown in **Figure 3**, we conclude that experience does make a great contribution to the conditional-mode income within 10 years, but experience longer than 10 years would not add anything more for it. That is, the conditional-mode income increases rapidly due to experience and decreases rapidly too after reaching the peak income. However, experience returns are positive until about 20 years to the conditional-mean and conditional-median income. Maybe this is another important reason why the typical income is lower than the mean and median

The conditional-mode income from a man is about 20.921% more than from a woman, which means that the gender pay gap is slightly less serious for majority workers compared with the mean and median regression. Similarly, the conditional-mode income from workers of eastern

The private sector does not have positive return to mode income; this is in sharp contrast with the results of the mean regression. Concretely speaking, the conditional-mode income for the

China is about 42.571% more than from worker of western China.

While mode regression has been found very useful in practical regression analysis, the main goal of this article is to introduce SA algorithm to fit mode regression models. The most popular mode regression model is introduced by Lee [2]. There is no doubt on the elegance of rectangular mode regression estimator of Lee [2], but the main problem with this estimator lies in computation. While the difficulty in obtaining reliable mode regression coefficients limits the application of the mode regression, we propose the SA algorithm for the mode regression estimation and then compare the proposed SA algorithm with other existing methods. To sum up, SA algorithm for fitting truncated mode regression does not require any liberalization or modification of Lee's estimator and solves the corresponding nonlinear optimization problem more efficiently and robustly as a rule.

As an application of the SA algorithm fitting a real data-based mode regression and the illustration of the sensible interpretation of fitted mode regression coefficients by the algorithm, we apply the mode regression model in income inequality analysis of China and some meaningful conclusions are obtained which are different from the mean regression and the quantile regression.

## **Acknowledgements**

The research is supported in part by the National Sciences Foundation of China (11261048) and National Bureau of Statistics of China (2013LY022).

## **Author details**

Maoxi Tian1,2, Jian He1 and Keming Yu3\*


## **References**


[12] Theussl S., Borchers H. CRAN Task View: Optimization and Mathematical Programming. [Internet]. 2015. Available from: https://cran.r-project.org/web/views/Optimization.html

**Author details**

**References**

Maoxi Tian1,2, Jian He1

and Keming Yu3\*

1 Department of Statistics, Shihezi University, Xinjiang, China

3 Department of Mathematics, Brunel University, London, UK

10.1007/s10846-007-9145-xLi

10.1016/0304-4076(89)90057-2

153–161. DOI:10.2307/1912352

10.1016/0304-4076(93)90056-B

1983; 220(4598):671–680.

1208.0579v1

2 Jinhe Center for Economic Research, Xi'an Jiaotong University, Shaanxi, China

[1] Li J., Ray S., Lindsay B. G. A nonparametric statistical approach to clustering via mode identification. Journal of Machine Learning Research. 2007; 8(4):1687–1723. DOI:

[2] Lee M. J. Mode regression. Journal of Econometrics. 1989; 42(3):337–349. DOI:

[3] Heckman J. J. Sample selection bias as a specification error. Econometrica. 1979; 47(1):

[4] Lee M. J. Quadratic mode regression. Journal of Econometrics. 1993; 57:1–19. DOI:

[5] Kemp G. C. R., Silva J. M. C. S. Regression towards the mode. Journal of Econometrics.

[6] Yu K., Aristodemou K. Bayesian mode regression. Technical report (2012). arXiv:

[7] Yao W., Li L. A new regression model: modal linear regression. Scandinavian Journal

[8] Powell J. L. Symmetrically trimmed least squares estimation for Tobit models. Econo-

[9] Pinkse C. A. P. On the computation of semiparametric estimates in limited dependent

[10] Fitzenberger B. A guide to censored quantile regressions. Handbook of Statistics. 1997;

[11] Kirkpatrick S., Gelatt C. D., Vecchi M. P. Optimization by simulated annealing. Science.

2012; 170(1):92–101. DOI:10.1016/j.jeconom.2012.03.002

of Statistics. 2013; 41(3):656–671. DOI: 10.1111/sjos.12054

variable models. Journal of Econometrics. 1993; 58(1–2):185–205.

metrica. 1986; 54(6):1435–1460. DOI:10.2307/1914308

15(97):405–437. DOI:10.1016/S0169-7161(97)15017-9

\*Address all correspondence to: Keming.Yu@brunel.ac.uk

96 Computational Optimization in Engineering - Paradigms and Applications

[13] Mincer J. Schooling, Experience, and Earnings. New York: Columbia University Press; 1974.

Provisional chapter

## **Facility Layout Problem for Cellular Manufacturing**

Facility Layout Problem for Cellular Manufacturing

## **Systems**

Maral Zafar Allahyari and Ahmed Azab Systems

Additional information is available at the end of the chapter Maral Zafar Allahyari and Ahmed Azab

http://dx.doi.org/10.5772/67313 Additional information is available at the end of the chapter

#### Abstract

Good layout plan leads to in improve machine utilization, part demand quality, efficient setup time, less work-in-process inventory and material handling cost. Cellular Manufacturing (CM) is an application of GTCM is the combination of job shop and/or flow shop. Facility Layout Problem (FLP) for CMS includes both inter-cell layout and intra-cell layout. A bi-level mixed-integer non-linear programming continuous model has been formulated to fully define the problem and the relationship between intra-cell and inter-cell layout design. Facilities are assumed unequal size; operation sequences, part demands, overlap elimination, aisle are considered. The problem is NP-hard; hence, a simulated annealing meta-heuristic employing a novel constructive radial-based heuristic for initialization have been designed and implemented. For the first time, a novel heuristic algorithm has been designed to allocate and displace facilities in radial direction. In order to improve the search efficiency of the developed SA algorithm, the cell size used in the initialization heuristic algorithm is assumed twice as that of the original size of the cells. A real case study from the metal cutting inserts industry has been used. Results demonstrate the superiority of the developed SA algorithm against rival comparable meta-heuristics and algorithms from the literature.

Keywords: facility layout problem, cellular manufacturing, mathematical modelling, simulated annealing, aisle constraint

© 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons

Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and eproduction in any medium, provided the original work is properly cited.

## 1. Introduction

Facility layout problem (FLP) is the arrangement of a given number of non-equal-sized facilities within the given space. Good layout plan leads to improve machine utilization, part demand quality, efficient setup time, less work-in-process inventory and material handling cost. Generally speaking, efficient layout design provides two main advantages: (1) Reduction of between 30% to 70% in the total material handling cost (MHC) and (2) designing layout is the long term-plan, hence, any changes in layout impose some expenditure such as shutting down production or service line, losing process time and so on. Thus, designing proper facility layout plan would prevent lots of costs [1].

Several algorithms have been developed for FLP problem. The traditional approach to FLP called discrete representation often addressed by quadratic assignment problem (QAP) with the objective of minimizing a given function cost. There are two main assumptions in QAP: firstly, all facilities are equal size and shape; secondly, the location of facilities is known in a priori. However, these kinds of assumptions are not applicable in real-world case studies. This approach to FLP is not suited to represent the exact location of facilities and cannot formulate FLP especially when facilities are unequal size and shape or if there are different clearances between the facilities. The more suitable approach to such a kind of cases is continuous representation rather than discrete. There are two ways to solve this problem. Chronologically, the first one attempts was to divide each facility into smaller size unit blocks, where the total area of those blocks is approximately equal to the area of the facility. There are two drawbacks to this method: firstly, the problem size is growing as the total number of blocks increase, and secondly, the exact shapes of facilities are ignored. The second approach to continuous problem assumes the exact shape and dimensions of the facilities (Table 1).


Table 1. FLP discrete approach versus FLP continuous approach.

The design of a cellular manufacturing system (CMS) includes: (1) cell formation (CF), (2) group layout, (3) group scheduling and (4) resource allocation. FLP to CMS is focusing on the second step of design of CMS which by itself is twofold: inter-cell and intra-cell layouts. The main objective of group layout is minimizing material handling cost (MHC) by arranging facilities in their corresponding cells and cells in floor. In this chapter, both demand and operation sequencing have been considered in optimizing the layout both at inter- and intracellular levels. However, this was not the case with the literature; there is a dearth of papers that happened to take a discrete approach which really did address those factors. Moreover, in this chapter, a continuous approach has been adopted.

Here, a bi-level mixed-integer non-linear programming continuous model has been developed for both intra-cell and inter-cell layout design sequentially. The problem is to arrange facilities that are machine tools in the leader problem and cells in the follower problem on the continual planar site. The objective function of leader and follower problems is minimizing the material handling cost at intra- and inter-cellular levels, respectively. The developed mathematical model has some main novelties. Firstly, a continuous approach has been adopted; i.e., facilities take unequal size and their locations are not predetermined. Secondly, operation sequences and part demands are taken into consideration. Thirdly, the model has the ability to consider certain restrictions or preferences for cells and floors such as aisle. Finally, CMS design of disjoint cells is considered; hence, the overlapping elimination constraint is presented. Since the model is NP-hard, a novel heuristic has been developed to solve the problem at two different levels (intra- and inter-cellular) in a similar fashion to that used for developing the mathematical model. The developed heuristic is very different from its counterparts in the literature in the sense that it places the facilities radially, while dividing the production floor area into four quadrants. A real case study from the metal cutting industry has been used, where multiple families of inserts have been formed, each with its distinguished master plan.

## 2. Literature review

1. Introduction

the facilities (Table 1).

layout plan would prevent lots of costs [1].

1002 Computational Optimization in Engineering - Paradigms and Applications Computational Optimization in Engineering - Paradigms and Applications

Facility layout problem (FLP) is the arrangement of a given number of non-equal-sized facilities within the given space. Good layout plan leads to improve machine utilization, part demand quality, efficient setup time, less work-in-process inventory and material handling cost. Generally speaking, efficient layout design provides two main advantages: (1) Reduction of between 30% to 70% in the total material handling cost (MHC) and (2) designing layout is the long term-plan, hence, any changes in layout impose some expenditure such as shutting down production or service line, losing process time and so on. Thus, designing proper facility

Several algorithms have been developed for FLP problem. The traditional approach to FLP called discrete representation often addressed by quadratic assignment problem (QAP) with the objective of minimizing a given function cost. There are two main assumptions in QAP: firstly, all facilities are equal size and shape; secondly, the location of facilities is known in a priori. However, these kinds of assumptions are not applicable in real-world case studies. This approach to FLP is not suited to represent the exact location of facilities and cannot formulate FLP especially when facilities are unequal size and shape or if there are different clearances between the facilities. The more suitable approach to such a kind of cases is continuous representation rather than discrete. There are two ways to solve this problem. Chronologically, the first one attempts was to divide each facility into smaller size unit blocks, where the total area of those blocks is approximately equal to the area of the facility. There are two drawbacks to this method: firstly, the problem size is growing as the total number of blocks increase, and secondly, the exact shapes of facilities are ignored. The second approach to continuous problem assumes the exact shape and dimensions of

The design of a cellular manufacturing system (CMS) includes: (1) cell formation (CF), (2) group layout, (3) group scheduling and (4) resource allocation. FLP to CMS is focusing on the second step of design of CMS which by itself is twofold: inter-cell and intra-cell layouts. The main objective of group layout is minimizing material handling cost (MHC) by arranging facilities in their corresponding cells and cells in floor. In this chapter, both demand and operation sequencing have been considered in optimizing the layout both at inter- and intracellular levels. However, this was not the case with the literature; there is a dearth of papers that happened to take a discrete approach which really did address those factors. Moreover, in

Continuous No predetermined location, i.e., no blocks Variable Unequal-sized MIP

Parameters Meller et al., [2] Mathematical formulation

Equal-sized QAP

Approach Plant site Distance Facilities

this chapter, a continuous approach has been adopted.

Discrete Divided in rectangular blocks with same size and shape; i.e., predetermined locations

Table 1. FLP discrete approach versus FLP continuous approach.

The block facility layout problem that was originally formulated by Armour and Buffa [3] is concerned with finding the most efficient arrangement of m indivisible departments with unequal area requirements within a facility [4]. As defined in the literature, the objective of the block layout design problem is to minimize the material handling costs by considering the following two sets of constraints: (a) department and floor area requirements; i.e. departments cannot overlap, must be placed within the facility, and some must be fixed to a location or cannot be placed in specific regions; see Refs. [1, 3, 5, 6].

Cellular layout is considered as one of the special cases of the general FLP. There is an increasing interest in solving the block layout problem by taking a continuous approach [6]. Alfa et al., [9] have developed a model to simultaneously solve group formation and intra-cell. The objective function is the summation of both inter-cell and intra-cell flow times based on distance. They develop SA/heuristic algorithm to solve their model. SA has been used to find the initial solution, and then a heuristic approach based on the penalty model developed to improve the solution. The main limitation of this model is that the cell locations are predetermined.

Bazargan-Lari and Kaebernick published few papers about design of cellular manufacturing [10–13]. Bazargan-Lari and Kaebernick [11] present a continuous plane approach where different constraints such as cell boundaries, non-overlapping, closeness relationships, location restrictions/preferences, orientation constraints and travelling distances have been considered. They develop a hybrid method which combined a non-linear goal programming (NLGP) and simulated annealing for machine layout problem. They have combined all constraints as goals using goal programming (GP) formulas. Generally speaking, GP divides those constraints into two main categories such as absolute or hard and goal or soft constraints. Hard constraints are those that have to be satisfied absolutely. It means that violation of any of them would yield to infeasibility. However, soft constraints can be compromised and be offset from desired set goals. Those constraints are considered as three separate sets of objectives. The first priority level includes all set of absolute or hard objectives which have to be absolutely satisfied such as non-overlapped and cell boundary constraints. The second and third priority levels are preferences. The second priority is devoted to minimizing the area of the cells/shop floor, satisfying closeness relationship and orientation. Finally, the third priority is to minimize the total travelling cost. Overall, the approach of Bazargan-Lari and Kaebernick is a combination of the NLGP and SA. They use the pattern search to solve their NLGP based on those three priorities. Since a pattern search is finding the local minimum, then they have been using SA to exit from the trap of local minimum. The core of their model is that they are generating alternative layout design by changing the order of priority levels 2 and 3 in each outer loop of SA algorithm. In other words, the starting point of new outer loop of SA is generated by the patter search algorithm. By changing the goal priority levels, huge pools of efficient solutions are generating. To solve this issue, they used what they called the filtering process to choose which sets of solutions have more different with the other ones. The logic behind this is giving decision-makers the chance to consider how changing preferences' priorities would impact the solutions.

The other important piece of research was written by Imam and Mir [14, 15]. Imam and Mir [14] introduce a heuristic algorithm to place unequal-sized rectangular facilities in continuous plane by introducing the new concept of 'controlled coverage' by using 'envelop blocks'. In the initial solution, facilities are randomly placed in plane in the envelop block the size of which is much larger than the actual size of facility and is calculated by multiplying magnification factor with the facilities' actual dimensions. Afterwards, during the heuristic iterations, the sizes of envelop blocks are gradually decreased by decreasing the magnification factor until the dimensions of envelopes will became equal to the dimensions of their corresponding facilities. By this approach, they were controlling the coverage of facilities together. The improvement iteration is based on the univariate search method. In this method, only one of the 2n design variables where n is the number of facilities is changing at time. This change means moving facility horizontally or vertically along the x-axis or y-axis, respectively. There are three drawbacks to their method. Firstly, each iteration cycle is repeated 2n times, n times to move facilities horizontally and then another n more times to move them vertically. The other drawback is that facilities are just allowed to move horizontally or vertically, there is no diagonal movement. Thirdly, there are no borders for the assumed continuous plane. However, in real world, there is no plane without borders. The last drawback is related to magnification factor, they have not specified how large this factor has to be originally and by which fraction it has to be reduced in each iteration cycle.

Mir and Imam [15] have mentioned the second drawback above is addressed and try to improve their primary procedure. They develop a hybrid model by using SA for gaining the sub-optimal initial feasible solution and then they improved it using a steepest descent approach. As they also noted that the number of optimization iterations depends of the magnification factor by which the size of the envelope blocks reduces when the magnification factor was being reduced. The algorithm stopped when the magnification factor is equal to one. So it is obvious that the computational cost and time are quite dependent on magnification factor.

those constraints into two main categories such as absolute or hard and goal or soft constraints. Hard constraints are those that have to be satisfied absolutely. It means that violation of any of them would yield to infeasibility. However, soft constraints can be compromised and be offset from desired set goals. Those constraints are considered as three separate sets of objectives. The first priority level includes all set of absolute or hard objectives which have to be absolutely satisfied such as non-overlapped and cell boundary constraints. The second and third priority levels are preferences. The second priority is devoted to minimizing the area of the cells/shop floor, satisfying closeness relationship and orientation. Finally, the third priority is to minimize the total travelling cost. Overall, the approach of Bazargan-Lari and Kaebernick is a combination of the NLGP and SA. They use the pattern search to solve their NLGP based on those three priorities. Since a pattern search is finding the local minimum, then they have been using SA to exit from the trap of local minimum. The core of their model is that they are generating alternative layout design by changing the order of priority levels 2 and 3 in each outer loop of SA algorithm. In other words, the starting point of new outer loop of SA is generated by the patter search algorithm. By changing the goal priority levels, huge pools of efficient solutions are generating. To solve this issue, they used what they called the filtering process to choose which sets of solutions have more different with the other ones. The logic behind this is giving decision-makers the

102 Computational Optimization in Engineering - Paradigms and Applications

chance to consider how changing preferences' priorities would impact the solutions.

The other important piece of research was written by Imam and Mir [14, 15]. Imam and Mir [14] introduce a heuristic algorithm to place unequal-sized rectangular facilities in continuous plane by introducing the new concept of 'controlled coverage' by using 'envelop blocks'. In the initial solution, facilities are randomly placed in plane in the envelop block the size of which is much larger than the actual size of facility and is calculated by multiplying magnification factor with the facilities' actual dimensions. Afterwards, during the heuristic iterations, the sizes of envelop blocks are gradually decreased by decreasing the magnification factor until the dimensions of envelopes will became equal to the dimensions of their corresponding facilities. By this approach, they were controlling the coverage of facilities together. The improvement iteration is based on the univariate search method. In this method, only one of the 2n design variables where n is the number of facilities is changing at time. This change means moving facility horizontally or vertically along the x-axis or y-axis, respectively. There are three drawbacks to their method. Firstly, each iteration cycle is repeated 2n times, n times to move facilities horizontally and then another n more times to move them vertically. The other drawback is that facilities are just allowed to move horizontally or vertically, there is no diagonal movement. Thirdly, there are no borders for the assumed continuous plane. However, in real world, there is no plane without borders. The last drawback is related to magnification factor, they have not specified how large this factor has to be originally and by which fraction it has to be reduced in each iteration cycle. Mir and Imam [15] have mentioned the second drawback above is addressed and try to improve their primary procedure. They develop a hybrid model by using SA for gaining the sub-optimal initial feasible solution and then they improved it using a steepest descent approach. As they also noted that the number of optimization iterations depends of the magnification factor by which the size of the envelope blocks reduces when the magnification On the other hand, there are various papers that considered alternative as a discrete approach. QAP is an NP-complete problem, which means that when the size of the problem is increasing it cannot be solved by exact algorithm [16]. Hence, lots of efforts have been made to develop and apply heuristic and meta-heuristic algorithm for this kind of problem. Wilhelm and Ward [16] have applied simulated annealing (SA) to solve QAP. Their results have been compared with the computerized relative allocation of facilities technique (CRAFT), biased sampling and revised Hillier problem and showed better quality solutions.

Baykasoğlu and Gindy [17] have applied SA for dynamic layout problem, discrete approach. They claim their proposed algorithm finds better solution. They compared their proposed algorithm to the three works done [18–20]. In the first comparison, their SA approach found optimum solution and revealed better solution than dynamic programming algorithm of Rosenblatt [18]. The second comparison has two experiments: first one carried out with no shifting cost and the SA algorithm found optimum solution and outperforms that Conway and Venkataramanan [19] genetic algorithm. In this experiment, relocation costs are included. The optimum solution was not found; however, the results of SA showed a slight improvement over that of Rosenblatt [18]. Finally, in the third comparison the data set obtained from Balakrishnan and Cheng [20]. They develop non-linear genetic algorithm (NLGA). The comparison between the SA-based approach and NLGA reveals the superiority of SA algorithm when the size of the problems is large. Since they have taken discrete approach to FLP, the only operator has been used in neighbourhood generation algorithm is the swap operator.

Tavakkoli-Moghaddam et al., [21] are developed a non-linear mathematical modelling to solve the cell formation in dynamic environment in which demand varies in each time horizon. The strength point of their model is that it is a multi-objective model, i.e. considering more than one objective such as machine cost, operating cost, inter-cell material handling cost and machine relocation cost. Three meta-heuristic models, such as genetic algorithm (GA), simulated annealing (SA) and tabu search (TS), have been used to solve this problem. The results show SA outperforms compare to the two meta-heuristics.

Safaei et al., [22] have developed a mixed integer programming which tries to minimize machine constant and variable costs, inter- and intra-material handling cost and reconfiguration costs. They present a hybrid model called mean field annealing and simulated annealing (MFA-SA) to solve the problem. MFA stands for mean field annealing which used to find the feasible initial solution for SA. Most of the developed heuristics in the literature have taken a discrete approach to FLP than a continuous one. Developing heuristics for the discrete problem is easier, because locations are predetermineda priori; hence, the only operator that is usually used is the swap operator, to shuffle the different facilities locations. Moreover, in the discrete approach no overlap would happen between facilities. On the other hand, it is harder to design heuristics for the continuous formulation of FLP since overlap takes place. It is usually the case that repeated repairs and checks of validity of the generated solutions have to take place.

## 3. Mathematical modelling

The problem is to arrange facilities that are cells in the leader problem and machine tools in the follower problem in their respective space. The site has a rectangular shape with specified length (L) and width (W). Moreover, there is a horizontal aisle in the site by the same length as of site, however, with two different vertical dimensions YVALU and YVALL. Aisle divides the site into two sections, upper and lower. No facilities are allocated to the aisles. The objective is to minimize the total travel-flow cost by considering shape, size and geometric characteristic of the different facilities. Facilities have a rectangular shape. The position of each facility is determined by the coordinates of its centroid as well as its predetermined length and width. Facilities are not allowed to overlap each other and have to be assigned in their related boundary areas, which is the overall site's boundaries for the follower problem and that of the cell for the leader problem. The traditional Cartesian coordinate system, shown in Figure 1, represents the scheme used in this chapter. The following model has represented by Allahyari and Azab [7, 8] and Allahayri [7]. The problem is formulated under the following assumptions [6]:

Figure 1. Scheme of shop.

The problem is formulated under the following assumptions:


The mathematical formulation represented as below

Sets:

3. Mathematical modelling

104 Computational Optimization in Engineering - Paradigms and Applications

The problem is to arrange facilities that are cells in the leader problem and machine tools in the follower problem in their respective space. The site has a rectangular shape with specified length (L) and width (W). Moreover, there is a horizontal aisle in the site by the same length as of site, however, with two different vertical dimensions YVALU and YVALL. Aisle divides the site into two sections, upper and lower. No facilities are allocated to the aisles. The objective is to minimize the total travel-flow cost by considering shape, size and geometric characteristic of the different facilities. Facilities have a rectangular shape. The position of each facility is determined by the coordinates of its centroid as well as its predetermined length and width. Facilities are not allowed to overlap each other and have to be assigned in their related boundary areas, which is the overall site's boundaries for the follower problem and that of the cell for the leader problem. The traditional Cartesian coordinate system, shown in Figure 1, represents the scheme used in this chapter. The following model has represented by Allahyari and Azab [7, 8] and Allahayri

[7]. The problem is formulated under the following assumptions [6]:

The problem is formulated under the following assumptions:

1. CF is known in advanced.

Figure 1. Scheme of shop.

2. Machines are not in the same size.

3. Machines must be located within a given area. 4. Machines are not allowed to overlap each other.

5. Cell's dimensions and orientation are predetermined.

P ¼ {1, 2, 3, …, P} Index set of part types

M ¼ {1, 2, 3, …, M} Index set of machine types

C ¼ {1, 2, 3, …, C} Index set of cell types

Op ¼ {1, 2, 3, …, Op} Index set of operations indices for part p

Parameters:

L Horizontal dimension of shop floor

W Vertical dimension of shop floor

YVALU Vertical dimension of upper side of aisle

YVALL Vertical dimension of lower side of aisle

XHALLF Horizontal dimension of left side of aisle

XHALRT Horizontal dimension of right side of aisle

li Length of machine i

wi Width of machine i

lc Length of cell c

wc Width of cell c

CAj Intra-cellular transfer unit cost for part j

CEj Inter-cellular transfer unit cost for part j

Dj Demand quantity for part j

Ujoi 1, if operation o of part j is done by machine i, otherwise 0

U0 joi 1, if operation o of part j is done by machine i which is located in cell c, otherwise 0

Qic 1, if machine i is assigned in cell c

Decision variables:

xi Horizontal distance between centre of machine i and vertical reference line

yi Vertical distance between centre of machine i and horizontal reference line

x0 <sup>c</sup> Horizontal distance between centre of cell c and vertical reference line

y0 <sup>c</sup> Vertical distance between centre of cell c and horizontal reference line

Ziu 1, if machine u is arranged in the same horizontal level as machine i, and 0 otherwise

Wcc<sup>0</sup> 1, if cell c is arranged in the same horizontal level as cell c<sup>0</sup> and 0 otherwise

Zc 1, if cell c is arranged in out of aisle horizontal boundaries and 0 otherwise

Wc 1, if cell c is arranged in out of aisle vertical boundaries and 0 otherwise

The continuous bi-level programming problem is defined as: the intra-cell layout mathematical formulation to layout the different machines (machines here are the facilities) of every cell c at a time is as follows:

$$\text{Min} \sum\_{j=1}^{p} \sum\_{o=1}^{o\_{\mathcal{V}}-1} \sum\_{i,u=1 \atop i \neq u}^{M} \mathcal{U}\_{ji} \mathcal{U}\_{jo+1u} (|\mathbf{x}\_i - \mathbf{x}\_u| + |y\_i - y\_u|) \, \text{CA}\_{\mathcal{V}} \mathcal{D}\_{\mathcal{V}} \tag{1}$$

s.t.

$$\alpha\_i + \frac{l\_i}{2} \le l\_\mathbb{C} \qquad i = 1, \ldots, M \tag{2}$$

$$\alpha\_i - \frac{l\_i}{2} \ge 0 \qquad i = 1, \ldots, M \tag{3}$$

$$y\_i + \frac{w\_i}{2} \le w\_c \quad \text{ } i = 1, \dots, M \tag{4}$$

$$y\_i - \frac{w\_i}{2} \ge 0 \qquad i = 1, \ldots, M \tag{5}$$

$$|\mathbf{x}\_i - \mathbf{x}\_u| \ge \mathbf{Z}\_{iu} (l\_i + l\_u) / 2 \qquad i, u = 1, \ldots, M \tag{6}$$

$$|y\_i - y\_u| \ge (1 - Z\_{iu})(w\_i + w\_u)/2 \qquad i, u = 1, \dots, M \tag{7}$$

$$i, x\_i, \ y\_i \ge 0, Z\_{\text{in}} \text{ are binary} \qquad i, u = 1, \ldots, M \tag{8}$$

Equation (1) declares the objective function of leader problem, which is minimizes the total intra-cell transportation cost of parts. Equations (2)–(5) are within site constraints that ensure each machine tool is assigned within the boundaries of its corresponding cell. Equations (6) and (7) force the overlap elimination for machine tools. Equation (8) represents the nature of the decision variables which are binary and non-negative.

Finally, the inter-cell layout problem tries to layout the different cells (cells here are the facilities) of the entire shop floor is as follows:

$$\text{Min} \sum\_{j=1}^{p} \sum\_{o=1}^{o\_p - 1} \sum\_{\substack{c, \boldsymbol{\epsilon}' = 1 \\ c \neq \boldsymbol{\epsilon}'}}^{o\_p - 1} \mathcal{U}\_{j\boldsymbol{\alpha}}^{\prime} \mathcal{U}\_{j\boldsymbol{\alpha} + 1\boldsymbol{\epsilon}^{\prime}}^{\prime} \left( |\mathbf{x}^{\prime}\_{\boldsymbol{\epsilon}} - \mathbf{x}^{\prime}\_{\boldsymbol{\epsilon}^{\prime}}| + |\mathbf{y}^{\prime}\_{\boldsymbol{\epsilon}} - \mathbf{y}^{\prime}\_{\boldsymbol{\epsilon}^{\prime}}| \right) \text{CE}\_{\boldsymbol{\beta}} D\_{\boldsymbol{\beta}} \tag{9}$$

s.t

U0

x0

y0

s.t.

Decision variables:

time is as follows:

Min

X P

Xop�<sup>1</sup>

X M

i, u ¼ 1 i 6¼ u

> xi þ li 2

xi � li

yi þ wi

yi � wi 2

o¼1

j¼1

Qic 1, if machine i is assigned in cell c

106 Computational Optimization in Engineering - Paradigms and Applications

joi 1, if operation o of part j is done by machine i which is located in cell c, otherwise 0

Ziu 1, if machine u is arranged in the same horizontal level as machine i, and 0 otherwise

The continuous bi-level programming problem is defined as: the intra-cell layout mathematical formulation to layout the different machines (machines here are the facilities) of every cell c at a

Ujoi Ujoþ1uðjxi � xujþjyi � yujÞ CAjDj (1)

≤ lC i ¼ 1, ::, M (2)

<sup>2</sup> <sup>≥</sup> <sup>0</sup> <sup>i</sup> <sup>¼</sup> <sup>1</sup>, ::, <sup>M</sup> (3)

<sup>2</sup> <sup>≤</sup> wc <sup>i</sup> <sup>¼</sup> <sup>1</sup>, ::, <sup>M</sup> (4)

≥ 0 i ¼ 1, ::, M (5)

jxi � xuj ≥ Ziuðli þ luÞ=2 i, u ¼ 1, ::, M (6)

jyi � yuj ≥ ð1 � ZiuÞðwi þ wuÞ=2 i, u ¼ 1, ::, M (7)

Equation (1) declares the objective function of leader problem, which is minimizes the total intra-cell transportation cost of parts. Equations (2)–(5) are within site constraints that ensure each machine tool is assigned within the boundaries of its corresponding cell. Equations (6)

xi, yi ≥ 0, Ziu are binary i, u ¼ 1, ::, M (8)

xi Horizontal distance between centre of machine i and vertical reference line yi Vertical distance between centre of machine i and horizontal reference line

<sup>c</sup> Horizontal distance between centre of cell c and vertical reference line

<sup>c</sup> Vertical distance between centre of cell c and horizontal reference line

Wcc<sup>0</sup> 1, if cell c is arranged in the same horizontal level as cell c<sup>0</sup> and 0 otherwise Zc 1, if cell c is arranged in out of aisle horizontal boundaries and 0 otherwise Wc 1, if cell c is arranged in out of aisle vertical boundaries and 0 otherwise

$$\mathbf{x}'\_{\varepsilon} + \frac{\mathbf{l}'\_{\varepsilon}}{2} \le L \qquad \mathbf{c} = 1, \ldots, \mathbf{C} \tag{10}$$

$$\mathbf{x}'\_{c} - \frac{\mathbf{f}'\_{c}}{2} \ge \mathbf{0} \qquad c = 1, \ldots, \mathbf{C} \tag{11}$$

$$\mathbf{w}'\_{\mathcal{L}} + \frac{\mathbf{w}'\_{\mathcal{L}}}{2} \le \mathbf{W} \quad \mathbf{c} = 1, \dots, \mathbf{C} \tag{12}$$

$$y\_{\mathcal{L}}' - \frac{w\_{\mathcal{L}}'}{2} \ge 0 \qquad \mathcal{c} = 1, \ldots, \mathbb{C} \tag{13}$$

$$|\mathbf{x}'\_{\varepsilon} - \mathbf{x}'\_{\varepsilon'}| \ge \mathcal{W}\_{\varepsilon\varepsilon'}(l'\_{\varepsilon} + l'\_{\varepsilon'})/2 \tag{14} \qquad \qquad \varepsilon, \varepsilon' = 1, \dots, \mathbb{C} \tag{14}$$

$$|y'\_{\ c} - y'\_{\ c'}| \ge (1 - W\_{\epsilon c'})(w'\_{\ c} + w'\_{\ c'})/2 \quad \text{c,} \ c' = 1, \ldots, \mathbb{C} \tag{15}$$

Aisle constraints:

Horizontal aisle:

$$\mathbf{M} \left( \mathbf{y}'\_{\mathfrak{c}} + \mathbf{w}'\_{\mathfrak{c}}/\mathfrak{L} \right) - \mathbf{Y}\_{\text{VAL}} \le \mathbf{M} \, \mathbf{Z}\_{\mathfrak{c}} \tag{16}$$

$$\mathbf{Y\_{VALU}} - (\mathbf{y'\_c} - \mathbf{z}'\_c/\mathbf{2}) \le \mathbf{M} \ (\mathbf{1} - \mathbf{Z\_c}) \tag{17}$$

Vertical aisle:

$$\left(\left(\mathbf{x'}\_{\varepsilon} - \mathbf{l'}\_{\varepsilon}/2\right) - \mathbf{X\_{\rm HELRT}} \leq \mathbf{M} \mathbf{W\_{\varepsilon}}\right) \tag{18}$$

$$\mathbf{X}\_{\text{HallLF}} - (\mathbf{x}'\_{\text{c}} + \mathbf{l}'\_{\text{c}}/2) \le \mathbf{M} \ (\mathbf{1} - \mathbf{W}\_{\text{c}}) \tag{19}$$

$$\mathbf{x}'\_{\mathcal{C}}, \mathbf{y}'\_{\mathcal{C}} \ge \mathbf{0}, \mathbf{W}\_{\mathcal{C}^d}, \mathbf{Z}\_{\mathcal{C}}, \mathbf{W}\_{\mathcal{C}} \text{ are binary } \mathcal{C} = \mathbf{1}, \dots, \mathbf{C} \tag{20}$$

Equation (9) represents the objective function of follower program. The objective function minimizes the inter-cell transportation cost of parts. The within-site constraints are enforced by the set of constraints 10–13; i.e. these constraints ensure that cells are assigned within the boundaries of shop floor. Moreover, overlap elimination constraints are defined by constraints (14) and (15) which enforce the overlap elimination among cells. Equations (16) and (19) in the follower problem ensure that no cells would be assigned in the aisle boundaries. Finally, Eq. (20) specifies that the decision variables are binary and positive.

## 4. Simulated annealing

Simulated annealing is a stochastic neighbourhood search technique, which was initially developed by Metropolis and applied to combinatorial problems by Kirkpatrich et al. [25] for the first time.

To begin with, the basic of SA is based on statistical mechanics and comes from the similarity between the annealing of solids process and the solving method of combinatorial problem. If each feasible solution to the combinatorial optimization problem as a configuration of atoms and the objective function value of corresponding feasible solution as the energy of the system, then the optimal solution of combinatorial optimization problem is as like as the lowest energy state of the physical system [23]. The core of heuristic algorithms for solving the combinatorial problem is based on continual improvement, moving from one solution to another one in order to decrease the objective function from one iteration to next one. The same procedure is taking in quenching the system from high to low temperature in order to reach the required quality.

## 4.1. The elements of an SA algorithm

The core of SA algorithm is Metropolis algorithm, which allows uphill moves sometimes. Metropolis algorithm has four main elements [24, 25]. Figure 2 represents the simulated annealing steps.

1. Initial solution and description of system configuration

It is the starting point of SA algorithm. There are two main approaches for generating initial solution. One is generating initial solution randomly; by taking this approach feasibility of initial solution has to be considered. The second approach is getting feasible initial solution by adapting greedy algorithms or another heuristic algorithm. It has to be noted that initial solution should not be too good because escaping from its local optimum is hard.

2. Configuration changes

By moving from one configuration to another one, new neighbourhood solution is generated. These changes occurred by defining some operators which are responsible to make changes in the current solution.

3. Objective function that represent the quantitative measurement of goodness of a system

After finding any neighbour, the difference between objective value of new solution (Enþ1) and of the current solution ðEnÞ is calculated. If ðΔE < 0Þ, it means that the objective value of neighbourhood solution is showing improvement in comparison to the objective value of the current solution found so far ðΔE < 0Þ. Hence, the current one will be accepted as the new best solution. On the other hand, if ðΔE ≥ 0Þ the new solution is accepted with a certain probability. Using this approach, SA tries to exit from the local optima region in which it trap. The probability is based on the so-called Boltzmann probability distribution

$$Prob(\Delta E) \text{--} \exp(-\Delta E/k\_b T) \tag{21}$$

where T is the parameter and kb is the Boltzmann's constant which is not required when Metropolis algorithm is applying to combinatorial problems [16]. The acceptance probability of new solution depends on two factors, one is how large is this difference. The bigger the difference, the lesser the chance of accepting this new solution. The second criterion is a control parameter (temperature). It should be noted if the initial temperature is not large enough or it decreases dramatically the chances that the algorithm traps at local optima is high.

4. Annealing schedule/cooling schedule

4. Simulated annealing

108 Computational Optimization in Engineering - Paradigms and Applications

4.1. The elements of an SA algorithm

1. Initial solution and description of system configuration

the first time.

annealing steps.

2. Configuration changes

changes in the current solution.

Simulated annealing is a stochastic neighbourhood search technique, which was initially developed by Metropolis and applied to combinatorial problems by Kirkpatrich et al. [25] for

To begin with, the basic of SA is based on statistical mechanics and comes from the similarity between the annealing of solids process and the solving method of combinatorial problem. If each feasible solution to the combinatorial optimization problem as a configuration of atoms and the objective function value of corresponding feasible solution as the energy of the system, then the optimal solution of combinatorial optimization problem is as like as the lowest energy state of the physical system [23]. The core of heuristic algorithms for solving the combinatorial problem is based on continual improvement, moving from one solution to another one in order to decrease the objective function from one iteration to next one. The same procedure is taking in quenching the system from high to low temperature in order to reach the required quality.

The core of SA algorithm is Metropolis algorithm, which allows uphill moves sometimes. Metropolis algorithm has four main elements [24, 25]. Figure 2 represents the simulated

It is the starting point of SA algorithm. There are two main approaches for generating initial solution. One is generating initial solution randomly; by taking this approach feasibility of initial solution has to be considered. The second approach is getting feasible initial solution by adapting greedy algorithms or another heuristic algorithm. It has to be noted that initial

By moving from one configuration to another one, new neighbourhood solution is generated. These changes occurred by defining some operators which are responsible to make

After finding any neighbour, the difference between objective value of new solution (Enþ1) and of the current solution ðEnÞ is calculated. If ðΔE < 0Þ, it means that the objective value of neighbourhood solution is showing improvement in comparison to the objective value of the current solution found so far ðΔE < 0Þ. Hence, the current one will be accepted as the new best solution. On the other hand, if ðΔE ≥ 0Þ the new solution is accepted with a certain probability. Using this approach, SA tries to exit from the local optima region in which it trap. The probability is based on the so-called Boltzmann probability distribution

<sup>e</sup> expð�ΔE=kbT<sup>Þ</sup> (21)

3. Objective function that represent the quantitative measurement of goodness of a system

ProbðΔEÞ

solution should not be too good because escaping from its local optimum is hard.

Figure 2. Flowchart of simulated annealing.

The annealing schedule determines four rules:

1. Initial temperature: Since the annealing of solids is the basic of the SA approach, initial temperature is the melting point of SA algorithm and it should be defined in such a way that the solutions generated by high acceptance probability approximately close to one. Kirkpatrick et al. [25] noted that the initial temperature has to be large enough that 80% of generated solutions are accepted. Kia et al., [26] and Baykasoğlu and Gindy [17] defined initial solution high enough in such a way that 95% of generated candidates can be accepted using the following equation:

$$T\_0 = \frac{\mathbf{Obj}v\_j - \mathbf{Obj}v\_i}{\ln(0.95)}\tag{22}$$

Objvj and Objvi are the objective values of two random solution i and j, respectively. It should be noted initial solution T<sup>0</sup> is generated once at the beginning of SA algorithm.


Based on the literature review, there are different approaches for choosing SA parameters as explained briefly in Table 2.


Epoch: Predetermined specific number of successful pairwise interchanges at each temperature.

N<sup>0</sup> : Predetermined integer.

n : Total number of facilities.

K : Predetermined integer- the total number of temperature steps.

Table 2. SA parameters.

## 4.2. Developed simulated annealing for FLP

#### 4.2.1. Initialization

The annealing schedule determines four rules:

110 Computational Optimization in Engineering - Paradigms and Applications

following equation:

2. Temperature length

• A specific number of iteration

• No improvement for a number of iteration

Initial temperature

Wilhelm and Ward [16] 10 0.9 ti <sup>¼</sup> <sup>10</sup>ð0:9<sup>Þ</sup>

K : Predetermined integer- the total number of temperature steps.

(T0) Cooling rate (α)

lnPc¼lnð0:95<sup>Þ</sup> <sup>∝</sup> <sup>¼</sup> lnPc

Epoch: Predetermined specific number of successful pairwise interchanges at each temperature.

Heragu and Alfa [27] 999 0.90 T ¼ rT Epoch concept

<sup>10</sup> 0.9 ti <sup>¼</sup> <sup>10</sup>ð0:9<sup>Þ</sup>

lnPf

• Exact final temperature

explained briefly in Table 2.

Baykasoğlu and Gindy [17] <sup>T</sup>in <sup>¼</sup> <sup>f</sup> min�<sup>f</sup> max

Author

Bazargan-Lari and Kaebernick [11]

N<sup>0</sup> : Predetermined integer. n : Total number of facilities.

Table 2. SA parameters.

1. Initial temperature: Since the annealing of solids is the basic of the SA approach, initial temperature is the melting point of SA algorithm and it should be defined in such a way that the solutions generated by high acceptance probability approximately close to one. Kirkpatrick et al. [25] noted that the initial temperature has to be large enough that 80% of generated solutions are accepted. Kia et al., [26] and Baykasoğlu and Gindy [17] defined initial solution high enough in such a way that 95% of generated candidates can be accepted using the

<sup>T</sup><sup>0</sup> <sup>¼</sup> Objvj � Objvi

Objvj and Objvi are the objective values of two random solution i and j, respectively. It should

Based on the literature review, there are different approaches for choosing SA parameters as

Temperature reduction

Loop length

K

K

Inner Outer

<sup>i</sup>�<sup>1</sup> N<sup>0</sup> · n K

<sup>1</sup> <sup>ð</sup>eLmax�<sup>1</sup> = <sup>Þ</sup> Telþ<sup>1</sup> ¼ αTe<sup>1</sup> IL>LMC elmax calculated

N<sup>0</sup> · n

<sup>i</sup>�<sup>1</sup> Epoch concept N<sup>0</sup> · n

be noted initial solution T<sup>0</sup> is generated once at the beginning of SA algorithm.

3. Termination: There are different approaches for stopping criteria such as

lnð0:95<sup>Þ</sup> (22)

A unique heuristic is used to generate a feasible initial solution for SA algorithm [7, 8]. The explanation of the developed heuristic is provided in Section 4.2.1.1.

## 4.2.1.1. Initialization heuristic

The mechanics of the developed algorithm are very different than any of the available heuristics in the literature. The whole idea behind our algorithm is to place facilities radially along vectors rf ! that are originated from the centroid of the space considered, where all facilities are to be placed as shown in Figure 3. The radial vectors along which all facilities are to be placed are distant radially by an angle <sup>θ</sup> <sup>¼</sup> <sup>3600</sup> M .

Figure 3. The mechanics of developed heuristics.

At the start of the heuristic method, at first the given area is first divided into four equal size quadrants; i.e. Q1, Q2, Q3, and Q4. Afterwards, all facilities are placed on top of each other in the middle of the given area. The developed heuristic algorithm consists of the two nested loops.

#### 4.2.1.1.1. Outer loop

For each iteration of the outer loop, one random facility (called target facility) f <sup>G</sup> is chosen and located radially along the radius ðrfÞ, which is making an angle θ<sup>0</sup> with the abscissa, as shown in Figure 3.

$$
\acute{\theta} = \mathbf{i} \times \boldsymbol{\theta} \quad \text{i} = 1, 2, \dots, M \tag{23}
$$

Facilities are permitted to be placed within the boundaries of the given area. In order to satisfy this constraint, vector a !, which is a vector of random magnitude along vector's rf ! direction, is taken, and facility, f <sup>G</sup>, is placed at the end of this vector. The length of vector a ! is a random number between ½0, j rf ! j � <sup>r</sup>�, where <sup>r</sup> is the length of the diagonal of facility <sup>f</sup> <sup>G</sup>. The next step is checking the possibilities for overlap between all facilities. If any overlap occurs between the target facility f <sup>G</sup> and the given area's boundaries or between target facility f <sup>G</sup> and the previously placed facilities, the inner loop is triggered. It should be noted that the facility coordinates for each is calculated based on an origin that is located at the bottom-left corner of the site as shown in Figure 3.

#### 4.2.1.1.2. Inner loop

Different repair functions based on the type of overlap are being developed to eliminate overlap. Repair functions guarantee the elimination of overlap between facilities and allocation of the facility within the boundaries of its corresponding quadrant. However, if the corresponding quadrant is too congested, the overlapped facility can be placed partially in a different quadrant. Nevertheless, no facilities are allowed to violate the given area boundaries. The inner loop has two main steps: in the first step, the overlap between facility f <sup>G</sup> and the overlapped facility f <sup>j</sup> is repaired. Afterwards, overlap checking is performed for all facilities starting from the last placed facility to the first one to see if repair done in previous step has caused further overlaps or not. If no overlap takes place, the inner loop is ended and algorithm goes back to the outer loop to place another facility, given a facility is still left to be placed. However, if overlap is detected when checking for overlap between all the facilities, the second step of the inner loop is enacted.

The second step of the inner loop consists of few iterations. In each iteration, as explained one facility f <sup>i</sup> is selected as target facility, and then the possibility of overlap between the target facility and rest of previously placed facilities is checked. If there is overlap between the target facility f <sup>i</sup> and facility f <sup>j</sup> , overlap elimination algorithms are enacted. The overlap has two main projections: one in the x-direction, Δx, and another in the y-direction, Δy. Δx represents the horizontal overlap between the two facilities f <sup>i</sup> and f <sup>j</sup> . In a similar fashion, Δy shows the vertical overlap between the two overlapped facilities as demonstrated in Figure 4. If Δx ≤ Δy,

Figure 4. Scheme of overlap between two facilities f <sup>i</sup> and f <sup>j</sup> .

the overlap is fixed by removing overlap in the x-projection direction; otherwise, it does that in the y-direction. The repair mechanism starts by moving target facility f <sup>i</sup> by the overlap distance Δ in appropriate direction.

Since no facility is allowed to violate the given area's boundaries, there is a need to know how much distance left between facility f <sup>i</sup> and cell/floor (or quarter) boundaries. If the distance left is less than overlap Δ, then overlap elimination is carried out for the facility f <sup>j</sup> . Moreover, if the distance left between the facility f <sup>j</sup> and site (or quarter) boundaries is not less than overlap Δ, the overlap distance Δ should be applied to both facilities f <sup>i</sup> and f <sup>j</sup> . At the end of each iteration, the overlap is checked once again to tackle any possibility of newly occurred overlap. This loop is repeated until all overlap and intersection between facilities are repaired. The summary of the developed initialization heuristic is represented in Figure 5.

and facility, f <sup>G</sup>, is placed at the end of this vector. The length of vector a

112 Computational Optimization in Engineering - Paradigms and Applications

! j � <sup>r</sup>�, where <sup>r</sup> is the length of the diagonal of facility <sup>f</sup> <sup>G</sup>. The next step is checking

the possibilities for overlap between all facilities. If any overlap occurs between the target facility f <sup>G</sup> and the given area's boundaries or between target facility f <sup>G</sup> and the previously placed facilities, the inner loop is triggered. It should be noted that the facility coordinates for each is calculated based on an origin that is located at the bottom-left corner of the site as shown in Figure 3.

Different repair functions based on the type of overlap are being developed to eliminate overlap. Repair functions guarantee the elimination of overlap between facilities and allocation of the facility within the boundaries of its corresponding quadrant. However, if the corresponding quadrant is too congested, the overlapped facility can be placed partially in a different quadrant. Nevertheless, no facilities are allowed to violate the given area boundaries. The inner loop has two main steps: in the first step, the overlap between facility f <sup>G</sup> and the overlapped facility f <sup>j</sup> is repaired. Afterwards, overlap checking is performed for all facilities starting from the last placed facility to the first one to see if repair done in previous step has caused further overlaps or not. If no overlap takes place, the inner loop is ended and algorithm goes back to the outer loop to place another facility, given a facility is still left to be placed. However, if overlap is detected when checking for overlap between all the facilities, the second

The second step of the inner loop consists of few iterations. In each iteration, as explained one facility f <sup>i</sup> is selected as target facility, and then the possibility of overlap between the target facility and rest of previously placed facilities is checked. If there is overlap between the target

projections: one in the x-direction, Δx, and another in the y-direction, Δy. Δx represents the

vertical overlap between the two overlapped facilities as demonstrated in Figure 4. If Δx ≤ Δy,

.

, overlap elimination algorithms are enacted. The overlap has two main

. In a similar fashion, Δy shows the

between ½0, j rf

4.2.1.1.2. Inner loop

step of the inner loop is enacted.

horizontal overlap between the two facilities f <sup>i</sup> and f <sup>j</sup>

Figure 4. Scheme of overlap between two facilities f <sup>i</sup> and f <sup>j</sup>

facility f <sup>i</sup> and facility f <sup>j</sup>

! is a random number

Figure 5. Summary of developed initialization heuristic algorithm.

### 4.2.2. Neighbourhood solution scheme

In order to generate new neighbourhood solution, two main operators, namely, move operator and swap operator, have been developed. The move operator tries to make facilities close to each other and also the swap operator switches the location of the two facilities. The details about these two operators explained below.

#### 4.2.2.1. Move operator

The developed move operator tries to reduce distances between the facilities. The logic behind this algorithm is decreasing the distance between one facility called in-context facility, which is chosen randomly and the closest facility towards that. By moving the in-context facility towards its closest facility, the possibility of overlap between in-context facility and the rest of facilities is decreased. Main point here is that how much the maximum\_movable\_ distance is. Maximum\_movable\_ distance is the maximum length which if in-context facility moved towards its closest facility no overlap will happen between them. The steps of move operator algorithm are explained below:


$$\text{Dis}\_{\text{Gi}} = \sqrt{\left(X\_{\text{G}} - X\_{i}\right)^{2} - \left(Y\_{\text{G}} - Y\_{i}\right)^{2}} \qquad \forall \text{ i} = 1, 2, \ldots, M \text{ and } \text{i} \neq \text{G} \tag{24}$$


$$\overrightarrow{\boldsymbol{O}^{\boldsymbol{f}}\boldsymbol{O}^{\boldsymbol{\cdot}}} \text{ : Vector between centroids of in-context family } \boldsymbol{f}\_{\boldsymbol{\cdot}\boldsymbol{G}} \text{ and closest family } \boldsymbol{f}\_{\boldsymbol{\cdot}\boldsymbol{\cdot}^{\boldsymbol{\cdot}}}$$


θ2: The angle between vector O<sup>0</sup> <sup>O</sup><sup>00</sup> ���! and vertical line


$$\theta\_1 = \tan^{-1} \frac{|\text{Opposite side}|}{|\text{Adjacent side}|} = \tan^{-1} \frac{|Y\_G - Y\_{\mathbb{C}}|}{|X\_G - X\_{\mathbb{C}}|} \tag{25}$$

$$\theta\_2 = \tan^{-1} \frac{|\text{Opposite side}|}{|\text{Adjacent side}|} = \tan^{-1} \frac{|X\_G - X\_{\mathbb{C}}|}{|Y\_G - Y\_{\mathbb{C}}|} \tag{26}$$

Also: θ<sup>2</sup> ¼ 90 � θ<sup>1</sup>

4.2.2. Neighbourhood solution scheme

4.2.2.1. Move operator

algorithm are explained below:

facilities is calculated.

centroid.

of vector r

O0 <sup>O</sup><sup>00</sup> ���!

j CC<sup>0</sup> j �!

00 !

these concepts are defined:

θ1: The angle between vector O<sup>0</sup>

: Maximum\_movable\_distance

DisGi ¼

q

1. Randomly choose one facility, called in-context facility f <sup>G</sup>.

ðXG � XiÞ

6. At this point the maximum \_movable\_ distance j CC<sup>0</sup>

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

<sup>2</sup> � ðYG � Yi<sup>Þ</sup>

5. Find in which quadrant of in-context facility f <sup>G</sup> the closest facility f <sup>C</sup> is located.

distance, two points C and C<sup>0</sup> have to be found. C is the conjunction of vector r<sup>0</sup>

: Vector between centroids of in-context facility f <sup>G</sup> and closest facility f <sup>C</sup>.

<sup>O</sup><sup>00</sup> ���!

closest boundary of in-context facility f <sup>G</sup> to the closest facility f <sup>C</sup>; and C<sup>0</sup> is the conjunction

and horizontal line

about these two operators explained below.

114 Computational Optimization in Engineering - Paradigms and Applications

In order to generate new neighbourhood solution, two main operators, namely, move operator and swap operator, have been developed. The move operator tries to make facilities close to each other and also the swap operator switches the location of the two facilities. The details

The developed move operator tries to reduce distances between the facilities. The logic behind this algorithm is decreasing the distance between one facility called in-context facility, which is chosen randomly and the closest facility towards that. By moving the in-context facility towards its closest facility, the possibility of overlap between in-context facility and the rest of facilities is decreased. Main point here is that how much the maximum\_movable\_ distance is. Maximum\_movable\_ distance is the maximum length which if in-context facility moved towards its closest facility no overlap will happen between them. The steps of move operator

2. The Euclidean distance between the centroid of in-context facility f <sup>G</sup> and the rest of

2

3. Facilities are sorted based on the distances found in step 2 in the descending order. The first one among the above set would be the closest facility f <sup>C</sup> to the in-context facility f <sup>G</sup>.

4. Divide the in-context facility f <sup>G</sup> into four equal-sized quadrants by the origin of its

j �!

and the closest boundary of closest facility to in-context facility. To do this,

∀ i ¼ 1, 2, …, M and i 6¼ G (24)

is calculated. For finding this

!

and the

where XG and YG are vertical and horizontal coordinates of centroid of in-context facility f <sup>G</sup>, respectively. Similarly, XC and YC are vertical and horizontal coordinates of centroid of incontext facility f <sup>C</sup>, respectively.

It has to be noted, the length of both vectors r 0 ! and r 00 ! depends on their corresponding angles θ<sup>1</sup> and θ2. Figures 6 and 7 illustrate this topic.

$$\begin{aligned} \left| \begin{array}{c} \overrightarrow{\text{'} } \text{'} \end{array} \right| = \begin{cases} \begin{array}{c} \text{Adjacent side} \\ \text{Cos} \theta\_1 = \frac{\text{L} \cdot \text{s}\_1 / \text{2}}{\text{Cos} \theta\_1} \\ \text{Opposite side} \end{array} \right| \quad & \quad \text{if} \quad 0 \le \theta\_1 \le 45^0 \end{aligned} \tag{27}$$
 
$$\begin{cases} \begin{array}{c} \text{Opposite side} \\ \text{Sin} \theta\_1 \end{array} = \frac{\text{W} \cdot \text{s}\_1 / \text{2}}{\text{Sin} \theta\_1} \qquad \text{if} \quad 45^0 \le \theta\_1 \le 90^0 \end{aligned} \tag{27}$$

$$\begin{aligned} \left| \begin{array}{c} \overrightarrow{r''} \end{array} \right| = \begin{cases} \begin{array}{c} \textit{Adjacent side} \\ \textit{Cos} \theta\_{2} \end{array} = \begin{array}{c} \textit{Wc}\_{\prime} \end{array} \begin{array}{c} \textit{Wc}\_{\prime} \end{array} \begin{array}{c} \textit{M} \end{array} \begin{array}{c} \textit{0} \leq \theta\_{2} \leq 4 \mathsf{5} \text{}^{0} \\\\ \begin{array}{c} \textit{Oppposite side} \end{array} \begin{array}{c} \textit{1} \leq \theta\_{2} \leq 4 \mathsf{5} \text{}^{0} \end{array} \end{array} \end{aligned} \tag{28}$$

where LG and WG are length and width of in-context facility f <sup>G</sup>, respectively. Similarly, LC and WC are length and width of in-context facility f <sup>C</sup>, respectively.

Based on in which quadrant closing facility is located, C and C<sup>0</sup> coordinates are calculating by equations shown in Table 3.

Hence, the length of vector <sup>j</sup> CC<sup>0</sup> �! j is

$$|\overrightarrow{\mathbb{C}\mathbb{C}}^{\circ}| = \sqrt{(\mathbf{X}\_{\mathbb{C}} - \mathbf{X}\_{\mathbb{C}'})^2 - (\mathbf{Y}\_{\mathbb{C}} - \mathbf{Y}\_{\mathbb{C}'})^2} \tag{29}$$

7. At this point the length of the movement, called ml is the random number in interval <sup>ð</sup>0, <sup>j</sup> CC<sup>0</sup> �! j�: Furthermore, the direction of movement is along the vector CC<sup>0</sup> �! .

Figure 6. Angle calculation in move operator (I).

Figure 7. Concept of angle in move operator (II).



Table 3. C and C<sup>0</sup> coordinates.


Table 4. New coordinate of f <sup>G</sup> after move.

#### 4.2.2.2. Swap operator

Figure 6. Angle calculation in move operator (I).

116 Computational Optimization in Engineering - Paradigms and Applications

Figure 7. Concept of angle in move operator (II).

The second operator of the developed SA is the swap operator which is switching positions of two facilities. The point here is how swap two facilities together that with the minimum possibility of overlap. To do that, the new concepts called free zone is defined. To apply this concept, a random facility called f <sup>G</sup> is chosen and the available free space around this facility called FZG is determined by applying the maximum\_movable\_distance concept introduced in move operator. It has to be noted the centroid of free zone FZG is the same as centroid of the facility f <sup>G</sup>. If there is any facility whose area is greater than the area of the facility f <sup>G</sup> and less than the area of free zone FZG then that facility is qualified for swapping. By swapping this facility with facility f <sup>G</sup> the possibility of occurrence of overlap is minimized. Moreover, if there is more than one facility which are qualified to swap with the facility f <sup>G</sup> , one facility is chosen randomly. Figure 8 shows the scheme of free zone concept. The algorithm below explained swap operator's steps in detail:

1. One facility is chosen randomly, called facility f <sup>G</sup>.


Figure 8. Free zone concept.

Assume:

2. The closest facility to the f <sup>G</sup> is determined-details mentioned in move operator.

6. Among the rest of facilities those ones whose areas are greater than the area of facility f <sup>G</sup>

.

.

7. Randomly one facility among those facilities is found in step 6 is chosen, call it f <sup>i</sup>

.

3. Maximum\_movable\_distance is calculated.

118 Computational Optimization in Engineering - Paradigms and Applications

4. Free zone FZG of facility f <sup>G</sup> is determined.

5. Areas of facility f <sup>G</sup> and FZG are calculated.

8. Swap facility f <sup>G</sup> to the facility f <sup>i</sup>

10. End

Figure 8. Free zone concept.

and less than the area of free zone FZG are found.

9. Calculated the new coordinates of both f <sup>G</sup> and f <sup>i</sup>

LG: Length of the f <sup>G</sup>

WG: Width of the f <sup>G</sup>

ml: Maximum movable distance

LFZ: Length of the FZ

WFZ: Width of the FZ

AFZ: Area of FZ

$$A\mathcal{C} = \min((X\_{\mathcal{G}} - \, ^{L\_{\mathcal{G}}}/\_2), \mathfrak{ml} \times \mathbb{C} \text{os} \theta\_1) \tag{30}$$

$$AC' = \min(\left(Y\_G - \mathbb{W}\_{\mathbb{C}}\, \Bigm\right), ml \times \text{Sim}\theta\_1) \tag{31}$$

$$\text{LFZ} = \text{L}\_{\text{G}} + 2\text{AC} \tag{32}$$

$$\text{WFZ} = W\_{\text{G}} + 2\text{AC}'\tag{33}$$

$$\text{AFZ} = \text{LFZ} \times \text{WFZ} \tag{34}$$

#### 4.2.3. Aisle constraints

In case of aisle, the operators move and swap vary. The details are presented in the below section.

#### 4.2.3.1. Move operator

The move operator has the same procedure as the move operator developed in case of no aisle. Hence, in case of aisle one facility is chosen randomly f <sup>G</sup> and moves to its closest facility f <sup>C</sup>. Afterwards, the possibility of overlap between aisle and new position of facility f <sup>G</sup> called f ´ <sup>G</sup> is considering. If any overlap happened, it has to be fixed. To do that, two repair functions have been developed.

#### 4.2.3.2. Before-aisle repair function

The idea behind this function is if there is any overlap between f ´ <sup>G</sup> and aisle happens, the facility f ´ <sup>G</sup> moves back exactly before the aisle. To illustrate, f ´ <sup>G</sup> backs to the back of boundary of aisle which it passed over. Figures 9 and 10 represent the overlap conditions in both cases of vertical and horizontal aisle.

The steps of the move operator with aisle constraints are explained as follows:

Step 1. Move facility f <sup>G</sup> towards its closest facility. Calculate new coordinates of facility f <sup>G</sup> and call it facility f ´ G.

Step 2. Check overlaps possibility between f ´ <sup>G</sup> and aisle. Step 3. If there is any overlap, take appropriate repair function.

Step 4. Find the coordinates of f ´ <sup>G</sup>- —details are shown in Tables 5 and 6.

Step 5. End

#### Repair function-horizontal aisle

• Facility f <sup>G</sup> is lower side of the aisle is

$$\mathbf{Rep} = \left( \left( \mathbb{\dot{y}\_G + \mathbb{w}\_G/\_2} \right) - \left( \mathbb{Y}\_A - \mathbb{w}\_A/\_2 \right) \right) \Big|\_{\mathrm{Sim}\theta} \tag{35}$$

• Facility f <sup>G</sup> is upper side of the aisle:

$$\mathbf{Rep} = \left( \left( \mathbf{y}\_{A} + \mathbf{w}\_{A}/\_{2} \right) - \left( \dot{y}\_{G} - \mathbf{w}\_{G}/\_{2} \right) \right) \% \text{in} \theta \tag{36}$$

#### Repair function-vertical aisle

• Facility f <sup>G</sup> is in the left side of the aisle:

$$\mathbf{Rep} = \langle \left( \dot{\mathbf{x}}\_{\mathcal{C}} + \mathbf{c}\_{\mathcal{C}} /\_{2} \right) - \left( \mathbf{X}\_{\mathcal{A}} - \mathbf{c}\_{\mathcal{A}} /\_{2} \right) \rangle \rangle\_{\mathbb{C}\infty\theta} \tag{37}$$

• Facility f <sup>G</sup> is in the right side of the aisle:

$$\mathbf{Rep} = \left( \left( \mathbf{x}\_{\mathcal{A}} + \mathbf{z}\_{\mathcal{A}} /\_2 \right) - \left( \mathbf{\dot{x}}\_{\mathcal{G}} - \mathbf{z}\_{\mathcal{G}} /\_2 \right) \right) \big|\_{\mathbf{C} \text{os} \boldsymbol{\theta}} \tag{38}$$

Figure 9. Before-aisle move operator for horizontal aisle.

Figure 10. Before-aisle move operator for vertical aisle.

Step 3. If there is any overlap, take appropriate repair function.

120 Computational Optimization in Engineering - Paradigms and Applications

´

Rep ¼ <sup>x</sup>

´ <sup>G</sup>þ<sup>l</sup>

Rep ¼ XAþLA <sup>=</sup> ð Þ ð <sup>2</sup> � <sup>x</sup>

´ <sup>G</sup>�<sup>l</sup>

<sup>G</sup>- —details are shown in Tables 5 and 6.

Rep ¼ ýGþWG <sup>=</sup> ð Þ ð <sup>2</sup> � YA�wA <sup>=</sup> ð <sup>2</sup> ÞÞ=Sin<sup>θ</sup> (35)

Rep ¼ YAþWA <sup>=</sup> ð Þ ð <sup>2</sup> � ýG�wG <sup>=</sup> ð <sup>2</sup> ÞÞ=Sin<sup>θ</sup> (36)

<sup>G</sup> <sup>=</sup> ð Þ ð <sup>2</sup> � XA�LA <sup>=</sup> ð <sup>2</sup> ÞÞ=Cos<sup>θ</sup> (37)

<sup>G</sup> <sup>=</sup> ð <sup>2</sup> ÞÞ=Cos<sup>θ</sup> (38)

Step 4. Find the coordinates of f

Repair function-horizontal aisle

Repair function-vertical aisle

• Facility f <sup>G</sup> is lower side of the aisle is

• Facility f <sup>G</sup> is upper side of the aisle:

• Facility f <sup>G</sup> is in the left side of the aisle:

• Facility f <sup>G</sup> is in the right side of the aisle:

Figure 9. Before-aisle move operator for horizontal aisle.

Step 5. End


Table 5. Revised coordinate based on before-aisle repair function-horizontal aisle.


Table 6. Revised coordinate based on before-aisle repair function-vertical aisle.

#### 4.2.4. Developed SA algorithm

In this chapter, the parameters taken by Bazargan-Lari and Kaebernick [10] have been used in the developed SA algorithm:


#### 5. Case study

Carbide Tool Inc. manufactures and distributes metalworking tools. The company is dedicated to developing specialized carbide, polycrystalline diamond (PCD) and cubic boron nitride (CBN) inserts, as well as multi-task tooling for the aerospace, automotive and mould-die industries. The company currently has a process layout configuration. Five different kinds of family cutting insert tools are produced. Each part has specific monthly demand. There are different kinds of unequal sized grinding machine tools. Some of the machine tools have identical copies on the shop floor to increase productivity. Therefore, the demand is being shared among the different copies of those machine tools. Different operations are performed on inserts with different sequences. The list of operations of each insert and the machine tools used for those operations are shown in Table 7.

The company's shop floor has a rectangular shape. There is no special material handling device for transforming unfinished products among machine tools. As demonstrated in Table 7, it is obvious that the number of operations performed on each part insert tool is large enough; hence, the amount of travel taking place every day on the production floor is quite significant. Additionally, as per their original layout, all the raw materials are transported from the back side of the shop to the front to start operation. This causes extra unnecessary travel, and hence extra material handling cost. The inspection and shipping stations which are two of the last steps as per the sequence of operations are not properly positioned in the current layout, because they are located in front of the floor. Since the current layout is process layout, similar machine tools are grouped together and located on one side of the floor. The original layout is


Table 7. Machine tool characterizations and parts' operations sequence.

4.2.4. Developed SA algorithm

the developed SA algorithm: 1. Initial temperature: 10

3. Temperature reduction: ti ¼ 10ð0:9Þ

Aisle xf <sup>G</sup> < xf

122 Computational Optimization in Engineering - Paradigms and Applications

< YL xf

> YL xf

Vertical Aisle yf <sup>G</sup>

xf <sup>G</sup> < XL xf

xf <sup>G</sup> > XL xf

5. Inner loop: 100 · M, M is the total number of facilities

2. Cooling rate: 0.9

Horizontal

yf G

yf G

4. Outer loop: 25

5. Case study

In this chapter, the parameters taken by Bazargan-Lari and Kaebernick [10] have been used in

Carbide Tool Inc. manufactures and distributes metalworking tools. The company is dedicated to developing specialized carbide, polycrystalline diamond (PCD) and cubic boron nitride (CBN) inserts, as well as multi-task tooling for the aerospace, automotive and mould-die industries. The company currently has a process layout configuration. Five different kinds of family cutting insert tools are produced. Each part has specific monthly demand. There are different kinds of unequal sized grinding machine tools. Some of the machine tools have identical copies on the shop floor to increase productivity. Therefore, the demand is being

i�1

´ G

� Rep · cosθ

� Rep · sinθ

� Rep · cosθ

þ Rep · sinθ

� Rep · cosθ

� Rep · sinθ

þ Rep · cosθ

� Rep · sinθ

´ G ¼ xf ´ G

´ G ¼ xf ´ G

> < yf ´ G

xf ´ G ¼ xf ´ G

> ´ G ¼ xf ´ G

> ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G

Table 6. Revised coordinate based on before-aisle repair function-vertical aisle.

xf ´ G ¼ xf ´ G

Table 5. Revised coordinate based on before-aisle repair function-horizontal aisle.

xf ´ G ¼ xf ´ G xf <sup>G</sup> ≥ xf ´ G

þ Rep · cosθ

� Rep · sinθ

þ Rep · cosθ

þ Rep · sinθ

� Rep · cosθ

þ Rep · sinθ

þ Rep · cosθ

þ Rep · sinθ

xf ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G

yf G ≥ yf ´ G

xf ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G

xf ´ G ¼ xf ´ G causing too much traffic, since it is not taking into account the sequence of processing of parts. For an example, the primary operations of all insert tools are performed by the combination of three machine tools: Double Disk, Blanchard and Wendt. All Wendt machines are located in upper side of hall, while Blanchard and Double Disk machines are arranged in the lower side. Therefore, it could be concluded that there are too much back and forth travel being done between the two sides of the floor just for performing the first couple of operations.

After having several meetings with the plant manager and executive board of the company, cellular layout was chosen as the best layout plan. Group formation was performed by the plant manager. Four cells with specific types of machine tools were designed as given in Table 8. The problem was solved using both the developed mathematical model and heuristic [7].


Table 8. GF results.

#### 5.1. Mathematical model

#### 5.1.1. Intra-cellular layout

For the leader problem the layout of the different machine tools and work stations in their respective cells are being solved. The intra-cellular travel cost per unit distance per one unit of each respective part are ¢10, ¢10, ¢15, ¢12 and ¢20, respectively for Dog Bone, S Shape, Triangular, Top Notch and all types of Diamond. The results for intra-cellular layout are summarized in Table 9.


Table 9. Intra-cell material handling costs and inter-cell dimensions of cells.

#### 5.1.2. Inter-cellular layout

In the follower problem, the four cells with exact dimensions are located on the 90" · 60" shop floor. The inter-cellular travel cost per unit distance for each unit of Dog Bone, S shape, Triangular, Top Notch, and Diamond are ¢12, ¢12, ¢18, ¢15, and ¢20, respectively. Material handling cost for the inter-cellular layout is \$7520.42. Table 11 shows the coordinates of cells based on inter-cellular layout plan. The final sketch of inter-cellular and intra-cellular layout is shown in Figure 11.

Figure 11. Inter-cell and intra-cell layout plan.

#### 5.2. Heuristic

causing too much traffic, since it is not taking into account the sequence of processing of parts. For an example, the primary operations of all insert tools are performed by the combination of three machine tools: Double Disk, Blanchard and Wendt. All Wendt machines are located in upper side of hall, while Blanchard and Double Disk machines are arranged in the lower side. Therefore, it could be concluded that there are too much back and forth travel being done

After having several meetings with the plant manager and executive board of the company, cellular layout was chosen as the best layout plan. Group formation was performed by the plant manager. Four cells with specific types of machine tools were designed as given in Table 8. The

For the leader problem the layout of the different machine tools and work stations in their respective cells are being solved. The intra-cellular travel cost per unit distance per one unit of each respective part are ¢10, ¢10, ¢15, ¢12 and ¢20, respectively for Dog Bone, S Shape, Triangular, Top Notch and all types of Diamond. The results for intra-cellular layout are

Cells Dimension Centroid MHC (material handling cost)

Diamond M10 (2) M7 (1) M5 (1) M12 (1) M11 (1)

In the follower problem, the four cells with exact dimensions are located on the 90" · 60" shop floor. The inter-cellular travel cost per unit distance for each unit of Dog Bone, S shape,

between the two sides of the floor just for performing the first couple of operations.

problem was solved using both the developed mathematical model and heuristic [7].

Primary M1 (1) M2 (2) M4 (1) M3 (3)

Final M13 (1) ST1 (1) ST2 (1) ST3 (1)

Length Width X Y Primary 35 25 42.5 13.5 \$1191.550 Grinding 26 20 74 50 \$520.588 Diamond 30 20 45 59.22 \$764.580 Final 30 20 75 8 \$1056.350

Aisle 90 60 45 32.5

Table 9. Intra-cell material handling costs and inter-cell dimensions of cells.

Grinding M6 (2) M8 (2) M9 (1)

124 Computational Optimization in Engineering - Paradigms and Applications

5.1. Mathematical model 5.1.1. Intra-cellular layout

Table 8. GF results.

Cell name Machine tools/work station

summarized in Table 9.

5.1.2. Inter-cellular layout

The heuristic is applied to solve the intra-cellular layout problems for each respective cell. The results obtained are provided in Table 10 and plotted in Figure 12. The material handling cost for the inter-cellular layout is \$6134.50 [6].

## 5.3. Simulated annealing (SA)

5.3.1. To validate the proof of the efficiency of the developed SA algorithm, the developed SA was applied for 10 runs for each cells [8]

## 5.4. Discussion

The comparison between the solutions provided non-linear, linear model and simulated annealing is represented in Table 11. The linear model gives the exact optimum solution, however simulated annealing provides near optimum solution. The results also prove this fact. In both leader and follower problem, i.e. intra- and inter-cell, respectively, the total


Table 10. Machine coordinates based on heuristic.

material handling cost is less than costs provided by non-linear mixed integer programming and simulated annealing.

The follower problem solved by simulated annealing has just assumed aisle.

Generally speaking, the linearized model obviously has yielded exact optimal results which proved to be better than those obtained by both the simulated annealing and the original non-

Figure 12. Heuristic results showing layout presented at intra-cell level for different cells (note: quadrant have been plotted demonstrating how facilities were spread around the different quadrants as per the working of the algorithm).


Table 11. Comparisons between mathematical modelling and simulate annealing.

material handling cost is less than costs provided by non-linear mixed integer programming

Generally speaking, the linearized model obviously has yielded exact optimal results which proved to be better than those obtained by both the simulated annealing and the original non-

The follower problem solved by simulated annealing has just assumed aisle.

Cell Machine Coordinates

126 Computational Optimization in Engineering - Paradigms and Applications

Primary Blanchard 14.5 18.30

MHC \$734.581

MHC \$669.480

MHC \$808.640

MHC \$2410.760

Final ETCH 26.19 9

Diamond Wire cutting 10.39 3.8

Grinding Surface grinding 16.79 4

Blanchard 21.27 17.16 Polish 14.5 5.98 Wendt 6.58 7.64 Wendt 23.34 4.45 Double Disc 23.83 10 Wendt 7.25 17.72

Surface grinding 21.05 16 Swing fixture 18.97 10 Swing fixture 5.72 15.26 V-bottom 8.55 6.76

Wire cutting 5.50 10 Surface grinding 18 5.64 Brazing 26 10 Ewag 11.53 16.31 Laser M/c 18.8 14.87

Wash 15 12.42 Inspection 4.89 9 Packing 15 5

X Y

and simulated annealing.

Table 10. Machine coordinates based on heuristic.

linear model. This was quite expected; in most cases simulated annealing resulted in better solutions than the non-linear model; however, there were cases where the non-linear model results were slightly better than those obtained by simulated annealing. The exception was for grinding cell and diamond cell where the non-linear model outperformed slightly than simulated annealing.

Table 12 summarizes the results from both leader and follower problems. Both mean and SDV from the performed 10 runs are being provided. Standard deviation is good except for intercell layout problem. For inter-cell, we believe the algorithm is yet to be improved, and variance as shown in Table 12 is relatively high.


Table 12. Mean and standard deviation of SA solutions.

## 6. Conclusion

Cellular manufacturing system (CMS) layout has recently begun to receive heightened attention worldwide. The design of a CMS includes: (1) cell formation (CF), (2) group layout, (3) group and (4) resource allocation. An effective CMS implementation help any company improve machine utilization and quality; it also makes reduction in setup time, work-in-process inventory, material handling cost, part makespan and expediting costs.

There are two main approaches to FLP such as the discrete and continuous approaches. The discrete approach holds two main assumptions: one is all facilities are equal size and shape; the other one is predetermined locations of facilities. However, these kinds of assumptions are not realistic. The discrete approach is not suited to represent the exact locations of facilities. Moreover, this approach is not applicable for FLP with unequal size and shape facilities. The appropriate approach to this kind of FLP is continuous representation.

Generally speaking, the design of layout cannot be efficient if manufacturing attributes are not being considered in it. To illustrate, operation sequencing and parts' demand are the two factors that have significant impacts on the flow rate which minimizes the main objective of FLP. The majority of literature studies have not considered these factors in the design of layout plan. Besides those manufacturing attributes, the available area of the shop that can be used for locating facilities is the other factor that has to be considered.

The facility layout problem for cellular manufacturing system in both inter- and intracellular levels is considered in this chapter. The problem is to arrange facilities that are cells in the leader problem and machine tools in the follower problem in the continual planar site. Operation sequence and parts' demand are the two main manufacturing attributes considered in the developed model. The MIP has been presented for both leader and follower problems. The novel aisle constraints have been presented in the mathematical formulation. Since the model is non-linear, the linearized model has been developed. Additionally, a novel mathematical modelling has been developed for considering block constraints such as fixed departments and facilities. Since the FLP is an NP-hard problem, novel heuristics presented in this chapter.

A novel heuristic model developed for finding feasible initial solution for designed metaheuristic algorithm, simulated annealing. The initial solution is based on the radial movement. In other words, the algorithm placed facilities along the specific radius with certain angle within site. The algorithm starts with dividing site into four equal-sized quadrants, start placing facilities into first quadrant to the fourth one. After placing any new facility, the overlap's possibility between facilities and between facility and site boundaries is being checked. The different repair functions have been designed for different cases.

The SA algorithm developed for both inter- and intra-cellular problem. The results of heuristic have used to initialize the developed SA algorithm. However, in order to have more efficient SA, the cell size used in heuristic algorithm is assumed two times of the original size of the cells. The two main operators used are move and swap operators. The move operator decreases the distance between facilities by moving the target facility towards the closest facility to it. Furthermore, the swap operator developed by defining the concept of the free zone.

## Author details

6. Conclusion

Table 12. Mean and standard deviation of SA solutions.

128 Computational Optimization in Engineering - Paradigms and Applications

in this chapter.

Cellular manufacturing system (CMS) layout has recently begun to receive heightened attention worldwide. The design of a CMS includes: (1) cell formation (CF), (2) group layout, (3) group and (4) resource allocation. An effective CMS implementation help any company improve machine utilization and quality; it also makes reduction in setup time, work-in-process

Cell Average SDV Primary \$633.86 \$11.19 Grinding \$492.44 \$15.63 Diamond \$759.790 \$22.315 Final \$902.62 \$32.23 Inter-cell \$5474.61 \$423.97

There are two main approaches to FLP such as the discrete and continuous approaches. The discrete approach holds two main assumptions: one is all facilities are equal size and shape; the other one is predetermined locations of facilities. However, these kinds of assumptions are not realistic. The discrete approach is not suited to represent the exact locations of facilities. Moreover, this approach is not applicable for FLP with unequal size and shape facilities. The

Generally speaking, the design of layout cannot be efficient if manufacturing attributes are not being considered in it. To illustrate, operation sequencing and parts' demand are the two factors that have significant impacts on the flow rate which minimizes the main objective of FLP. The majority of literature studies have not considered these factors in the design of layout plan. Besides those manufacturing attributes, the available area of the shop that can be used for

The facility layout problem for cellular manufacturing system in both inter- and intracellular levels is considered in this chapter. The problem is to arrange facilities that are cells in the leader problem and machine tools in the follower problem in the continual planar site. Operation sequence and parts' demand are the two main manufacturing attributes considered in the developed model. The MIP has been presented for both leader and follower problems. The novel aisle constraints have been presented in the mathematical formulation. Since the model is non-linear, the linearized model has been developed. Additionally, a novel mathematical modelling has been developed for considering block constraints such as fixed departments and facilities. Since the FLP is an NP-hard problem, novel heuristics presented

A novel heuristic model developed for finding feasible initial solution for designed metaheuristic algorithm, simulated annealing. The initial solution is based on the radial movement.

inventory, material handling cost, part makespan and expediting costs.

appropriate approach to this kind of FLP is continuous representation.

locating facilities is the other factor that has to be considered.

Maral Zafar Allahyari<sup>1</sup> and Ahmed Azab<sup>2</sup> \*

\*Address all correspondence to: azab@uwindsor.ca

1 Industrial Engineering, University of Windsor, Windsor, Ontario, Canada

2 Production & Operations Management Research Lab, Faculty of Engineering, University of Windsor, Ontario, Canada

## References


[25] Kirkpatrick S, Gelatt C D, Vecchi M P. Optimization by simulated annealing. Science. 1983;220:671–680.

[8] Zafar AM. Bi-Level Mathematical Programming and Heuristics for the Cellular Manufacturing Facility Layout Problem. MASc Thesis, University of Windsor, 2014. [9] Alfa Sule A, Mingyuan C, Sunderesh SH. Integrating the grouping and layout problems in Cellular manufacturing systems. Computers & Industrial Engineering. 1992;23(1):55–58.

[10] Kaebernick H, Bazargan-Lari M, Arndt G. An integrated approach to the design of cellular manufacturing. CIRP Annals—Manufacturing Technology. 1996;45(1):421–425.

[11] Bazargan-lari M, Kaebernick H. An approachto the machine layout problem in a cellular manufacturing environment. Production Planning & Control. 1997;8(1):41–55.

[12] Bazargan-Lari M. Layout designs in cellular manufacturing. European Journal of Opera-

[13] Bazargan-Lari M, Kaebernick H, Harraf A. Cell formation and layout designs in a cellular manufacturing environment a case study. International Journal of Production Research.

[14] Imam M H, Mir M. Automated layout of facilities of unequal areas. Computers &

[15] Mir M, Imam M H. A hybrid optimization approach for layout design of unequal-area

[16] Wilhelm M R,Ward T L. Solving quadratic assignment problems by 'simulated

[17] Baykasoğlu A, Gindy NNZ. A simulated annealing algorithm for dynamic layout prob-

[19] Conway D G, Venkataramanan MA. Genetic search and the dynamic facility layout

[20] Balakrishnan J, Cheng C H. Genetic search and the dynamic layout problem. Computers

[21] Tavakkoli-Moghaddam R., Aryanezhad M B, Safaei N, Azaron A. Solving a dynamic cell formation problem using metaheuristics. Applied Mathematics and Computation.

[22] Safaei N, Saidi-Mehrabad M, Jabal-Ameli M S. A hybrid simulated annealing for solving an extended model of dynamic cellular manufacturing system. European Journal of

[23] Golden, B L, Skiscim, CH. Using simulated annealing to solve routing and location

[24] Press W, Teukolsky S A, Vetterling W T, Flanner B P 2007. Numerical Recipes. The Art of

[18] Rosenblatt M J. The dynamics of plant layout. Management Science. 1986;32(1):76–86.

tional Research. 1999;112(2):258–272.

130 Computational Optimization in Engineering - Paradigms and Applications

Industrial Engineering. 1993;24(355–366):0360–8352.

annealing'. IIE Transactions. 1987;19(1):107–119.

& Operations Research. 2000;27(6):587–593.

Operational Research. 2008;185(2):563–592.

Scientific Computing. WH Press. pp. 549–555.

facilities. Computers & Industrial Engineering. 2001;39(1–2):49–63.

lem. Computers & Operations Research. 2001;28(14):1403–1426.

problem. Computers & Operations Research. 1994;21(8):955–960.

problems. Naval, Research Logistics Qaurtely. 1986;33 (2):261–279.

2000;38(7): 1689–1709.

2005;170(2):761–780.


**Provisional chapter**

## **Application of Simulated Annealing and Adaptive Simulated Annealing in Search for Efficient Optimal Solutions of a Groundwater Contamination related Problem Application of Simulated Annealing and Adaptive Simulated Annealing in Search for Efficient Optimal Solutions of a Groundwater Contamination related Problem**

Mahsa Amirabdollahian and Bithin Datta Mahsa Amirabdollahian and Bithin Datta

Additional information is available at the end of the chapter Additional information is available at the end of the chapter

http://dx.doi.org/10.5772/66998

#### **Abstract**

Characterization of groundwater contamination sources is a complex inverse problem. This inverse problem becomes complicated, due to the nonlinear nature of the ground‐ water flow and transport processes and the associated natural uncertainties. The math‐ ematical challenges arise due to the nonunique characteristics of this problem resulting from the nonunique response of the aquifer system to a set of stresses and the possibil‐ ity of instead locating only local optimal solutions. The linked simulation‐optimization model is an efficient approach to identifying groundwater contamination source charac‐ teristics. Efficiency and accuracy of the search for optimum solutions of a linked simu‐ lation‐optimization depend on the utilized optimization algorithm. This limited study focuses on the application and efficiency of simulated annealing (SA) as the optimiza‐ tion algorithm for solving the source characterization problem. The advantages in using adaptive simulated algorithm (ASA) as an alternative are then evaluated. The possibility of identifying a local optimal solution rather than a global optimal solution when using SA implies failure to solve the source characterization inverse problem. The cost of such inaccurate characterization may be enormous when a remediation strategy is based on the model inferences. ASA is shown to provide a reliable and acceptable alternative for solving this challenging aquifer contamination problem.

**Keywords:** groundwater contamination, adaptive simulated annealing, source characterization, simulation, optimization

© 2017 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

© 2016 The Author(s). Licensee InTech. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution,

and reproduction in any medium, provided the original work is properly cited.

## **1. Introduction**

One of the efficient methodologies for identifying groundwater contamination source charac‐ teristics is the linked simulation‐optimization model [1]. Efficiency and accuracy of the opti‐ mal solutions for this type of inverse models, which are often complex, nonlinear, and large scale, depend on the efficiency and accuracy of the optimization algorithm. Simulated anneal‐ ing (SA) and adaptive simulated annealing (ASA) are two efficient evolutionary optimization algorithms which have been recently applied for solving such a large‐scale nonlinear and complex linked simulation‐optimization models for optimal characterization of unknown contaminant sources in groundwater systems. This chapter discusses the application of these two optimization algorithms and their relative performances in identifying the characteris‐ tics of contamination sources in groundwater systems. Adaptive simulated annealing is a modified version of simulated annealing where the optimization parameters are tuned auto‐ matically [2]. The advantages of using the ASA algorithm are demonstrated for an illustrative groundwater contamination‐related problem, and the relative efficiency and accuracy of these two optimization algorithms, SA and ASA, are compared. The adaptive algorithm, ASA, is shown to be computationally more efficient and more suitable for searching for a globally optimal solution for a complex nonlinear optimization model representing complex flow and contaminant transport process in a contaminated groundwater aquifer.

## **2. Background**

Effective groundwater pollution management and remediation depend on identifying the unknown pollution source and reconstructing their release history [3, 4]. The optimal and accurate identification of the contaminant sources plays an important role in modeling of sub‐ surface transport processes and in reducing the long‐term contamination remediation cost. The source identification problem deals with the spatial and temporal variations of the loca‐ tion, activity duration, and the injection rate of the pollutant sources and is mostly inferred from the available sparse and sometimes erroneous concentration measurements at the site. Mainly, source identification includes a simulation problem, such as groundwater flow and pollutant transport models, used to estimate past phenomena or predict future scenarios.

A linked optimization simulation‐based methodology is often the viable and efficient approach for source identification in a regional‐scale aquifer. The unknown contamination source identification in a contaminated aquifer is generally a very complex, ill‐posed, and nonunique problem [5]. The nonuniqueness can be caused by sparsity of field measurement data or due to the inefficiency of the optimization algorithm to reach a global optimal solution. Designed monitoring networks [6–9] can reduce the nonuniqueness related to data availabil‐ ity. However, the nonuniqueness related to the search for a single global optimal solution to the inverse problem depends on the efficiency of the optimization algorithm. Other approach for source identification consists of solving the differential equations backwards in time (inverse problem). The random walk particle method [10, 11], the quasi‐reversibility tech‐ nique [12], the minimum relative entropy method [13], the Bayesian theory and geostatistical techniques [14], and genetic algorithm [15, 16] are some examples of this approach.

A simulation‐optimization methodology couples the forward time contaminant simulation model with optimization techniques. If an optimization problem is solvable and every mini‐ mization sequence converges to a unique answer, it is called stable [17]. This methodology avoids the problem of stability associated with formally solving the inverse problem, but the iterative nature of simulation models usually requires increased computational effort. Many techniques were proposed utilizing coupled simulation‐optimization, and a few representa‐ tive ones are: response matrix [18, 19], embedded optimization [3], and linked simulation and ptimization [3, 15, 20].

**1. Introduction**

134 Computational Optimization in Engineering - Paradigms and Applications

**2. Background**

One of the efficient methodologies for identifying groundwater contamination source charac‐ teristics is the linked simulation‐optimization model [1]. Efficiency and accuracy of the opti‐ mal solutions for this type of inverse models, which are often complex, nonlinear, and large scale, depend on the efficiency and accuracy of the optimization algorithm. Simulated anneal‐ ing (SA) and adaptive simulated annealing (ASA) are two efficient evolutionary optimization algorithms which have been recently applied for solving such a large‐scale nonlinear and complex linked simulation‐optimization models for optimal characterization of unknown contaminant sources in groundwater systems. This chapter discusses the application of these two optimization algorithms and their relative performances in identifying the characteris‐ tics of contamination sources in groundwater systems. Adaptive simulated annealing is a modified version of simulated annealing where the optimization parameters are tuned auto‐ matically [2]. The advantages of using the ASA algorithm are demonstrated for an illustrative groundwater contamination‐related problem, and the relative efficiency and accuracy of these two optimization algorithms, SA and ASA, are compared. The adaptive algorithm, ASA, is shown to be computationally more efficient and more suitable for searching for a globally optimal solution for a complex nonlinear optimization model representing complex flow and

Effective groundwater pollution management and remediation depend on identifying the unknown pollution source and reconstructing their release history [3, 4]. The optimal and accurate identification of the contaminant sources plays an important role in modeling of sub‐ surface transport processes and in reducing the long‐term contamination remediation cost. The source identification problem deals with the spatial and temporal variations of the loca‐ tion, activity duration, and the injection rate of the pollutant sources and is mostly inferred from the available sparse and sometimes erroneous concentration measurements at the site. Mainly, source identification includes a simulation problem, such as groundwater flow and pollutant transport models, used to estimate past phenomena or predict future scenarios.

A linked optimization simulation‐based methodology is often the viable and efficient approach for source identification in a regional‐scale aquifer. The unknown contamination source identification in a contaminated aquifer is generally a very complex, ill‐posed, and nonunique problem [5]. The nonuniqueness can be caused by sparsity of field measurement data or due to the inefficiency of the optimization algorithm to reach a global optimal solution. Designed monitoring networks [6–9] can reduce the nonuniqueness related to data availabil‐ ity. However, the nonuniqueness related to the search for a single global optimal solution to the inverse problem depends on the efficiency of the optimization algorithm. Other approach for source identification consists of solving the differential equations backwards in time (inverse problem). The random walk particle method [10, 11], the quasi‐reversibility tech‐ nique [12], the minimum relative entropy method [13], the Bayesian theory and geostatistical

techniques [14], and genetic algorithm [15, 16] are some examples of this approach.

contaminant transport process in a contaminated groundwater aquifer.

The two limitations of the response matrix approach are as follows: it is based on the prem‐ ise that the superposition principle is approximately valid in terms of flow and contaminant transport in the aquifer, and the aquifer parameters must be known and the simulation model must be used to generate the response matrix prior to running the source identification model [3]. Mahar and Datta [3] showed that the embedding methods need large computer storage and computational time, for large aquifers. Gorelick and Evans [18] concluded that numerical difficulties are likely to arise for large‐scale problems using the embedding technique.

To conduct unknown pollutant source characterization in large‐scale aquifers and real areas, linked simulation‐optimization methodology was proposed. In this methodology, the numeri‐ cal models for simulation of the flow and transport process are internally linked to the optimi‐ zation algorithm. Chadalavada and Datta [1] and Amirabdollahian and Datta [21] presented an overview of the pollution source identification simulation‐optimization approaches.

The linked simulation‐optimization model is an efficient and effective technique to charac‐ terize the contaminant sources by internal linkage between flow and contaminant transport simulation models and the selected optimization technique. This methodology can solve con‐ taminant source problems in fairly large study areas.

Evolutionary optimization algorithms have made it possible to solve complex linked simu‐ lation‐optimization models, which are difficult to solve, or difficult to even obtain feasible solutions, when utilizing classical optimization tools. Moreover, there is less limitation in mathematical definition of objective function and constraints compared to former optimi‐ zation algorithms such as linear programming [22]. Finally, evolutionary algorithms can optimally identify relatively large number of decision variables [23], and utilization of the evolutionary optimization algorithms simplifies the linking process. Examples of the evolu‐ tionary optimization algorithms include: genetic algorithm (GA) [24], tabu search (TS) [25], simulated annealing (SA), adaptive simulated annealing (ASA) [26], and differential evolu‐ tion algorithm [27]. Yeh [28] and Datta and Kourakos [22] presented an overview on various optimization methods coupled with simulation techniques utilized for groundwater quantity management, and quality management, respectively.

In a linked simulation‐optimization approach, the optimization algorithm is used as an effi‐ cient search tool and the accuracy and efficiency of the methodology depend on the selected optimization algorithm. In groundwater contaminant source characterization problems, even when there are no errors or uncertainties associated with the inputs and parameter estimates for the physical process simulation model, there may not be a unique solution to the inverse problem due to nonunique physical response of the system. The ill‐posed nature of the inverse problem is a different issue and is predominantly due to sparsity of measurement data, which can be addressed by designing and implementing a suitable concentration moni‐ toring network [29]. The ill‐posed nature of the inverse problem and the plausibility of nonu‐ nique solutions can be interrelated as well. More efficient monitoring networks can reduce the plausibility of nonunique solutions, and therefore, optimal monitoring network design is a related issue [9, 22]. Also, only if the global optimum solution is found, it may represent the accurate solution to the source identification problem. Nonuniqueness in the system response may introduce alternate optimal solutions, although each globally optimal [5, 20, 21]. This is due to the fact that different sets of stresses (i.e., contaminant sources) can result in identi‐ cal responses (resulting spatial and temporal concentrations). Therefore, nonuniqueness is inherent, independent of errors in parameter estimates or measurements. In addition, if the optimal solution is not the global solution, it itself introduces nonuniqueness due to local opti‐ mal solutions being wrongly identified as the optimal solution of the inverse problem. In the optimal contaminant source identification process, this global optimum generally represents the actual contaminant source characteristics. Failure to identify the global optimal and the plausible local optimal solutions introduces nonuniqueness in the solution space. Therefore, efficiency of the optimization algorithm to reach a global optimum solution, or near optimal solution, is crucial to accurate source identification. Efficiency of algorithms like ASA can be extremely useful especially when compared to SA which needs tuning of optimization parameters, hence rendering the search for a global optimal somewhat subjective or more uncertain.

In this chapter, two evolutionary optimization algorithms are described: SA and ASA. Simulated annealing (SA) approaches the optimization model like a bouncing ball, which bounces over mountains from valley to valley. The SA controlling parameter is temperature (T) which mimics the effect of fast moving particle in a hot object like hot molten metal; as the T decreases and reaches relatively colder states, the height of the ball bounce decreases and it settles gradually in the deepest valley. To reach the optimal solution, there are many param‐ eters which need to be tuned, such as probability density function, acceptance probability density function, and re‐annealing temperature schedule. ASA is a variant of SA in which the automated re‐annealing temperature schedule and random step selection make the algo‐ rithm less sensitive to the user‐defined parameters. One of the issues in selecting a suitable and efficient optimization algorithm for solution of an optimization model is the likelihood of reaching a global optimum solution. It has been shown that SA is relatively more efficient in reaching a better optimal solution compared to GA [2]. The added advantage of using ASA is the elimination of the requirement to choose all the relevant optimization parameters appropriately, a process very much dependent on the structure and nature of the optimization model to be solved. ASA also eliminated the need for several trial executions of the model, to adjust the parameters [23]. Therefore, the possibility of reaching a global optimal solution faster is also enhanced by utilizing ASA. Very fast simulated re‐annealing (VFSR) developed in 1987 is the first version of ASA [30]. Ingber and Rosen [31] showed that VFSR is about one order of magnitude faster than GA in convergence speed and is more likely to find the global optimum.

This study investigates the applicability of ASA to unknown groundwater contaminant source release history reconstruction problem and compares its performance to SA‐based solutions. The performance evaluation of linked simulation‐optimization approach is based on a real‐ istic scenario where contaminant concentration measurements are available a few years after the sources have ceased to exist. Apart from the convergence speeds, the two algorithms are compared for their performance in recovering accurate source release histories in terms of source location, magnitude, and duration of activity.

## **3. Methodology**

inverse problem is a different issue and is predominantly due to sparsity of measurement data, which can be addressed by designing and implementing a suitable concentration moni‐ toring network [29]. The ill‐posed nature of the inverse problem and the plausibility of nonu‐ nique solutions can be interrelated as well. More efficient monitoring networks can reduce the plausibility of nonunique solutions, and therefore, optimal monitoring network design is a related issue [9, 22]. Also, only if the global optimum solution is found, it may represent the accurate solution to the source identification problem. Nonuniqueness in the system response may introduce alternate optimal solutions, although each globally optimal [5, 20, 21]. This is due to the fact that different sets of stresses (i.e., contaminant sources) can result in identi‐ cal responses (resulting spatial and temporal concentrations). Therefore, nonuniqueness is inherent, independent of errors in parameter estimates or measurements. In addition, if the optimal solution is not the global solution, it itself introduces nonuniqueness due to local opti‐ mal solutions being wrongly identified as the optimal solution of the inverse problem. In the optimal contaminant source identification process, this global optimum generally represents the actual contaminant source characteristics. Failure to identify the global optimal and the plausible local optimal solutions introduces nonuniqueness in the solution space. Therefore, efficiency of the optimization algorithm to reach a global optimum solution, or near optimal solution, is crucial to accurate source identification. Efficiency of algorithms like ASA can be extremely useful especially when compared to SA which needs tuning of optimization parameters, hence rendering the search for a global optimal somewhat subjective or more

136 Computational Optimization in Engineering - Paradigms and Applications

In this chapter, two evolutionary optimization algorithms are described: SA and ASA. Simulated annealing (SA) approaches the optimization model like a bouncing ball, which bounces over mountains from valley to valley. The SA controlling parameter is temperature (T) which mimics the effect of fast moving particle in a hot object like hot molten metal; as the T decreases and reaches relatively colder states, the height of the ball bounce decreases and it settles gradually in the deepest valley. To reach the optimal solution, there are many param‐ eters which need to be tuned, such as probability density function, acceptance probability density function, and re‐annealing temperature schedule. ASA is a variant of SA in which the automated re‐annealing temperature schedule and random step selection make the algo‐ rithm less sensitive to the user‐defined parameters. One of the issues in selecting a suitable and efficient optimization algorithm for solution of an optimization model is the likelihood of reaching a global optimum solution. It has been shown that SA is relatively more efficient in reaching a better optimal solution compared to GA [2]. The added advantage of using ASA is the elimination of the requirement to choose all the relevant optimization parameters appropriately, a process very much dependent on the structure and nature of the optimization model to be solved. ASA also eliminated the need for several trial executions of the model, to adjust the parameters [23]. Therefore, the possibility of reaching a global optimal solution faster is also enhanced by utilizing ASA. Very fast simulated re‐annealing (VFSR) developed in 1987 is the first version of ASA [30]. Ingber and Rosen [31] showed that VFSR is about one order of magnitude faster than GA in convergence speed and is more likely to find the global

uncertain.

optimum.

The pollutant source characteristics which are required to be addressed in an identifica‐ tion procedure are: (1) source release history (time); (2) source locations; and (3) source flux magnitudes. The linked simulation‐optimization model consists of an optimization algorithm which specifies the candidate source characteristics. A simulation model which is linked to the optimization algorithm uses the candidate characteristics to simulate the contaminant concentration at monitoring locations at various time intervals. The optimi‐ zation algorithm is used to minimize the objective function representing the differences between measured concentrations and simulated ones. This process evolves until the algo‐ rithm reaches the optimal solution or a specified stopping criterion. This methodology for identification of unknown groundwater contamination sources has two major components: numerical groundwater flow and transport simulation models and linked simulation‐opti‐ mization model.

#### **3.1. Groundwater flow and transport simulation models**

Groundwater flow simulation model used in this study is MODFLOW‐2000 [32]. MODFLOW is a computer program that numerically solves the three‐dimensional ground water flow equation for a porous medium by using a finite‐difference method. The partial differential equation for transient groundwater flow utilized in MODFLOW is given by the following equation [33]:

$$\frac{\partial}{\partial x}\left(\mathbf{K}\_{\text{xx}}\frac{\partial h}{\partial x}\right) + \frac{\partial}{\partial y}\left(\mathbf{K}\_{\text{yy}}\frac{\partial h}{\partial y}\right) + \frac{\partial}{\partial z}\left(\mathbf{K}\_{\text{zz}}\frac{\partial h}{\partial z}\right) + \mathcal{W} = \mathcal{S}\_{s}\frac{\partial h}{\partial t} \tag{1}$$

where Kxx , Kyy and Kzz are the hydraulic conductivities (L/T) along the x, y, and z coordi‐ nate axes which are assumed to be parallel to the principal axes of hydraulic conductivity, respectively. H, SS, and t are the potentiometric head (L), the specific storage of the porous material (L‐1), and time, respectively. W is the volumetric flux per unit volume representing sources and/or sinks of water, where W < 0 for flow moving out of the groundwater system, and W > 0 for flow moving in (T‐1). When combined with boundary and initial conditions, Eq. (1) describes transient three‐dimensional groundwater flow in a heterogeneous and aniso‐ tropic medium, assuming that the principal axes of hydraulic conductivity are aligned with the coordinate directions.

The contaminant transport simulation model which is used in this study is chosen as MT3DMS [34]. The partial differential equation describing three‐dimensional transport of contaminants in groundwater can be written as follows [35]:

$$\frac{\partial \mathbb{C}}{\partial t} = \frac{\partial}{\partial x\_j} \left( D\_{\mid k} \frac{\partial \mathbb{C}}{\partial x\_k} \right) - \frac{\partial}{\partial x\_j} \{\mu\_j \mathbb{C}\} + \sum\_{i=1}^N \frac{q\_i}{\theta} + \sum\_{p=1}^{NR} R\_p \tag{2}$$

where C is the solute concentration in groundwater (ML‐3); t is the time (T); j and k are the Cartesian coordinates along axes; u<sup>j</sup> is the groundwater velocity (LT‐1); Dj,k is the dispersion coef‐ ficient tensor (L<sup>2</sup> T‐1); q<sup>i</sup> is the flux of contaminant concentration for source i (MT‐1); and ∑*<sup>p</sup>*=1 *NR Rp* are chemical reaction terms (ML‐3T‐1).

The MT3DMS transport model uses a mixed Eulerian‐Lagrangian approach to the solution of the three‐dimensional advective‐dispersive‐reactive equation [34]. The groundwater velocity values (u<sup>j</sup> ), estimated by the flow model, are used by the transport model to estimate concen‐ tration values. The estimated concentrations (C) are transferred to the optimization model (in the linked simulation‐optimization approach) to evaluate the objective function value.

#### **3.2. Linked simulation‐optimization method**

For completeness, a brief description of the formulation for the linked simulation‐optimiza‐ tion source identification framework is presented. More details can be found in Mahar and Datta [3]. Based on the available background information about the site, the set of poten‐ tial contaminant source locations is assumed to be known. The optimization model esti‐ mates the optimal contaminant fluxes associated with each potential source location at each stress period. The objective function minimizes the weighted sum of normalized differences between temporal and spatial observed and simulated concentrations subject to upper and lower bounds on source fluxes. The optimal source identification model is defined by the objective function 3, subject to the constraints 4 and 5.

$$\text{Min } F = \sum\_{k=1}^{nk} \sum\_{ib=1}^{mb} \left( \text{Ces } t\_{iab}^k - \text{Cob } s\_{iab}^k \right)^2 \text{ /} (\text{Cob } s\_{iab}^k + a)^2 \tag{3}$$

Subject to:

$$\text{Cost} = f(\mathbf{D}\_r \mathbf{H} \mathbf{C}\_r \theta, \mathbf{x}\_{\rho} y\_{\rho} z\_{\rho} q\_{\parallel}) \quad i = 1, \ldots, N \tag{4}$$

$$0 \le q\_i \le q\_{\text{max}} \text{ i = 1, \dots, N} \tag{5}$$

where nk, nob, and N are the total number of concentration observation time periods, avail‐ able monitoring locations, and candidate source locations, respectively. *Ces t iob <sup>k</sup>* and *Cob s iob <sup>k</sup>* are the concentration estimated by the simulation model and observed concentration at observation location iob and at the end of time period k, respectively. D, HC, and θ are the dispersion coefficient, hydraulic conductivity, and porosity, respectively. x<sup>i</sup> , yi , zi , and qi are the Cartesian coordinates of candidate contaminant source i and the contaminant release flux for candidate location i, respectively. *q*max is the upper bound for contaminant release fluxes.

The constraint set 4 represents the flow and contaminant transport simulation models, and it couples the simulation model and optimization algorithm. Eq. (5) limits the candidate con‐ taminant flux values, at each potential location, to an upper bound. In Eq. (3), α is a constant, which is sufficiently large to prevent any individual term in Eq. (3) becoming indeterminate due to the observed value of concentration becoming very small. Adding this parameter variable also prevents domination of the obtained solution by deviation between measured and simulated concentrations corresponding to low concentration measurement values [3]. **Figure 1** shows a schematic representation of the linked simulation‐optimization source iden‐ tification process using evolutionary optimization algorithms.

The contaminant transport simulation model which is used in this study is chosen as MT3DMS [34]. The partial differential equation describing three‐dimensional transport of contaminants

where C is the solute concentration in groundwater (ML‐3); t is the time (T); j and k are the

The MT3DMS transport model uses a mixed Eulerian‐Lagrangian approach to the solution of the three‐dimensional advective‐dispersive‐reactive equation [34]. The groundwater velocity

tration values. The estimated concentrations (C) are transferred to the optimization model (in

For completeness, a brief description of the formulation for the linked simulation‐optimiza‐ tion source identification framework is presented. More details can be found in Mahar and Datta [3]. Based on the available background information about the site, the set of poten‐ tial contaminant source locations is assumed to be known. The optimization model esti‐ mates the optimal contaminant fluxes associated with each potential source location at each stress period. The objective function minimizes the weighted sum of normalized differences between temporal and spatial observed and simulated concentrations subject to upper and lower bounds on source fluxes. The optimal source identification model is defined by the

the linked simulation‐optimization approach) to evaluate the objective function value.

), estimated by the flow model, are used by the transport model to estimate concen‐

 (*uj C* ) + ∑ *i*=1 *<sup>N</sup> q* \_\_*i <sup>θ</sup>* + ∑ *p*=1 *NR*

is the flux of contaminant concentration for source i (MT‐1); and ∑*<sup>p</sup>*=1

is the groundwater velocity (LT‐1); Dj,k is the dispersion coef‐

*Rp* (2)

+ *α* )<sup>2</sup> (3)

 )     *i* = 1, ..., *N* (4)

*iob <sup>k</sup>* and *Cob s*

, and qi

*iob <sup>k</sup>* are the

are the Cartesian

   ≤ *q*max    *i* = 1, ..., *N* (5)

, yi , zi *NR Rp* are

in groundwater can be written as follows [35]:

**3.2. Linked simulation‐optimization method**

objective function 3, subject to the constraints 4 and 5.

*Cest* = *f*(*D*, *HC*, *θ*, *xi*

0 ≤ *qi*

*k*=1 *nk* ∑ *iob*=1 *nob*

coefficient, hydraulic conductivity, and porosity, respectively. x<sup>i</sup>

(*Ces t iob k*

able monitoring locations, and candidate source locations, respectively. *Ces t*

location i, respectively. *q*max is the upper bound for contaminant release fluxes.

 − *Cob siob k*  )<sup>2</sup>

> , *yi* , *zi* , *qi*

where nk, nob, and N are the total number of concentration observation time periods, avail‐

concentration estimated by the simulation model and observed concentration at observation location iob and at the end of time period k, respectively. D, HC, and θ are the dispersion

coordinates of candidate contaminant source i and the contaminant release flux for candidate

The constraint set 4 represents the flow and contaminant transport simulation models, and it couples the simulation model and optimization algorithm. Eq. (5) limits the candidate con‐ taminant flux values, at each potential location, to an upper bound. In Eq. (3), α is a constant,

   / (*Cob siob k*

*Min F* = ∑

∂ *t* = \_\_\_<sup>∂</sup> ∂ *xj* (*Dj*,*<sup>k</sup>* \_∂ *C* ∂ *xk* ) − \_\_\_∂ ∂ *xj*

138 Computational Optimization in Engineering - Paradigms and Applications

\_\_\_ <sup>∂</sup>*<sup>C</sup>*

ficient tensor (L<sup>2</sup>

values (u<sup>j</sup>

Subject to:

Cartesian coordinates along axes; u<sup>j</sup>

chemical reaction terms (ML‐3T‐1).

T‐1); q<sup>i</sup>

**Figure 1.** Schematic diagram of linked simulation‐optimization contaminant source identification methodology.

## **4. Performance evaluation**

A three‐dimensional transient illustrative groundwater contamination problem is utilized to compare the efficiency and accuracy of SA and ASA optimization algorithm in the context of this research. First, the illustrative groundwater system is defined. Arbitrary monitoring locations are selected, and flow and transport simulation models are used to estimate con‐ taminant concentrations at monitoring locations and times. Specifically for the performance evaluation purpose, these simulated concentration measurement values are to be used as observed concentration in the linked simulation‐optimization source identification model. Using this performance evaluation procedure with synthetic data for an illustrative example facilitates the comparison between the application of SA and ASA optimization algorithms, without the need to consider the reliability of model properties, measurement accuracies, and parameters estimation errors.

#### **4.1. Illustrative groundwater contamination problem**

The performance of the proposed methodology is evaluated for a three‐dimensional illus‐ trative groundwater aquifer study area. **Figure 2** shows the plan view of the three‐dimen‐ sional study area measuring 1500 m × 1000 m × 36 m and consisting of two unconfined layers. The top, bottom, and left side boundaries have a specified head, and the right‐hand

**Figure 2.** Plan view of the study area with location of contaminant sources.

side boundary has variable head boundary conditions. The triangular signs show the loca‐ tion of active extraction wells (sinks). The candidate contaminant source locations are shown by square signs. Two of them are actual sources and one is a dummy. The dummy (not actual) source is introduced to evaluate the accuracy of the proposed methodology in correctly identifying actual sources. Twelve concentration monitoring locations are speci‐ fied. It is assumed that the contaminant source fluxes are the same in every stress period. The study period is divided into five stress periods. **Table 1** shows the length of stress periods and the extraction wells and contaminant sources' properties. The field hydrogeological parameters are given in **Table 2**.


**Table 1.** Characteristics of the contamination sources and extraction wells.

**4. Performance evaluation**

parameters estimation errors.

**4.1. Illustrative groundwater contamination problem**

140 Computational Optimization in Engineering - Paradigms and Applications

**Figure 2.** Plan view of the study area with location of contaminant sources.

A three‐dimensional transient illustrative groundwater contamination problem is utilized to compare the efficiency and accuracy of SA and ASA optimization algorithm in the context of this research. First, the illustrative groundwater system is defined. Arbitrary monitoring locations are selected, and flow and transport simulation models are used to estimate con‐ taminant concentrations at monitoring locations and times. Specifically for the performance evaluation purpose, these simulated concentration measurement values are to be used as observed concentration in the linked simulation‐optimization source identification model. Using this performance evaluation procedure with synthetic data for an illustrative example facilitates the comparison between the application of SA and ASA optimization algorithms, without the need to consider the reliability of model properties, measurement accuracies, and

The performance of the proposed methodology is evaluated for a three‐dimensional illus‐ trative groundwater aquifer study area. **Figure 2** shows the plan view of the three‐dimen‐ sional study area measuring 1500 m × 1000 m × 36 m and consisting of two unconfined layers. The top, bottom, and left side boundaries have a specified head, and the right‐hand


**Table 2.** Hydrogeologic parameters for the study area.

#### **4.2. Contaminant source identification process**

For the purpose of identifying contaminant source characteristics, all sources are considered to be active during all five stress periods, and the pollutant flux from each of the sources is assumed to be constant over a specified stress period. In order to evaluate the model perfor‐ mance, one dummy (not actual) source is also introduced as a potential source. Therefore, the source identification decision variables are the contaminant fluxes at three potential source locations for each stress period, and in total, there are 15 decision variables to be identified.

Since the performance is evaluated using illustrative problem, flow and transport simula‐ tion models are utilized to find contaminant concentration at monitoring locations using the actual source fluxes. These values are used in the linked simulation‐optimization process as observed concentrations. The initial source fluxes are set to 0 for all sources and stress periods. In real‐life contaminant source identification problems, the observed concentrations collected in the field will be used to find optimal source characteristics.

### **5. Results and discussion**

The applicability of these two algorithms is compared in terms of efficiency and accuracy. The run time and number of generations are utilized to compare the efficiency of the algorithms. Moreover, the estimated source characteristics are compared with the actual properties in order to compare algorithms in terms of accuracy. To examine the capability of both models in terms of accuracy in estimating source fluxes, the normalized absolute error of estimation (NAEE%) is estimated using Eq. (6) [2]:

$$\text{NAEE}\left(\%\right) = \frac{\sum\_{i=1}^{N} \left| \left. q\_{rel}^{i} - q\_{ar}^{i} \right|}{\sum\_{i=1}^{N} q\_{at}^{i}} \times 100\right.\tag{6}$$

where N is the number of stress periods and *qest <sup>i</sup>* and *qact <sup>i</sup>* are the estimated and actual source fluxes for stress period i, respectively.

**Table 3** presents the SA and ASA optimization parameters. Every iteration of SA‐ and ASA‐ based methods uses one run of the groundwater transport simulation model (MT3DMS). Irrespective of the optimization algorithm, the execution time for one transport simulation run depends on the computation platform. In order to have a comparison between methods,


**Table 3.** Optimization model parameters.

independent of the utilized computation platforms, both methods are compared based on the number of simulation runs which is directly proportional to the computational time.

In real‐life groundwater contaminant source identification problems, the source character‐ istics are unknown. Therefore, there is no information available to measure the accuracy of linked simulation‐optimization methods. The accuracy and efficiency of SA depend to a large extent on the selected SA optimization parameters. However, due to unknown source char‐ acteristics, sensitivity analysis and tuning SA parameters are not possible or very difficult. To compare SA‐ and ASA‐based methods, SA with initial temperature (T) 1.0E8 and temperature reduction factor (TR) 0.5 is selected. **Figures 3** and **4** compare the estimated against actual fluxes for sources 1 and 3, respectively. NAEE% of the estimated fluxes using ASA and SA models (T = 1.0E8, RT = 0.5) is 22.5 and 75%, respectively. Both methods identified the dummy source (not an actual source but introduced as a potential source for evaluation purpose). As shown in **Figures 3** and **4**, this particular set of SA parameters represents one particular SA solution highlighting the comparative inefficiency of SA.

**Figure 5** shows the convergence profiles for both SA‐ (T = 1.0E8, RT = 0.5) and ASA‐based models. Minimum value of the objective function achieved is plotted against number of the transport process numerical simulation models. **Figure 5** shows that ASA converges faster to the smaller objective function values (in the minimization problem), compared to the utilized SA model. Although, as **Figure 5** shows, the SA‐based model converges to very small objec‐ tive function values, the corresponding estimated NAEE% (75%) is large. This shows that SA‐based solutions seem to get trapped in a local optimum and did not find or get close to the global optimum. This may be due to the nonunique nature of the local optimal solutions

**Figure 3.** Source 1 release fluxes.

**4.2. Contaminant source identification process**

142 Computational Optimization in Engineering - Paradigms and Applications

in the field will be used to find optimal source characteristics.

**5. Results and discussion**

(NAEE%) is estimated using Eq. (6) [2]:

fluxes for stress period i, respectively.

**Table 3.** Optimization model parameters.

NAEE (% ) =

where N is the number of stress periods and *qest*

For the purpose of identifying contaminant source characteristics, all sources are considered to be active during all five stress periods, and the pollutant flux from each of the sources is assumed to be constant over a specified stress period. In order to evaluate the model perfor‐ mance, one dummy (not actual) source is also introduced as a potential source. Therefore, the source identification decision variables are the contaminant fluxes at three potential source locations for each stress period, and in total, there are 15 decision variables to be identified. Since the performance is evaluated using illustrative problem, flow and transport simula‐ tion models are utilized to find contaminant concentration at monitoring locations using the actual source fluxes. These values are used in the linked simulation‐optimization process as observed concentrations. The initial source fluxes are set to 0 for all sources and stress periods. In real‐life contaminant source identification problems, the observed concentrations collected

The applicability of these two algorithms is compared in terms of efficiency and accuracy. The run time and number of generations are utilized to compare the efficiency of the algorithms. Moreover, the estimated source characteristics are compared with the actual properties in order to compare algorithms in terms of accuracy. To examine the capability of both models in terms of accuracy in estimating source fluxes, the normalized absolute error of estimation

> ∑*<sup>i</sup>*=1 *N* | *qest <sup>i</sup>* − *qact i* \_\_\_\_\_\_\_\_\_\_\_\_| ∑*<sup>i</sup>*=1 *<sup>N</sup> qact*

**Table 3** presents the SA and ASA optimization parameters. Every iteration of SA‐ and ASA‐ based methods uses one run of the groundwater transport simulation model (MT3DMS). Irrespective of the optimization algorithm, the execution time for one transport simulation run depends on the computation platform. In order to have a comparison between methods,

**Parameter Unit Value** Error tolerance for termination – 0.01 Objective function multiplier – 100 Lower bound for source fluxes g/s 0 Lower bound for source fluxes g/s 100 Initial source fluxes g/s 0

*<sup>i</sup>* and *qact*

*<sup>i</sup>* × 100 (6)

*<sup>i</sup>* are the estimated and actual source

as well, that is, the obtained solution matches the observed and simulated concentrations for a different set of sources not representing actual sources. The objective function is very small,

**Figure 4.** Source 3 release fluxes.

even though the accuracy of source estimates is very poor. **Figure 5** shows that the objective function improvement rate decreases after 10,000 simulation runs. Next, sensitivity analysis is conducted to find suitable SA parameters. A set of 10,000 simulation runs is selected as maximum number of simulation runs.

For the performance evaluation purposes, a sensitivity analysis is performed to find suit‐ able SA parameters. This sensitivity analysis, so‐called artificial, is not possible in real‐life

**Figure 5.** SA‐ (T = 1.0E8, RT = 0.5) and ASA‐based model convergence profiles.

contamination problems. In this chapter, illustrative example study area is selected with syn‐ thetic aquifer data. Therefore, SA parameters are tuned by comparing estimated and actual release fluxes. This step is not possible in real‐life scenarios where the source release fluxes are unknown. These evaluation results only show that SA can deliver solution results compa‐ rable to the ASA solutions if the SA parameters could be optimally tuned based on sequential matching of desired and obtained solutions, an impractical scenario.

Two parameters, initial temperature and temperature reduction factor, in order to find the sensitivity of SA models to the optimization parameters. Since the objective of this chapter is to compare the performance of SA‐ and ASA‐based models, an initial execution of ASA is used to find desirable number of simulation runs. Using this initial model execution, 10,000 simulation runs are selected. **Figure 6** shows the resulted NAEE% using different SA param‐ eters with the same number of maximum simulation runs (10,000). The least NAEE% is asso‐ ciated with 1000 as initial temperature (T) and 0.3 as the temperature reduction factor (TR).

In order to compare the accuracy and convergence of the tune SA‐ and ASA‐based models, both models are executed with unconstrained time of run. Error tolerance of estimation is set as 0.01 for both methods. NAEE% of the estimated fluxes using ASA and tuned SA (T = 1.0E3, RT = 0.3) models is 22.5 and 16.5%, respectively.

It can be inferred from the results that both methods are able to correctly identify the dummy source. A zero or near zero estimation of the dummy source implies correct identification. Solution results obtained using tuned SA parameters recovered source 1 release fluxes more accurately compared to the other SA‐based solutions. Solutions obtained utilizing the ASA algo‐

**Figure 6.** SA parameters sensitivity analysis.

even though the accuracy of source estimates is very poor. **Figure 5** shows that the objective function improvement rate decreases after 10,000 simulation runs. Next, sensitivity analysis is conducted to find suitable SA parameters. A set of 10,000 simulation runs is selected as

For the performance evaluation purposes, a sensitivity analysis is performed to find suit‐ able SA parameters. This sensitivity analysis, so‐called artificial, is not possible in real‐life

maximum number of simulation runs.

144 Computational Optimization in Engineering - Paradigms and Applications

**Figure 5.** SA‐ (T = 1.0E8, RT = 0.5) and ASA‐based model convergence profiles.

**Figure 4.** Source 3 release fluxes.

rithm recovered source 3 release fluxes more accurately. Apparently, both methods resulted in relatively similar accuracy in recovering source release fluxes in terms of location and magni‐ tudes. The tuned SA method solution results are slightly superior to ASA method considering accuracy, although the errors are similar in magnitude. As shown in **Figure 6**, all other sets of SA parameters (the ones tested in sensitivity analysis process) resulted in higher NAEE% compared with ASA method (25%). This shows that the performance of the SA‐based method is highly reliant on the selection of its parameters which limits its applicability in real‐life scenarios.

It can be argued that the poor performance of various SA‐based models reported in **Figure 6** may have resulted from limited number of simulation runs. The purpose of this chapter is to compare ASA‐ and SA‐based models based on both accuracy and convergence speed. Therefore, improved accuracy with the cost of relatively larger execution times is beyond the scope of this chapter. However, to give an insight to the readers, the SA model with T = 1.0E10 and RT = 0.9 is tested. The NAEE% associated with the estimated fluxes is 0.65 which was achieved after 103,876 simu‐ lation runs. **Figure 7** shows the T values and corresponding minimum objective function values.

Therefore, longer execution times can improve the results using non‐tuned SA‐based model. **Figure 7** shows that faster convergence happens when T reaches cooler states. This potentially demonstrates that very high T values would not have substantial positive effect on finding the optimal values. More rigorous studies are required to make a definitive conclusion about the SA parameters optimization without any time (computational costs) constraints.

**Figure 7.** SA‐based model convergence profile, T = 1.0E10, RT = 0.9.

## **6. Conclusion**

rithm recovered source 3 release fluxes more accurately. Apparently, both methods resulted in relatively similar accuracy in recovering source release fluxes in terms of location and magni‐ tudes. The tuned SA method solution results are slightly superior to ASA method considering accuracy, although the errors are similar in magnitude. As shown in **Figure 6**, all other sets of SA parameters (the ones tested in sensitivity analysis process) resulted in higher NAEE% compared with ASA method (25%). This shows that the performance of the SA‐based method is highly reliant on the selection of its parameters which limits its applicability in real‐life scenarios.

146 Computational Optimization in Engineering - Paradigms and Applications

It can be argued that the poor performance of various SA‐based models reported in **Figure 6** may have resulted from limited number of simulation runs. The purpose of this chapter is to compare ASA‐ and SA‐based models based on both accuracy and convergence speed. Therefore, improved accuracy with the cost of relatively larger execution times is beyond the scope of this chapter. However, to give an insight to the readers, the SA model with T = 1.0E10 and RT = 0.9 is tested. The NAEE% associated with the estimated fluxes is 0.65 which was achieved after 103,876 simu‐ lation runs. **Figure 7** shows the T values and corresponding minimum objective function values. Therefore, longer execution times can improve the results using non‐tuned SA‐based model. **Figure 7** shows that faster convergence happens when T reaches cooler states. This potentially demonstrates that very high T values would not have substantial positive effect on finding the optimal values. More rigorous studies are required to make a definitive conclusion about the

SA parameters optimization without any time (computational costs) constraints.

**Figure 7.** SA‐based model convergence profile, T = 1.0E10, RT = 0.9.

Characterization of unknown sources of groundwater pollution, especially when field con‐ centration measurement data are sparse and arbitrary, remains a challenging problem. The linked simulation‐optimization method to solve the inverse problem of unknown groundwa‐ ter contamination source characterization problem is utilized. The performance of two evolu‐ tionary optimization algorithms, SA and ASA, in terms of accuracy and convergence speed is evaluated. It is applied to an illustrative contaminated aquifer study area. Evaluation results show that suitable SA parameters need to be selected based on the nature of the problem to be optimized. Performance evaluation shows that the ASA‐based method estimates the source release fluxes more accurately and convergences to a smaller objective function value (in a minimization problem) faster than non‐tuned SA‐based method.

In this limited study, an illustrative study area was selected with synthetic hydrogeology and contamination data specified for evaluation of the solution results. Therefore, just for comparison and performance evaluation purpose, SA parameters are tuned by comparing estimated and actual release fluxes. This practice is not possible in real‐life scenarios where the source release fluxes are unknown. Without synthetic data, and simulation results to rep‐ resent field concentration measurements, such tuning with prior knowledge will be impos‐ sible. Therefore, in real‐life scenarios, the SA performance cannot be controlled by tuned SA parameters. This limits the application of SA‐based models. The non‐tuned SA model con‐ verged to poor results even with unconstrained computational time. Results demonstrated that non‐tuned SA might not converge to near optimum results, even with a large number of iterations, and it may be trapped in the vicinity of local optimal solutions. This is due to nonunique nature of the groundwater contamination source characterization problems. Since tuning SA parameters in real‐life scenarios is hard or impossible, utilizing SA may lead to wrong source flux estimation. The wrong estimation of source fluxes is not verifi‐ able in real life and may lead to wrong decisions about management and remediation of the contaminated area.

The solution results obtained by an SA‐based model with tuned parameters and solutions obtained by ASA‐based model show that they have relatively similar performance con‐ cerning both accuracy and convergence speed. Moreover, the need to tune SA parameters will substantially limit its application in groundwater source identification problems. The tuning trial and error process increases the total computational costs of the linked simu‐ lation‐optimization process. Therefore, the application of ASA in linked simulation‐opti‐ mization‐based groundwater source identification models results in substantial savings in computational time and potentially results in more accurate results. This inference is critical for designing, effective, and efficient contaminated aquifer remediation strategies which are often very costly, and cost of failed remediation strategies, caused by inaccurate character‐ ization of unknown contamination sources, can be enormous. ASA is shown to provide a reliable and an acceptable alternative for solving this challenging aquifer contamination problem.

## **Author details**

Mahsa Amirabdollahian<sup>1</sup> \* and Bithin Datta<sup>2</sup>


## **References**


[10] Bagtzoglou A C, Tompson A F B, Dougherty D E. Probabilistic simulation for reli‐ able solute source identification in heterogeneous porous media. Water Resources Engineering Risk Assessment, 1991; **29**(NATO ASI Series): 189–201. doi:10.1007/978‐3‐ 642‐76971‐9\_12

**Author details**

**References**

Mahsa Amirabdollahian<sup>1</sup>

s11269‐013‐0451‐8

WR.1943‐5452.0000513

1107. doi:10.1007/s10040‐015‐1292‐8

\* and Bithin Datta<sup>2</sup>

1 James Cook University, Townsville, QLD, Australia

148 Computational Optimization in Engineering - Paradigms and Applications

\*Address all correspondence to: mahsa.amirabdollahian@my.jcu.edu.au

2 James Cook University and CRC‐CARE, Callaghan, NSW, Australia

Management, 2011; **8**(1–2): 40–61. doi:10.1504/IJEWM.2011.040964

307–317. doi:10.1061/(ASCE)HE.1943‐5584.0000624

329–338. doi:10.1061/(ASCE)0733‐9496(2007)

[1] Chadalavada S, Datta B, Naidu R. Optimisation approach for pollution source identifi‐ cation in groundwater: an overview. International Journal of Environment and Waste

[2] Jha M, Datta B. Three‐dimensional groundwater contamination source identification using adaptive simulated annealing. Journal of Hydrologic Engineering, 2013; **18**(3):

[3] Mahar P S, Datta B. Optimal identification of ground‐water pollution sources and param‐ eter estimation. Water Resources Planning and Management‐ASCE, 2001; **127**(1): 20–29. [4] Datta B, et al. Efficient identification of unknown groundwater pollution sources using linked simulation‐optimization incorporating monitoring location impact factor and frequency factor. Water Resources Management, 2013; **27**(14): 4959–4976. doi:10.1007/

[5] Datta B. Discussion of "Identification of Contaminant Source Location and Release History in Aquifers" by Mustafa M. Aral, Jiabao Guan, and Morris L. Maslia. Journal of Hydrologic Engineering, 2002; **7**(5): 399–400. doi:10.1061/(ASCE)1084‐0699(2002)7:5(399)

[6] Dhar A, Datta B. Multiobjective design of dynamic monitoring networks for detection of groundwater pollution. Water Resources Planning and Management, 2007; **133**(4):

[7] Jha M, Datta B. Application of dedicated monitoring–network design for unknown pollutant‐source identification based on dynamic time warping. Journal of Water Resources Planning and Management, 2015; **141**(11): 4015022. doi:10.1061/(ASCE)

[8] Jha M, Datta B. Application of unknown groundwater pollution source release history estimation methodology to distributed sources incorporating surface‐groundwater inter‐ actions. Environmental Forensics, 2015; **16**(2): 143. doi:10.1080/15275922.2015.1023385 [9] Prakash O, Datta B. Optimal characterization of pollutant sources in contaminated aqui‐ fers by integrating sequential‐monitoring‐network design and source identification: methodology and an application in Australia. Hydrogeology Journal, 2015; **23**: 1089–


[23] Ingber L. Adaptive simulated annealing (ASA): lessons learned. Control Cybernetics, 1996;

[24] Singh R M, Datta B. Identification of groundwater pollution sources using GA‐based linked simulation optimization model. Journal of Hydrologic Engineering, 2006; **11**(2):

[25] Yeh H‐D, Chang T‐H, Lin Y‐C. Groundwater contaminant source identification by a hybrid heuristic approach. Water Resources Research, 2007; **43**(9): 1–16. doi:10.1029/2005WR004731 [26] Amirabdollahian M, Datta B. Identification of pollutant source characteristics under uncertainty in contaminated water resources systems using adaptive simulated anneal‐

[27] Gurarslan G, Karahan H. Solving inverse problems of groundwater‐pollution‐source identification using a differential evolution algorithm. Hydrogeology Journal, 2015;

[28] Yeh W W G. Review: Optimization methods for groundwater modeling and manage‐ ment. Hydrogeology Journal, 2015; **23**(6): 1051–1065. doi:10.1007/s10040‐015‐1260‐3 [29] Mahar P S, Datta B. Optimal monitoring network and ground‐water‐pollution source identification. Water Resources Planning and Management, 1997; **123**(4): 199–207.

[30] Ingber L. Very fast simulated re‐annealing. Mathematical and Computer Modelling,

[31] Ingber L, Rosen B. Genetic algorithms and very fast simulated reannealing: a comparison. Mathematical and Computer Modelling, 1992; **16**(11): 87–100.

[32] Zheng C, Hill M C, Hsieh P A. MODFLOW‐2000, The U.S. Geological Survey Modular Ground‐Water Model‐User Guide to the LMT6 Package, The Linkage with MT3DMS for Multi‐Species Mass Transport Modeling, in Open File Report 01‐82, U.S.G. SURVEY,

[33] Zheng C, Wang P P. A field demonstration of the simulation optimization approach for remediation system design. Ground Water, 2002; **40**(3): 258–266. doi:10.1111/

[34] Zheng C, Wang P P. A modular three‐dimensional multispecies transport model for simulation of advection, dispersion, and chemical reactions of contaminants in ground‐

[35] Zheng C, Bennett G D. Applied Contaminant Transport Modeling. 2002. New York:

water systems, U.A.C.o. Engineers, Editor. 1999. Washington, DC.

ing and fuzzy logic. International Journal of Geomate, 2014; **6**(1): 757–762.

101–109. doi:10.1061/(ASCE)1084‐0699(2006)11:2(101)

150 Computational Optimization in Engineering - Paradigms and Applications

**23**(6): 1109–1119. doi:10.1007/s10040‐015‐1256‐z

doi:10.1061/(ASCE)0733‐9496(1997)123:4(199)

doi:10.1016/0895‐7177(92)90108‐W

Editor. 2001. Denver, Colorado.

j.1745‐6584.2002.tb02653.x

Wiley‐Interscience.

1989; **12**(8): 967–973. doi:10.1016/0895‐7177(89)90202‐1

**25**(1): 33–54.

## *Edited by Hossein Peyvandi*

The purpose of optimization is to maximize the quality of lives, productivity in time, as well as interests. Therefore, optimization is an ongoing challenge for selecting the best possible among many other inferior designs. For a hundred years in the past, as optimization has been essential to human life, several techniques have been developed and utilized. Such a development has been one of the long-lasting challenges in engineering and science, and it is now clear that the optimization goals in many of real-life problems are unlikely to be achieved without resource for computational techniques. The history of such a development in the optimization techniques starts from the early 1950s and is still in progress. Since then, the efforts behind this development dedicated by many distinguished scientists, mathematicians, and engineers have brought us today a level of quality of lives.

This book concerns with the computational optimization in engineering and techniques to resolve the underlying problems in real life. The current book contains studies from scientists and researchers around the world from North America to Europe and from Asia to Australia.

Computational Optimization in Engineering - Paradigms and Applications

Computational Optimization

in Engineering

Paradigms and Applications

*Edited by Hossein Peyvandi*

Photo by e\_zebolov / iStock