**2. Theoretical framework**

In order to introduce some nomenclature and to lay the foundation for surrogate modeling, and adaptive sampling, consider a concrete example of an engineering design task.

Before diving into the details of the setup it is worth briefly discussing how the data is collected. We obtain data in this design task either from real-world experiments or from computer experiments the latter defined in Ref. [10]. A computer experiment consists of running an expensive complex computer code for a set of different inputs. One of the main motivations of using computer codes is to approximate and thereby speed up costly real-world experiments in order to reduce the engineering design cycle time.

Continuing the example, imagine designing a wing blade described by a set of *M* geometric design variables/features/input dimensions. The blade is part of an engine and for a given blade design, i.e., for given values of the *M* features, the engine produces two outputs/objectives one which is the efficiency and the other being the mass flow rate. The *design space* is an *M*-dimensional space containing the set of all possible geometries we can consider. In this example, we want to maximize *Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective… DOI: http://dx.doi.org/10.5772/intechopen.88213*

the efficiency and to minimize the mass flow rate but in general, we may have *N* objectives. When *N*>1, we are perform multi-objective optimization, and later in this chapter, we explore a set of sampling strategies suitable for various values of *N*. The set of all points in objective space form a *response surface* which is assumed to be extremely costly to exhaustively explore putting the search for global optima at risk. Toward reducing the cost of the overall design process regardless of whether the data is obtained by real-world experiments or complex computer codes, a key ingredient is meta-/surrogate modeling where an approximation to the response surface is obtained by generalizing its behavior under certain assumptions (such as smoothness) based on few observed instances of said surface. This is typically done by querying a set of initial points on which the surrogate model is built. Following this, the meta-model is extremely cheap to query in comparison to the surface itself. These meta-models require, typically, only a handful of data to construct an accurate representation of the response surface. An important question however arises on which points to pick, i.e., on how to form the design [10, 12–14]. Given a budget on the total number of data points, how are new points added to this design sequentially? This is where adaptive sampling comes into play and GEBHM is used in conjunction with a powerful technology called intelligent design and analysis of computer experiments (GE-IDACE/IDACE) [15, 16].

In this section, we provide an overview of the mathematical framework of the GEBHM and GE-IDACE technologies introduced in Section 1. Further theoretical details and application coverage can be found in Refs. [7, 11, 15–26].

#### **2.1 Bayesian hybrid modeling (GEBHM/BHM)**

In industry applications, it is not uncommon for the data to be multidimensional, noisy, highly non-linear, and expensive to collect. On a day-to-day basis, we address the challenge of enabling a robust and uncertainty certified design utilizing both limited expansive simulations data and noisy field measurements. GE Research has an in-house software framework for advanced Bayesian modeling and machine learning named GEBHM, sometimes we shall refer to this as simply BHM, which enables the combination of multiple numerical simulations and experimental sources of data in one unified workflow. As shown in **Figure 1**, GEBHM capabilities are: uncertainty propagation and quantification, sensitivity analysis, full Bayesian model calibration, meta-modeling, multi-fidelity analysis, and adaptive design of numerical experiments. The theoretical framework of the GEBHM is based on Kennedy O'Hagans approach to modeling and fusing simulation and experimental data with associated uncertainties using GPs [6]. The noisy highfidelity model is represented as Gaussian process aggregated from a linear combination of a low-fidelity model and a model discrepancy function *δ*ð Þ� as

$$y(\mathbf{x}\_i) = \eta(\mathbf{x}\_i, \boldsymbol{\theta}) + \delta(\mathbf{x}\_i) + \epsilon, \quad \text{for } i = \mathbf{1}, \dots, n,\tag{1}$$

where *y*ð Þ� is the (high-fidelity numerical model or experimentally measured) response. The low-fidelity model is *η*ð Þ �*;* � and discrepancy term *δ*ð Þ� are modeled by separate GPs. The design variable is denoted by *xi*, while the calibration parameters are denoted by *θ*. GEBHM allows for calibration of a set of model parameters in the low-fidelity model. For example, this could be parameters in a CFD simulation that we want to tune/calibrate in order to match real-world experimental runs as closely as possible. The measurement error is denoted by ϵ.

The GP hyperparameters are learned using a Markov Chain Monte Carlo (MCMC) technique based on an Metropolis-Hastings-within-Gibbs algorithm [27, 28] with univariate proposal distributions for the posterior distribution

a reduced cost, and steam turbines that can reach higher efficiencies, to mention a few examples. The importance of a fast design process which produces global optima with the fewest amount of resources is obviously fundamentally crucial. This process, for the purpose of this chapter, can be thought of as a multi-objective optimization problem. Consider as an example the design of aerodynamic airfoil shapes. First, the performance is produced, followed by a mechanical and aeromechanics assessment. Aeromechanical feedback and reactive aerodynamic redesign rely heavily on the domain expertise of the design engineers. It is not atypical to cycle through 50 of these iterations to obtain a design that satisfies mechanical (stress, creep, fatigue to mention a few) and aeromechanical (say, clean Campbell and flutter resistant) requirements. During these cycles, data from expensive computational codes and/or real-world experiments are collected and the design cycle continues in a direction suggested by this information. Generally speaking, as some examples of the key contributions toward high resource requirements are expensive computer simulations such as computational fluid dynamics (CFD) and ANSYS. In some cases, real-world experiments need to be performed, e.g., when it comes to

With this, it should also be clear that engineering design is performed under very strict budgets. Each datum obtained whether from a simulation, physical experiment, or an expert needs to be as informative toward the goals we are trying to accomplish as possible. In some cases, it can take weeks or months to evaluate a single datum. In this case, a meta-model is built on a small representative set of data. This can be Gaussian processes (GPs) [1–5], Bayesian hybrid modeling as used

One of the key goals of this chapter is to present the state-of-the-art industrial tools toward achieving the best possible optima under strict budget constraints. Specifically at GE we are regularly seeing a reduction in cost needed to obtain the same level of information on the order of 30–90%. This leaves more room in the budget for finding even better, more competitive, designs as hitherto possible. As a consequence of the technologies covered in this chapter, we are building better aircraft engines, improving our steam turbines, and harvesting more wind energy because of this. There is still a lot more to be invented and improved, but the

In order to introduce some nomenclature and to lay the foundation for surrogate modeling, and adaptive sampling, consider a concrete example of an engineering

Before diving into the details of the setup it is worth briefly discussing how the data is collected. We obtain data in this design task either from real-world experiments or from computer experiments the latter defined in Ref. [10]. A computer experiment consists of running an expensive complex computer code for a set of different inputs. One of the main motivations of using computer codes is to

approximate and thereby speed up costly real-world experiments in order to reduce

geometric design variables/features/input dimensions. The blade is part of an engine and for a given blade design, i.e., for given values of the *M* features, the engine produces two outputs/objectives one which is the efficiency and the other being the mass flow rate. The *design space* is an *M*-dimensional space containing the set of all possible geometries we can consider. In this example, we want to maximize

Continuing the example, imagine designing a wing blade described by a set of *M*

at GE [6, 7], or polynomial chaos expansions (PCEs) [8, 9].

following sections will give an idea of where we currently stand.

passing FAA certification.

*Design and Manufacturing*

**2. Theoretical framework**

the engineering design cycle time.

design task.

**22**

#### **Figure 1.**

*A diagram of GE's Bayesian hybrid modeling (GEBHM/BHM). From left to right, varying-fidelity-level data (e.g., simulation vs. experimental) can be input to GEBHM. GE research has added a large range of capabilities over the years listed to the right which include parameter tuning, building discrepancy models between low and high fidelity data, informing the designer about which inputs mostly impact the outputs via a global sensitivity analysis, performing probabilistic predictions including tail probabilities, and building high-accuracy predictive surrogate models.*

updates. MCMC generally converges toward the most probable values for the parameters which best explain the data [29] from which representative samples can be obtained. To avoid overfitting the high-fidelity data, the initial values of the hyperparameters of covariance matrices are updated with the current realizations at every MCMC step, and realizations from the posterior distributions of the model parameters are produced.

EI ¼

dimensions under increasingly stricter budgets [15].

*EI, e.g., reaching a specific value, or something else.*

*DOI: http://dx.doi.org/10.5772/intechopen.88213*

*2.2.1 Multiple objectives: centroid method for two dimensions*

design.

**Figure 2.**

*f x<sup>i</sup>* � � <sup>¼</sup> *<sup>f</sup>* <sup>1</sup> *xi* � �*; <sup>f</sup>* <sup>2</sup> *xi*

**25**

ð*<sup>f</sup>* min �∞

In each iteration of GE-IDACE, the point with maximum EI is added to the

*Diagram of intelligent design and analysis of computer experiments as used by GE research (GE-IDACE/ IDACE). Starting with an initial design, e.g., from LHS, a stochastic model, such as BHM, is built on this. Via the stochastic predictive distribution, the desirability and uncertainty are quantified which combines into an acquisition strategy (such as: Pick points to optimize an objective—that is expected improvement (EI)). Then, we check for convergence and rank a new set of points to be run and then re-build the stochastic model completing one iteration of GE-IDACE. We iterate until convergence defined as either budget exhaustion, the*

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective…*

In what follows, we review some GE-IDACE multi-objective optimization EI methods that we have found to work well in practice but emphasize that more research is needed toward getting faster at locating the global optimum in multiple

Many approaches exist for running IDACE with multiple objectives [31–39]. Here, consider the two-dimensional case and the so-called centroid method which shares a similar intuition as its one-dimensional counterpart Eq. (3). In this methodology, each candidate point from the design space is imagined to create a centroid point, the equations for computing this point are given below. This centroid point, which is located in output space, is then compared to its closest Pareto point on the current frontier [40, 41]. For simplicity, consider now two different candidate/ design points where we compute the associated centroid point for each. Then, for the *i*th candidate point *xi* the centroid point in two-dimensional output space is

� � � � , the Pareto point *Pi* on the current frontier closest to *f xi* � � is

*<sup>f</sup>* min � *<sup>f</sup>* � �*pf* <sup>d</sup>*f:* (2)

## **2.2 Intelligent design and analysis of computer experiments (GE-IDACE/IDACE)**

While meta-models offer very low query times for response surface values they are still only approximate, especially when built on small datasets. Thus, having the meta-model does not generally mean that we can use this entirely in place of the response surface. However, we can use it as a guide to seek out new locations in the design space that are promising toward our goal which could, e.g., be optimization or to produce the most accurate surrogate model possible. The focus in this chapter is on the prior goal of global optimization.

GE-IDACE, also sometimes referred to as simply IDACE, uses the expected improvement (EI) [30] method to explore and exploit the design space for obtaining the global optimum with the fewest possible resources, see **Figure 2**. Without loss of generality, EI defines the *improvement* of a new design point *x* as *If x* ð Þ¼ ð Þ max *<sup>f</sup>* min � *f x*ð Þ*;* <sup>0</sup> , where *<sup>f</sup>* min is the current best point (also called the incumbent) and we will suppress *x* going forward. The surrogate model predicts a distribution for *f* denoted *pf* . This makes *I f*ð Þ a random variable. In face of this randomness, we are just interested in knowing how large the improvement is expected to be:

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective… DOI: http://dx.doi.org/10.5772/intechopen.88213*

#### **Figure 2.**

updates. MCMC generally converges toward the most probable values for the parameters which best explain the data [29] from which representative samples can be obtained. To avoid overfitting the high-fidelity data, the initial values of the hyperparameters of covariance matrices are updated with the current realizations at every MCMC step, and realizations from the posterior distributions of the model

*A diagram of GE's Bayesian hybrid modeling (GEBHM/BHM). From left to right, varying-fidelity-level data (e.g., simulation vs. experimental) can be input to GEBHM. GE research has added a large range of capabilities over the years listed to the right which include parameter tuning, building discrepancy models between low and high fidelity data, informing the designer about which inputs mostly impact the outputs via a global sensitivity analysis, performing probabilistic predictions including tail probabilities, and building high-accuracy*

While meta-models offer very low query times for response surface values they are still only approximate, especially when built on small datasets. Thus, having the meta-model does not generally mean that we can use this entirely in place of the response surface. However, we can use it as a guide to seek out new locations in the design space that are promising toward our goal which could, e.g., be optimization or to produce the most accurate surrogate model possible. The focus in this chapter

GE-IDACE, also sometimes referred to as simply IDACE, uses the expected improvement (EI) [30] method to explore and exploit the design space for obtaining the global optimum with the fewest possible resources, see **Figure 2**. Without loss of generality, EI defines the *improvement* of a new design point *x* as *If x* ð Þ¼ ð Þ max *<sup>f</sup>* min � *f x*ð Þ*;* <sup>0</sup> , where *<sup>f</sup>* min is the current best point (also called the incumbent) and we will suppress *x* going forward. The surrogate model predicts a distribution for *f* denoted *pf* . This makes *I f*ð Þ a random variable. In face of this randomness, we are just interested in knowing how large the improvement is

**2.2 Intelligent design and analysis of computer experiments**

parameters are produced.

*predictive surrogate models.*

*Design and Manufacturing*

**Figure 1.**

expected to be:

**24**

**(GE-IDACE/IDACE)**

is on the prior goal of global optimization.

*Diagram of intelligent design and analysis of computer experiments as used by GE research (GE-IDACE/ IDACE). Starting with an initial design, e.g., from LHS, a stochastic model, such as BHM, is built on this. Via the stochastic predictive distribution, the desirability and uncertainty are quantified which combines into an acquisition strategy (such as: Pick points to optimize an objective—that is expected improvement (EI)). Then, we check for convergence and rank a new set of points to be run and then re-build the stochastic model completing one iteration of GE-IDACE. We iterate until convergence defined as either budget exhaustion, the EI, e.g., reaching a specific value, or something else.*

$$\text{EI} = \int\_{-\infty}^{f\_{\text{min}}} (f\_{\text{min}} - f) p\_f \,\text{df} \,. \tag{2}$$

In each iteration of GE-IDACE, the point with maximum EI is added to the design.

In what follows, we review some GE-IDACE multi-objective optimization EI methods that we have found to work well in practice but emphasize that more research is needed toward getting faster at locating the global optimum in multiple dimensions under increasingly stricter budgets [15].

#### *2.2.1 Multiple objectives: centroid method for two dimensions*

Many approaches exist for running IDACE with multiple objectives [31–39]. Here, consider the two-dimensional case and the so-called centroid method which shares a similar intuition as its one-dimensional counterpart Eq. (3). In this methodology, each candidate point from the design space is imagined to create a centroid point, the equations for computing this point are given below. This centroid point, which is located in output space, is then compared to its closest Pareto point on the current frontier [40, 41]. For simplicity, consider now two different candidate/ design points where we compute the associated centroid point for each. Then, for the *i*th candidate point *xi* the centroid point in two-dimensional output space is *f x<sup>i</sup>* � � <sup>¼</sup> *<sup>f</sup>* <sup>1</sup> *xi* � �*; <sup>f</sup>* <sup>2</sup> *xi* � � � � , the Pareto point *Pi* on the current frontier closest to *f xi* � � is computed and the distance between *f x<sup>i</sup>* � � and *P<sup>i</sup>* quantifies the value of adding candidate point *xi* . Note that *Pi* 6¼ *<sup>P</sup><sup>j</sup>* in general. The candidate point inducing the biggest expected change in the Pareto frontier as measured by the distance between its centroid point and corresponding Pareto point is picked in a given iteration.

The probability that a new design point at *x* improves the *i*th member of the current Pareto front, denoted *f* <sup>∗</sup> *,*ð Þ*<sup>i</sup>* <sup>1</sup> *; <sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>* 2 � �, is:

$$P\left[f\_1(\mathbf{x}) < f\_1^{\*, \left(i\right)} \cup f\_2(\mathbf{x}) < f\_2^{\*, \left(i\right)}\right] \equiv P[I]\_{\text{2D}} = \tag{3}$$

$$\Phi\left(\frac{f\_1^{\*,(i)} - \mu\_1(\mathbf{x})}{\sigma\_1(\mathbf{x})}\right) + \Phi\left(\frac{f\_2^{\*,(i)} - \mu\_2(\mathbf{x})}{\sigma\_2(\mathbf{x})}\right) \tag{4}$$

$$-\Phi\left(\frac{f\_1^{\*, \left(i\right)} - \mu\_1(\mathbf{x})}{\sigma\_1(\mathbf{x})}\right)\Phi\left(\frac{f\_2^{\*, \left(i\right)} - \mu\_2(\mathbf{x})}{\sigma\_2(\mathbf{x})}\right).\tag{5}$$

*2.2.3 GE-IDACE with desirability*

*DOI: http://dx.doi.org/10.5772/intechopen.88213*

**Figure 3.**

be in the range (�4, 4).

**27**

expected desirability of improvement (EDI):

The physical programming technique can help the engineer to guide the design algorithm toward the desirable regions of the Pareto front [46]. For example,

*An example of how the objective space, here one-dimensional, is split into regions (here (*�*10,* �*4), (*�*4, 4), and (4, 10)) and separate independent desirability functions with user-defined parameters (not directly shown) are defined. In this case, candidate points with predicted objective values from the surrogate model in the range (*�*4, 4) are primarily favored since the y-axis being D y*ð Þ *takes the largest values. Note that D y*ð Þ *need not integrate to any specific value (such as unity). Desirability provides a lot of flexibility to GE-IDACE. As a simple example, it enables us to target a specific objective value instead of maximizing/minimizing it.*

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective…*

*Yi* ∈ð Þ �10*;* 10 . Then, by dividing the range of possible values into four segments (e.g., ð Þ �10*;* �5 , ð Þ �5*;* 0 , 0ð Þ *;* 5 , and 5ð Þ *;* 10 ), the design engineer can assign a desirability for each sub-range as a highly desirable, acceptable, undesirable, and unacceptable. An aggregate desirability function that is formed from these individual ranges is used to rank the Pareto points. Within the GE-IDACE framework, we help the designer to define desirabilities from a set of functions by decomposing the objectives into ranges as shown in **Figure 3**. The desirability function in this case is one-dimensional and decomposed into three regions whereby different desirability functions are assigned by the designer. The figure shows that the candidate points mostly favored have an objective value predicted to

Next, we extend the hyperrectangle approach to account for desirability as follows. Specifically, the designer chooses, for each objective, ranges of the objective which are considered highly desirable, acceptable, undesirable, and unacceptable. Closely following the ideas in Ref. [47], consider the following quantity called

EDI ¼

desirability of the predictions that improve the current Pareto front.

section turns to demonstrate their application to real-world engineering

**3. Industrial applications of GEBHM/GE-IDACE**

ð

Ið Þ*f* >0

Having covered the theoretical framework for both GEBHM and GE-IDACE this

Given the predictive distribution *pf* for some new point, EDI is the mean

*D f*ð Þ*pf* d*f:* (8)

let the *i*th objective in a multi-objective design problem be denoted by

The two-dimensional EI then becomes:

$$E[I(\mathbf{x})] \equiv \mathbf{E} \mathbf{I}\_{\text{2D}} = P[I]\_{\text{2D}} \sqrt{\left[\overline{f}\_1(\mathbf{x}) - f\_{1,\text{c}}^{\*}(\mathbf{x})\right]^2 + \left[\overline{f}\_2(\mathbf{x}) - f\_{2,\text{c}}^{\*}(\mathbf{x})\right]^2},\tag{6}$$

where *f* <sup>∗</sup> <sup>1</sup>*,* <sup>c</sup>ð Þ *<sup>x</sup> ; <sup>f</sup>* <sup>∗</sup> <sup>2</sup>*,* <sup>c</sup>ð Þ *x* � � is the Pareto point, among all Pareto points of the current Pareto frontier, which is closest to the centroid point *f* <sup>1</sup>ð Þ *x ; f* <sup>2</sup>ð Þ *x* � �. The details are available in Ref. [15] where also another method assuming a convex hull Pareto shape is discussed, which in certain scenarios outperforms the centroid method.

#### *2.2.2 Multiple objectives: hypervolume method for any dimensions*

The hypervolume EI method is presented to handle high-dimensional objective spaces beyond two [15, 42]. Drawing an analogy with the one-dimensional case, in multi-dimensional objective space, the hypervolume is considered a measure of the current known minimum point (the Pareto front). The difference in the hypervolume between the current Pareto front and the new Pareto front resulting from adding a candidate point is used to define the EI. Accordingly, the EI in multidimensional objective-space, EIð Þ*f* , gained by adding a new point *x* with objective values *f x*ð Þ� *f* to the design space is defined as

$$\text{EI}(f) = \int [\text{HV}(f^\* \cup f) - \text{HV}(f^\*)] p\_f(f) \,\,\text{df},\tag{7}$$

where HV *<sup>f</sup>* <sup>∗</sup> ð Þ denotes the hypervolume contained by the current Pareto front. In general, the expectation integral in the hypervolume EI method is simplified by decomposing it into sub integrals over hyperrectangles, please see Ref. [43]. Further simplification to this integral can be achieved by assuming the predicted output components are independent [44]. To reduce the cost of computing the hyperrectangles integrals, Monte Carlo approximation can be utilized [45] as well as using a suitable merging approach to decompose the integral as presented in [31]. It is worth mentioning that depending on the domination of the proposed point over the current Pareto front, different levels of improvement can be gained [15].

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective… DOI: http://dx.doi.org/10.5772/intechopen.88213*

#### **Figure 3.**

computed and the distance between *f x<sup>i</sup>* � � and *P<sup>i</sup>* quantifies the value of adding

<sup>1</sup> *; <sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>* 2 � �

h i

*P f* <sup>1</sup>ð Þ *<sup>x</sup>* <sup>&</sup>lt;*<sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

<sup>1</sup> � *μ*1ð Þ *x σ*1ð Þ *x* !

r

*2.2.2 Multiple objectives: hypervolume method for any dimensions*

<sup>1</sup> � *μ*1ð Þ *x σ*1ð Þ *x* !

*<sup>f</sup>* <sup>1</sup>ð Þ� *<sup>x</sup> <sup>f</sup>* <sup>∗</sup>

rent Pareto frontier, which is closest to the centroid point *f* <sup>1</sup>ð Þ *x ; f* <sup>2</sup>ð Þ *x*

current known minimum point (the Pareto front). The difference in the

components are independent [44]. To reduce the cost of computing the

<sup>Φ</sup> *<sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

�<sup>Φ</sup> *<sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

The two-dimensional EI then becomes:

<sup>2</sup>*,* <sup>c</sup>ð Þ *x* � �

values *f x*ð Þ� *f* to the design space is defined as

EIð Þ¼ *f*

ð

*E I*½ �� ð Þ *x* EI2D ¼ *P I*½ �2D

<sup>1</sup>*,* <sup>c</sup>ð Þ *<sup>x</sup> ; <sup>f</sup>* <sup>∗</sup>

where *f* <sup>∗</sup>

**26**

biggest expected change in the Pareto frontier as measured by the distance between its centroid point and corresponding Pareto point is picked in a given iteration. The probability that a new design point at *x* improves the *i*th member of the

<sup>1</sup> <sup>∪</sup> *<sup>f</sup>* <sup>2</sup>ð Þ *<sup>x</sup>* <sup>&</sup>lt; *<sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

, is:

2

<sup>þ</sup> <sup>Φ</sup> *<sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

<sup>Φ</sup> *<sup>f</sup>* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

<sup>1</sup>*,* <sup>c</sup>ð Þ *x* h i<sup>2</sup>

are available in Ref. [15] where also another method assuming a convex hull Pareto shape is discussed, which in certain scenarios outperforms the centroid method.

The hypervolume EI method is presented to handle high-dimensional objective spaces beyond two [15, 42]. Drawing an analogy with the one-dimensional case, in multi-dimensional objective space, the hypervolume is considered a measure of the

hypervolume between the current Pareto front and the new Pareto front resulting from adding a candidate point is used to define the EI. Accordingly, the EI in multidimensional objective-space, EIð Þ*f* , gained by adding a new point *x* with objective

where HV *<sup>f</sup>* <sup>∗</sup> ð Þ denotes the hypervolume contained by the current Pareto front. In general, the expectation integral in the hypervolume EI method is simplified by decomposing it into sub integrals over hyperrectangles, please see Ref. [43]. Further simplification to this integral can be achieved by assuming the predicted output

hyperrectangles integrals, Monte Carlo approximation can be utilized [45] as well as using a suitable merging approach to decompose the integral as presented in [31]. It is worth mentioning that depending on the domination of the proposed point over the current Pareto front, different levels of improvement can be gained [15].

<sup>2</sup> � *μ*2ð Þ *x σ*2ð Þ *x* !

<sup>2</sup> � *μ*2ð Þ *x σ*2ð Þ *x* !

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

<sup>þ</sup> *<sup>f</sup>* <sup>2</sup>ð Þ� *<sup>x</sup> <sup>f</sup>* <sup>∗</sup>

is the Pareto point, among all Pareto points of the cur-

HV *<sup>f</sup>* <sup>∗</sup> ð Þ� <sup>∪</sup> *<sup>f</sup>* HV *<sup>f</sup>* <sup>∗</sup> ½ � ð Þ *pf*ð Þ*<sup>f</sup>* <sup>d</sup>*f,* (7)

<sup>2</sup>*,* <sup>c</sup>ð Þ *x* h i<sup>2</sup>

� �

� *P I*½ �2D ¼ (3)

*:* (5)

*,* (6)

. The details

(4)

. Note that *Pi* 6¼ *<sup>P</sup><sup>j</sup>* in general. The candidate point inducing the

candidate point *xi*

*Design and Manufacturing*

current Pareto front, denoted *f* <sup>∗</sup> *,*ð Þ*<sup>i</sup>*

*An example of how the objective space, here one-dimensional, is split into regions (here (*�*10,* �*4), (*�*4, 4), and (4, 10)) and separate independent desirability functions with user-defined parameters (not directly shown) are defined. In this case, candidate points with predicted objective values from the surrogate model in the range (*�*4, 4) are primarily favored since the y-axis being D y*ð Þ *takes the largest values. Note that D y*ð Þ *need not integrate to any specific value (such as unity). Desirability provides a lot of flexibility to GE-IDACE. As a simple example, it enables us to target a specific objective value instead of maximizing/minimizing it.*

#### *2.2.3 GE-IDACE with desirability*

The physical programming technique can help the engineer to guide the design algorithm toward the desirable regions of the Pareto front [46]. For example, let the *i*th objective in a multi-objective design problem be denoted by *Yi* ∈ð Þ �10*;* 10 . Then, by dividing the range of possible values into four segments (e.g., ð Þ �10*;* �5 , ð Þ �5*;* 0 , 0ð Þ *;* 5 , and 5ð Þ *;* 10 ), the design engineer can assign a desirability for each sub-range as a highly desirable, acceptable, undesirable, and unacceptable. An aggregate desirability function that is formed from these individual ranges is used to rank the Pareto points. Within the GE-IDACE framework, we help the designer to define desirabilities from a set of functions by decomposing the objectives into ranges as shown in **Figure 3**. The desirability function in this case is one-dimensional and decomposed into three regions whereby different desirability functions are assigned by the designer. The figure shows that the candidate points mostly favored have an objective value predicted to be in the range (�4, 4).

Next, we extend the hyperrectangle approach to account for desirability as follows. Specifically, the designer chooses, for each objective, ranges of the objective which are considered highly desirable, acceptable, undesirable, and unacceptable. Closely following the ideas in Ref. [47], consider the following quantity called expected desirability of improvement (EDI):

$$\text{EDI} = \int\_{I(f)\times\mathbb{0}} D(f) p\_f \,\mathrm{d}f \,. \tag{8}$$

Given the predictive distribution *pf* for some new point, EDI is the mean desirability of the predictions that improve the current Pareto front.

#### **3. Industrial applications of GEBHM/GE-IDACE**

Having covered the theoretical framework for both GEBHM and GE-IDACE this section turns to demonstrate their application to real-world engineering

applications. We consider first additive manufacturing (AM) which is a vital process for many engineering design applications and is bound to further transform manufacturing. In essence, AM can be defined as the process of overlying layers to create a three-dimensional objects [42, 48]. We will show how GE-IDACE reduces the design cycle time from 6 months to a few weeks.

As a second application, we consider combustion testing where the goal in this case is to maximize load while keeping exhaust and temperature within specific limits. We demonstrate that GE-IDACE can help guide the test into regions with 20% more points from critical areas compared to status quo.

Finally, we demonstrate how well GE-IDACE does for expensive complex computer codes such as CFD modeling. We show that GEBHM/GE-IDACE helps reduce the number of test points by a factor of three when compared with neural network modeling and optimization.

#### **3.1 Additive manufacturing**

As a main example of AM applications utilizing GE-IDACE we consider Direct Metal Laser Melting (DMLM) but mention that GE-IDACE is also used for featurebased qualification methods for Directed Energy Deposition having a big positive impact.

> processing window. This shows that by the third iterations GE-IDACE suggested to us almost 65% of points that satisfy our objectives in defects, while also bringing the overall model uncertainty down. Currently, we are working on expanding this methodology into more complicated structures and additional quantities of interest

> *An example of GE Bayesian hybrid model (BHM/GEBHM)/GE intelligent design and analysis computer experiment (IDACE/GE-IDACE)-based process parameter optimization for a hard-to-process superalloy. The plot on the left shows an initial design from a space-filling and uninformed design of experiment (DoE/DOE). Defects in an as-built part was measured after each DoE. The red triangles in the middle plot are suggested by GE-IDACE based on a GEBHM model built on the blue-circle dataset. As noted in the text box below the plot, DoE 2 reduced the model uncertainty by 5% but did not suggest any datapoints that meet the target defect criteria (not specified here). In DoE 3 on the far right, the green points are suggested by GE-IDACE based on a GEBHM model built on the blue circles and red triangles. By adding informative data, we have added more information to GE-IDACE through the underlying GEBHM models. As a result, at DoE 3 we saw further reduction in model uncertainty (close to 25%) and also excitingly identified a parameter space window where we obtained more than 65% of datapoints satisfying the defect criteria. The figure overall aims to demonstrate*

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective…*

*DOI: http://dx.doi.org/10.5772/intechopen.88213*

During manufacturing of turbine or jet engines, combustion testing is required at different stages of development, manufacturing, installation, and deployment to ensure that the engine is working as designed and within desired tolerances. Multiple of such experiments are required for different operating points making this process very time consuming and expensive. Traditionally, test plans are created prior to actual experiments which consists of heuristics test blocks or groups of tests. These test blocks are generally created from expert judgment based on prior operational experience. However, these test plans may not be optimal as they are heuristically designed without rigorous statistical analysis of legacy and existing data. The obvious results is that traditional heuristic test plans may lead to ineffi-

Therefore, GE-IDACE can be employed to improve the test schedule. This happens by adaptively learning the system characteristics and performance with the underlying advanced surrogate model GEBHM as the function of input conditions using real-time data or even leverage historical experiments while also incorporating expert judgment. The test plan is hereby dynamically learned, compiled,

Ideally, historical data is available. The first step is to build GEBHM on this dataset. The input variables in this application are gas splits, loads, speed, firing temperature, etc., and the outputs are NOx emission, combustor instability, system dynamics, etc. The process followed with GE-IDACE is as shown in **Figure 2**. The steps are repeated as more data is added until necessary goals in the form of

(QoIs) such as mechanical properties, durability, surface finish etc.

*the power of the GE-IDACE methodology for performing experimental design.*

**3.2 Combustion testing**

**Figure 4.**

ranked, and updated.

**29**

cient and redundant allocation of resources.

DMLM is a key modality of additive manufacturing that focuses on 3D printing of metallic materials. Printing metals is in itself a complicated task due to the microstructural instabilities from melting of the metallic powder. It is especially complicated for superalloys since as-built parts from DMLM are highly prone to microcracking and other microstructural deficiencies. So it is of primary importance to identify what the processing parameters are for the hard-to-process Nickelbased superalloys, and that process has been proven to be non-trivial. The lead times for processing parameter development for these types of alloys are typically on the order of several weeks to months, which means increased cost and the inability to introduce new materials in the additive marketplace. In order to reduce the cycle time when developing the processing parameters for DMLM for hard-toprocess alloys, we have extensively utilized GEBHM and GE-IDACE to collect data in an intelligent manner [49]. Typically the key parameters that dictate the processing of additive parts are the laser power, the laser speed, etc. GE-IDACE automated the process toward obtaining design points, i.e., processing parameter combinations, which were most informative to the model and which would guide us in the direction of the optimal solution(s). We used quantified characteristics of the microstructural deficiencies as our outputs/objectives. Parts were built in the additive machine and then characterized by sectioning the parts, imaging the sections, and performing automated image analysis. This enabled us to analyze a large number of images extracting specific defect information such as porosity, keyholes due to unmelted powder, etc. We are interested in porosity because it affects the mechanical properties like yield strength and fatigue life adversely. The GE-IDACE process then constructed a model of the output microstructural defects as function of the input process parameters. We utilized both variance minimization (uncertainty sampling) and EI-based optimization to exploit and explore the design space to identify the optimal solutions faster. Using GE-IDACE with GEBHM, we were able to reduce the cycle time in identifying the optimal process parameter window for a superalloy from 6 months to a few weeks.

**Figure 4** shows the progression of the collection of data (only two dimensions shown in a multi-dimensional problem). We can clearly see that the GE-IDACE methodology helps us explore the design space initially and then start to exploit the optimal solutions in the later iterations to quickly converge on an optimal

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective… DOI: http://dx.doi.org/10.5772/intechopen.88213*

#### **Figure 4.**

applications. We consider first additive manufacturing (AM) which is a vital process for many engineering design applications and is bound to further transform manufacturing. In essence, AM can be defined as the process of overlying layers to create a three-dimensional objects [42, 48]. We will show how GE-IDACE reduces

As a second application, we consider combustion testing where the goal in this case is to maximize load while keeping exhaust and temperature within specific limits. We demonstrate that GE-IDACE can help guide the test into regions with

Finally, we demonstrate how well GE-IDACE does for expensive complex computer codes such as CFD modeling. We show that GEBHM/GE-IDACE helps reduce the number of test points by a factor of three when compared with neural network

As a main example of AM applications utilizing GE-IDACE we consider Direct Metal Laser Melting (DMLM) but mention that GE-IDACE is also used for featurebased qualification methods for Directed Energy Deposition having a big positive

DMLM is a key modality of additive manufacturing that focuses on 3D printing

of metallic materials. Printing metals is in itself a complicated task due to the microstructural instabilities from melting of the metallic powder. It is especially complicated for superalloys since as-built parts from DMLM are highly prone to microcracking and other microstructural deficiencies. So it is of primary importance to identify what the processing parameters are for the hard-to-process Nickelbased superalloys, and that process has been proven to be non-trivial. The lead times for processing parameter development for these types of alloys are typically on the order of several weeks to months, which means increased cost and the inability to introduce new materials in the additive marketplace. In order to reduce the cycle time when developing the processing parameters for DMLM for hard-toprocess alloys, we have extensively utilized GEBHM and GE-IDACE to collect data

in an intelligent manner [49]. Typically the key parameters that dictate the processing of additive parts are the laser power, the laser speed, etc. GE-IDACE automated the process toward obtaining design points, i.e., processing parameter combinations, which were most informative to the model and which would guide us in the direction of the optimal solution(s). We used quantified characteristics of the microstructural deficiencies as our outputs/objectives. Parts were built in the additive machine and then characterized by sectioning the parts, imaging the sections, and performing automated image analysis. This enabled us to analyze a large number of images extracting specific defect information such as porosity, keyholes due to unmelted powder, etc. We are interested in porosity because it affects the mechanical properties like yield strength and fatigue life adversely. The GE-IDACE process then constructed a model of the output microstructural defects as function of the input process parameters. We utilized both variance minimization (uncertainty sampling) and EI-based optimization to exploit and explore the design space to identify the optimal solutions faster. Using GE-IDACE with GEBHM, we were able to reduce the cycle time in identifying the optimal process parameter window

**Figure 4** shows the progression of the collection of data (only two dimensions shown in a multi-dimensional problem). We can clearly see that the GE-IDACE methodology helps us explore the design space initially and then start to exploit the

optimal solutions in the later iterations to quickly converge on an optimal

for a superalloy from 6 months to a few weeks.

the design cycle time from 6 months to a few weeks.

modeling and optimization.

*Design and Manufacturing*

**3.1 Additive manufacturing**

impact.

**28**

20% more points from critical areas compared to status quo.

*An example of GE Bayesian hybrid model (BHM/GEBHM)/GE intelligent design and analysis computer experiment (IDACE/GE-IDACE)-based process parameter optimization for a hard-to-process superalloy. The plot on the left shows an initial design from a space-filling and uninformed design of experiment (DoE/DOE). Defects in an as-built part was measured after each DoE. The red triangles in the middle plot are suggested by GE-IDACE based on a GEBHM model built on the blue-circle dataset. As noted in the text box below the plot, DoE 2 reduced the model uncertainty by 5% but did not suggest any datapoints that meet the target defect criteria (not specified here). In DoE 3 on the far right, the green points are suggested by GE-IDACE based on a GEBHM model built on the blue circles and red triangles. By adding informative data, we have added more information to GE-IDACE through the underlying GEBHM models. As a result, at DoE 3 we saw further reduction in model uncertainty (close to 25%) and also excitingly identified a parameter space window where we obtained more than 65% of datapoints satisfying the defect criteria. The figure overall aims to demonstrate the power of the GE-IDACE methodology for performing experimental design.*

processing window. This shows that by the third iterations GE-IDACE suggested to us almost 65% of points that satisfy our objectives in defects, while also bringing the overall model uncertainty down. Currently, we are working on expanding this methodology into more complicated structures and additional quantities of interest (QoIs) such as mechanical properties, durability, surface finish etc.

#### **3.2 Combustion testing**

During manufacturing of turbine or jet engines, combustion testing is required at different stages of development, manufacturing, installation, and deployment to ensure that the engine is working as designed and within desired tolerances. Multiple of such experiments are required for different operating points making this process very time consuming and expensive. Traditionally, test plans are created prior to actual experiments which consists of heuristics test blocks or groups of tests. These test blocks are generally created from expert judgment based on prior operational experience. However, these test plans may not be optimal as they are heuristically designed without rigorous statistical analysis of legacy and existing data. The obvious results is that traditional heuristic test plans may lead to inefficient and redundant allocation of resources.

Therefore, GE-IDACE can be employed to improve the test schedule. This happens by adaptively learning the system characteristics and performance with the underlying advanced surrogate model GEBHM as the function of input conditions using real-time data or even leverage historical experiments while also incorporating expert judgment. The test plan is hereby dynamically learned, compiled, ranked, and updated.

Ideally, historical data is available. The first step is to build GEBHM on this dataset. The input variables in this application are gas splits, loads, speed, firing temperature, etc., and the outputs are NOx emission, combustor instability, system dynamics, etc. The process followed with GE-IDACE is as shown in **Figure 2**. The steps are repeated as more data is added until necessary goals in the form of

certification approval, optimum operating conditions, and/or constraints in the shape of time and budget are all met as an example.

In summary, the impact of GE-IDACE is clear to combustion testing. In fact, the application has now extending beyond just combustion testing: we are performing full-scale engine testing as well with this new strategy and it is being rolled out to multiple GE's businesses thus achieving severe cost reductions and better engineer-

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective…*

Next, we consider the application of GE-IDACE to turbine design with CFD. So far in this chapter, we have covered real-world engineering applications. In the following, we demonstrate how GE-IDACE is positively impacting expensive

Aerodynamic optimization of a turbine involves dozens of variables, impacting everything from system level features through detailed airfoil properties. Two primary top-level considerations for the aerodynamic design of a turbine include vortexing and airfoil stack. Vortexing involves custom tailoring of the vane and rotor exit angle distributions. This establishes the radial distribution of work within the turbine stage. Vortexing affects local acceleration and mass flow distributions, and thus is a strong driver of secondary loss generation (endwall vortices). Airfoil stacking aerodynamically imposes body forces on the flow, further affecting the radial mass flow and work distributions. Stacking also strongly influences the generation of secondary loss. The general objective of a vortexing and stack optimiza-

Before covering how GE-IDACE improved the optimization, consider the tradi-

tional approach as shown in **Figure 6**. The first stage vane is optimized using a component-specific space-filling DOE on which CFD is evaluated. These results are then used to build a surrogate model that characterizes a row-specific loss metric (relative total pressure loss or secondary kinetic energy, for example). A genetic algorithm (GA) is used to optimize a set of X's (defining a design point) describing the geometry for minimal loss (maximum efficiency) based on the surrogate model. This process is repeated for each subsequent row, with downstream components reacting to the results of the upstream row's optimized exit flow conditions.

tion is to maximize turbine performance, usually through management of secondary loss growth, while also adhering to numerous constraints that ensure

proper downstream performance and acceptable component life.

**3.3 Computational fluid dynamics for turbine design**

*DOI: http://dx.doi.org/10.5772/intechopen.88213*

ing designs.

**Figure 6.**

**31**

*Traditional approach to turbine blade optimization.*

computer simulations as well.

The user typically possess desirability ratings for experimental outcomes. This desirability may include a factor, threshold, constraint, goal, or objective that is important to the user for testing such as emissions thresholds, maximum and minimum loads, efficiency ratings, among others. For example, the user would like to know the operating conditions at which the load is maximum while NOx exhaust and temperature are within some limits. The desirability is provided by the user as target values, target ranges, or by a custom function over the quantity of interest; please see Section 2.2.3 for a review.

With this introduction, in the following we demonstrate the impact of using GE-IDACE on combustion testing. A design space of four operating conditions *x*1*, x*2*, x*3, and *x*<sup>4</sup> are explored, such that two performance parameters *y*<sup>1</sup> and *y*<sup>2</sup> stay within some thresholds defined as: *y*<sup>1</sup> ∈ *ylow* <sup>1</sup> *; y high* 1 h i and *<sup>y</sup>*<sup>2</sup> <sup>∈</sup> *<sup>y</sup>low* <sup>2</sup> *; y high* 2 h i. The goal is to design a test plan to maximize the number of experiments within said thresholds.

First, for later comparison, the traditional approach with one-factor-at-a-time designs are shown in **Figure 5**. Grey points indicate experiments out-of-bounds from a threshold perspective. Blue points met the conditions, i.e., they are within the blue delineated region of objective space. Out of a total of 69 experiments performed, 10 (14.5%) satisfied the desirability of *y*1, 27 (39.1%) satisfied the desirability of *y*2, and 35 (50.7%) satisfied the desirability of both *y*<sup>1</sup> and *y*2.

Then, as an aim to improve this process, GE-IDACE was used to carry out a dynamic test plan. After each experiment, GEBHM was updated on the new data and the next point was picked based on the desirability with regards to the output responses. The corresponding output performance of these experiments and desirable regions is shown in **Figure 5B** to be compared with **Figure 5A**. Out of a total of 69 experiment performed, 25 (36.2%) satisfied the desirability of *y*1, 40 (57.9%) satisfied the desirability of *y*2, and 47 (68.2%) satisfied the desirability of both *y*<sup>1</sup> and *y*2. The impact is that GE-IDACE increases the number of points in the desirable region by 20% with the same number of tests. Given the high cost of running these experiments, this easily translates to hundreds of thousands of dollars saved annually.

#### **Figure 5.**

*(A) Results from the testing approach which does not utilize GE-IDACE. The plot shows the two-dimensional output space and the desirable region is delineated with a blue line and identifying text in the top right corner. Blue dots indicate experiments that met the desirability. Grey points did not meet desirabilities. (B) Results from the testing approach which utilizes GE-IDACE. The plot shows the two-dimensional output space and the desirable region is delineated with a blue line and identifying text in the top right corner. Blue dots indicate experiments that met the desirability. Grey points did not meet desirabilities. Comparing to (A), a higher fraction of points are blue and thus located in the desirable region.*

In summary, the impact of GE-IDACE is clear to combustion testing. In fact, the application has now extending beyond just combustion testing: we are performing full-scale engine testing as well with this new strategy and it is being rolled out to multiple GE's businesses thus achieving severe cost reductions and better engineering designs.

### **3.3 Computational fluid dynamics for turbine design**

Next, we consider the application of GE-IDACE to turbine design with CFD. So far in this chapter, we have covered real-world engineering applications. In the following, we demonstrate how GE-IDACE is positively impacting expensive computer simulations as well.

Aerodynamic optimization of a turbine involves dozens of variables, impacting everything from system level features through detailed airfoil properties. Two primary top-level considerations for the aerodynamic design of a turbine include vortexing and airfoil stack. Vortexing involves custom tailoring of the vane and rotor exit angle distributions. This establishes the radial distribution of work within the turbine stage. Vortexing affects local acceleration and mass flow distributions, and thus is a strong driver of secondary loss generation (endwall vortices). Airfoil stacking aerodynamically imposes body forces on the flow, further affecting the radial mass flow and work distributions. Stacking also strongly influences the generation of secondary loss. The general objective of a vortexing and stack optimization is to maximize turbine performance, usually through management of secondary loss growth, while also adhering to numerous constraints that ensure proper downstream performance and acceptable component life.

Before covering how GE-IDACE improved the optimization, consider the traditional approach as shown in **Figure 6**. The first stage vane is optimized using a component-specific space-filling DOE on which CFD is evaluated. These results are then used to build a surrogate model that characterizes a row-specific loss metric (relative total pressure loss or secondary kinetic energy, for example). A genetic algorithm (GA) is used to optimize a set of X's (defining a design point) describing the geometry for minimal loss (maximum efficiency) based on the surrogate model. This process is repeated for each subsequent row, with downstream components reacting to the results of the upstream row's optimized exit flow conditions.

**Figure 6.** *Traditional approach to turbine blade optimization.*

certification approval, optimum operating conditions, and/or constraints in the

The user typically possess desirability ratings for experimental outcomes. This desirability may include a factor, threshold, constraint, goal, or objective that is important to the user for testing such as emissions thresholds, maximum and minimum loads, efficiency ratings, among others. For example, the user would like to know the operating conditions at which the load is maximum while NOx exhaust and temperature are within some limits. The desirability is provided by the user as target values, target ranges, or by a custom function over the quantity of interest;

With this introduction, in the following we demonstrate the impact of using GE-IDACE on combustion testing. A design space of four operating conditions *x*1*, x*2*, x*3, and *x*<sup>4</sup> are explored, such that two performance parameters *y*<sup>1</sup> and *y*<sup>2</sup> stay

<sup>1</sup> *; y high* 1 h i

to design a test plan to maximize the number of experiments within said thresholds. First, for later comparison, the traditional approach with one-factor-at-a-time designs are shown in **Figure 5**. Grey points indicate experiments out-of-bounds from a threshold perspective. Blue points met the conditions, i.e., they are within the blue delineated region of objective space. Out of a total of 69 experiments performed, 10 (14.5%) satisfied the desirability of *y*1, 27 (39.1%) satisfied the desirability of *y*2, and 35 (50.7%) satisfied the desirability of both *y*<sup>1</sup> and *y*2. Then, as an aim to improve this process, GE-IDACE was used to carry out a dynamic test plan. After each experiment, GEBHM was updated on the new data and the next point was picked based on the desirability with regards to the output responses. The corresponding output performance of these experiments and desirable regions is shown in **Figure 5B** to be compared with **Figure 5A**. Out of a total of 69 experiment performed, 25 (36.2%) satisfied the desirability of *y*1, 40 (57.9%) satisfied the desirability of *y*2, and 47 (68.2%) satisfied the desirability of both *y*<sup>1</sup> and *y*2. The impact is that GE-IDACE increases the number of points in the desirable region by 20% with the same number of tests. Given the high cost of running these experiments, this easily translates to hundreds of thousands of dollars saved

*(A) Results from the testing approach which does not utilize GE-IDACE. The plot shows the two-dimensional output space and the desirable region is delineated with a blue line and identifying text in the top right corner. Blue dots indicate experiments that met the desirability. Grey points did not meet desirabilities. (B) Results from the testing approach which utilizes GE-IDACE. The plot shows the two-dimensional output space and the desirable region is delineated with a blue line and identifying text in the top right corner. Blue dots indicate experiments that met the desirability. Grey points did not meet desirabilities. Comparing to (A), a higher*

*fraction of points are blue and thus located in the desirable region.*

and *y*<sup>2</sup> ∈ *ylow*

<sup>2</sup> *; y high* 2 h i

. The goal is

shape of time and budget are all met as an example.

please see Section 2.2.3 for a review.

*Design and Manufacturing*

annually.

**Figure 5.**

**30**

within some thresholds defined as: *y*<sup>1</sup> ∈ *ylow*

Compared to the traditional approach, GE-IDACE can help automate the geometry generation process to ensure efficient throughput. To accelerate build time from an X-specification to a CFD-ready geometry, a mesh morphing approach is implemented. Leveraging the block structured hexahedral mesh from the baseline geometry's CFD analysis, new cases re-stretch the inlet, exit, and passage blocks to produce a topologically identical mesh that conforms to the new 3D airfoil surface. The O block surrounding airfoil remains largely unchanged and translates with the new geometry. The baseline mesh is similar in fidelity to a typical "production" CFD analysis for turbine design. Surface y + is 1 for all airfoil metal surfaces, and in total, the high pressure turbine (HPT) domain consists of 9 million nodes. **Figure 7** shows a representative example of the baseline grid and how it is morphed to an updated geometry.

All processes required to translate X's to CFD geometries are batch enabled, and each new CFD case requires 15 min of wall clock time to generate. The CFD analysis is performed using GE's in-house CFD solver, TACOMA. TACOMA is a 2nd order accurate (in time and space), finite-volume, block-structured, compressible flow solver, implemented in Fortran 90. Stability is achieved via the JST scheme, and convergence is accelerated using pseudo-time marching and multi-grid techniques. The Reynolds Averaged Navier-Stokes (RANS) equations are closed via the k-*ω* turbulence model of Wilcox. Multi-row analysis is enabled through the use of mixing plane interfaces. Using 64 total CPU cores for the four-airfoil HPT domain, convergence is achieved in roughly 6 h.

As a benchmark for the new process a 320-point space-filling OLH DOE was first created to cover all 32 HPT variables. Consistent with current practice, a radial basis function (RBF) surrogate model was fit to this data set, and GA optimization was performed on the RBF model. Modest gains over baseline were achieved from

*initial design from which GEBHM is built from is shown as blue circles.*

*DOI: http://dx.doi.org/10.5772/intechopen.88213*

*Results of using GE-IDACE for turbine fan blade optimization. The traditional best shown with a green dashed line identifies the optimal design previously obtained using a mix of strategic designs and expert insights. GE-IDACE is the red full line and automates the design process and is clearly seen to outperform status quo. The*

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective…*

For GE-IDACE, a more sparsely populated OLH DOE of 50 points was generated to seed the optimization. Leveraging GEBHM capabilities, additional DOE points were added only in areas of high error where the GEBHM model predicted high likelihood of progressing toward the objective—maximum delta group efficiency over the baseline. Through several rounds of intelligent incremental point addition, where each round included a refinement to the GEBHM fit, a new final optimal was established that exceeded the previous delta by roughly three times. Additionally, as shown in **Figure 8**, this much more favorable outcome was achieved with roughly a

It has been demonstrated how advanced engineering tools centered around adaptive sampling in multi-objective space help achieve better engineering designs at highly-reduced cost. The underlying technologies are GEBHM and GE-IDACE which were covered first from a theoretical perspective. Then, applications in the areas of additive manufacturing, combustion testing, and computational fluid dynamics were considered. The impact of using GEBHM/GE-IDACE was clear and far surpassed status quo. At GE we consistently find a 30–90% resource cost

Before discussing future work, we first cover some of the main limitations of the GE-IDACE tool. Fundamentally, GE-IDACE treats the computer experiment as a black box function, i.e., it only sees inputs to the code and the corresponding outputs. In some cases, this information is all we are able to leverage, but in other situations we may have additional insights which, if taken advantage of, could speed up the optimization. For example, gradient information could be available from the experiments too. Furthermore, the GE-IDACE approach is "greedy," i.e., it selects the next input point from the design space which is predicted to give the best

this approach.

**Figure 8.**

reduction.

**33**

third of the computational resources.

**4. Summary and future work**

The objective of the blade design task will be group efficiency. This metric is evaluated for each candidate point as a delta from a known baseline, which for this case is a modern two-stage Aviation HPT that already leverages results from prior optimization using the traditional techniques described earlier. All four HPT airfoils are considered in this optimization. To establish an entitlement performance, no constraints are imposed at this time to account for mechanical requirements or downstream component performance. Traditional space-filling DOEs for high dimensional problems require a large number of data points, and for a CFD-based study, an out-of-budget amount of computational resources. To manage these requirements, and to maintain design-cycle-relevant optimization times, advanced machine learning techniques are employed to intelligently guide the optimization process.

**Figure 7.** *Baseline and representative morphed mesh for an HPT vane.*

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective… DOI: http://dx.doi.org/10.5772/intechopen.88213*

#### **Figure 8.**

Compared to the traditional approach, GE-IDACE can help automate the geometry generation process to ensure efficient throughput. To accelerate build time from an X-specification to a CFD-ready geometry, a mesh morphing approach is implemented. Leveraging the block structured hexahedral mesh from the baseline geometry's CFD analysis, new cases re-stretch the inlet, exit, and passage blocks to produce a topologically identical mesh that conforms to the new 3D airfoil surface. The O block surrounding airfoil remains largely unchanged and translates with the new geometry. The baseline mesh is similar in fidelity to a typical "production" CFD analysis for turbine design. Surface y + is 1 for all airfoil metal surfaces, and in total, the high pressure turbine (HPT) domain consists of 9 million nodes. **Figure 7** shows a representative example of the baseline grid and how it is morphed

All processes required to translate X's to CFD geometries are batch enabled, and

The objective of the blade design task will be group efficiency. This metric is evaluated for each candidate point as a delta from a known baseline, which for this case is a modern two-stage Aviation HPT that already leverages results from prior optimization using the traditional techniques described earlier. All four HPT airfoils are considered in this optimization. To establish an entitlement performance, no constraints are imposed at this time to account for mechanical requirements or downstream component performance. Traditional space-filling DOEs for high dimensional problems require a large number of data points, and for a CFD-based study, an out-of-budget amount of computational resources. To manage these requirements, and to maintain design-cycle-relevant optimization times, advanced machine learning techniques are employed to intelligently guide the optimization

each new CFD case requires 15 min of wall clock time to generate. The CFD analysis is performed using GE's in-house CFD solver, TACOMA. TACOMA is a 2nd order accurate (in time and space), finite-volume, block-structured, compressible flow solver, implemented in Fortran 90. Stability is achieved via the JST scheme, and convergence is accelerated using pseudo-time marching and multi-grid techniques. The Reynolds Averaged Navier-Stokes (RANS) equations are closed via the k-*ω* turbulence model of Wilcox. Multi-row analysis is enabled through the use of mixing plane interfaces. Using 64 total CPU cores for the four-airfoil HPT domain,

to an updated geometry.

*Design and Manufacturing*

process.

**Figure 7.**

**32**

*Baseline and representative morphed mesh for an HPT vane.*

convergence is achieved in roughly 6 h.

*Results of using GE-IDACE for turbine fan blade optimization. The traditional best shown with a green dashed line identifies the optimal design previously obtained using a mix of strategic designs and expert insights. GE-IDACE is the red full line and automates the design process and is clearly seen to outperform status quo. The initial design from which GEBHM is built from is shown as blue circles.*

As a benchmark for the new process a 320-point space-filling OLH DOE was first created to cover all 32 HPT variables. Consistent with current practice, a radial basis function (RBF) surrogate model was fit to this data set, and GA optimization was performed on the RBF model. Modest gains over baseline were achieved from this approach.

For GE-IDACE, a more sparsely populated OLH DOE of 50 points was generated to seed the optimization. Leveraging GEBHM capabilities, additional DOE points were added only in areas of high error where the GEBHM model predicted high likelihood of progressing toward the objective—maximum delta group efficiency over the baseline. Through several rounds of intelligent incremental point addition, where each round included a refinement to the GEBHM fit, a new final optimal was established that exceeded the previous delta by roughly three times. Additionally, as shown in **Figure 8**, this much more favorable outcome was achieved with roughly a third of the computational resources.

#### **4. Summary and future work**

It has been demonstrated how advanced engineering tools centered around adaptive sampling in multi-objective space help achieve better engineering designs at highly-reduced cost. The underlying technologies are GEBHM and GE-IDACE which were covered first from a theoretical perspective. Then, applications in the areas of additive manufacturing, combustion testing, and computational fluid dynamics were considered. The impact of using GEBHM/GE-IDACE was clear and far surpassed status quo. At GE we consistently find a 30–90% resource cost reduction.

Before discussing future work, we first cover some of the main limitations of the GE-IDACE tool. Fundamentally, GE-IDACE treats the computer experiment as a black box function, i.e., it only sees inputs to the code and the corresponding outputs. In some cases, this information is all we are able to leverage, but in other situations we may have additional insights which, if taken advantage of, could speed up the optimization. For example, gradient information could be available from the experiments too. Furthermore, the GE-IDACE approach is "greedy," i.e., it selects the next input point from the design space which is predicted to give the best immediate outcome with the current state of knowledge. This approach might not be the optimum strategy in the long-term. Worded differently, under a budget, there could exist a possibility that one can reach a better overall solution with fewer experiments without selecting the highest EI point in each intermediate step. Finally, it is difficult to thoroughly parallelize experiments with GE-IDACE as it is a sequential process which requires data acquisition and model-updating as the experimental results are available, although some approximate schemes exist [50].

**References**

Hall/CRC; 2013

425-464

2017

**35**

**224**(2):560-586

[1] Rasmussen CE. Gaussian Processes in Machine Learning. 2004. pp. 63-71

*DOI: http://dx.doi.org/10.5772/intechopen.88213*

multi-source modeling with legacy data. In: 2018 AIAA Non-Deterministic Approaches Conference. 2018. p. 1663

[12] Bilionis I, Zabaras N, Konomi BA, Lin G. Multi-output separable Gaussian process: Towards an efficient, fully Bayesian paradigm for uncertainty

quantification. Journal of

computer experiments in

212-239

*Industrial Applications of Intelligent Adaptive Sampling Methods for Multi-Objective…*

2008. p. 5802

135-156

21–24, 2016

pp. 407-416

Computational Physics. 2013;**241**:

[13] Simpson T, Toropov V, Balabanov V, Viana F. Design and analysis of

multidisciplinary design optimization: A review of how far we have come-or not. In: 12th AIAA/ISSMO Multidisciplinary Analysis and Optimization Conference.

[14] Viana FAC, Venter G, Balabanov V. An algorithm for fast optimal Latin hypercube design of experiments. International Journal for Numerical Methods in Engineering. 2010;**82**(2):

[15] Kristensen J, Ling Y, Asher I, Wang L. Expected-improvement-based

methods for adaptive sampling in multiobjective optimization problems. In:

Engineering Technical Conferences and

Charlotte, North Carolina, USA; August

Subramaniyan AK, Wang L. Improving high-dimensional physics models through Bayesian calibration with uncertain data. In: ASME Turbo Expo 2012: Turbine Technical Conference and

ASME. International Design

[16] Chennimalai Kumar N,

Exposition. American Society of Mechanical Engineers; 2012.

Computers and Information in Engineering Conference, Volume 2B: 42nd Design Automation Conference, Design Automation Conference;

[2] Bishop CM. Pattern Recognition and Machine Learning. Springer; 2006

[3] Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB. Bayesian Data Analysis. Chapman and

[4] Murphy KP. Machine Learning: A Probabilistic Perspective. MIT Press; 2012

[5] Seeger M. Gaussian processes for machine learning. International Journal of Neural Systems. 2004;**14**(02):69-106

[6] Kennedy MC, O'Hagan A. Bayesian calibration of computer models. Journal of the Royal Statistical Society, Series B: Statistical Methodology. 2001;**63**(3):

[7] Kristensen J, Asher I, Ling Y, Ryan K, Subramaniyan A, Wang L. Predictive analytics with an advanced Bayesian modeling framework. MODSIM World.

[8] Marzouk YM, Najm HN, Rahn LA. Stochastic spectral methods for efficient Bayesian solution of inverse problems. Journal of Computational Physics. 2007;

[9] Subber W, Salvadori A, Lee S, Matous K. Uncertainty Quantification of the Reverse Taylor Impact Test and Localized Asynchronous Space-Time Algorithm. Bulletin of the American

Physical Society; 2017. p. 62

[10] Sacks J, Welch WJ, Mitchell TJ, Wynn HP. Design and analysis of computer experiments. Statistical Science. 1989;**4**(4):409-423

[11] Ghosh S, Asher I, Kristensen J, Ling Y, Ryan K, Wang L. Bayesian

In terms of future work and further improvements, we demonstrate in Ref. [51] that Particle Swarm Optimization performs very well for EI computation. A lot of exciting opportunities exist for GEBHM and GE-IDACE to further improve the engineering design process and remain to be discovered. In recent work, we demonstrate how to use GE-IDACE with multi-fidelity data sources (simulation vs. experiments, e.g.,) [52] and how to leverage legacy data from other designs into the GEBHM modeling process [11], to reduce the cost of running tests for new engine designs. In terms of future work, GEBHM can be extended to operate fluently across any type of data in terms of dimensionality and number of points. This way, all the benefits of GEBHM and GE-IDACE can be leveraged at any scale. Toward this, an initial exploration of a parallelizable way to fit the GEBHM is found in Ref. [53]. This extends the size of datasets which GEBHM can fit by a factor of 5–10.
