**2. Simulation theory**

#### **2.1. Smoothed particle hydrodynamics**

Smoothed particle hydrodynamics is an advanced Lagrangian mesh-free simulation method. The numerical technique has applications in a wide variety of dynamic problems such as astrophysics, magnetohydrodynamics (MHD), computational fluid dynamics, and computa‐ tional solid mechanics (CSM). The method was originally proposed by two independent research groups within the same year. Gingold and Monaghan [14] showed that the method could be used to simulate nonspherical stars, and Lucy [15] used the method to test the theory of fission for rotating protostars. One of the first groups to apply the SPH method to solid mechanics was Libersky and Petschek [16] in 1991.

What makes the SPH method meshfree is that the set of field equation (conservation equations for a solid body in this case) is solved by interpolation using a kernel, *W*(*r*, *h*), from a set of *j* neighbor particles that are within the influence domain of a particle of interest, *i*. **Figure 3** gives a graphical representation of this concept. In this method, continuous field equations are "weakened" into a set of discrete ordinary differential equations. A continuous function is approximated by an interpolant through the use of a convolution integral:

$$f\left(\overline{\mathbf{x}}\right) = \int f\left(\overline{\mathbf{x}}^{\cdot}\right) \mathcal{W}\left(\overline{\mathbf{x}} - \overline{\mathbf{x}}^{\cdot}, h\right) d\overline{\mathbf{x}}^{\cdot} \tag{1}$$

**Figure 3.** Smoothed particle interpolation.

the effect of pin shape, contact friction, material and temperature flow. Their models are highly sophisticated, but are not able to predict residual stresses and defects. They noted that the

Bohjwani [8] used the SPH method to study the FSW process with the Johnson-Cook constit‐ utive model in LS-DYNA. At the time, it was not possible to perform a coupled thermome‐ chanical SPH simulation. As such, thermal softening is not taken into consideration. Timesli et al. [9] used the SPH method in two dimensions (2D) to simulate the FSW process. They have used the fluid formulation that directly calculated the deviatoric stress from the strain rate and a non-Newtonian viscosity (function of temperature). They showed that their model correlates well to an equivalent CFD model; however, they did not validate the model experimentally. Recently, Pan et al. [10] used the SPH method to solve the fully coupled thermomechanical problem for the FSW process in three dimensions (3D). Their approach gives detailed grain size, hardness, and microstructure evolution using the SPH method. However, they use a fluidbased formulation that does not allow the determination of elastic strains and stresses. Fraser et al. [11–13] have used the SPH method to simulate various FSW processes using a fully coupled thermos-mechanical SPH-FEM model. The tool is modeled with rigid FEMs and the workpieces with SPH. The model is able to predict temperatures, stresses, and defects all within a Lagrangian framework. This approach permits following the material point history throughout the entire welding process. Since the tool is modeled with FEMs, friction contact

In this chapter, we describe our approach toward simulating the entire FSW process using SPH on the GPU. In Section 2, we explain what SPH is and how the method can be used to solve large plastic deformation problems with an elastic-plastic formulation, including a description of our parallelization strategy on the GPU. Section 3 introduces the simulation model of a complex aluminum alloy joint. The simulation model will be used to show the power of the SPH method. A validation case is presented to show that the model is able to predict tool torque, force, and the temperature distribution, as well as the size and shape of the flash. Finally, Section 4 wraps up the chapter with concluding remarks and an outlook toward the

Smoothed particle hydrodynamics is an advanced Lagrangian mesh-free simulation method. The numerical technique has applications in a wide variety of dynamic problems such as astrophysics, magnetohydrodynamics (MHD), computational fluid dynamics, and computa‐ tional solid mechanics (CSM). The method was originally proposed by two independent research groups within the same year. Gingold and Monaghan [14] showed that the method could be used to simulate nonspherical stars, and Lucy [15] used the method to test the theory of fission for rotating protostars. One of the first groups to apply the SPH method to solid

calculation time is many weeks with their approach.

can be included.

32 Joining Technologies

future of FSW simulation.

**2. Simulation theory**

**2.1. Smoothed particle hydrodynamics**

mechanics was Libersky and Petschek [16] in 1991.

is called the kernel, also commonly referred to as the smoothing function. It is a function of the spatial distance between the point at which the function is to be calculated (calculation point, *x*¯), the interpolation location (*x*¯ ′), and the smoothing length, *h*. The kernel is the key to the SPH method. The continuous SPH interpolation equation can then be written for a set of discrete material points:

$$f\left(\mathbf{x}\_i^{\alpha}\right) = \sum\_{j=1}^{N\_i} \frac{m\_j}{\rho\_j} f\left(\mathbf{x}\_j^{\alpha}\right) W\left(r\_i h\right) \tag{2}$$

*xi* is the spatial location vector for particle *i* and *xj* for the *j* th particle. *mj* and *ρ<sup>j</sup>* are the mass and density of a *j* th particle and *r* = | *xi <sup>α</sup>* − *xj <sup>α</sup>* |. The interpolation kernel, *W*(*r*,*h*), will be written as *Wij* throughout the rest of the paper. The sum is taken over the total number (*Ni* ) of the *j* particles within the influence domain of *i*; these are termed the neighbors of the *i* th particle. As a general rule, we will use tensor notation to describe variables in continuum equations and indicial notation for the discrete SPH equations. Subscripts are reserved to indicate the *i* th or *j* th particle, whereas superscripts follow the general rules of the Einstein notation. For example, the Cauchy stress tensor, *σ*¯ ¯, in this notation would be *<sup>σ</sup><sup>i</sup> αβ* for the ith particle.

Determining the neighbors list is a major part of the computational time in the SPH method. We have developed an efficient adaptive neighbor-searching algorithm (complete details in Fraser [17]). The adaptive search typically cuts the search time in half or better.

In the SPH method, the gradient of a vector function can be shown to be simply the function multiplied by the gradient of the smoothing function:

$$\nabla f\left(\mathbf{x}\_i^{\;\;\alpha}\right) = \sum\_{j=1}^{N\_i} \frac{m\_j}{\rho\_j} f\left(\mathbf{x}\_j^{\;\;\alpha}\right) \nabla\_i \mathcal{W}\_{ij} \tag{3}$$

approach; however, the main difference is the ability to account for elastic and plastic defor‐ mation. Liu and Liu [22] as well as Violeau [23] provided in-depth development of the SPH

A Mesh-Free Solid-Mechanics Approach for Simulating the Friction Stir-Welding Process

In order to simulate the FSW process, we must discretize conservation of mass, momentum, and energy using the SPH method previously outlined. We use the weakly compressible approach that is common for large deformation problems (e.g., see [24–27]). Fundamentally, for a system described by particles, mass is inherently conserved at the particle level. It follows then that mass would be conserved for a set of rigid particles (incompressible) that make up a system. On the other hand, for a system made up of non-rigid (compressible) particles, we must take into account the spatial and temporal change of mass, *m*, within an infinitesimal volume. A convenient measure of this change is the local density, *ρ*=*m*/*V*, of an element within the infinitesimal volume. The conservation of mass for a temporally changing compressible

> <sup>0</sup> *<sup>d</sup> v*

where *t* is time and *v*¯ is velocity. Using the definitions outlined in Eqs. (1)–(5), we can now

*v*

of conservation of mass in the SPH method; this form is found to be robust and has the added benefit that it provides improved results for a system with significant spatial variation of density such as in multi-phase problems. The continuum mechanics description of conserva‐

b

b

th particle and *vji*

*ext*

Equation (8) describes the change in velocity (acceleration) of a material point in a solid body

as contact forces, and body forces, such as thermal expansion. Gravity is not considered in the formulation as its effects are not significant during the welding process. Now, we are ready to

¯, external forces, *F*¯

(6)

¶ = - <sup>å</sup> ¶ (7)

= Ñ× + + (8)

. There are other forms

http://dx.doi.org/10.5772/64159

35

*ext*, (on the surface of the body), such

*β* = *vj β* – *vi β*

 +Ñ× = r

1

*dt x*

1 1

r

translate the momentum equation for a continuum to the discrete SPH form:

*dv F b dt m* s

= r

*Ni j ij i i ji j j i*

r

*d m W*

*dt* r

r

is the number of neighbors of the *i*

conservation equations.

system is

where *Ni*

*2.2.1. Conservation equations*

write the discrete equation for Eq. (6) as

tion of momentum for a solid body is

subject to internal forces due to stress, *σ*¯

The evaluation of first derivatives is straightforward in the SPH method through the use of Eq. (3). The gradient of the smoothing function is given by

$$\nabla\_i \mathcal{W}\_{ij} = \frac{\partial \mathcal{W}\_{ij}}{\partial \overline{\mathbf{x}}\_i} = \frac{d \mathcal{W}\left(r, h\right)}{d \mathcal{R}} \left(\frac{1}{h}\right) \left(\frac{1}{r}\right) \left(\overline{\mathbf{x}}\_i - \overline{\mathbf{x}}\_j\right) \tag{4}$$

The smoothing function is typically normalized using the ratio *<sup>R</sup>* <sup>=</sup> *<sup>r</sup> <sup>h</sup>* <sup>=</sup> *<sup>x</sup>*¯*<sup>i</sup>* <sup>−</sup> *<sup>x</sup>*¯ *<sup>j</sup>* / *<sup>h</sup>* . The available choices of smoothing functions are vast as this is an ongoing research topic. We have tested a number of different options such as the cubic spline by Monaghan [18], the quadratic function by Johnson and Beissel [19], the quintic function of Wendland [20], and the hyperbolic spline by Yang et al. [21], among others. Of those tested, we have found that the hyperbolic spline is well adapted for simulating friction stir welding with SPH. The function for simula‐ tions in three dimensions is defined as

$$\mathcal{W}\_{\boldsymbol{\upbeta}} = 15 \,/\, 62 \pi \hbar 3 \begin{cases} R^3 - 6R + 6, & 0 \le R < 1 \\\\ \left( 2 - R \right)^3, & 1 \le R < 2 \\\\ 0, & R \ge 2 \end{cases} \tag{5}$$

#### **2.2. Coupled thermal mechanics SPH formulation for FSW**

In this section, the solid-mechanics formulation of smoothed particle hydrodynamics that is used in this work is outlined. The formulation bears close resemblance to that of a fluid approach; however, the main difference is the ability to account for elastic and plastic defor‐ mation. Liu and Liu [22] as well as Violeau [23] provided in-depth development of the SPH conservation equations.

#### *2.2.1. Conservation equations*

within the influence domain of *i*; these are termed the neighbors of the *i*

¯, in this notation would be *<sup>σ</sup><sup>i</sup>*

multiplied by the gradient of the smoothing function:

Eq. (3). The gradient of the smoothing function is given by

tions in three dimensions is defined as

stress tensor, *σ*¯

34 Joining Technologies

notation for the discrete SPH equations. Subscripts are reserved to indicate the *i*

Fraser [17]). The adaptive search typically cuts the search time in half or better.

( ) ( ) 1

*m fx fx W*

*Ni j i j i ij j j*

= r

a

*i*

The smoothing function is typically normalized using the ratio *<sup>R</sup>* <sup>=</sup> *<sup>r</sup>*

p

*ij*

**2.2. Coupled thermal mechanics SPH formulation for FSW**

( )

*W h RR*

3

<sup>ï</sup> <sup>=</sup> <sup>í</sup> - £<

15 / 62 3 2 , 1 2

In this section, the solid-mechanics formulation of smoothed particle hydrodynamics that is used in this work is outlined. The formulation bears close resemblance to that of a fluid

<sup>ì</sup> - + £< <sup>ï</sup>

*RR R*

3 6 6, 0 1

0, 2

*R*

<sup>ï</sup> <sup>³</sup> ïî

rule, we will use tensor notation to describe variables in continuum equations and indicial

whereas superscripts follow the general rules of the Einstein notation. For example, the Cauchy

Determining the neighbors list is a major part of the computational time in the SPH method. We have developed an efficient adaptive neighbor-searching algorithm (complete details in

In the SPH method, the gradient of a vector function can be shown to be simply the function

The evaluation of first derivatives is straightforward in the SPH method through the use of

( ) ( ) *ij* , 1 1

available choices of smoothing functions are vast as this is an ongoing research topic. We have tested a number of different options such as the cubic spline by Monaghan [18], the quadratic function by Johnson and Beissel [19], the quintic function of Wendland [20], and the hyperbolic spline by Yang et al. [21], among others. Of those tested, we have found that the hyperbolic spline is well adapted for simulating friction stir welding with SPH. The function for simula‐

*i ij i j*

*W dW r h W x x x dR h r*

 a

Ñ= Ñ å (3)

¶ æ öæ ö Ñ= = ç ÷ç ÷ - ¶ è øè ø (4)

*αβ* for the ith particle.

th particle. As a general

th or *j*

*<sup>h</sup>* <sup>=</sup> *<sup>x</sup>*¯*<sup>i</sup>* <sup>−</sup> *<sup>x</sup>*¯ *<sup>j</sup>* / *<sup>h</sup>* . The

(5)

th particle,

In order to simulate the FSW process, we must discretize conservation of mass, momentum, and energy using the SPH method previously outlined. We use the weakly compressible approach that is common for large deformation problems (e.g., see [24–27]). Fundamentally, for a system described by particles, mass is inherently conserved at the particle level. It follows then that mass would be conserved for a set of rigid particles (incompressible) that make up a system. On the other hand, for a system made up of non-rigid (compressible) particles, we must take into account the spatial and temporal change of mass, *m*, within an infinitesimal volume. A convenient measure of this change is the local density, *ρ*=*m*/*V*, of an element within the infinitesimal volume. The conservation of mass for a temporally changing compressible system is

$$\frac{d\rho}{dt} + \nabla \cdot \rho \overline{v} = 0 \tag{6}$$

where *t* is time and *v*¯ is velocity. Using the definitions outlined in Eqs. (1)–(5), we can now write the discrete equation for Eq. (6) as

$$\frac{d\rho\_{\parallel}}{dt} = -\rho\_{\parallel} \sum\_{\neq 1}^{N\_l} \frac{m\_{\parallel}}{\rho\_{\parallel}} \upsilon\_{\neq}{}^{\rho} \frac{\partial \mathcal{W}\_{\neq}}{\partial \mathbf{x}\_{\neq}^{\rho}}\tag{7}$$

where *Ni* is the number of neighbors of the *i* th particle and *vji β* = *vj β* – *vi β* . There are other forms of conservation of mass in the SPH method; this form is found to be robust and has the added benefit that it provides improved results for a system with significant spatial variation of density such as in multi-phase problems. The continuum mechanics description of conserva‐ tion of momentum for a solid body is

$$\frac{d\overline{\upsilon}}{dt} = \frac{1}{\rho} \nabla \cdot \overline{\overline{\sigma}} + \frac{1}{m} \overline{F}\_{\text{ext}} + \overline{b} \tag{8}$$

Equation (8) describes the change in velocity (acceleration) of a material point in a solid body subject to internal forces due to stress, *σ*¯ ¯, external forces, *F*¯ *ext*, (on the surface of the body), such as contact forces, and body forces, such as thermal expansion. Gravity is not considered in the formulation as its effects are not significant during the welding process. Now, we are ready to translate the momentum equation for a continuum to the discrete SPH form:

$$\frac{d\sigma\_i^{\alpha}}{dt} = \sum\_{j=1}^{N\_i} \mathfrak{m}\_j \left( \frac{\sigma\_i^{\alpha\theta}}{\rho\_i^2} + \frac{\sigma\_j^{\alpha\theta}}{\rho\_j^2} \right) \frac{\partial \mathcal{W}\_{ij}}{\partial \mathbf{x}\_i^{\theta}} + \frac{\mathbf{1}}{\mathcal{m}\_i} \left( F\_{\text{ext}} \right)\_i^{\alpha} + \mathbf{b}\_i^{\alpha} \tag{9}$$

the ambient air, and *Tsurr* is the (average) temperature of the surroundings. The friction heat is

*i pi i*

r

> r

A Mesh-Free Solid-Mechanics Approach for Simulating the Friction Stir-Welding Process

*k C kC kC*

r

<sup>=</sup> <sup>+</sup>

*i pi i j pj j*

Certainly, the heat loss and gain at the surface (convection, radiation, and friction heating) can be evaluated accordingly as surface integrals. However, we have found that the added complexity does not lead to improved precision for the FSW models that we have considered. In our experimental work, the surfaces of the workpieces are painted black to improve the quality of the image taken with an infrared camera. Note that for unpainted aluminum, the emissivity is very low (often less than 0.1); however, for a painted plate, the emissivity is ~0.95. Because of this, radiation effects are significant and should not be disregarded in the energy

The stress state can be updated in the material using a frame-indifferent objective stress rate equation. There are many different stress rate equations that can be used such as Truesdell, Green-Nahgdi, or the Jaumann rate equation (others exist). The Jaumann rate has a relatively simple formulation, thus making it unassuming to implement in a CSM code. The rate equation

> ( ) <sup>1</sup> <sup>2</sup> 3

> > ab gg

*j ij j ij i ji ji expandi i ij j j j i i*

 r

*m Wm W*

æ ö ¶ ¶ = +- ç ÷

b

 de æ ö ç ÷ - <sup>+</sup> W= <sup>+</sup> <sup>W</sup> è ø

3

e

ab

a

e

& & & <sup>1</sup> <sup>2</sup>

and *Ω*¯ are the strain rate and spin tensor, respectively, and *δ*¯

ab

1

= r

1 2 *Ni*

ab

e *<sup>T</sup> S G tr S S*

is the time rate of change of the deviatoric stress, *G* is the shear modulus of the material, *ε*

equation can be transformed into the discrete SPH formulation by using the discrete form of

 ag bg

*vv T x x*

 a

b

& & & & (14)

 ag gb

b

¶ ¶ è ø <sup>å</sup> & & (16)

 d

*S G S S* (15)

 edæ ö = - + W +W ç ÷ è ø

*i*

l

th finite element) by the *λ<sup>i</sup>* parameter:

http://dx.doi.org/10.5772/64159

(13)

37

≐

¯ is the Kronecher delta. The rate

th particle) and the tool (*j*

distributed into the workpieces (*i*

balance.

is

*S* ≐

*2.2.2. Stress and strain in SPH*

the strain rate and spin tensors:

The strain rate is found from

This version of the momentum equation is commonly called the symmetric form since the pairwise particle interactions are balanced. Moreover, this form exactly conserves linear momentum.

In order to simulate the FSW process, we must take into account the change in energy in the system due to conversion of internal energy (plastic deformation) and frictional heating. The standard energy equation for a weakly compressible body takes on the same form as the heat diffusion equation:

$$\frac{\partial T}{\partial t} = \frac{1}{\rho \mathcal{C}\_p} \left( k \nabla^2 T + \dot{q} \right) \tag{10}$$

Equation (10) provides the temporal change in temperature, *T*, in a solid body due to the diffusion of thermal energy. *Cp* is the heat capacity and *q*˙ takes into account heat generation and dissipation due to plastic deformation, frictional heating, convection, and radiation. The discrete SPH approximation of Eq. (10) is (see Cleary and Monaghan [28])

$$\frac{dT\_i}{dt} = \frac{1}{\rho\_i \mathbb{C}\_{p\boldsymbol{i}}} \sum\_{j=1}^{N\_i} \frac{m\_j}{\rho\_j} \frac{\left(\mathbf{4}k\_i k\_j\right)}{\left(k\_i + k\_j\right)} \frac{\left(T\_i - T\_j\right)}{\left|\mathbf{x}\_{ij}\right|^2} \mathbf{x}\_{ij} \frac{\partial \mathcal{W}\_{ij}}{\partial \mathbf{x}\_i^{\rho}} + \dot{\eta}\_i \tag{11}$$

Although frictional heating, convection, and radiation are surface integrals, we have found that these terms can be approximated as volume integrals without any loss of precision for the FSW simulations. In this sense, the heat generation and dissipation take on the following form:

$$\begin{split} \dot{q}\_{i} = \frac{1}{\rho\_{i}\mathbb{C}\_{p\_{i}}} \Big( \sigma\_{i}^{\
eg \dot{\Xi}\_{i}^{\rho \
eg \partial}} + \frac{A\_{l}}{V\_{i}} \Big( F\_{T\_{i}} \, ^{a} \boldsymbol{v}\_{T\_{\dot{\varphi}}} \, ^{a} \Big) \Big) \\ &+ \frac{A\_{i}}{m\_{i}\mathbb{C}\_{p\_{i}}} \Big( h\_{\text{conv}\_{i}} \left( T\_{\alpha} - T\_{i} \right) + \varepsilon\_{i} \sigma\_{\text{SB}} \left( T\_{\text{surr}\_{i}} \, ^{4} - T\_{i}^{4} \right) \Big) \end{split} \tag{12}$$

where *ε*¯ ¯ . *pi αβ* is the plastic strain rate tensor, *Vi* <sup>=</sup> *mj* / *<sup>ρ</sup><sup>j</sup>* , *FT <sup>i</sup> <sup>α</sup>* is the tangential force from sliding contact (we have used a constant coefficient of friction with the standard Coulomb friction law), *vTij <sup>α</sup>* = *vTi <sup>α</sup>* – *vTj <sup>α</sup>* is the relative tangential velocity at the contact surface, *h*conv*<sup>i</sup>* is the coefficient of convection, *Ai* is the equivalent surface area of a particle taken to be *h*<sup>2</sup> , *ε<sup>i</sup>* is the emissivity of the workpieces, *σSB* is the Stefan-Boltzmann constant, *T∞*) is the temperature of the ambient air, and *Tsurr* is the (average) temperature of the surroundings. The friction heat is distributed into the workpieces (*i* th particle) and the tool (*j* th finite element) by the *λ<sup>i</sup>* parameter:

$$\mathcal{A}\_{i} = \frac{\sqrt{k\_{i}\mathbb{C}\_{p\lor}\rho\_{i}}}{\sqrt{k\_{i}\mathbb{C}\_{p\lor}\rho\_{i}} + \sqrt{k\_{i}\mathbb{C}\_{p\lor}\rho\_{i}}} \tag{13}$$

Certainly, the heat loss and gain at the surface (convection, radiation, and friction heating) can be evaluated accordingly as surface integrals. However, we have found that the added complexity does not lead to improved precision for the FSW models that we have considered. In our experimental work, the surfaces of the workpieces are painted black to improve the quality of the image taken with an infrared camera. Note that for unpainted aluminum, the emissivity is very low (often less than 0.1); however, for a painted plate, the emissivity is ~0.95. Because of this, radiation effects are significant and should not be disregarded in the energy balance.

#### *2.2.2. Stress and strain in SPH*

( ) 2 2

b

a

ç ÷ ¶ è ø <sup>å</sup> (9)

a

¶ & (10)

*ij i*

= + å <sup>+</sup> ¶ & (11)

b

*m F b*

This version of the momentum equation is commonly called the symmetric form since the pairwise particle interactions are balanced. Moreover, this form exactly conserves linear

In order to simulate the FSW process, we must take into account the change in energy in the system due to conversion of internal energy (plastic deformation) and frictional heating. The standard energy equation for a weakly compressible body takes on the same form as the heat

( ) <sup>1</sup> <sup>2</sup>

Equation (10) provides the temporal change in temperature, *T*, in a solid body due to the diffusion of thermal energy. *Cp* is the heat capacity and *q*˙ takes into account heat generation and dissipation due to plastic deformation, frictional heating, convection, and radiation. The

> ( ) 2


( ( ) ( )) 4 4 conv SB surr

, *FT <sup>i</sup>*

*<sup>α</sup>* is the relative tangential velocity at the contact surface, *h*conv*<sup>i</sup>*

e s

*i i i*

(12)

is the

, *ε<sup>i</sup>* is the

*<sup>α</sup>* is the tangential force from sliding

*i i*

*<sup>A</sup> h TT T T*

*p <sup>T</sup> kTq t C*r

> ( ) ( )

*dT m W kk T T x q dt C k k <sup>x</sup> <sup>x</sup>*

*ij i j j ij i*

*i pi j <sup>j</sup> <sup>i</sup> i j ij*

( )

a ar

*i ij*

¥

+ -+ -

contact (we have used a constant coefficient of friction with the standard Coulomb friction

emissivity of the workpieces, *σSB* is the Stefan-Boltzmann constant, *T∞*) is the temperature of

coefficient of convection, *Ai* is the equivalent surface area of a particle taken to be *h*<sup>2</sup>

Although frictional heating, convection, and radiation are surface integrals, we have found that these terms can be approximated as volume integrals without any loss of precision for the FSW simulations. In this sense, the heat generation and dissipation take on the following form:

discrete SPH approximation of Eq. (10) is (see Cleary and Monaghan [28])

1 4 1 *Ni*

ε

*q F v C V*

ab ab l

s

*i i i T T i p i*

*αβ* is the plastic strain rate tensor, *Vi* <sup>=</sup> *mj* / *<sup>ρ</sup><sup>j</sup>*

*i*

*i*

*i p*

*m C*

*i*

æ ö = + ç ÷ è ø

1

r

where *ε*¯ ¯ . *pi*

law), *vTij*

*<sup>α</sup>* = *vTi*

*<sup>α</sup>* – *vTj*

*i*

& &

 r=

r

¶ = Ñ+

*<sup>j</sup> ext i <sup>i</sup> <sup>j</sup> i ji <sup>i</sup>*

1 *Ni j ij i i*

s

æ ö ¶

 r

ab

= + ++ ç ÷

*dt x m*

*dv W*

r

s ab

1

=

a

momentum.

36 Joining Technologies

diffusion equation:

The stress state can be updated in the material using a frame-indifferent objective stress rate equation. There are many different stress rate equations that can be used such as Truesdell, Green-Nahgdi, or the Jaumann rate equation (others exist). The Jaumann rate has a relatively simple formulation, thus making it unassuming to implement in a CSM code. The rate equation is

$$\dot{\overline{\overline{S}}} = 2G \left( \dot{\overline{\overline{\varepsilon}}} - \frac{1}{3} \text{tr} \left( \dot{\overline{\overline{\varepsilon}}} \right) \overline{\overline{\overline{\varepsilon}}} \right) + \dot{\overline{\overline{S}}} \, \overline{\overline{\Omega}}^{\top} + \overline{\overline{\Omega}} \, \overline{\overline{S}} \tag{14}$$

*S* ≐ is the time rate of change of the deviatoric stress, *G* is the shear modulus of the material, *ε* ≐ and *Ω*¯ are the strain rate and spin tensor, respectively, and *δ*¯ ¯ is the Kronecher delta. The rate equation can be transformed into the discrete SPH formulation by using the discrete form of the strain rate and spin tensors:

$$\dot{\mathbf{S}}^{a\theta} = \mathbf{2G} \left( \dot{\varepsilon}^{a\theta} - \frac{1}{3} \boldsymbol{\mathcal{S}}^{a\theta} \dot{\varepsilon}^{\prime\prime} \right) + \mathbf{S}^{a\gamma} \boldsymbol{\Omega}^{\theta\gamma} + \boldsymbol{\Omega}^{a\eta} \mathbf{S}^{\eta\theta} \tag{15}$$

The strain rate is found from

$$\dot{\boldsymbol{\varepsilon}}\_{i}^{\alpha\beta} = \frac{1}{2} \sum\_{j=1}^{N\_{l}} \left( \frac{m\_{j}}{\rho\_{j}} \boldsymbol{\upsilon}\_{\boldsymbol{\mu}} \, \overset{\mathcal{O}\mathbf{W}\_{\boldsymbol{\upbeta}}}{\partial \mathbf{x}\_{i}^{\boldsymbol{\upbeta}}} + \frac{m\_{j}}{\rho\_{j}} \boldsymbol{\upsilon}\_{\boldsymbol{\mu}} \, \overset{\mathcal{O}\mathbf{W}\_{\boldsymbol{\upbeta}}}{\partial \mathbf{x}\_{i}^{\boldsymbol{\upalpha}}} \right) - \boldsymbol{\mathcal{J}}\_{\text{exponent}} \dot{T}\_{l} \boldsymbol{\mathcal{S}}\_{\boldsymbol{\upbeta}} \tag{16}$$

#### 38 Joining Technologies

The *β*expand*<sup>i</sup> T*˙ *i δij* term takes into account the thermal strain rate and allows us to include thermal expansion. *β*expand is the coefficient of volumetric expansion of the material. The SPH form of the spin rate is

$$\boldsymbol{\mathfrak{Q}}\_{i}^{\alpha\beta} = \frac{1}{2} \sum\_{j=1}^{N\_{i}} \left( \frac{\boldsymbol{m}\_{j}}{\rho\_{j}} \boldsymbol{\upsilon}\_{\boldsymbol{\mu}} \, \frac{\boldsymbol{\mathcal{O}} \mathbf{W}\_{ij}}{\boldsymbol{\hat{\sigma}} \mathbf{x}\_{i}^{\beta}} - \frac{\boldsymbol{m}\_{j}}{\rho\_{j}} \boldsymbol{\upsilon}\_{\boldsymbol{\mu}} \, ^{\beta} \frac{\boldsymbol{\mathcal{O}} \mathbf{W}\_{ij}}{\boldsymbol{\hat{\sigma}} \mathbf{x}\_{i}^{\alpha}} \right) \tag{17}$$

As the SPH method used is that of a weakly compressible approach, an equation of state is required to link the pressure, *p*, to the density, and speed of sound, *c*:

$$p\_i = c^2 \left(\rho\_i - \rho\_{i0}\right) \tag{18}$$

For larger models, a different tactic is often employed with large number of CPUs, whereby the model and the data in memory are split up and assigned to individual compute "nodes." This approach is called distributed-memory parallel and requires the individual compute "nodes" to be linked by a network. A message-passing interface (MPI) can be used to provide

A Mesh-Free Solid-Mechanics Approach for Simulating the Friction Stir-Welding Process

http://dx.doi.org/10.5772/64159

39

Another parallelization strategy that has become very popular is to use the graphics processing unit. Today's GPUs have hundreds and in most cases thousands of "cores". **Figure 4** shows a schematic of the architecture of a typical GPU. We can see that each multiprocessor is com‐ posed of a large number of "thread processors". The GPU has its own memory called global memory that is accessed by all the multiprocessors. For this reason, as much as possible of the code should be programmed on the GPU to limit the amount of data transfer between the CPU

**Figure 4.** GPU architecture (adapted from NVIDIA [32] and Ruetsch and Fatica [33]).

problems that are set up to fully exploit the architecture of a specific GPU.

In the case of simulating the FSW process with SPH, the GPU is ideally suited for paralleliza‐ tion. The large number of streaming multiprocessors on a GPU is perfect for the computa‐ tionally heavy nature of SPH. SPH codes written to take advantage of the GPU can typically achieve speedup factors of 20–100× over an equivalent serial CPU (e.g., see Dalrymple et al. [34]). In some cases, speedup factors of over 150× are possible, although these are typically

Our parallelization strategy for the SPH code on the GPU is to assign each particle to a thread processor. In this sense, a thread will then carry out a set of calculations for a single particle. The number of threads that can run in parallel is hardware and code specific, but is typically in the multi-thousand range. Certainly, there are different parallelization strategies for SPH on the GPU; however, we have found this approach to be straightforward and efficient.

the communication.

and GPU.

Plasticity is included in the simulation by using an elastic-perfectly-plastic–thermal-softening flow stress model of the form

$$
\sigma\_y \left( T \right) = \sigma\_{y0} \left( 1 - \left( \frac{T - T\_{\text{g}}}{T\_{\text{milt}} - T\_{\text{g}}} \right)^w \right) \tag{19}
$$

Here, *σy*<sup>0</sup> is the room temperature yield strength, *T*R and *T*m are the room and melt temperature, respectively. *m* is the thermal-softening exponent. Plasticity is accounted for using the radial return algorithm (see [29–31] for further details).

#### **2.3. Parallelization strategy on the GPU**

Many types of engineering simulations require a large amount of computational time due to the complexity of the numerical model and/or the sheer size of the computational domain. In the case of friction stir welding, capturing all the aspect of the process requires a multi-physics approach that is very computationally burdensome. A typical FSW simulation can take many days or even weeks running on a single processing unit (sequential approach). For this reason, it is critical to be able to find an efficient means to run the simulation code in parallel. The idea is to split the domain into subregions and assigns them to individual processing units.

There are a number of different parallelization strategies that can be used. A popular method for small- to medium-sized models is to use a shared memory parallel (SMP) approach wherein each processor has its own set of tasks, but the processors share memory. In this sense, all the simulation data are stored in a common memory location. OpenMP is a very common directives-based programming language that can be used for SMP codes running on central processing units (CPUs).

For larger models, a different tactic is often employed with large number of CPUs, whereby the model and the data in memory are split up and assigned to individual compute "nodes." This approach is called distributed-memory parallel and requires the individual compute "nodes" to be linked by a network. A message-passing interface (MPI) can be used to provide the communication.

The *β*expand*<sup>i</sup>*

38 Joining Technologies

the spin rate is

*T*˙ *i δij*

flow stress model of the form

Here, *σy*<sup>0</sup>

term takes into account the thermal strain rate and allows us to include thermal

expansion. *β*expand is the coefficient of volumetric expansion of the material. The SPH form of

*j j j i i*

b

As the SPH method used is that of a weakly compressible approach, an equation of state is

( ) <sup>2</sup> *i ii*<sup>0</sup> *p c* = r r

Plasticity is included in the simulation by using an elastic-perfectly-plastic–thermal-softening

*j ij j ij*

 r

*x x*

b

*m R*

*melt R*

is the room temperature yield strength, *T*R and *T*m are the room and melt temperature,

*T T*

æ ö æ ö - = - ç ÷ ç ÷ ç ÷ - è ø è ø

respectively. *m* is the thermal-softening exponent. Plasticity is accounted for using the radial

Many types of engineering simulations require a large amount of computational time due to the complexity of the numerical model and/or the sheer size of the computational domain. In the case of friction stir welding, capturing all the aspect of the process requires a multi-physics approach that is very computationally burdensome. A typical FSW simulation can take many days or even weeks running on a single processing unit (sequential approach). For this reason, it is critical to be able to find an efficient means to run the simulation code in parallel. The idea is to split the domain into subregions and assigns them to individual processing units.

There are a number of different parallelization strategies that can be used. A popular method for small- to medium-sized models is to use a shared memory parallel (SMP) approach wherein each processor has its own set of tasks, but the processors share memory. In this sense, all the simulation data are stored in a common memory location. OpenMP is a very common directives-based programming language that can be used for SMP codes running on central

 a

æ ö ¶ ¶ <sup>=</sup> ç ÷ - ¶ ¶ è ø <sup>å</sup> (17)

(18)

(19)

*m Wm W v v*

1

required to link the pressure, *p*, to the density, and speed of sound, *c*:

( ) <sup>0</sup> 1

 s

*T T <sup>T</sup>*

*y y*

s

return algorithm (see [29–31] for further details).

**2.3. Parallelization strategy on the GPU**

processing units (CPUs).

= r

*i ji ji*

a

<sup>1</sup> <sup>Ω</sup> 2 *Ni*

ab Another parallelization strategy that has become very popular is to use the graphics processing unit. Today's GPUs have hundreds and in most cases thousands of "cores". **Figure 4** shows a schematic of the architecture of a typical GPU. We can see that each multiprocessor is com‐ posed of a large number of "thread processors". The GPU has its own memory called global memory that is accessed by all the multiprocessors. For this reason, as much as possible of the code should be programmed on the GPU to limit the amount of data transfer between the CPU and GPU.

**Figure 4.** GPU architecture (adapted from NVIDIA [32] and Ruetsch and Fatica [33]).

In the case of simulating the FSW process with SPH, the GPU is ideally suited for paralleliza‐ tion. The large number of streaming multiprocessors on a GPU is perfect for the computa‐ tionally heavy nature of SPH. SPH codes written to take advantage of the GPU can typically achieve speedup factors of 20–100× over an equivalent serial CPU (e.g., see Dalrymple et al. [34]). In some cases, speedup factors of over 150× are possible, although these are typically problems that are set up to fully exploit the architecture of a specific GPU.

Our parallelization strategy for the SPH code on the GPU is to assign each particle to a thread processor. In this sense, a thread will then carry out a set of calculations for a single particle. The number of threads that can run in parallel is hardware and code specific, but is typically in the multi-thousand range. Certainly, there are different parallelization strategies for SPH on the GPU; however, we have found this approach to be straightforward and efficient.
