**4. Conclusions**

In this work, we have presented our approach toward simulating the entire FSW process using a solid-mechanics approach. By using a mesh-free numerical method such as SPH, the large plastic deformation encountered during FSW can be easily calculated. Mesh-based methods struggle to capture all the physics of the process due to discretization errors as the mesh distorts. The fully coupled elastic-plastic-thermal code is able to predict temperature, stress, and deformation histories. Because of the mesh-free Lagrangian nature of SPH, the model is able to predict defects (free surface changes) in a way that other numerical methods cannot. The prediction of defects is an invaluable feature for an engineer working on the design of the joint geometry to be welded. Optimal process parameters can then be chosen that lead to noweld defects. In this manner, the design engineer can find the fastest rate of advance that can be used to increase the overall profit margin during a high-volume production run.

One of the major advantages of using a solid-mechanics approach compared to a fluid approach is that the simulation models are able to capture the elastic stresses and strains. **Figure 13** shows the effective stress in the joint at the end of the plunge phase. This is the point when the forge force reaches its maximum value. This is of great interest to a joint designer who is interested to know if the joint will withstand the forge force during the welding process. If the vertical members under the weld seam are too thin, they will likely undergo significant plastification and could collapse. This certainly would be disastrous for the finished product. Other benefits of including the elastic stresses and strain are the ability to more precisely predict defect size and shape, as well as residual stresses and deformation following a cooldown phase.

**Figure 13.** Stress state at the end of plunge phase.

Looking toward the future of numerical simulation of FSW, we can see that as the performance of GPUs continues to improve, larger and more complex simulation models will be possible. We are currently working on a multi-GPU parallelization strategy that will allow tens and even hundreds of millions of SPH elements to be simulated. This approach requires the use of a highly optimized communication strategy between the GPUs (e.g., using MPI). We are currently working on various developments in the code, such as follows:


Since the simulation code is developed using a highly optimized parallel-processing strategy, complex 3D-joint geometries can be simulated within a reasonable period. In this work, the three simulation models were run simultaneously on a personal workstation with three individual GPUs. The cost of such a computer is less than five thousand dollars in today's market. Because of the parallel strategy, a cluster with many GPUs can be used with 100% efficiency (as long as an individual GPU has enough memory for each simulation model). In the sense of optimization, a company with access to a GPU compute cluster (say eight or more GPUs) could run parametric models (e.g., varying the rpm and advance speed) simultane‐ ously. The obtained data sets would provide the required information to construct a response surface and find the optimal advancing speed and rpm.
