**5. Computational performance of the problem**

According to Fahmy [35], the computer performance with simulation can be computed based on account and communication process, elements underlying the hardware and functional computation. The main objective of our proposed technique during simulation process is to use the preconditioners which are efficient to improve the overall CPU utilization of the cluster, accelerate the iterative method, and reduce the input/output and the interprocessor communication costs. Also, Fahmy [35] compared the communication-avoiding Krylov methods that are based on the s-step Krylov methods such as communication-avoiding generalized minimal residual (CA-GMRES) of Saad and Schultz [92], communication-avoiding Arnoldi (CA-Arnoldi) of the Arnoldi [93] and communication-avoiding Lanczos (CA-Lanczos) of Lanczos [94], with their corresponding standard Krylov methods. CA-Arnoldi which is also called Arnoldi (s, t) algorithm is different from standard Arnoldi (s) ð Þ *s*, *t* ¼ 1 , where s is the number of inner iteration steps and t is the number of outer iteration steps. According to [35], the CA-Arnoldi has numerical stability, convergence, and performance due to the implementation of algorithm shown in **Figure 2**, which is based on the QR factorization update and block classical Gram-Schmidt (block CGS) approach or block modified Gram-Schmidt (block MGS) approach where

$$V\_k = [v\_{sk+1}, v\_{sk+2}, \dots, v\_{sk+\varepsilon}] \tag{79}$$

**Figure 2.** *CA-Arnoldi iteration algorithm.* *A Novel MDD-Based BEM Model for Transient 3T Nonlinear Thermal Stresses in FGA Smart… DOI: http://dx.doi.org/10.5772/intechopen.92829*

and

An explicit staggered algorithm based on communication-avoiding Arnoldi as described in Hoemmen [91] is very suitable for efficient implementation in Matlab (R2019a) with the aim of specifically improving its performance for the solution of

According to Fahmy [35], the computer performance with simulation can be computed based on account and communication process, elements underlying the hardware and functional computation. The main objective of our proposed technique during simulation process is to use the preconditioners which are efficient to improve the overall CPU utilization of the cluster, accelerate the iterative method, and reduce the input/output and the interprocessor communication costs. Also, Fahmy [35] compared the communication-avoiding Krylov methods that are based on the s-step Krylov methods such as communication-avoiding generalized minimal residual (CA-GMRES) of Saad and Schultz [92], communication-avoiding Arnoldi (CA-Arnoldi) of the Arnoldi [93] and communication-avoiding Lanczos (CA-Lanczos) of Lanczos [94], with their corresponding standard Krylov methods. CA-Arnoldi which is also called Arnoldi (s, t) algorithm is different from standard Arnoldi (s) ð Þ *s*, *t* ¼ 1 , where s is the number of inner iteration steps and t is the number of outer iteration steps. According to [35], the CA-Arnoldi has numerical stability, convergence, and performance due to the implementation of algorithm shown in **Figure 2**, which is based on the QR factorization update and block classical Gram-Schmidt (block CGS)

approach or block modified Gram-Schmidt (block MGS) approach where

*Vk* ¼ *vsk*þ1, *vsk*þ2, … , *vsk*þ*<sup>s</sup>* ½ � (79)

the resulting linear algebraic systems.

*Advanced Functional Materials*

**Figure 2.**

**104**

*CA-Arnoldi iteration algorithm.*

**5. Computational performance of the problem**

$$\mathbf{Q}\_k = [\mathbf{Q}\_0, \mathbf{Q}\_1, \dots, \mathbf{Q}\_{k-1}] \tag{80}$$

The generalized minimal residual (GMRES) method of Saad and Schultz [92] is a Krylov subspace method for solving nonsymmetric linear systems. The CA-GMRES algorithm is based on Arnoldi (s, t) and equivalent to standard GMRES in exact arithmetic. Also, the GMRES or CA-GMRES are convergent at the same rate for problems, but Hoemmen [91] proved that CA-GMRES algorithm shown in **Figure 3** converges for the s-step basis lengths and restart lengths used for obtaining maximum performance. Lanczos method can be considered as a special case of Arnoldi method for symmetric and real case of A or Hermitian and complex case of A. Symmetric Lanczos which is also called Lanczos is different from nonsymmetric Lanczos. We implemented a communication-avoiding version of symmetric Lanczos (CA-Lanczos) for solving symmetric positive definite (SPD) eigenvalue problems. Also, we implement CA-Lanczos iteration algorithm shown in **Figure 4**, which is also called Lanczos (s, t), where s is the s-step basis length and t is the outer iterations number before restart. This algorithm is based on using rank revealingtall skinny QR-block Gram-Schmidt (RR-TSQR-BGS) orthogonalization method

**Figure 3.**

*CA-GMRES iteration algorithm.*

#### **Figure 4.**

*CA-Lanczos iteration algorithm.*

which connects between TSQR and block Gram-Schmidt, where we have been using the right-shifted basis matrix at outer iteration *k* as follows:

$$V'\_k = [V\_{sk+2}, \dots, v\_{sk+s}] \tag{81}$$

**Methods Preconditioning**

Direct methods

**Table 1.**

**107**

**techniques**

*DOI: http://dx.doi.org/10.5772/intechopen.92829*

**Iterations Residual Time of each iterative**

NO —— — 9 min 50 s

JOBI 26 5.22E–07 3.86 2 min 38 s BJOB 22 1.34E–06 3.86 2 min 23 s ILU3 47 1.66E–06 3.84 4 min 2 s ILU5 48 1.38E–06 3.89 4 min 6 s DILU 48 1.53E–06 5.45 4 min 18 s

JOBI 20 4.42E–07 1.96 1 min 30 s BJOB 20 2.30E–08 1.96 1 min 30 s ILU3 40 7.87E–07 1.96 2 min 11 s ILU5 60 1.28E–08 1.96 2 min 48 s DILU 60 1.59E–07 3.07 4 min 1 s

JOBI 40 5.01E–13 1.91 2 min 10 s BJOB 40 2.05E–11 1.91 2 min 10 s ILU3 40 4.70E–08 1.91 2 min 10 s ILU5 40 3.13E–08 2.60 2 min 10 s DILU 40 6.19E–08 3.07 2 min 48 s

JOBI 12 1.00E–05 3.76 1 min 41 s BJOB 12 2.22E–06 3.76 1 min 42 s ILU3 26 3.63E–06 3.75 2 min 34 s ILU5 22 4.05E–06 3.75 2 min 20 s DILU 25 5.19E–06 5.93 3 min 18 s

JOBI 22 4.87E–07 3.75 2 min 33 s BJOB 18 9.27E–07 5.18 3 min 2 s ILU3 42 2.41E–07 3.81 3 min 48 s ILU5 36 6.41E–07 3.78 3 min 18 s DILU 38 2.04E–07 5.00 3 min 32 s

JOBI 16 8.64E–07 3.76 2 min 3s BJOB 14 1.69E–07 3.77 2 min 0 s ILU3 24 9.29E–07 3.87 2 min 31 s ILU5 31 1.91E–07 3.90 3 min 1 s DILU 27 8.11E–07 5.95 3 min 31 s

Arnoldi NO 174 7.21E–07 3.85 11 min 25 s

*A Novel MDD-Based BEM Model for Transient 3T Nonlinear Thermal Stresses in FGA Smart…*

CA–Arnoldi NO 360 6.96E–07 1.95 11 min 53 s

GMRES NO 280 2.36E–08 1.90 6 min 20 s

CA–GMRES NO 120 6.89E–07 3.78 7 min 57 s

Lanczos NO 135 7.24E–07 3.80 8 min 41 s

CA–Lanczos NO 129 1.30E–04 3.75 9 min 22 s

*Performances of preconditioned Krylov subspace iterative methods for DOF 3964.*

**step (s)**

**Time of solution**

and

$$V\_k' = \begin{bmatrix} V\_k', v\_{sk+s+1} \end{bmatrix} \tag{82}$$

For more details about the considered preconditioners and algorithms, we refer the interested readers to [91].

The main objective of this section is to implement an accurate and robust preconditioning technique for solving the dense nonsymmetric algebraic system of linear equations arising from the BEM. So, a communication-avoiding Arnoldi of the Arnoldi [93] has been implemented for solving the resulting linear systems in order to reduce the iteration number and CPU time. The BEM discretization is employed in 1280 quadrilateral elements, with 3964 degrees of freedom (DOF). A comparative performance of preconditioned Krylov subspace solvers (CA-Arnoldi, CA-GMRES, and CA-Lanczos) has been shown in **Table 1**, where the number of DOF is 3964 and "–" was defined as the divergence process. From the results of **Table 1**. The CA-Arnoldi, CA-GMRES, and CA-Lanczos are more cost-effective than the other Krylov subspace methods Arnoldi, GMRES, and Lanczos, respectively. Also, CA-Arnoldi, CA-GMRES, and CA-Lanczos have been compared with each other in **Table 2**. It can be seen from this table that the performance of CA-Arnoldi is superior than the other iterative methods.


*A Novel MDD-Based BEM Model for Transient 3T Nonlinear Thermal Stresses in FGA Smart… DOI: http://dx.doi.org/10.5772/intechopen.92829*

#### **Table 1.**

*Performances of preconditioned Krylov subspace iterative methods for DOF 3964.*

which connects between TSQR and block Gram-Schmidt, where we have been

*<sup>k</sup>*, *vsk*þ*s*þ<sup>1</sup>

For more details about the considered preconditioners and algorithms, we refer

The main objective of this section is to implement an accurate and robust preconditioning technique for solving the dense nonsymmetric algebraic system of linear equations arising from the BEM. So, a communication-avoiding Arnoldi of the Arnoldi [93] has been implemented for solving the resulting linear systems in order to reduce the iteration number and CPU time. The BEM discretization is employed in 1280 quadrilateral elements, with 3964 degrees of freedom (DOF). A comparative performance of preconditioned Krylov subspace solvers (CA-Arnoldi, CA-GMRES, and CA-Lanczos) has been shown in **Table 1**, where the number of DOF is 3964 and "–" was defined as the divergence process. From the results of **Table 1**. The CA-Arnoldi, CA-GMRES, and CA-Lanczos are more cost-effective than the other Krylov subspace methods Arnoldi, GMRES, and Lanczos, respectively. Also, CA-Arnoldi, CA-GMRES, and CA-Lanczos have been compared with each other in **Table 2**. It can be seen from this table that the performance of CA-

*<sup>k</sup>* ¼ *Vsk*þ2, … , *vsk*þ*<sup>s</sup>* ½ � (81)

(82)

using the right-shifted basis matrix at outer iteration *k* as follows:

*V*0

*V*0 *<sup>k</sup>* ¼ *V*<sup>0</sup>

Arnoldi is superior than the other iterative methods.

and

**106**

**Figure 4.**

*CA-Lanczos iteration algorithm.*

*Advanced Functional Materials*

the interested readers to [91].


The proposed technique that has been implemented in the current study can be

*A Novel MDD-Based BEM Model for Transient 3T Nonlinear Thermal Stresses in FGA Smart…*

**Figure 5** shows the variations of the three temperatures Te, Ti and Tp with the time τ in the presence of MDD. **Figure 6** shows the variations of the three temperatures Te, Ti and Tp with the time τ in the presence of MDD. It can be seen from **Figures 5** and **6** that the MDD has a significant effect on the temperature

applicable to a wide variety of FGA smart structure problems involving three temperatures. All the physical parameters satisfy the initial and boundary conditions. The efficiency of our BEM modeling technique has been improved using an explicit staggered algorithm based on communication-avoiding Arnoldi procedure

to decrease the computation time.

*DOI: http://dx.doi.org/10.5772/intechopen.92829*

distributions.

**Figure 5.**

**Figure 6.**

**109**

*Variation of the three-temperature (with memory) with time* τ*.*

*Variation of the three-temperature (without memory) with time* τ*.*

#### **Table 2.**
