**4.1 The equations**

The basic goal of SEM is to generalize the CFA to assess relationships between latent variables [7]. A classic form of SEM representation is the LISREL model which involves a measurement model and a structural model. The measurement

model defines the relationship between the latent variables and their indicators or observed variables, and the structural model defines the relationship between the latent variables. In this section, we will address the linear SEM model and the nonlinear case.

The measurement equations are:

$$\mathbf{x} = \Lambda\_{\mathbf{x}} \xi + \delta \tag{1}$$

$$\mathbf{y} = \Lambda\_{\mathcal{Y}} \boldsymbol{\eta} + \boldsymbol{\varepsilon} \tag{2}$$

In Eq. (1), *x* is the vector of observed exogenous variables, *ξ* is the vector of exogenous latent variables, *δ* is the vector of errors and Λ*<sup>x</sup>* the matrix of coefficients that relates *x* to *ξ*. In Eq. (2), *y* is vector of observed variables referred to as endogenous, *η* is the vector of latent variables also endogenous; *ε* is the vector of errors for the endogenous variables, and Λ*<sup>y</sup>* the matrix of coefficients relating *y* to *η*. In addition, connected with the two previous equations we have the covariance matrices: Θ*<sup>δ</sup>* and Θ*<sup>ε</sup>* are the matrix of covariances among errors *δ* and *ε*, respectively.

In summary, the object of the measurement model is to analyze the relation of the latent variables in *ξ* and *η* with the observed variables in *x* and *y*, respectively. One problem in formulating these equations is to specify the factorial loading matrix Λ, based on a priori information on the observed and latent variables considered in the study.

The structural equation for linear SEMs is:

$$
\eta = \Gamma \,\xi + \zeta \,\tag{3}
$$

The structural equation for nonlinear SEMs is:

$$
\eta = B\eta + \Gamma \,\xi + \zeta \,\tag{4}
$$

where *η* is the vector of endogenous variables, *ξ* is the vector of exogenous variables, and *ζ* explain the latent errors of endogenous variables; and *B* is the matrix of coefficients that explain the relation among endogenous latent variables, Γ explain the linear effects of exogenous variables on endogenous, and *ζ* include of errors of endogenous variables. Related to Eq. (4) we have the following matrices: Φ and Ψ are the covariance matrix of latent exogenous variables and the matrix of covariances among errors of endogenous variables, respectively.

#### **4.2 Assumptions and limitations**

**Normality:** The most important assumption in SEM is the multivariate normal distribution (MVN), particularly when the maximum likelihood (ML) method is used to estimate the model parameters. When discrete variables have used the assumption of normality is violated. The violation or omission of the assumption of the MVN of the observed variables leads to a high value of *χ*<sup>2</sup> *<sup>M</sup>=df <sup>M</sup>* and to an affectation of the significance of the test. In this scenario, it is suggested to apply other methods such as Generalized Least Squared (GLS).

When the complexity of the SEM increases, the sample size must also increase, and when the data depart from the normal distribution it is essential to increase the number of observations [1]. The non-normality assumption can be detected by

univariate tests, multivariate tests, and skewness and kurtosis statistics. The skewness and kurtosis can be measured, separately or together in the same variable. In the context of SEM, the kurtosis is more problematic than skewness in terms of the effects on inference. If the absolute value of the skewness exceeds 2 and kurtosis exceeds 4, then the distribution is non-normal [8].

**No correlation between errors:** The errors are assumed to be independent, that is, there is no correlation between the errors *δ*, *ε* and *ζ*.

**Multicollinearity:** It is assumed that there is no strong relationship among the independent variables.

**Linearity:** It is assumed that exists linear relation among the variables.

**Outliers:** The presence of outliers in the data affects the significant results of the model. **Sample size:** Generally, the number of observations in the sample affects the results of the fit indices in SEM. [9] suggest a minimum sample size of 150; [10] suggest at least 10 times the number of parameters in the model: [11] recommends should be at least 200, and Hair et al. mentioned by Thakkar [12] provides an interesting list. However, if the number of observations is small, it is reasonable and recommendable to use the Bayesian approach of SEM.

**Limitations:** Prior to analysis, and since the SEM model is a statistical method of confirmation, the researcher must establish a hypothetical model, analyze the model based on the sample and the latent, and observed variables. Additionally, one must know how many parameters you need to estimate, adding variances, covariances, and path coefficients. Of course, one must know all the relationships that he/she intends to specify in the model.

### **4.3 Estimation**

Let Σ ¼ Σð Þ*θ* be the covariance matrix of the model, where Σ is the population matrix corresponding to the observed variables, *θ* is a vector of (unknown) parameters, and Σð Þ*θ* is a matrix as a function of *θ*, which is estimated by minimizing the discrepancy among a sample covariance matrix *S* and Σð Þ*θ* . The estimation methods minimize different discrepancy functions *F* between *S* and Σð Þ*θ* , so that

$$F = \min\left(\mathbb{S}, \Sigma\right) \tag{5}$$

where the matrix Σð Þ*θ* is given by

$$\begin{aligned} \boldsymbol{\Sigma}(\boldsymbol{\theta}) &= \begin{bmatrix} \boldsymbol{E}(\boldsymbol{\mathcal{y}}\boldsymbol{\mathcal{y}}^{T}) & \boldsymbol{E}(\boldsymbol{\mathcal{y}}\boldsymbol{\mathcal{x}}^{T}) \\ \boldsymbol{E}(\boldsymbol{\mathcal{x}}\boldsymbol{\mathcal{y}}^{T}) & \boldsymbol{E}(\boldsymbol{\mathcal{x}}\boldsymbol{\mathcal{x}}^{T}) \end{bmatrix} = \begin{bmatrix} \boldsymbol{\Sigma}\_{\mathcal{Y}}(\boldsymbol{\theta}) & \boldsymbol{\Sigma}\_{\mathcal{Y}}(\boldsymbol{\theta}) \\ \boldsymbol{\Sigma}\_{\mathcal{Y}}(\boldsymbol{\theta}) & \boldsymbol{\Sigma}\_{\mathbf{x}\boldsymbol{\mathcal{x}}}(\boldsymbol{\theta}) \end{bmatrix} \\ &= \begin{bmatrix} \boldsymbol{\Lambda}\_{\mathcal{Y}}\mathbf{C}(\boldsymbol{\Gamma}\boldsymbol{\Phi}\boldsymbol{\Gamma}^{T} + \boldsymbol{\Psi})\mathbf{C}^{T}\boldsymbol{\Lambda}\_{\mathcal{Y}}^{T} + \boldsymbol{\Theta}\_{\boldsymbol{\varepsilon}} & \boldsymbol{\Lambda}\_{\mathcal{Y}}\mathbf{C}\boldsymbol{\Gamma}\boldsymbol{\Lambda}\boldsymbol{\Lambda}\_{\mathbf{x}}^{T} \\ & \boldsymbol{\Lambda}\_{\mathbf{x}}\boldsymbol{\Phi}\boldsymbol{\Gamma}^{T}\mathbf{C}^{T}\boldsymbol{\Lambda}\_{\mathcal{Y}}^{T} & \boldsymbol{\Lambda}\_{\mathbf{x}}\boldsymbol{\Phi}\boldsymbol{\Lambda}\_{\mathbf{x}}^{T} + \boldsymbol{\Theta}\_{\boldsymbol{\delta}} \end{bmatrix} \end{aligned} \tag{6}$$

Note that this matrix does not depend on observed or latent variables but on the matrices of unknown parameters <sup>Θ</sup>*δ*, <sup>Θ</sup>*ε*, <sup>Φ</sup>, <sup>Ψ</sup>,Λ*x*,Λ*y*, <sup>Γ</sup> and *<sup>B</sup>*, where *<sup>C</sup>* <sup>¼</sup> ð Þ *<sup>I</sup>* � *<sup>B</sup>* �<sup>1</sup> .

**ML estimation:** In this method, function (7) is the logarithm of the likelihood, the loglikelihood. Maximization is accomplished by deriving the loglikelihood with respect to the parameters, equating each derivative to zero, and solving the equations system. This procedure requires that the endogenous variables have an MVN

distribution, *S* Wishart distribution, that the observations are distributed independently and identically, and that the matrices Σ and *S* are positive definite.

$$\begin{split} F\_{\text{ML}} &= \log \left| \Sigma(\theta) \right| + tr \Big( \mathbf{S} \Sigma(\theta)^{-1} \Big) \\ &\quad - \log \left| \mathbf{S} \right| - tr \Big( \mathbf{S} \mathbf{S}^{-1} \Big) \end{split} \tag{7}$$

where log ðÞ is the natural logarithm function, j j� is the determinant and *tr*ðÞ is the trace function.

The ML estimator has among others the following advantages: is asymptotically consistent, unbiased, efficient, and the model fit statistic *TML* is asymptotically distributed as *<sup>χ</sup>*<sup>2</sup> with *df* <sup>¼</sup> *p p*ð Þ <sup>þ</sup><sup>1</sup> <sup>2</sup> � *t*, where *t* is the number of model parameters estimated.

Two other estimation methods that consider endogenous variables with MVN distributions are generalized least squares (GLS) and unweighted least squares (ULS), which are described below.

**GLS estimator:** This method is a member of a family known as fully weighted least squares (WLS) estimation, which is suggested to be applied when the data is considered severely non-normal; in addition, it has the property of being asymptotically MVN distributed. The function to minimize is given by

$$F\_{\rm GLS} = \frac{1}{2}\text{tr}\left\{ \left[ I - \Sigma(\theta)S^{-1} \right]^2 \right\} \tag{8}$$

**ULS estimator:** The method consists of minimizing the sum of squares of the differences among the sample covariance matrix and the predicted covariance matrix. This method can generate unbiased estimates but is not as good as the ML method [13]. The function to minimize is

$$F\_{\rm ULS} = \frac{1}{2}\text{tr}\left\{\mathcal{S} - \Sigma(\theta)^2\right\} \tag{9}$$

In general, the ML estimator is preferred over both GLS and ULS, especially when the number of observations is large.

GLS estimator requires well-specified models but allows small sample sizes to do an acceptable job in terms of theoretical and empirical fit. WLS estimator also requires well-specified models, but in contrast to GLS and ML, it also requires large sample sizes to perform well [14]. In general, the ML estimator is preferred over both GLS and ULS, especially when the number of observations is large.

## **4.4 Model assessment**

The SEM tests a hypothetical theoretical model about the relation among latent and observed variables, the goal of model evaluation consists in test the causal relationships of a model. There are several criteria for evaluating the fit of an SEM, so it is difficult to adopt a single specific model fit criterion. The researcher generally uses three criteria to assess the statistical significance and the substantive significance of a hypothesized model [15]:

1.The non-significance of the chi-square test indicates that the proposed model fits the data.


Kline [1], Schumacker and Lomax [15], Thakkar [12] and Douglas [2] provided indices and criteria for evaluating the fit of the model. This chapter only presents some indices: A statistical test and four basic fit statistics criteria.

a. Chi-square *χ*<sup>2</sup> *<sup>M</sup>* with its degrees of freedom df*<sup>M</sup>* and p value.

This statistic is based on a function of the fitting function *F*ML (7) and is given by

$$
\chi\_M^2 = (n-1)F\_{\rm ML} \tag{10}
$$

where *n* is sample size and *χ*<sup>2</sup> *<sup>M</sup>* has a central chi-square distribution with degrees of freedom *df <sup>M</sup>* <sup>¼</sup> *<sup>p</sup>*<sup>∗</sup> � *<sup>t</sup>*, where *<sup>p</sup>*<sup>∗</sup> <sup>¼</sup> *p p*ð Þ <sup>þ</sup> <sup>1</sup> *<sup>=</sup>*2 is the total number of variances and covariance terms, *p* is the number of observed variables, and *t* is the total number of free parameters. Among the problems that this statistic presents are that its value can be affected by the sample size, non-normality, correlation, and unique variance. To decrease the sensitivity of the *χ*<sup>2</sup> *<sup>M</sup>* to sample size, it is common to divide this statistic by its expected value, that is to say *χ*2 *<sup>M</sup>=df <sup>M</sup>*, change that reduces the value of this ratio for *df <sup>M</sup>* > 1 compared with *χ*2 *<sup>M</sup>*. This statistic is used to test the absolute model fit. The null hypothesis of equal fit is that there is no difference between the proposed model and the data. A large value of statistics *χ*<sup>2</sup> *<sup>M</sup>* with a respective small p value imply that model does not fit the data well.

b. Root Mean Square Error of Approximation (RMSEA) and its 90% confidence interval.

The RMSEA is a function of *χ*<sup>2</sup> *<sup>M</sup>* statistics defined by

$$R \text{MSE}A = \sqrt{\frac{\hat{\delta}\_{\mathsf{M}}}{\text{df}\_{\mathsf{M}} \ (n-1)}} \tag{11}$$

where ^*δ<sup>M</sup>* <sup>¼</sup> max 0, *<sup>χ</sup>*<sup>2</sup> *<sup>M</sup>* � *df* � � is the estimated noncentrality parameter and *<sup>χ</sup>*<sup>2</sup> *M* is defined in (10).

c. The Comparative Fit Index (CFI).

Let *I* be the null (independence) model and its *χ*<sup>2</sup> *<sup>I</sup>* statistic which is approximately central chi-square distributed with degrees of freedom df*I*. The CFI can be obtained using the ML estimator. This index is given by:

$$\text{CFI} = \mathbf{1} - \frac{\chi\_M^2 - \text{df}\_M}{\chi\_I^2 - \text{df}\_I} \tag{12}$$

d. Goodness of Fit Index (GFI)

*The Basics of Structural Equations in Medicine and Health Sciences DOI: http://dx.doi.org/10.5772/intechopen.104957*


#### **Table 1.**

*Guidelines in SEM for select model fit statistics and indices [2].*

GFI is the amount of variances and covariances jointly accounted for by the model. It is given by:

$$GFI = 1 - \frac{\text{tr}\left(\Sigma(\theta)^{-1}\mathbb{S} - I\right)^2}{\text{tr}\left(\Sigma(\theta)^{-1}\mathbb{S}\right)^2} \tag{13}$$

GFI varies from 0 to 1.0.

e. Standardized Root Mean Square Residual (SRMR).

SRMR is an absolute fit index that is a badness-of-fit statistic that consists of standardizing the Root Mean Square Residual (RMR). It is a measure of the mean absolute covariance residual. An SRMR = 0 means an ideal model fit, and increasingly higher values indicate a worse fit [1].

In **Table 1** a summary is given about the interpretation of the most important goodness of fit indices.
