**5.2 Using the proposed alternative statistical interpretation for the informative graphical diagnostic**

If the modified winsorized means for each variable in **Table 1** are denoted as the independent variable *Y Y*ð Þ ¼ *Y*1, *Y*2, *Y*3, *Y*<sup>4</sup> , and the winsorized percent as the dependent variable *X*, then the six pairs of values for *X* (0, 4, 8, 12, 16, 20) and *Y*<sup>1</sup>


#### **Table 4.**

*Regression coefficients,* b *of fitted regression model to Table 1 data.*

(428.70, 421.07, 415.50, 412.28, 408.90, 406.20) are in good accordance to a regression model.

At *step 1* of the proposed alternative statistical interpretation for the informative graphical diagnostic, each of the six values of the modified winsorized means for the four variables in groups 1 and 2 with the six values of the winsorized percent are fitted to a linear regression model. The summary of the obtained values of the regression coefficient, *b* for the four fitted regression models each for groups 1 and 2 are presented in **Table 4**.

At *step 2* of the proposed alternative approach, an absolute difference between the obtained regression coefficients (i.e., the slope) for each variable in group 1 and 2 was also obtained. The summary of the absolute values is presented as the last row of **Table 4**.

The two step approach of the proposed alternative method was repeated using the data of **Table 3**. The summary of the obtained values of the regression coefficient, *b* for the three fitted regression models each for groups 1 and 2, and the summary of the absolute values is presented as the last row of **Table 5**.

A cursory look at **Table 4** shows that the regression coefficient (�2.316) of *X*<sup>2</sup> in group 1 is notably lower than the corresponding regression coefficient (�0.569) of variable *X*<sup>2</sup> in group 2, and the regression coefficients obtained for the variables *X*1, *X*3, and *X*<sup>7</sup> in group 1 and in group 2, respectively. Also, the value of 1.8 which is the absolute difference between the regression coefficients of *X*<sup>2</sup> in group 1 and in group 2 is significantly greater than the decision boundary value of 0.75. This indicates that the variable *X*<sup>2</sup> does not have similar variances in groups formed by the dependent, and thus becomes the variable identified with legitimate contaminants. In **Table 5**, the values of the regression coefficients for the three variables in groups 1 and 2 are equivalently equal. Also, the values of the absolute difference between the regression coefficients of variable *X*5, *X*10, and *X*<sup>11</sup> in group 1 and in group 2 are all equal, and significantly lower than the


#### **Table 5.**

*Regression coefficients,* b *of fitted regression model to Table 3 data.*

decision boundary value of 0.75. This indicates that there are no legitimate contaminants in variable *X*5, *X*10, and *X*11, respectively. This therefore implies that the fit between the validation sample, *Dt <sup>N</sup>* and the basic assumptions of PDA is sufficient to construct a PDF whose hit rate can be said to be statistically optimal.
