**3. Statistical optimality of a PDF classification accuracy**

A PDF's classification accuracy is only statistically optimal if each group sample is normally distributed with different group means, and each predictor variance is similar between the groups [5, 7]. In addition to the above requirements, it is recommended that there be at least four to five times as many cases as predictors in order to produce more accurate estimates. Note that in PDA, the failure of the training sample to meet the assumption of normality can result in a decrease in efficiency and accuracy—see Lachenbruch [42] as cited in Klecka [43]. However, a minor violation of this assumption will not decrease the accuracy of the classification. As long as the distributions of predictors are reasonably comparable, the estimation of most multivariate parameters does not require multivariate normality [44]. Moreover, under the central limit theorem, there is no need to worry about the assumption of normality as long as each group sample contains a very large number of observations. As a general rule, a PDF will still perform strongly against non-normality as long as the smallest group has over 20 cases, and the number of predictors is less than six [45]. Due to these robustness properties of PDA, researchers are barely concerned about the assumption of normality.

But where non-normality is due to outliers and/or hidden influencing observations other than skewness, violating this assumption has serious consequences, because PDA is very sensitive to outliers [45]. Likewise, more cases could be classified into the group with greater dispersion when the assumption of equality of variance-covariance matrices is not tenable [45]. In addition, the likelihood of belonging to a group may be distorted, and the PDF may also not be able to separate the groups as much as possible [43]. The accuracy of the discriminant weights estimates may be reduced if the variances of the predictors are not all similar between groups. They may be precise but not unbiased [46]. When the homogeneity of variance test is significant, it indicates that the training sample is contaminated with outliers and/or hidden influence observations, and the significance tests are unreliable [3, 45]. It is apparent from the

*On the Use of Modified Winsorization with Graphical Diagnostic for Obtaining… DOI: http://dx.doi.org/10.5772/intechopen.104539*

foregoing that, if the assumption of homogeneity of variances is not satisfied, it is probable that the assumption of multivariate normality is not equally satisfied. This suggests that multivariate normality and homogeneity of variance assumptions can be taken into account if outliers and hidden influential observations are completely removed from a training sample. The practice of researchers relying on the robustness properties of PDA without checking for outliers and hidden influential observations, which may hinder maximal separation between the groups seems to be a norm. This practice is further encouraged by the general acceptance of a hit rate of 25% above that of chance. Assuming that you get a 95% hit rate, it is certain that you will not care about the two basic assumptions of PDA.

Whereas the reason for such good performance might be that the data support simple linear or quadratic separation boundaries. The general belief that linear classifiers are robust to minor violations of its basic assumptions (in particular is the assumption of multivariate normality) is often not tenable. Studies have shown that the reliability of a PDF solution is dependent upon adherence to the underlying assumptions [5]. The primary objective of PDA is classification, and if the percentage of correct classifications is not satisfactory, it is likely that variances in predictors are not similar across groups. That is to say the training sample is not statistically optimal. Therefore, it is necessary to adopt a screening method that will effectively identify and remove legitimate contaminants from training samples before using them to build a PDF. Iduseri and Osemwenkhae [6] proposed the modified winsorization with graphical diagnostic (MW-GD) method to identify and remove legitimate contaminants from training samples. The MW-GD method produced a statistically optimal training sample when applied to a real dataset, and the resulting PDF yielded a hit rate that was statistically optimal. As a result, the uncertainty about the PDF's actual hit rate was greatly reduced. However, the informative graphical diagnostic associated the proposed method may be difficult to interpret if there are no significant differences between a variable shape in the groups of the 2-D area plot.

This paper proposes an alternative statistical interpretation of the informative graphical diagnostic when confronted with the challenge of differentiating between a variable shape in the groups of the 2-D area plot.
