**2. Methods**

188 Biomarker

Therefore, the first requirement of a fingerprinting method is that there must be a complete *coverage* of all required fingerprints in the training data. If a required fingerprint or proteomic pattern is missing from the training data (Figure 4), the quality of predictions for the testing data will either be greatly reduced or there will be a significant number of testing

If a fingerprinting classifier is found that performs extremely well on classifying the training data, but classifies the testing data poorly, one can either state that the classifier is insufficient and therefore not biologically relevant, or that there was an incorrect separation of training and validation data so that effective coverage of all important fingerprints was not present in the training data. Since the discriminating fingerprints are not known, proper coverage cannot be known, and therefore proper selection of the training data cannot be known. In addition, since the quality of classifying the testing set is the metric used to determine biological relevance, the testing set is used in the process of constructing the

With these points in mind, an effective way to construct classifiers based on fingerprints is to include all data in the search for fingerprinting classifiers and then to selectively remove samples for the testing set in a way that preserves the coverage of the fingerprint in the training data. This statement does not suggest, in any way, that this procedure is used by other research groups who present fingerprinting classifiers, it simply states that this method is an effective way to ensure complete coverage in the training data and to effectively test for uniqueness. If Figure 4 was used as the basic classifier, all other possible three-node decision trees would have to be constructed and compared to a sensitivity and specificity of 90%. If no other three-node decision tree is found to have this overall accuracy, then the uniqueness of this classifier is established. Otherwise, each decision tree would have to be presented as a possible solution; since the important fingerprints are not known, the selection of the training set cannot be determined, and two different decision trees that imply different separations of training and validation data are therefore equally valid.

Finally, the *significance* of a fingerprinting classifier needs to be established. Permutation testing is often used to test significance, but can be used in three different ways. In the Random Forest algorithm [Breiman, 2001] the intensities of a given feature are scrambled among all data in each testing set (i.e. the out-of-bag samples) to determine the importance of that feature. The phenotypes of the samples can also be scrambled a large number of times to determine the probability that the accuracy of a given classifier occurred by chance. In this application, the phenotypes will be scrambled amongst all data to determine if a new classifier of the same form (e.g. a three-node decision tree) can be constructed with comparable accuracy. The probability that random phenotypes can be classified to a given

To test the classification ability of different algorithms, this study will attempt to build classifiers from sets of 300 possible features. In each case, the intensities of the features will be determined using a random number generator. In other words, each classifier will attempt to distinguish healthy samples from diseased ones using data that contains no information. Results using DT and MCA classifiers have been previously presented [Luke &

individuals that will receive an "undetermined" classification.

classifier and is therefore part of the training process.

accuracy determines the significance of a given model.

**1.8 Proposed study** 
