1. Spectrogram Pretreatment

Due to many influencing factors in the MS experiment, the original spectrogram produced by MS must be pretreated to eliminate disturbance. Pretreatment of the MS data includes baseline elimination, filtration and noise elimination, standardization, peak detection and peak quantification. The comparison of commonly-used pretreatment methods and tools are shown in Table 3 (Cruz-Marcelo et al, 2008).


Table 3. Commonly-used MS Pretreatment Algorithms and Tools

### 2. Peak Alignment

The spectrograms after pretreatment shall undergo peak alignment. Many spectrograms are combined into a matrix file similar to the gene expression profile, namely the serum peptidome profiling. In the profiling, the line represents the peak of some specific chargemass ratio (m/z), namely, the relative content of specific proteins or polypeptides, and the column represents samples. This is the foundation for follow-up bioinformatics analysis and its data quality directly influences the analysis results.

3. Bioinformatics Analysis

Usually, the first step is to make cluster analysis, mainly including shortest distance method, longest distance method, median method, centroid method, average linkage and Ward's minimum-variance method. Also, the data classification is carried out. The commonly-used methods include support vector machine (SVM), decision tree, neural networks and k nearest neighbor (kNN).

The research process based on serum peptidomics classification is first to divide the mass spectrometric data after pretreatment into a group of modeling data and the other group of validation data. The modeling data fall into the training set and testing set. Then, analysis

Due to many influencing factors in the MS experiment, the original spectrogram produced by MS must be pretreated to eliminate disturbance. Pretreatment of the MS data includes baseline elimination, filtration and noise elimination, standardization, peak detection and peak quantification. The comparison of commonly-used pretreatment methods and tools are

Main Functions Relevant Information

s.html

org/cromwell.html

avelet.html

/specalign/index.htm

http://www.vermillion.com/

http://www.bioconductor.org/pac kages/bioc/1.8/html/PROces

http://bioinformatics.mdanderson.

http://physchem.ox.ac.uk/~jwong

http://www.bioconductor.org/pac kages/2.0/bioc/html/MassSpecW

**2.3.3 TOF-MS Based Software**  1. Spectrogram Pretreatment

Algorithm and

2. Peak Alignment

3. Bioinformatics Analysis

nearest neighbor (kNN).

ProteinChip Software 3.1 and Biomarker Wizard

Tool

shown in Table 3 (Cruz-Marcelo et al, 2008).

PROcess An R-language based

Commercial software of Ciphergen Biosystems, designed for analyzing SELDI-TOF MS data

package of BioConductor, designed for pretreatment of

package of BioConductor, using continuous wavelet transform for peak detection

The spectrograms after pretreatment shall undergo peak alignment. Many spectrograms are combined into a matrix file similar to the gene expression profile, namely the serum peptidome profiling. In the profiling, the line represents the peak of some specific chargemass ratio (m/z), namely, the relative content of specific proteins or polypeptides, and the column represents samples. This is the foundation for follow-up bioinformatics analysis and

Usually, the first step is to make cluster analysis, mainly including shortest distance method, longest distance method, median method, centroid method, average linkage and Ward's minimum-variance method. Also, the data classification is carried out. The commonly-used methods include support vector machine (SVM), decision tree, neural networks and k

The research process based on serum peptidomics classification is first to divide the mass spectrometric data after pretreatment into a group of modeling data and the other group of validation data. The modeling data fall into the training set and testing set. Then, analysis

Table 3. Commonly-used MS Pretreatment Algorithms and Tools

its data quality directly influences the analysis results.

SELDI-TOF MS

peak alignment

data pretreatment

Cromwell MatLab script to realize MS

SpecAlign MS data pretreatment and

MassSpecWavelet An R-language based

will be performed over the training set with t-test, Pearson correlation analysis and genetic algorithm, to find peaks of higher specificity to build a sorter. Then, the testing set makes tests, which shall be repeated and optimized. Finally, the validating data are used for validating to get a stable model.

4. TOF-MS System Based Softwares

Softwares based on the TOF-MS system mainly includes: (1) for the SELDI system, the ProteinChip Software 3.1 and Biomarker Wizard taking the decision tree as the core; (2) for the ClinProt system, the ClinProTools software taking cluster analysis as the core; (3) for the ClinTOF system, the BioExploerTM software taking specific vector machine (SVM), decision, tree, neural networks and k nearest neighbor (kNN) as the core.

Other bioinformatics tools for MS correlation analysis include: MapQuant, MASPECTRAS, SpecArray, msInspect and MZMine. These tools or softwares haven't realized seamless connection with serum peptidomics data, so they fail to perfectly accomplish the data management and analysis based on MS serum peptidome profiling.
