**2.2.2 Amino acid composition analysis**

Amino acid composition analysis is frequently used for protein identification owing to its low cost. Different from the peptide mass or sequence tags, amino acid composition analysis identifies proteins by utilizing the specific amino acid component of different proteins. This method can be used for identifying 2-DE separated proteins. The radio-labeled amino acid is used to determine amino acid components of proteins, or the proteins are converted to the PVDF membrane and after the automatic derivation of amino acid, undergo chromatographic separation to obtain data. Then, inquiries are made to the database to rank proteins in the database by the amount of difference of two components, as a result, the top ranking proteins having greater reliability. Yet this method has some defects: slow speed and require a great amount of proteins or peptides; restricted in the ultramicro analysis;the possible amino acid variation due to the incomplete acidic hydrolysis or partial degradation.

#### **2.2.3 Mass Spectrometry**

Mass Spectrometry (MS): MS is an analytical technique that measures the mass-to-charge ratio (or mass) of charged particles, molecules or molecular fragments. MS provides information of molecular weight, molecular formula, isotopic element composition of

**2.2.5 Others** 

**2.3 Analysis method** 

**2.3.1 Bioconductor** 

workers.

al, 2005).

**2.3.2 MATLAB** 

Serum Peptidomics 375

Besides the foregoing methods, amino acid composition analysis, amino acid sequence analysis, Field Desorption Mass Spectrometer (FDMS), IR, UV spectra, circular dichroism spectrum, bioassay technique, tagging method and immunologic method have also been

Regarding research on serum peptidomics, the analysis of data collected shall have: highefficient analysis technical platform, computer and network have become a necessary tool of biological research; high throughput technical platform, mainly targeting at how to use the information technology to analyze the giant data; data mining technical platform, which shall be able to mine knowledge from the massive data saved in database or other information banks for the analysis; data visualization technical platform, in describing the systematic relations, the functions of nucleic acid, protein, cell, organ and tissue shall be considered, that is to say, a systematic method shall be used to learn about vital activities. Currently, databases used for proteomics research include SWISS-PROT, BLOCKS, SMART, PROSITE, WORLD-2D-PAGE, EMBL, GenBank, DDBJ, ProClass, PR INTS, MASCOT, PROTO-MAP, DOMO, PDB and NCBI. Among them, SWISS-PROT is a real protein sequence database and also the largest and most diversified proteome database in the world. EMBL is to collect protein sequences that have been translated from nucleic acid automatically and not yet entered the SWISS-PROT. NCBI contains protein sequences translated from DNA in the GenBank and from the PDB, SWISS-PROT and PIR databases. Presently, many tools and methods used for MS data process and analysis of serum

Bioconductor is an open source and open development software project, with the broad goals of providing widespread access to a broad range of powerful statistical and graphical methods for the analysis of genomic data, facilitating the inclusion of biological metadata and driving comprehensive analysis and application of data. Its application function is to provide users with the integrated packages. It has also provided many packages of MS data processing and analysis for users. Bioconductor is based on the R language, so it requires that users must be familiar with the R-language working environment and have some knowledge of programming, that is to say, it will be difficult for the general clinical and lab

MATLAB is a piece of commercial software integrating statistical analysis and engineering computation. Taking MATLAB as the development platform, it will be possible to realize the pretreatment, display and statistical analysis of MS data. In the research of serum peptidome profiling of prostatic cancer, breast cancer and bladder cancer, this tool together with the GENESPRING of Agilent has achieved good results in data analysis (Villanueva et

used for the result identification, analysis and detection of polypeptides.

peptidomics have been developed, mainly including:

molecules and molecular structure of samples analyzed. It has been widely applied in protein and polypeptide analysis. In particular, it is suitable for the analysis and identification of polypeptide substances in the online analysis after separation and purification due to its hypersensitivity and rapidity. Commonly, MS includes electrospray MS (in the spray process, the continuous ionization method makes the polypeptide samples ionized), fast atom bombardment MS (FAB MS) and isotopic element MS. Among them, the Continuous-Flow Fast Atom Bombardment, cf-FAB) and the Electrospray Ionization (ESI) have just been developed in recent years.

Continuous-Flow Fast Atom Bombardment(cf-FAB): a kind of weak ionization technology, it is capable of ionizing peptides or small-molecular-weight proteins into the form of MH+ or (M-H). It is mostly applied in the separation and detection of peptides and has moderate resolution, with the accuracy greater than +0.2amu and flow rate of 0.5-1.5μl·Ml-1. In the determination, the mobile phase shall be added with 0.5%-10% substrate like glycerol and high organic solvents, so that samples can be sensitized at the detection probe. The cf-FAB is usually used together with HPLC and CEZ, to realize the purpose of isolation analysis. The cf-FAB analysis methods have been built for many polypeptides and well applied.

Electrospray Ionization (ESI): able to generate multivalent ionized proteins or polypeptides, allowing analyzing of proteins with the molecular weight reaching 100kD; its resolution is 1500-2000 amu and accuracy about 0.01 %. ESI is more suitable for the online analysis of proteins with large molecular weight and requires gasification or organic solvents for the sample sensitization. It has been a success to combine ESI and HPLC for separation and analysis of GH and hemoglobin. ESI can also be used together with CEZ.

MALDI-TOF MS: in this method, the ionization of polypeptide samples is realized with the substrate absorbing the laser energy. It is a tool for accurately determining the molecular mass in the current protein identification and particularly suitable for the determination of the molecular weight of mixed proteins and polypeptides, featuring high sensitivity and resolution. For the moment, it is a necessary tool for the proteomics research. Working with the coupling technique of the liquid chromatography, this method can identify polypeptides at a high efficiency. Especially, when MS technologies of different principles are coupled, they can not only obtain the molecular weight of polypeptides, but also determine the sequential structure. This technology will exert a decisive effect in the future proteomics study.

### **2.2.4 Nuclear Magnetic Resonance**

Nuclear Magnetic Resonance (NMR): NMR profiling has purely digital signals, excessive overlapping range (due to the large molecular weight) and weak signals, so it is seldom used in analysis of proteins and polypeptides. In company with the application of 2D, 3D and 4D NMR and the progress of molecular biology and computer processing technology, NMR has gradually become a main approach for analysis of proteins and polypeptides. NMR can be used for determining amino acid sequence and the content of components in mixtures. Yet some problems need solving if this method is used for protein analysis, for instance, how to give proteins with large molecular weight a specific shape to facilitate quantitative and qualitative analysis and how to reduce the data processing time, which are being studied by many scholars. Despite of its infrequent use in the protein analysis, NMR is extremely useful in analyzing small peptides with the molecules having less than 30 amino acids, in which case, it can overcome the foregoing defects and realize rapid and accurate analysis.
