**2.3.1 Single-locus sequence typing**

Sequencing of a single genetic locus has been used for epidemiological studies of many bacterial species, yielding valuable typing results. In this approach, it is essential to select highly variable gene sequences. Valuable typing results have been obtained for *S. pyogenes* by DNA sequencing of 150 nucleotides coding for the N-terminal end of M protein (*emm* typing) (Beall et al. 2000). Another example is spa typing for *S.aureus* that consists in sequencing of the X region of the protein A gene (*spa typing*). This technique is widely used for subtyping methicillin-resistant *S. aureus* (MRSA) strains (Shopsin, 1999, 2000; Shopsin & Kreiswirth 2001; Harmsen et al. 2003), (Figure 5).

Fig. 5. Sequence of steps involved in *spa* typing.

Application of Molecular Typing Methods to the

Study of Medically Relevant Gram-Positive Cocci 121

mutation rates (Feil et al. 2001) and in investigation of the evolutionary relationships among

Fig. 6. Sequence of steps involved in MLST scheme. Adapted from Vazquez et al. 2004.

programs have become indispensable in molecular epidemiological investigations.

Laboratories, Hercules, Ca). Treecon (Van de Peer and De Wachter 1994).

**3. Epidemiologic applications of bacterial typing techniques** 

Comparison and interpretation of raw data generated by molecular typing methods, such as gel electrophoresis band patterns, sequence alignments, or hybridization matrix patterns could be performed by visual analysis when there are few strains. However, if the analysis includes many strains, the comparison turns out to be very difficult. Therefore computer

Computer programs that compare band sequences or patterns employ clustering algorithms that can generate dendrograms or trees illustrating the arrangement of the clusters produced. For pattern recognition, such as electrophoretic banding patterns or hybridization matrices, additional programs are needed to capture, digitize, and normalize the patterns. There are different commercially available platforms for databasing and gel analysis that have been developed for computer-assisted analysis such as BioNumerics, GelCompar (Applied Maths, Sint-Martens-Latem, Belgium), Diversity Database Fingerprinting Software (Bio-Rad

A more comprehensive knowledge of the evolution and the epidemiology of bacterial pathogens

had been obtained by combination of genetic, phenotypic, spatial and temporal data.

**2.4 Analysis of results obtained by molecular epidemiology** 

bacteria that are classified as belonging to the same genus (Godoy et al. 2003).

### **2.3.2 Multi-locus sequence typing (MLST)**

MLST is a genotyping method based on the measurement of DNA sequence variation in a set of housekeeping genes (usually seven genes) whose sequences are constrained because of the essential function of the proteins they encode. This method was proposed in 1998 as a general approach to provide accurate, portable data that were appropriate for the epidemiological investigation of bacterial pathogens and which also reflected their evolutionary and population biology (Maiden et al. 1998).

MLST schemes have been developed for several species and databases containing the allelic profiles of a great number of strain types with corresponding clinical information that can be readily consulted over the Internet (http://www.mlst.net/ and http://pubmlst. org/), (Aanensen & Spratt 2005). Additional information such as date, place of isolation and antibiotype is included in the database when a strain is deposited so this database is continuously expanding as new STs are identified and additional nucleotide sequence data are deposited.

Internal fragments of the seven housekeeping genes are amplified by PCR from chromosomal DNA using the primer pairs described in the web site. The amplified fragments are directly sequenced in each direction. The sequences at each of the seven *loci* are then compared with all the known alleles at that *locus,* and a number representing a previously described allele (or a new one) is assigned to the *locus*. For a given isolate, alleles present at each gene position are combined into an allelic profile and assigned a sequence type (ST) designation (Maiden et al. 1998). Relationships among isolates are assessed by comparisons of allelic profiles: closely related isolates have identical STs, or STs that differ at a few *loci*, whereas unrelated isolates have unrelated STs (Figure 6).

A number of clustering algorithms have been employed to analyze the data in the MLST scheme, including UPGMA (unweighted pair group method with arithmetic mean) and eBURST analyses (Feil et al. 2004).

The original conception of MLST used the allele number as the primary unit of analysis (Enright & Spratt 1998; Maiden et al. 1998) which was appropriate for organisms where horizontal genetic exchange is common. However, MLST data can also be interpreted by tree-building approaches that use nucleotide substitutions rather than allelic changes as the unit of analysis; this is more pertinent to bacteria where mutational change predominates over genetic exchange in the evolution of variants.

An important advantage of MLST is that results are unambiguous and easily and unequivocally exchangeable, much more so than images of agarose gel electrophoresis patterns. MLST drawbacks are practical, including limited accessibility and high cost. It is a relatively expensive technique available for the characterization of bacterial isolates, mainly in reference or research laboratories. However, MLST is increasingly applied as an informative typing tool that enables international comparison of isolates. It has been applied to problems as diverse as the emergence of antibiotic-resistant variants (Crisostomo et al. 2001; Enright et al. 2002), the association of particular genotypes with virulence (Brueggemann et al. 2003) or antigenic characteristics (Meats et al. 2003) and also the global spread of disease caused by novel variants (Albarracin Orio et al. 2008). In addition to these medically-motivated epidemiological analyses, MLST data have been exploited in evolutionary and population analyses (Jolley et al. 2000) that estimate recombination and

MLST is a genotyping method based on the measurement of DNA sequence variation in a set of housekeeping genes (usually seven genes) whose sequences are constrained because of the essential function of the proteins they encode. This method was proposed in 1998 as a general approach to provide accurate, portable data that were appropriate for the epidemiological investigation of bacterial pathogens and which also reflected their

MLST schemes have been developed for several species and databases containing the allelic profiles of a great number of strain types with corresponding clinical information that can be readily consulted over the Internet (http://www.mlst.net/ and http://pubmlst. org/), (Aanensen & Spratt 2005). Additional information such as date, place of isolation and antibiotype is included in the database when a strain is deposited so this database is continuously expanding as new STs are identified and additional nucleotide sequence data

Internal fragments of the seven housekeeping genes are amplified by PCR from chromosomal DNA using the primer pairs described in the web site. The amplified fragments are directly sequenced in each direction. The sequences at each of the seven *loci* are then compared with all the known alleles at that *locus,* and a number representing a previously described allele (or a new one) is assigned to the *locus*. For a given isolate, alleles present at each gene position are combined into an allelic profile and assigned a sequence type (ST) designation (Maiden et al. 1998). Relationships among isolates are assessed by comparisons of allelic profiles: closely related isolates have identical STs, or STs that differ at

A number of clustering algorithms have been employed to analyze the data in the MLST scheme, including UPGMA (unweighted pair group method with arithmetic mean) and

The original conception of MLST used the allele number as the primary unit of analysis (Enright & Spratt 1998; Maiden et al. 1998) which was appropriate for organisms where horizontal genetic exchange is common. However, MLST data can also be interpreted by tree-building approaches that use nucleotide substitutions rather than allelic changes as the unit of analysis; this is more pertinent to bacteria where mutational change predominates

An important advantage of MLST is that results are unambiguous and easily and unequivocally exchangeable, much more so than images of agarose gel electrophoresis patterns. MLST drawbacks are practical, including limited accessibility and high cost. It is a relatively expensive technique available for the characterization of bacterial isolates, mainly in reference or research laboratories. However, MLST is increasingly applied as an informative typing tool that enables international comparison of isolates. It has been applied to problems as diverse as the emergence of antibiotic-resistant variants (Crisostomo et al. 2001; Enright et al. 2002), the association of particular genotypes with virulence (Brueggemann et al. 2003) or antigenic characteristics (Meats et al. 2003) and also the global spread of disease caused by novel variants (Albarracin Orio et al. 2008). In addition to these medically-motivated epidemiological analyses, MLST data have been exploited in evolutionary and population analyses (Jolley et al. 2000) that estimate recombination and

**2.3.2 Multi-locus sequence typing (MLST)** 

are deposited.

eBURST analyses (Feil et al. 2004).

over genetic exchange in the evolution of variants.

evolutionary and population biology (Maiden et al. 1998).

a few *loci*, whereas unrelated isolates have unrelated STs (Figure 6).

mutation rates (Feil et al. 2001) and in investigation of the evolutionary relationships among bacteria that are classified as belonging to the same genus (Godoy et al. 2003).

Fig. 6. Sequence of steps involved in MLST scheme. Adapted from Vazquez et al. 2004.
