**1. Introduction**

58 Understanding Tuberculosis – Deciphering the Secret Life of the Bacilli

Wall, S., K. Ghanekar, J. McFadden & J. W. Dale, (1999) Context-sensitive transposition of

Warren, R. M., S. L. Sampson, M. Richardson, G. D. Van Der Spuy, C. J. Lombard, T. C.

Wirth, T., F. Hildebrand, C. Allix-Beguec, F. Wolbeling, T. Kubica, K. Kremer, D. van

Yang, Z., D. Yang, Y. Kong, L. Zhang, C. F. Marrs, B. Foxman, J. H. Bates, F. Wilson & M. D.

Yesilkaya, H., J. W. Dale, N. J. Strachan & K. J. Forbes, (2005) Natural transposon

Victor & P. D. van Helden, (2000) Mapping of IS*6110* flanking regions in clinical isolates of *Mycobacterium tuberculosis* demonstrates genome plasticity. *Mol Microbiol*

Soolingen, S. Rusch-Gerdes, C. Locht, S. Brisse, A. Meyer, P. Supply & S. Niemann, (2008) Origin, spread and demography of the *Mycobacterium tuberculosis* complex.

Cave, (2005) Clinical relevance of *Mycobacterium tuberculosis* plcD gene mutations.

mutagenesis of clinical isolates of *Mycobacterium tuberculosis*: how many genes does

IS*6110* in mycobacteria. *Microbiology* 145 ( Pt 11): 3169-3176.

37: 1405-1416.

*PLoS Pathog* 4: e1000160.

*Am J Respir Crit Care Med* 171: 1436-1442.

a pathogen need? *J Bacteriol* 187: 6726-6732.

#### **1.1 Insertion sequences in** *Mycobacterium tuberculosis* **complex**

Insertion sequence (IS) is a short DNA mobile genetic element coding for proteins involved in the transposition activity, which allows it to spread within the genome. ISs are widely distributed in prokaryotes and can be grouped into different families established by Mahillon & Chandler (1998) based on structural characteristics and transposase similarities.

In the genus *Mycobacterium* have been located and identified more than 46 ISs from different species, mostly on the basis of sequence similarities (Brosch et al., 2000). In the genome of the members of the *Mycobacterium tuberculosis* complex (MTBC) has been possible to find dispersed IS elements that could be included in various of the following families attending to their characteristics: IS*3*, IS*5*, IS*21*, IS*30*, IS*110*, IS*256*; IS*1535*, IS*L3* and other IS-*like* elements (Gordon et al., 1999, Table 1).

The ISs can induce duplications, deletions, and rearrangements in the bacteria genome, all of them essentials changes for the genome plasticity of the members of MTBC (Mahillon & Chandler, 1998). Not all of the ISs described in *M. tuberculosis* are active and have the availability of transpose from one site to another in the genome, some of the elements are defective copies. Furthermore, some of them have a limited host range (Brosch et al., 2000).

The Table 1 shows the ISs described in *M. tuberculosis* that will be briefly presented below.

#### **1.2 ISs families**

The IS*3* family represents an extensive set of insertion elements in bacteria. The features that characterize this family are their length between 1200 and 1600 bp, and their inverted repeats (IRs) between 20 and 40 bp long, as well as the presence of two overlapping open reading frames (ORFs: *orf*A and *orf*B) (Mahillon & Chandler, 1998; McAdam et al., 2000). After the insertion, a duplication of 3 or 4 bp occurs at the insertion point (Mendiola et al., 1992).

IS*6110* the Double-Edged Passenger 61

The first group, in *M. tuberculosis*, included IS*1608'* and IS*1547,* unlike to other elements, this group have a target sequence: CATN(6-9)(T,C)CCTT. The IS*1547* is one of the members that was detected only in members of the MTBC and seems to be an IS*6110* preferential site for insertion (Fang et al., 1999a). The second group includes IS*1558* and IS*1607,* they have imperfect IRs or lack of them (McAdam et al., 2000). Some copies of these IS elements, in the

The IS*256* family is, probably, the largest family of ISs in mycobacteria with more than 25% of the known ISs. Their members have been divided in two groups attending to the structural organization (Guilhot et al., 1999). One of the groups comprises the members of

IS*1081*, the main member of IS*256* family in the MTBC, was found in the genome of *M. bovis*. It is 1324 bp long with 15 bp IR ends and contains a large ORF. There are six copies of IS*1081* in the genome of *M. tuberculosis* H37Rv (Collins et al., 1991). Other elements of IS*256* family, are IS*1552'*, IS*1553* and IS*1554*. They have a single ORF coding for a protein of 281, 409 and 439 aminoacids. It is speculated that IS*1552'* was transferred from *Rhodococcus* into *M.* 

The IS*5* family is a very heterogeneus group of ISs, with lenghts range between 850 to 1640 bp. Two different ISs of this family have been described in MTBC: IS*1560* and IS-like (Cole et al., 1998; Mariani et al., 1993). One of the two copies of IS*1560* appears to be defective and

The members of the IS*30* family have a single open reading frame, IRs 20-30 bp, and DRs 2-3 bp long created after insertion (Mahillon & Chandler, 1998). IS603, an insertion sequence 1327 bp lenght and present in a single copy in the *M. tuberculosis*, genome belongs to this family (Table 1) with IRs 63 bp long and DRs have not been detected (McAdam et al., 2000). *M. tuberculosis* contains seven members of the IS*605* family (IS*1535*, IS*1536*, IS*1537*, IS*1538*, IS*1539*, IS*1602* and IS*1605'*) in its genome (Gordon & Supply, 2005). These ISs present two

To the IS*L3* family belongs some defective copies of ISs present in *M. tuberculosis*: IS*1555*', IS*1561*' and IS*1606*' and IS*1557*. The IS*1561*' element is absent from some clinical strains of

IS*6110* was initially named IS*986*. It is a genomic insertion element of 1361 bp long and shows 28 bp imperfect IRs, and duplications of 3 or 4 bp next to the insertion site. It has two overlapping ORFs (*orf*A and *orf*B) coding for a transposase, showing similarities with

The IS*6110* was found to be specific of mycobacteria belonging to the MTBC (Thierry et al., 1990a) and it was considered as the main target of the first reference genotyping tool, due to the high degree of polymorphism observed comparing strains of the MTBC (see part 3.2; Otal et al., 1991), turning into an important factor involved in the evolution of the *M. tuberculosis* genome. The sequences of IS*6110* and IS*986/*IS*987* identified in MTBC were practically identical and considered the same IS (Thierry et al., 1990b; McAdam et al., 1990).

*M. tuberculosis* (Gordon et al., 1999) and *M. microti* (Gordon & Supply, 2005).

elements of the IS*3* family of prokaryotes (Accesion No.: X17348, M29899; Fig. 1).

*M. tuberculosis* genome, are defective as was the case of IS*1558*' and IS*1608*'.

MTBC, such as: IS*1081*, IS*1552'*, IS*1553* and IS*1554*.

*tuberculosis*. In *M. tuberculosis* H37Rv the IS*1552*' is defective.

probably is non-functional in the *M. tuberculosis* H37Rv.

overlapping ORFs (Gordon et al., 1999; McAdam et al., 2000).

**2. Structural organization and function of the IS***6110*

Members IS*1540*, IS*1604*, IS*1556*/*990* and IS*6110* belong to this family in the MTBC. The most representative member of this family is IS*6110*, one of the insertion element most abundant and best characterized in the MTBC. Copies of this IS can be found at 16 positions in the genome of *M. tuberculosis* H37Rv providing an important epidemiological tool (Small & van Embden, 1994).

Other elements of this family, namely IS*1540*, IS*1604* and IS*1556*/*990*, have missing the IRs and Direct Repeats **(**DRs) or contain mutations in *orf*B making them supposedly inactive and non-functional (Dziadek et al., 1998; McAdam et al., 2000).


Table 1. ISs present in *M. tuberculosis* H37Rv. ' Defective copy of IS, putatively inactivated.

Members of the IS*21* family are among the largest bacterial IS elements, with sizes between 2 and 2.5 Kb length. Their IRs are variable. These elements encode two proteins for the transposition (IStA and IStB). Duplication of 4 or 5 bp occurs after transposition at the insertion point. The transposases coded by IS*1532*, IS*1533* and IS*1534* shows homology to the elements of IS*21* family. These elements posses end IRs of 48, 54 and 49 bp respectively and internal DRs (Mahillon & Chandler, 1998). All of them are absent from 40% of *M. tuberculosis* clinical isolates, as well as from *M. bovis* and *M. bovis* BCG Pasteur (Gordon et al., 1999).

The IS*110* family, exhibits unusual features for bacterial ISs, they have not IRs and DRs (McAdam et al., 2000) and may be differentiated in two groups.

Members IS*1540*, IS*1604*, IS*1556*/*990* and IS*6110* belong to this family in the MTBC. The most representative member of this family is IS*6110*, one of the insertion element most abundant and best characterized in the MTBC. Copies of this IS can be found at 16 positions in the genome of *M. tuberculosis* H37Rv providing an important epidemiological tool (Small

Other elements of this family, namely IS*1540*, IS*1604* and IS*1556*/*990*, have missing the IRs and Direct Repeats **(**DRs) or contain mutations in *orf*B making them supposedly inactive and

Thierry et al., 1990a; Cole et al., 1998;

Cole et al., 1998; Mariani et al., 1993

Dziadek et al., 2000; Cole et al., 1998;

Collins et al., 1991; Cole et al., 1998

Gordon et al., 1999; Cole et al., 1998

Cole et al., 1998; Gordon et al., 1999

Gordon et al., 1999; Dziadek et al., 1998.

Gordon et al., 1999

Fang et al., 1999a

non-functional (Dziadek et al., 1998; McAdam et al., 2000).

(1 / 2212 bp); IS*1534* (1 / 2129 bp)

IS*110* IS*1558* (1+1' / 1212 bp); IS*1607* (1 / 1227 bp); IS*1608'* (2' / 1031 bp); IS*1547*

IS*605* IS*1535* (1 / 2322 bp); IS*1536* (1 / 1391 bp); IS*1537* (1 / 1889 bp); IS*1538*  (1 / 2055 bp); IS*1539* (1 / 2057 bp); IS*1602* (1 / 2052 bp); IS*1605'* (1' / 287 bp)

IS*L3* IS*1555'* (1' / 398 bp); IS*1557* (2+1' / 1451 bp); IS*1561'* (1' / 1319 bp); IS*1606'* (1' / 330 bp)

IS*30* IS*1603* (1 / 1327 bp) Cole et al., 1998

(1' / 844 bp); IS*1553* (1 / 1398 bp); IS*1554* 

Unknown IS*1556* (1 / 1468 bp) Cole et al., 1998

(McAdam et al., 2000) and may be differentiated in two groups.

Table 1. ISs present in *M. tuberculosis* H37Rv. ' Defective copy of IS, putatively inactivated.

Members of the IS*21* family are among the largest bacterial IS elements, with sizes between 2 and 2.5 Kb length. Their IRs are variable. These elements encode two proteins for the transposition (IStA and IStB). Duplication of 4 or 5 bp occurs after transposition at the insertion point. The transposases coded by IS*1532*, IS*1533* and IS*1534* shows homology to the elements of IS*21* family. These elements posses end IRs of 48, 54 and 49 bp respectively and internal DRs (Mahillon & Chandler, 1998). All of them are absent from 40% of *M. tuberculosis* clinical isolates, as well as from *M. bovis* and *M. bovis* BCG Pasteur (Gordon et al.,

The IS*110* family, exhibits unusual features for bacterial ISs, they have not IRs and DRs

IS*3* IS*6110* (16 / 1361 bp); IS*1540* (1 / 1164 bp); IS*1604* (1 / 1410 bp); IS*1556/990* (1 / 1346 bp)

IS*5* IS*1560* (1+1' / 1567 bp); IS-like (2 / 968 bp)

IS*21* IS*1532* (1 / 2609 bp); IS*1533* 

(2 / 1351 bp)

(1 / 1435 bp)

1999).

IS*256* IS*1081* (6 / 1324 bp); IS*1552'* 

**Family ISs (ner copies / lenght) Source** 

& van Embden, 1994).

The first group, in *M. tuberculosis*, included IS*1608'* and IS*1547,* unlike to other elements, this group have a target sequence: CATN(6-9)(T,C)CCTT. The IS*1547* is one of the members that was detected only in members of the MTBC and seems to be an IS*6110* preferential site for insertion (Fang et al., 1999a). The second group includes IS*1558* and IS*1607,* they have imperfect IRs or lack of them (McAdam et al., 2000). Some copies of these IS elements, in the *M. tuberculosis* genome, are defective as was the case of IS*1558*' and IS*1608*'.

The IS*256* family is, probably, the largest family of ISs in mycobacteria with more than 25% of the known ISs. Their members have been divided in two groups attending to the structural organization (Guilhot et al., 1999). One of the groups comprises the members of MTBC, such as: IS*1081*, IS*1552'*, IS*1553* and IS*1554*.

IS*1081*, the main member of IS*256* family in the MTBC, was found in the genome of *M. bovis*. It is 1324 bp long with 15 bp IR ends and contains a large ORF. There are six copies of IS*1081* in the genome of *M. tuberculosis* H37Rv (Collins et al., 1991). Other elements of IS*256* family, are IS*1552'*, IS*1553* and IS*1554*. They have a single ORF coding for a protein of 281, 409 and 439 aminoacids. It is speculated that IS*1552'* was transferred from *Rhodococcus* into *M. tuberculosis*. In *M. tuberculosis* H37Rv the IS*1552*' is defective.

The IS*5* family is a very heterogeneus group of ISs, with lenghts range between 850 to 1640 bp. Two different ISs of this family have been described in MTBC: IS*1560* and IS-like (Cole et al., 1998; Mariani et al., 1993). One of the two copies of IS*1560* appears to be defective and probably is non-functional in the *M. tuberculosis* H37Rv.

The members of the IS*30* family have a single open reading frame, IRs 20-30 bp, and DRs 2-3 bp long created after insertion (Mahillon & Chandler, 1998). IS603, an insertion sequence 1327 bp lenght and present in a single copy in the *M. tuberculosis*, genome belongs to this family (Table 1) with IRs 63 bp long and DRs have not been detected (McAdam et al., 2000).

*M. tuberculosis* contains seven members of the IS*605* family (IS*1535*, IS*1536*, IS*1537*, IS*1538*, IS*1539*, IS*1602* and IS*1605'*) in its genome (Gordon & Supply, 2005). These ISs present two overlapping ORFs (Gordon et al., 1999; McAdam et al., 2000).

To the IS*L3* family belongs some defective copies of ISs present in *M. tuberculosis*: IS*1555*', IS*1561*' and IS*1606*' and IS*1557*. The IS*1561*' element is absent from some clinical strains of *M. tuberculosis* (Gordon et al., 1999) and *M. microti* (Gordon & Supply, 2005).
