**9. GA-map tecnology platform in gut microbiota diagnostics**

The GA-map platform has given rise to a pipeline of assays for analysis of disease based on the microbiota composition. This platform includes a DNA purification module, a module for probe design, a patent protected approach for the actual gut microbiota screening, in addition to a diagnostic database. The GA-map assay will help to utilize information in the gut microbiota for diagnostic purposes.

An outline of the GA-map platform for gut microbiota diagnostics is illustrated in Figure 2. The platform can be used to assess health conditions of individuals based on the composition of the gut microbiota. In addition, it can serve both the the pharmaceutical industry and governmental health authorities in epidemiological population screenings and clinical trials. In addition, the technology can be used for early detection of undesirable conditions in the gut that can be corrected before the illnesses are manifested. The core aspects of the technology was developed and patented at the University of Oslo in 1998 (US patent # 6 617 138 Nucleic Acids Detection Methods). The technology has since then been refined at Nofima Mat (Matforsk). Currently the technology is patented worldwide and Genetic Analysis AS has been set up to commercialize the technology.

A high throughput analysis platform based on this technology will make it possible to gain greater understanding of the relationship between the composition of the gut microbiota and health, as well as being used as a diagnostic and prognostic tool in the future.

#### **9.1 GA-map array**

Probe labeling is based on the minisequencing principle, where a DNA polymerase extends the probe with a single labeled dideoxy nucleotide (Syvannen et al., 1990). In the GA-map assay several probes are labeled simultaneously, with the detection by reverse hybridization to the complementary strands spotted on to a solid phase (Vebo et al., 2011). This process is illustrated in Figure 3. The probes are constructed so that the probes hybridize adjacent to discriminative gene positions. If the target bacterium is present then a labeled dideoxynucleotide is incorporated by the polymerase. This is illustrated in Figure 3A. The solid phase (i.e. microarray or beads) is used to separate the probes by hybridization to their respective complementary sequences attached to the solid phase, which is illustrated in Figure 3B (exemplified by an array).

#### **9.2 Probe design**

There are several steps in the GA-map array process that can lead to wrong patterns if the probes are not properly designed. The probes may bind to the wrong target, and be labeled that way. Furthermore, the probes may be labeled by using itself as a target, or another probe as target. Finally, the probes may bind to the wrong spots on the array. Successful application of the GA-map assay requires therefore the application of a probe design software that takes into account many of the potentially unwanted reactions mentioned above, leading to false results.

The probe design is based on a novel way of bacterial classification based on 16S rRNA gene sequences. Rather than classifying bacteria by traditional phylogenetic tree-based approaches, the bacteria are classified in a coordinate system (Rudi et al., 2006). The benefit of this approach is both that very large numbers of bacteria can be analyzed, and that phylogroups are easily identified for probe design. This is illustrated in Figure 4.

Gut Microbiota in Disease Diagnostics 109

Fig. 3. Schematic outline of the GA map assay(A) Illustration of the SNuPE principle. An unlabelled probe hybridizes adjacent to a discriminative guanine on the complementary DNA strand. A DNA polymerase single-base extends the probe with a labeled cytosine dideoxy nucleotide. (B) Illustration of array hybridization of SNuPE labeled probes. Three probes are illustrated by green - , red – and blue bars. The green and the red probe are labeled. The probes are hybridized to their complementary sequences on an array as illustrated with the squares. The green and red probes will give a signal on the array due to

the label, while the blue probe will not give a signal since it is not labeled.

Fig. 2. GA pipeline. This figure illustrates the GA core technology and pipeline. Based on signatures in the 16S rRNA gene sequences GA probes are designed and evaluated *in silico* using the GA probe design tool. The GA analysis involves automated sample preparation combined with array hybridization . The whole information flow is stored in the GA database including the information about sample preparation and storage. The results to the end user are in the form of a direct description of the microbiota with respect to consequences for health and disease. Using this proprietary GA technology, a GA array is obtained, giving a specific "fingerprint" of each persons gut microbiota.

Fig. 2. GA pipeline. This figure illustrates the GA core technology and pipeline. Based on signatures in the 16S rRNA gene sequences GA probes are designed and evaluated *in silico* using the GA probe design tool. The GA analysis involves automated sample preparation combined with array hybridization . The whole information flow is stored in the GA

database including the information about sample preparation and storage. The results to the

consequences for health and disease. Using this proprietary GA technology, a GA array is

end user are in the form of a direct description of the microbiota with respect to

obtained, giving a specific "fingerprint" of each persons gut microbiota.

Fig. 3. Schematic outline of the GA map assay(A) Illustration of the SNuPE principle. An unlabelled probe hybridizes adjacent to a discriminative guanine on the complementary DNA strand. A DNA polymerase single-base extends the probe with a labeled cytosine dideoxy nucleotide. (B) Illustration of array hybridization of SNuPE labeled probes. Three probes are illustrated by green - , red – and blue bars. The green and the red probe are labeled. The probes are hybridized to their complementary sequences on an array as illustrated with the squares. The green and red probes will give a signal on the array due to the label, while the blue probe will not give a signal since it is not labeled.

Gut Microbiota in Disease Diagnostics 111

Fig. 5. Illustration of steps in probe evaluation. (A) The first step is to evaluate whether the probe will self label. A high value here indicates a high risk of self labeling. (B) The next step is to evaluate the potential of cross-labeling. A color code is used to illustrate the risk of cross-labeling. High values indicate high risk. (C) The final evaluation is whether the probe will bind to the right spot on the array. The red diagonal line indicates correct hybridization,

while red squares outside the diagonal would indicate wrong hybridization.

Fig. 4. Illustration of coordinate based classification of bacteria related to IBD. Each point in the plot represent a 16S rRNA gene sequence from a single bacterium. The distances between the points reflect the relatedness between the bacteria. For illustration, the points labeled green are target organisms for probe design, while those labeled red are non-target organisms.

After a set of probes have been constructed, the probes are evaluated with respect to if they will self-label (Figure 5A), whether they can cross-label (Figure 5B), or whether they will bind to a wrong spot on the array (Figure 6C). This bioinformatics evaluation is crucial for the successful construction of functional probe-set based on the GA-map technology (Vebo et al., 2011).

Validation of probes constructed with the probe design software have shown a high success rate (Vebo et al., 2011). Prior to the development of the software, probes were identified manually from multiple sequence alignments. The conclusion from the manual constructions, however, was that these probes did not perform satisfactory, and that there were too many considerations when performing this probe construction to make it possible to do manually.

Fig. 4. Illustration of coordinate based classification of bacteria related to IBD. Each point in the plot represent a 16S rRNA gene sequence from a single bacterium. The distances between the points reflect the relatedness between the bacteria. For illustration, the points labeled green are target organisms for probe design, while those labeled red are non-target

After a set of probes have been constructed, the probes are evaluated with respect to if they will self-label (Figure 5A), whether they can cross-label (Figure 5B), or whether they will bind to a wrong spot on the array (Figure 6C). This bioinformatics evaluation is crucial for the successful construction of functional probe-set based on the GA-map technology (Vebo

Validation of probes constructed with the probe design software have shown a high success rate (Vebo et al., 2011). Prior to the development of the software, probes were identified manually from multiple sequence alignments. The conclusion from the manual constructions, however, was that these probes did not perform satisfactory, and that there were too many considerations when performing this probe construction to make it possible

organisms.

et al., 2011).

to do manually.

Fig. 5. Illustration of steps in probe evaluation. (A) The first step is to evaluate whether the probe will self label. A high value here indicates a high risk of self labeling. (B) The next step is to evaluate the potential of cross-labeling. A color code is used to illustrate the risk of cross-labeling. High values indicate high risk. (C) The final evaluation is whether the probe will bind to the right spot on the array. The red diagonal line indicates correct hybridization, while red squares outside the diagonal would indicate wrong hybridization.

Gut Microbiota in Disease Diagnostics 113

Fig. 6. The prevalence of the G-map bacteria was determined within age groups. The color

code indicates the prevalence from absent (white) to present in all samples (black).

Fig. 7. Bacteria which are important for age prediction. For each bacterium the age contribution is a function of probe signal. Adding all the age contributions gives the

and predicted ages for all the samples in our data.

predicted age (see function above). The final panel shows a regression between the observed

#### **9.3 Application of the GA-map array to describe the temporal development of the gut microbiota in infants**

We have evaluated the recently developed GA-map infant microarray as a high throughput assay for screening of the gut microbiota. We analyzed 216 faecal samples collected from a cohort of 47 infants from 1 day until 2 years of age. To test the predictive ability of the assay we asked the question whether we could predict the age of the infants based on the microarray data.

The Prevention of Allergy Among Children in Trondheim (PACT) study is a large population based intervention study in Norway focused on childhood allergy (Oyen et al., 2006). The samples included here is a subset from the PACT study. Mechanical lysis was used for cell disruption, and an automated magnetic bead-based method was used for DNA purification. The approach is previously described by Skånseng et al. (2006)

We experienced that the primer pairs commonly used for amplification of the full-length 16S rRNA gene showed poor amplification of bifidobacteria. To circumvent this problem we developed a novel primer pair to obtain a near full-length 16S rRNA universal amplicon. The amplicon was evaluated both theoretically based on sequences in the RDP II database, and experimentally for bacterial species expected in the infant gut. We found that all the currently known infant gut bacteria were amplified with this new, optimized primer pair. A primer pair that is able to representably amplify the 16S rRNA gene from all the bacteria present in the sample is critical for proper analysis of the sample.
