**4. Methods for diagnosis and virus identification**

#### **4.1. Traditional methods**

Since viruses are extracellular inert particles they need to be propagated into on susceptible host or host cells for their growth. Initially, viruses were cultured in vitro with the help of embryonated eggs or laboratory animals. Discovery of tissue culture technique in the 1900s provides an indispensable tool for in vitro virus culture. Tissue culture technique has been then recognized as a "gold standards" for virus discovery. Major advantages of using tissue culture technique for virus identification are an amplification of viruses, characterization of the virus, functional studies, drug targeting, and genome extraction. Due to authentic results and sensitivity of the technique, tissue culture-based techniques are still in use for virus discovery, as well as immune responses study, altered gene expression and characterization of viruses. Successful use of tissue culture technique in virus identification depends on crucial steps involved such as collection of a sample from high titer area of the body, immediate transport of sample, sample processing and selection of appropriate cell line [10]. The major defects of traditional method for virus identification are difficulties in identification of susceptible cell line, time-consuming and laborious in nature [10]. Moreover, culture-based virus identification is further succeeded with the evolution of new scientific techniques and modification in existing techniques. Shell vials with centrifugation, PRE-CFE stain technique, immune-based techniques, e.g., ELISA, agglutination, precipitation, flocculation, microscopy-based techniques, reduced the time of virus identification but compromising sensitivity.

#### **4.2. Molecular methods**

for their propagation. Extracellular viral particles are noninfectious in nature. They can infect a wide range of hosts including plants, bacteria, fungi, algae, protozoa, vertebrate or non-vertebrate animals. In nature, around 1 × 1031 number of different viruses are present. The number itself suggests the diversity of viruses in nature. They play a very important role such as an increase in diversity via horizontal gene transfer in hosts, and nutrient recycling [1]. Report from Hooda et al. showed the abundance of viruses in nature is around 1000 times more than observed via cell culture dependent technique [1, 2]. This suggests the large pool of viruses is

**Human viruses**: More than 200 viruses are known to infect humans and number is increasing with time, but the diversity of viruses suggests a huge number of viruses still unknown. In humans, yellow fever virus was the first pathogenic virus discovered in 1901. 1900 was the era of human virus discovery and most of the common pathogenic viruses studied during this time. In current scenario, two out of three infection causing organisms are viruses [4] and known to cause a variety of disease ranging from normal acute infections such as common cold, flu, and gastroenteritis to deadly diseases such as Hantavirus pulmonary syndrome (Huntavirus), AIDS (HIV) ebolavirus disease (ebolavirus). Recent outbreaks of viruses show

For decades human gut-associated pathogenic viruses are known for many gastrointestinal diseases as gastroenteritis. Following are the main group of viruses has been identified. Rotavirus, adenovirus (serotype 40 and 41), astrovirus, calicivirus, norovirus, torovirus, herpesviruses, coxsackieviruses, human papillomaviruses [3], Norwalk-like viruses, coronaviruses, picornaviruses, Sapporo-like viruses [4, 5]. They infect epithelial cell linings, mucosal linings of the stomach and small intestine, a specific portion of epithelium in the intestine. Depending upon the infection type observed, different samples are used for detection of the infectious agent. In general, feces sample used for general microbiological examination during gut-associated infection [6, 7]. Apart from feces, gastric biopsy, gastric juice, saliva [8, 9] duodenal fluid, cotton swabs [5] are collected. These samples are very essential for diagnosis as they directly contain the pathogen.

Since viruses are extracellular inert particles they need to be propagated into on susceptible host or host cells for their growth. Initially, viruses were cultured in vitro with the help

still unknown, only around 219 pathogenic viruses have been yet identified [2, 3].

the emergence of previously known viruses with modified virulence properties.

**2. Role in pathogenesis**

20 Metagenomics for Gut Microbes

**3. Human gut and viral infection**

**4.1. Traditional methods**

**4. Methods for diagnosis and virus identification**

Gradually field of virology shifted their particles toward molecular biology methods. Together, traditional culture-based methods and molecular biology techniques are used hand in hand for studying virus associated samples [11]. Broadly molecular biology methods are of two types: sequence dependent and sequence independent. Both the methods have proven its usefulness; many viruses have been identified using these techniques.


Another independent approach is (SISPA) sequence-independent single-primer amplification circumvents used for detection of the unknown viral sequence by ligation of linker oligonucleotide sequence [23]. Further, it can be used for molecular cloning of viral genome for subsequent characterization. This method has been used successfully for the discovery of well-known Hepatitis E virus [10, 24] Parvovirus 2 and 3 [24] and Norwalk virus [11]. As viruses are devoid of consensus sequences, generally culture-based traditional and molecular biology-sequence-dependent and sequence independent technique are useful for the study of limited samples with limited output. Most of the viruses remain unidentified due to this reason.

**Year of study**

2007 Faces, urine, blood rolling circle

2008 Insect pool, skunk brain,

human feces, sewer effluent

**Sample type Method of sequencing Virus detected Reference**

Potential Applications and Challenges of Metagenomics in Human Viral Infections

http://dx.doi.org/10.5772/intechopen.75023

23

Novel polyomavirus [30]

[33]

[34]

[43]

[43]

[46]

[47]

growing infants

454NGS Orthoreovirus and orbirus [38]

anellovirus

mastrevirus

circular virus

virus

picornavirus, norovirus and anellovirus, picornavirus, norovirus, picobirnavirus

 Sea water Sanger's [12] Feces Sanger's [16] Marine sediments Sanger's [17] Blood Sanger's Novel anellovirus [25] Plasma SISPA Novel parvoviruses [18] Nasopharyngeal aspirates Sanger's Novel bocavirus [26] Seawater Sanger's Novel RNA viruses [27] Feces Sanger's Plant RNA viruses [28] Honey bees 454 NGS Israeli acute paralysis virus [29]

amplification (RCA)

2007 Soil Sanger's Soil metagenomics overview [31] 2007 Virioplankton Sanger's Virioplankton metagenome [32]

2008 Turkey feces 454NGS Novel bornavirus [35] 2008 Hotspring water Sanger's Novel viruses in hot springs [36] 2008 Bush kuru rat 454NGS Novel arenavirus [37]

 SISPA Novel paralysis virus [39] plasma, liver biopsy 454NGS Novel LUJO virus [40] grapevine 454NGS Novel marafivirus [41] plant 454NGS Novel cucumovirus [41] potable, reclaimed water 454NGS Several animal and plant viruses [42]

technique

2008 Feces Sanger's Study of diversity viruses in

2008 Feces Sanger's Novel picobirnavirus,

2009 Sea lion lungs Sanger's Novel California sea lion

2009 Sea turtle swabs/tissues 454NGS Novel sea turtle fibropapilloma

2009 Plant Sanger's Sweet potatoes badnavirus and

2010 Feces 454NGS Novel chimpanzee associated

2009 Ant Sanger's *Solenopsis invicta* virus [44] 2009 Feces 454NGS Klassevirus [45]

2010 Brain 454NGS Astrovirus [45]

2010 Mosquitoes 454NGS Novel mycovirus [48]

Compared to above techniques metagenomics is the less biased approach. Any type of virus with either RNA or DNA as a genome, cultivable or uncultivable or novel viruses can be quickly detected. The word metagenomics denotes "transcendent" and "ome" is the all or every in Greek collectively means all genomic content. Metagenomics is the study of genetic material with the help of advanced genomic research technique's and computational tools, directly from the environmental sample. Metagenomics approach bypasses the need for classical biochemical laboratory techniques for microbial analysis. With the help of metagenomics, one can investigate all types of genomic contents of a variety of organisms. This technique provided an indispensable tool for identification of nonculturable species of microbes. It is also used for investigation of known and culturable organisms with great accuracy. Another advantage to use this tool is it bypasses the need to isolate and culture individual species manually and the thereby it reduces the time required to study while providing more information. Initial metagenomics analysis of samples directly from raw environmental samples subsequently provides a necessary foundation for further lab-based analysis (**Table 1**). Metagenomics has been used for a variety of purposes, in diverse areas from the time of its discovery in 2002 when for the first time this approach was used in the virology field [12, 52].

#### **4.3. Process of metagenomics**

Metagenomics tool is a successful tool for surveillance in different environmental conditions such as freshwater, soil, marine water and gut of different organisms (**Table 1**) Recent advances in sequencing technology improved the speed of novel virus discovery and surveillance of environment [13, 53]. In 2000s, increase in literature related to metagenomics use in virome study and increase in a number of virus database show the ease of process. Recently government organization takes active participation in conducting surveillance programs [14, 15, 54, 55].

Basically, there are three main steps involved in metagenomics analysis of sample as follows:



Another independent approach is (SISPA) sequence-independent single-primer amplification circumvents used for detection of the unknown viral sequence by ligation of linker oligonucleotide sequence [23]. Further, it can be used for molecular cloning of viral genome for subsequent characterization. This method has been used successfully for the discovery of well-known Hepatitis E virus [10, 24] Parvovirus 2 and 3 [24] and Norwalk virus [11]. As viruses are devoid of consensus sequences, generally culture-based traditional and molecular biology-sequence-dependent and sequence independent technique are useful for the study of limited samples with limited output. Most of the viruses remain unidentified due

Compared to above techniques metagenomics is the less biased approach. Any type of virus with either RNA or DNA as a genome, cultivable or uncultivable or novel viruses can be quickly detected. The word metagenomics denotes "transcendent" and "ome" is the all or every in Greek collectively means all genomic content. Metagenomics is the study of genetic material with the help of advanced genomic research technique's and computational tools, directly from the environmental sample. Metagenomics approach bypasses the need for classical biochemical laboratory techniques for microbial analysis. With the help of metagenomics, one can investigate all types of genomic contents of a variety of organisms. This technique provided an indispensable tool for identification of nonculturable species of microbes. It is also used for investigation of known and culturable organisms with great accuracy. Another advantage to use this tool is it bypasses the need to isolate and culture individual species manually and the thereby it reduces the time required to study while providing more information. Initial metagenomics analysis of samples directly from raw environmental samples subsequently provides a necessary foundation for further lab-based analysis (**Table 1**). Metagenomics has been used for a variety of purposes, in diverse areas from the time of its discovery in 2002 when for the first time this approach was used in the virology field [12, 52].

Metagenomics tool is a successful tool for surveillance in different environmental conditions such as freshwater, soil, marine water and gut of different organisms (**Table 1**) Recent advances in sequencing technology improved the speed of novel virus discovery and surveillance of environment [13, 53]. In 2000s, increase in literature related to metagenomics use in virome study and increase in a number of virus database show the ease of process. Recently government organization takes active participation in conducting surveillance pro-

Basically, there are three main steps involved in metagenomics analysis of sample as follows:

to this reason.

22 Metagenomics for Gut Microbes

**4.3. Process of metagenomics**

grams [14, 15, 54, 55].

**1.** Sample preparation

**3.** Bioinformatics analysis

**2.** Sequencing


tools, like riboPicker tool version and blast of viral RNA sequence showed more number of virus domains present in the sample which were processed via the second method, while

Potential Applications and Challenges of Metagenomics in Human Viral Infections

http://dx.doi.org/10.5772/intechopen.75023

25

**2. Sequencing:** The rate of metagenomics study was slow during Sanger sequencing when around 2005 other methods are yet to be evolved, Sangers sequencing was in use. Many studies in this period showed abundant diversity in viruses, analysis of human clinical samples also showed plenty of diversity, while speed of viral genome sequencing is increased several times during pyrosequencing. New viral communities of human and animals have been identified during this period. Some important discoveries are as follows: Astrovirus [21], Rhabdovirus [22], Coronavirus [23], Picornavirus [24], gammapaillomavirus [61]. This technology becomes popular in short time because of low cost, a high number of reads. This technology is also used for sequencing of the clinical sample from tissue fluids and tissue samples [11].

**Ion Torrent:** This is pH-based sequencing method with few steps are similar to pyrosequencing technology. Ion Torrent technology gives very rapid runs so it was very useful for targeted deletion of viral sequences from clinical samples such as HIV, HCV, polyomavirus, influenza virus, etc. This method was not a good choice for virologists for identifica-

**Illumina:** This technology is a high-throughput platform with low-cost rate of virus identification; many viruses from clinical samples have been identified using this technique.

**Pacific bioscience sequencing and nanopore sequencing**: These sequencing methods

**Challenges involved in metagenomics:** For analysis of sequencing data of viral genome through high throughput, sequencing machine needs standard computational tools, software with a high accuracy of data analysis. This needs high-cost involvement with technical expertise. Few high-quality tools available for sequence data analysis such as Diamond [53], UBLAST [52] and Kaiju [54] have increased the speed of metagenomics study. Still, there is a need for technical improvement for rapid and accurate data analysis. The second challenge involved in data analysis of metagenomics sequencing is an assembly of the genome from thousands of small fragments. Assemblers used for the assembly of single genome sets during early times of sequencing study are outdated or non-useful for

**3. Bioinformatics analysis:** Bioinformatics analysis of raw sequence data generated from highthroughput sequencer is a critical step in novel virus discovery and even in diagnostics. There many ready to use pipelines available for analysis of raw data. VIP, VirFinder, Vipie, METAVIR, PHACCS, VIROME, HP Viewer, Fast virome Explorer, EzMAP, Vanator, viruspy and Viral\_genome\_annotator are few commonly used pipelines for viral metagenomics analysis. Typical workflow of viral metagenomics includes the following steps. Next-generation sequencing (NGS) data obtained is first subjected to trimming for removal of low-quality sequences and adaptor sequences, (Refer **Figure 2**). Second the trimmed data is subjected for removal of host (humans or bacteria) related sequences and third, these sequences are aligned to reference viral genomes for advance functional characteristics such as novel virus identification, viral taxonomy, identification of viral proteins and phylogenic analysis.

were not popular for metagenomics study because of high error rate [52].

other methods showed more cellular noise [19].

tion of new viruses because of low output.

**Table 1.** Viruses discovered with metagenomics approach.

**1. Sample preparation and processing:** Since in metagenomics any type of sample can be analyzed with some pretreatment (or enrichment methods). However, for analysis of gutassociated virome collection of the different sample is done from different parts of the human gastrointestinal region. For accurate results, sample collection, proper handling, transportation, stage of the sample is very crucial. There are many standard protocols available for collection of different samples to laboratory and its storage techniques [37]. Different protocols are used for fluid sample and for tissue samples. The tissue sample is generally homogenized in autoclaved saline and collected supernatant filtered through 0.8, 0.45 and 0.2 μm filters, this serial filtration procedure is used to separate larger particles and bacteria from viruses. See **Figure 1**.

There are different types of sample processing methods used earlier for extraction of viral genomic material [16, 56–58]. Based on studies done by many groups [56, 58–60], a framework designed by Shah et al. in 2014. A comparative analysis of three widely used sample processing methods for gut-associated RNA virome was done. The second processing method used in the separation of virus partials and DNA preparation gave good results. In that method, PEG treatment and ultracentrifugation steps are spatially separated by sonication step in PBS buffer to remove remnants of PEG. In this method based on bioinformatics

**Figure 1.** Overview of general procedure of metagenomics.

tools, like riboPicker tool version and blast of viral RNA sequence showed more number of virus domains present in the sample which were processed via the second method, while other methods showed more cellular noise [19].

**2. Sequencing:** The rate of metagenomics study was slow during Sanger sequencing when around 2005 other methods are yet to be evolved, Sangers sequencing was in use. Many studies in this period showed abundant diversity in viruses, analysis of human clinical samples also showed plenty of diversity, while speed of viral genome sequencing is increased several times during pyrosequencing. New viral communities of human and animals have been identified during this period. Some important discoveries are as follows: Astrovirus [21], Rhabdovirus [22], Coronavirus [23], Picornavirus [24], gammapaillomavirus [61]. This technology becomes popular in short time because of low cost, a high number of reads. This technology is also used for sequencing of the clinical sample from tissue fluids and tissue samples [11].

**1. Sample preparation and processing:** Since in metagenomics any type of sample can be analyzed with some pretreatment (or enrichment methods). However, for analysis of gutassociated virome collection of the different sample is done from different parts of the human gastrointestinal region. For accurate results, sample collection, proper handling, transportation, stage of the sample is very crucial. There are many standard protocols available for collection of different samples to laboratory and its storage techniques [37]. Different protocols are used for fluid sample and for tissue samples. The tissue sample is generally homogenized in autoclaved saline and collected supernatant filtered through 0.8, 0.45 and 0.2 μm filters, this serial filtration procedure is used to separate larger parti-

**Sample type Method of sequencing Virus detected Reference**

virus

astrovirus, bocavirus

454NGS Novel turkey hepatitis virus [51]

[49]

[50]

2011 Plasma 454NGS Novel simian hemorrhagic fever

2011 Feces 454NGS Many novel species in pig:

There are different types of sample processing methods used earlier for extraction of viral genomic material [16, 56–58]. Based on studies done by many groups [56, 58–60], a framework designed by Shah et al. in 2014. A comparative analysis of three widely used sample processing methods for gut-associated RNA virome was done. The second processing method used in the separation of virus partials and DNA preparation gave good results. In that method, PEG treatment and ultracentrifugation steps are spatially separated by sonication step in PBS buffer to remove remnants of PEG. In this method based on bioinformatics

> **High through put sequencing**

**Data analysis**

cles and bacteria from viruses. See **Figure 1**.

**Table 1.** Viruses discovered with metagenomics approach.

**Year of study**

2011 Liver, pancreas, intestine biopsy

24 Metagenomics for Gut Microbes

**Sample preparaon**

**Homogenize the sample Centrifuge to remove debris**

**Filter sequentially through 0.8,0.45, 0.22μm filter PEG / Ultracentrifugation treatment Extraction of nucleic acids from viral particles Amplification of nucleic acids if needed**

**Figure 1.** Overview of general procedure of metagenomics.

**Ion Torrent:** This is pH-based sequencing method with few steps are similar to pyrosequencing technology. Ion Torrent technology gives very rapid runs so it was very useful for targeted deletion of viral sequences from clinical samples such as HIV, HCV, polyomavirus, influenza virus, etc. This method was not a good choice for virologists for identification of new viruses because of low output.

**Illumina:** This technology is a high-throughput platform with low-cost rate of virus identification; many viruses from clinical samples have been identified using this technique.

**Pacific bioscience sequencing and nanopore sequencing**: These sequencing methods were not popular for metagenomics study because of high error rate [52].

**3. Bioinformatics analysis:** Bioinformatics analysis of raw sequence data generated from highthroughput sequencer is a critical step in novel virus discovery and even in diagnostics. There many ready to use pipelines available for analysis of raw data. VIP, VirFinder, Vipie, METAVIR, PHACCS, VIROME, HP Viewer, Fast virome Explorer, EzMAP, Vanator, viruspy and Viral\_genome\_annotator are few commonly used pipelines for viral metagenomics analysis. Typical workflow of viral metagenomics includes the following steps. Next-generation sequencing (NGS) data obtained is first subjected to trimming for removal of low-quality sequences and adaptor sequences, (Refer **Figure 2**). Second the trimmed data is subjected for removal of host (humans or bacteria) related sequences and third, these sequences are aligned to reference viral genomes for advance functional characteristics such as novel virus identification, viral taxonomy, identification of viral proteins and phylogenic analysis.

**Challenges involved in metagenomics:** For analysis of sequencing data of viral genome through high throughput, sequencing machine needs standard computational tools, software with a high accuracy of data analysis. This needs high-cost involvement with technical expertise. Few high-quality tools available for sequence data analysis such as Diamond [53], UBLAST [52] and Kaiju [54] have increased the speed of metagenomics study. Still, there is a need for technical improvement for rapid and accurate data analysis. The second challenge involved in data analysis of metagenomics sequencing is an assembly of the genome from thousands of small fragments. Assemblers used for the assembly of single genome sets during early times of sequencing study are outdated or non-useful for

the help of surveillance pyramid. The surveillance pyramid explains during disease spread in the community only a few diagnosed cases are reported, the individuals carrying symptoms of the disease and the carriers of the disease are not reported. This phenomenon creates biasedness in sampling. Therefore metagenomics study has been proved a useful tool for constant surveillance of gastrointestinal tract pathogenic virome community. As well as some endemic viral diseases, which causes common gastrointestinal health concerns in community, e.g., astrovirus, calicivirus, norovirus, and torovirus [64], herpesviruses, hepatitis E virus, epstein bar

Potential Applications and Challenges of Metagenomics in Human Viral Infections

http://dx.doi.org/10.5772/intechopen.75023

27

virus, coxsackieviruses, and surveillance with the metagenomics study is useful.

analogy makes it in the suspect list of emerging viruses [49].

**3. Diagnostic Metagenomics** is a potent method that allows broad analysis of relative genetic variation among viruses and can be used for the study of host-pathogen interactions. This is also more popular because it can be used for uncultivable organisms as well. The recently rising approach is to use metagenomics during epidemics and outbreaks, with a given large number of samples in a lesser time. In *hepatitis C virus* (HCV) infection, identification of infection is a challenging task due to lack of apparent symptoms and lack of easy laboratory tests for differentiation of acute and chronic phase of the disease. Available

**2. Discovery of new viruses and classification**: Metagenomics is a powerful tool for identification of novel organism(s). Screening of different gut samples can be useful to study novel gut-associated viruses. Initially with the sequence-based studies of Markel cell carcinoma new *human papillomavirus* has been identified. Markel cell carcinoma is human skin tissue carcinoma, where virus DNA found to be integrated into tumor tissue [65]. Subsequent studies have revealed the diversity of gut-associated viruses in different animals which help in the study of past zoonotic occurred in history. Human-rodent's interaction is well known due to civilization in forest areas or due to the domestication of animals this is leading cause of zoonotic outbreaks. Knowledge of outbreaks in past and monitoring of the present status of the spread of known pathogenic viruses and closely associated pathogenic human viruses provides a base to predict future outbreaks. This approach is also useful to limit the epidemiology of recurrent outbreaks with the study of disease-prone viruses and characterization of unknown viruses. Phan et al. in 2011 extensively studied fecal sample from wild rodents in Virginia and they characterized viruses belonging to mammalian virus families, many new viral families, two new genera were identified. Two viruses closely related to *Aichivirus*, an associated with acute gastroenteritis worldwide, were characterized through the study [66]. Turkey meat is very popular in the USA and its production is an important part of US economy. One study conducted in California in March 2011on turkey which was suffering from turkey viral hepatitis. Pyrosequencing of RNA, extracted from liver revealed the presence of novel picornaviruses named as *turkey hepatitis virus* [51]. Another study on cattle's suffering from the unknown disease in Germany and Netherlands affected milk production. Metagenomics study discovered the new virus, *Schmallenberg virus*, from infected cow sample [67]. Identification and characterization of such viruses will help in facing problems which have a negative impact on countries economic status. Similar to domestic animals, wild-type animals can also act as a reservoir of novel pathogens. Two novel simian hemorrhagic fever viruses diverse from original simian hemorrhagic fever virus were identified from African green monkeys. *Simian hemorrhagic fever virus* has not yet found to infect human but clinical indices comparable with human *Ebola* and *Marburg viruses*. This

**Figure 2.** Workflow of metagenomics data analysis.

metagenomics; they create chimeric genomes which misinterpret the genome sequence. Now a days for such studies MetAMOS [55], Meta Velvet [62], MetaSPADes [57] assemblers are available. Still assembly process requires manual editing to sort out genomic chimera generation [15]. Another challenge of virologists for data analysis is reference database deposited which sometimes may cause confusion or problems. If reference database is misinterpreted it will give a wrong interpretation of results. If reference database is high, it decreases the speed as a large number of sequence alignments are required to test data. Sequence data interpretation is a last and very decisive step for metagenomics. Still, we lack clear knowledge about the link between the diversity of virus in the environment and during outbreaks, our surveillance is merely based on a biased collection of only clinical samples and their study. This limits our knowledge about disease spread [63]. Prediction of future outbreaks and limiting the spread of disease needs proper study, development of strong tools [15] Therefore further extensive studies should be encouraged for obtaining maximum and precise knowledge of environmental and gut-associated virome.

#### **4.4. Applications in gut-associated virome analysis**

1. **Epidemic and endemic surveillance**: Several reports of unknown pathogenic virus outbreaks in history suggest the need for comprehensive study of virus-host interaction during disease and disease-causing viruses is a big threat to the human population. Well, known examples of zoonotic virus transmission are Nipah virus from fruit bats [58] and Ebola virus from bushmeat [60]. This creates a need for continuous surveillance of diseases in the community. David et al. in 2017 [15] gave a comprehensive explanation about disease outbreak and its diagnosis with the help of surveillance pyramid. The surveillance pyramid explains during disease spread in the community only a few diagnosed cases are reported, the individuals carrying symptoms of the disease and the carriers of the disease are not reported. This phenomenon creates biasedness in sampling. Therefore metagenomics study has been proved a useful tool for constant surveillance of gastrointestinal tract pathogenic virome community. As well as some endemic viral diseases, which causes common gastrointestinal health concerns in community, e.g., astrovirus, calicivirus, norovirus, and torovirus [64], herpesviruses, hepatitis E virus, epstein bar virus, coxsackieviruses, and surveillance with the metagenomics study is useful.

**2. Discovery of new viruses and classification**: Metagenomics is a powerful tool for identification of novel organism(s). Screening of different gut samples can be useful to study novel gut-associated viruses. Initially with the sequence-based studies of Markel cell carcinoma new *human papillomavirus* has been identified. Markel cell carcinoma is human skin tissue carcinoma, where virus DNA found to be integrated into tumor tissue [65]. Subsequent studies have revealed the diversity of gut-associated viruses in different animals which help in the study of past zoonotic occurred in history. Human-rodent's interaction is well known due to civilization in forest areas or due to the domestication of animals this is leading cause of zoonotic outbreaks. Knowledge of outbreaks in past and monitoring of the present status of the spread of known pathogenic viruses and closely associated pathogenic human viruses provides a base to predict future outbreaks. This approach is also useful to limit the epidemiology of recurrent outbreaks with the study of disease-prone viruses and characterization of unknown viruses. Phan et al. in 2011 extensively studied fecal sample from wild rodents in Virginia and they characterized viruses belonging to mammalian virus families, many new viral families, two new genera were identified. Two viruses closely related to *Aichivirus*, an associated with acute gastroenteritis worldwide, were characterized through the study [66].

metagenomics; they create chimeric genomes which misinterpret the genome sequence. Now a days for such studies MetAMOS [55], Meta Velvet [62], MetaSPADes [57] assemblers are available. Still assembly process requires manual editing to sort out genomic chimera generation [15]. Another challenge of virologists for data analysis is reference database deposited which sometimes may cause confusion or problems. If reference database is misinterpreted it will give a wrong interpretation of results. If reference database is high, it decreases the speed as a large number of sequence alignments are required to test data. Sequence data interpretation is a last and very decisive step for metagenomics. Still, we lack clear knowledge about the link between the diversity of virus in the environment and during outbreaks, our surveillance is merely based on a biased collection of only clinical samples and their study. This limits our knowledge about disease spread [63]. Prediction of future outbreaks and limiting the spread of disease needs proper study, development of strong tools [15] Therefore further extensive studies should be encouraged for obtaining

**Virus discovery**  **Phylogenetic analysis**

**Raw NGS DATA generated** 

**Quality control (Trimming )**

**Subtraction of host related genes from Trimmed raw data**

**Alignment of nucleotides with reference Example – Ref Seq virus (ViPR/IRD)**

maximum and precise knowledge of environmental and gut-associated virome.

1. **Epidemic and endemic surveillance**: Several reports of unknown pathogenic virus outbreaks in history suggest the need for comprehensive study of virus-host interaction during disease and disease-causing viruses is a big threat to the human population. Well, known examples of zoonotic virus transmission are Nipah virus from fruit bats [58] and Ebola virus from bushmeat [60]. This creates a need for continuous surveillance of diseases in the community. David et al. in 2017 [15] gave a comprehensive explanation about disease outbreak and its diagnosis with

**4.4. Applications in gut-associated virome analysis**

**Taxonomy identification**

**Figure 2.** Workflow of metagenomics data analysis.

26 Metagenomics for Gut Microbes

Turkey meat is very popular in the USA and its production is an important part of US economy. One study conducted in California in March 2011on turkey which was suffering from turkey viral hepatitis. Pyrosequencing of RNA, extracted from liver revealed the presence of novel picornaviruses named as *turkey hepatitis virus* [51]. Another study on cattle's suffering from the unknown disease in Germany and Netherlands affected milk production. Metagenomics study discovered the new virus, *Schmallenberg virus*, from infected cow sample [67]. Identification and characterization of such viruses will help in facing problems which have a negative impact on countries economic status. Similar to domestic animals, wild-type animals can also act as a reservoir of novel pathogens. Two novel simian hemorrhagic fever viruses diverse from original simian hemorrhagic fever virus were identified from African green monkeys. *Simian hemorrhagic fever virus* has not yet found to infect human but clinical indices comparable with human *Ebola* and *Marburg viruses*. This analogy makes it in the suspect list of emerging viruses [49].

**3. Diagnostic Metagenomics** is a potent method that allows broad analysis of relative genetic variation among viruses and can be used for the study of host-pathogen interactions. This is also more popular because it can be used for uncultivable organisms as well. The recently rising approach is to use metagenomics during epidemics and outbreaks, with a given large number of samples in a lesser time. In *hepatitis C virus* (HCV) infection, identification of infection is a challenging task due to lack of apparent symptoms and lack of easy laboratory tests for differentiation of acute and chronic phase of the disease. Available molecular methods for virus diagnostic purpose are tedious, time-consuming and costly. A recent report from Escobar-Gutierrez et al. described the use of next-generation sequencing (NGS) method in the diagnosis of HCV infection. NGS allows cost-effective analysis of a large number of samples in detail. The study showed low-frequency mutations, genetic variation [68]. Genetic shift and re-assortment viruses are a leading cause of the emergence of a new strain of viruses, especially in RNA viruses. Well a known example is influenza virus, many pandemics and deaths in history. The recent H1N1 virus is a combination of swine, human and avian genomic segments of RNA [69]. The best approach of metagenomics study in 2009 H1N1 pandemic is the use of metagenomics for characterization and detail study of the virus, followed by manufacture of microarray-based virochip for rapid detection and differential screening from seasonal virus [70].

**Author details**

Prudhvi Lal Bhukya<sup>1</sup>

**References**

2002

Clínica. 2017;**35**:367-376

1994;**266**:1865-1869

\* and Renuka Nawadkar<sup>2</sup>

1 Department of Biotechnology, Krishna University, Machilipatnam, Andhra Pradesh, India

Potential Applications and Challenges of Metagenomics in Human Viral Infections

http://dx.doi.org/10.5772/intechopen.75023

29

[1] Hobbie JE, Daley RJ, Jasper S. Use of nuclepore filters for counting bacteria by fluorescence microscopy. Applied and Environmental Microbiology. 1977;**33**:1225-1228

[2] Woolhouse M, Scott F, Hudson Z, Howey R, Chase-Topping M. Human viruses: Discovery and emergence. Philosophical Transactions of the Royal Society of London.

[3] Salim AF, Phillips AD, Farthing MJ. Pathogenesis of gut virus infection. Baillière's

[4] Brogden KA, Guthmiller JM. Polymicrobial Diseases. Washington, D.C.: ASM Press;

[5] Rene E, Verdon R. Upper gastrointestinal tract infections in AIDS. AIDS GIT group.

[6] Solonenko SA, Sullivan MB. Preparation of metagenomic libraries from naturally occur-

[7] Draghici S, Khatri P, Eklund AC, Szallasi Z. Reliability and reproducibility issues in

[8] Balsalobre-Arenas L, Alarcon-Cavero T. Rapid diagnosis of gastrointestinal tract infections due to parasites, viruses, and bacteria. Enfermedades Infecciosas y Microbiología

[9] Chang Y, Cesarman E, Pessin MS, Lee F, Culpepper J, Knowles DM, et al. Identification of herpesvirus-like DNA sequences in AIDS-associated Kaposi's sarcoma. Science.

[10] Reyes A, Haynes M, Hanson N, Angly FE, Heath AC, Rohwer F, et al. Viruses in the faecal microbiota of monozygotic twins and their mothers. Nature. 2010;**466**:334-338 [11] Quan PL, Firth C, Conte JM, Williams SH, Zambrana-Torrelio CM, Anthony SJ, et al. Bats are a major natural reservoir for hepaciviruses and pegiviruses. Proceedings of the National Academy of Sciences of the United States of America. 2013;**110**:8194-8199 [12] Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, et al. Genomic analysis of uncultured marine viral communities. Proceedings of the National Academy

\*Address all correspondence to: saiprudhvi21@gmail.com

2 Yashwantrao Chavan College of Science, Karad, India

Series B, Biological Sciences. 2012;**367**:2864-2871

Baillière's Clinical Gastroenterology. 1990;**4**:339-359

ring marine viruses. Methods in Enzymology. 2013;**531**:143-165

of Sciences of the United States of America. 2002;**99**:14250-14255

DNA microarray measurements. Trends in Genetics. 2006;**22**:101-109

Clinical Gastroenterology. 1990;**4**:593-607

**4. Evolution of host-virus interaction:** Evolution of RNA viruses is comparatively fast process than DNA viruses. Study of evolution is necessary to understand the source of new variance, spread and keep a check on epidemic initiating variant. In emerging RNA virus, *norovirus* causative agent of gastroenteritis inter-host, intra-host, and transmission of the new variant has been studied. Usually, it is a self-limiting acute disease but in immunecompromised individuals and in newborns it may cause morbidity and mortality. No vaccine or drugs are available for treatment. A report from Bull et al. hypothesized based on metagenomics study that, *norovirus* has multiple mechanisms of evolution. Chronic hosts are a major reservoir of new variants while acute patients generally possess a single variant. NGS approach for use assists in comprehensive study of viral population dynamics [71]. Characterization of cardiovirus genus originally believed to possess two genera, metagenomics study has revealed five new genera with full characterization. Cardioviruses are the causative agent of enteric diseases in mice with multiple symptoms. In humans, it causes encephalitis-like condition and diarrhea in children's [72]. Metagenomics based studies help in designing future approach with these new genotypes and associated diseases.

#### **5. Conclusion**

The metagenomics studies have a huge potential to describe about diversity of microbiome in gut microflora and most importantly directly in infectious samples. Among all pathogens viruses are the ones, who cause severe illness to mankind. With rapid improvement in the genomic sequencing techniques, the overall metagenomics approach is very valuable for discovery of new viruses, novel genes, surveillance of pathogens, discover new pathway, host virus interaction, functional studies. The leads obtained through this exercise may have great impact on early diagnosis and treatment. While metagenomic studies also experience limitations and challenges, which need to overcome in near future to obtain a precise results. Unified genomic extraction techniques and development of improved analysis modules may suffice the needs of metagenomics in future.

## **Conflict of interest**

Authors declare no conflict of interest.
