**1. Introduction**

A number of viral pneumonia cases of uncertain origin arose in late of December 2019 (Wuhan, Hubei Province of China). These cases have gained public attention and considered an emergency. The Public Health Emergency of International Concern and the World Health Organization (WHO) have declared "epidemic" [1]. Thereafter, the Multidisciplinary Task Forces under the organization of the National Health Commission of the People's Republic of China undertook collective efforts to define the causative agent. At the beginning, a group of researchers from the Chinese Academy of Medical Sciences has announced their study on the causative agent identification. Basically, they conducted a metagenomic study of specimens of the respiratory tract collected from five patients who had pneumonia. After they have successfully isolated the virus and carried out the genomic sequencing, the results have revealed that it belongs to the beta-coronavirus family [1, 2].

In more details, the results from their project showed that the genomic analysis of the specimens have approximately 79 percent homology to the Acute Respiratory Syndrome (SARS) Coronavirus (SARS-CoV) genome, approximately 52 percent similarity to the Middle East Respiratory Syndrome Coronavirus (MERS-CoV), and approximately 87 percent similarity to two bat-derived SARS-like coronavirus genomes (Zhoushan, 2015) [2]. Likewise, identical findings were published by a team from the Chinese Center for Disease Control and Prevention [3]. Such proof, the isolated virus was proposed to be a novel coronavirus and this novel coronavirus later dubbed the novel coronavirus 2019 (nCoV 2019), quickly recognized by the WHO as the pathogen accountable for this transmissible illness [1].

#### **2. Sequencing**

The arrangement of nucleic acids inside the chains of polynucleotides provides the details of inherited and biochemical characteristics of life. Determining the order of nucleic acid sequences in biomolecules is a crucial part of a wide range of research applications. Huge numbers of researchers have spent the last fifty years developing technological approaches to simplify sequencing of (DNA and RNA) molecules. During this time span, significant changes were being seen ranging from short to very long oligonucleotide sequencing, from struggling to deduct a single gene's coding sequence to quick and widely available sequencing of the whole genome [1]. In this section we will go over the several generations of the nucleic acids sequencing indicting the major discoveries, impact of researchers and the major characteristics of (first, second and third) generations sequencing technology.

#### **2.1 First-generation nucleic acids sequencing**

James D. Watson and Francis H.C. Crick are two scientists who were able to discover the three-dimensional structure of DNA (1953). Watson and Crick were working on crystallographic data provided by Maurice and Rosalind Franklin Wilkins, contributed to both DNA replication conceptual frameworks and the transcription of proteins in the nucleic acids. However, reading the sequence is not achieved yet [1, 2]. The initial efforts were concentrated on sequencing RNA molecule which is not as complicated as the DNA molecule.

Frederick Sanger said "knowledge of sequences could contribute much to our understanding of living matter." In a collaboration with other scientists, Sanger was able to create a new technology based on the determination of radiolabeled partialdigestion fragments after two-dimensional fractionation (1965) [3], allowing the scientists to continuously build on the growing pool of RNA (ribosomal and transfer) sequences [4–8]. In the same year, Robert Holley and colleagues generated, for the first time, the whole nucleic acid sequence of alanine transfer RNA isolated from *Saccharomyces cerevisiae* [9]. Utilizing these techniques enabled Walter Fiers and colleagues, in the period of 1972–1976, to produce the first complete

#### *Whole Genome Sequencing: A Powerful Tool for Understanding the Diversity of Genotypes… DOI: http://dx.doi.org/10.5772/intechopen.96260*

protein coding gene sequence and then the whole genome of bacteriophage MS2 [10, 11]. In the mid of seventies, a strong impact was produced to get much greater resolution by replacing two steps two-dimensional fractionation with one step as single electrophoresis separation through polyacrylamide gel (considered the birth of 1st generation) [3, 12, 13]. Using this procedure, in 1977, Sanger and colleagues were able for the first time to sequence the bacteriophage X174 or (PhiX) genetic material, becoming a positive control in sequencing genome laboratories [14]. In the same year, establishment of Sanger's "chain-termination" or dideoxy method produced a huge advancement in DNA sequencing technology [15]. Subsequently, tremendous efforts were made to generate an automated DNA sequencing technology and the first commercially available machine was made for sequencing the highly complex species genome [1, 16–22].

One of the major drawbacks of the first-generation nucleic acids sequencing machines is reading short base pairs below a kilo base (kb); however, scientists tried to overcome this issue by techniques an example of which is Shotgun sequencing method in which individual cloning and sequencing of two parts of DNA will be carried out and then compiled into a long sequence [23, 24].

Nonetheless, many improvements were made to the sequencing of first-generation nucleic acids that eventually ended with new dideoxy sequencers an example of which is the ABI PRISM created by Applied Biosystems from Leroy Hood's research. This sequencer allowed hundreds of samples to be sequenced concurrently and was used in the generating the first draft of Human Genome Project completed years ahead of schedule [25–28].

#### **2.2 Second-generation nucleic acids sequencing**

The luminescent method for measuring pyrophosphate synthesis was the starting point for the second generation of DNA sequencers. Basically, it is a two-enzymes reaction where ATP sulphurylase converts pyrophosphate to ATP as a luciferase substrate. Light generation is thus proportional to the amount of pyrophosphate [29].

Pyrosequencing was later licensed to 454 Life Sciences, a Jonathan Rothburgfounded biotechnology corporation, which grew into the first major successful technology as a commercially available next-generation sequencing (NGS). The 454-sequence equipment, later bought via Roche, were a paradigm shift allowed sequencing reactions to be mass parallelized, considerably raising the amount of DNA sequenced in a single experiment [30].

The parallelization technique rises the yield of sequencing efforts by order of magnitudes, enabling scientists to fully sequence a whole human genome belonging to the developer of the DNA structure, James Watson, with much low-priced and faster than a similar effort exerted by the team of DNA sequencing entrepreneur Craig Venter exploitation Sanger sequencing method [31, 32]. The novel 454 machine, called the GS 20 later replaced by the 454 GS FLX which provides not only better-quality data but also higher numeral of readings attributed to having more wells in the pico-titer plate. Indeed, it was the first high-throughput sequencing machine (HTS) broadly accessible to customers. Moreover, the concept of having massive numbers of parallel sequencing reactions on a micrometer measure improves microfabrication and high-resolution imaging and this is actually what defined the second-generation of DNA sequencing [26, 33].

Furthermore, after the success of 454, there were several parallel sequencing methods suddenly emerged. Arguably, the most vital one is the Solexa technique of sequencing developed by Illumina [33]. The concept of this process based what they call bridge amplification. Basically, adapter-bracketed DNA molecules are passed

over a complementary oligonucleotide attached to a flow cell. Then a solid phase PCR generates neighboring groups of clonal populations from each of the single original flow cell attached to the DNA strands [34–36]. Moreover, the HiSeq has emerged after the standard Genome Analyzer version (GAIIx). It is a machine characterized by its ability to huger read length and depth of a sequence. Then, the MiSeq was discovered. One of its drawbacks is having a lower-throughput. On the other hand, it is the lower price, quicker turnaround and longer read length instrument [37, 38].

Analogously to 454 sequencing, beads containing cloned DNA fragment populations produced by an emPCR are washed over a pico-well plate proceeded by each nucleotide in turn. Nevertheless, nucleotide integration is determined not by the production of pyrophosphate; however, the alteration in pH produced by the protons (H+ ions) production through polymerization facilitating a quick sequencing during the actual detection time [39, 40].

Alongside 454 and Solexa/Illumina, (SOLiD) system from Applied Biosystems was the third major choice at the early time of second-generation sequencing. Its sequencing concept is based on oligonucleotide ligation and detection. (SOLiD) system turn out to be Life Technologies following merged with Invitrogen [41, 42]. Even though the SOLiD platform is unable to manufacture Illumina system read length and depth and makes its assembly more difficult, it continued to be a costcompetitive instrument [39, 43].

One more important sequencing which utilizes the ligation technology was the DNA nanoballs method, in which sequences are similarly attained from probeligation. However, the generation of clonal DNA population is innovative. Instead of bead or bridge amplification, rolling circle amplification is used to produce extended DNA chains comprising of repetition units of the template sequence bordered by adapters. Then the sequence is self-assembled into nanoballs attached to a slide in order to be sequenced [44].

Lastly, an outstanding sequencing system of the second-generation sequencing is the one that Jonathan Rothburg created after leaving 454. It was the first so-called post-light sequencing technology "Ion Torrent" (another Life Technologies product), neither fluorescence nor luminescence is used in this technology [40].

The frequently mentioned "genomics revolution" has dramatically changed the cost and effort accompanying with DNA sequencing guided in large part by these extraordinary improvements in nucleotide sequencing technology. The Illumina sequencing platform; however, has been the most effective and valuable in recent years, and can therefore probably be considered having made the strongest impact to the second-generation of DNA sequencers [45].

#### **2.3 Third-generation nucleic acids sequencing**

There were substantial arguments to characterize the various generations of technology for DNA sequencing, particularly in regard to the division from the second to the third generation. However, a suggestive distinguishing characteristic of the third generation should be single molecule sequencing (SMS), real-time sequencing, and simply deviated from the earlier technologies [46–49].

The first SMS technology was developed in Stephen Quake's laboratory, later marketed by Helicos BioSciences, and worked broadly like Illumina does, but excluding bridge amplification step [50, 51]. Basically, the DNA templates are linked to a planar surface and then deoxyribonucleotide triphosphate (dNTPs) called virtual terminators, proprietary fluorescent reversible terminators [52]. Although it is relatively slow, costly and generating short reads, it was considered the first technology enabling non-amplified DNA to be sequenced evading biases

#### *Whole Genome Sequencing: A Powerful Tool for Understanding the Diversity of Genotypes… DOI: http://dx.doi.org/10.5772/intechopen.96260*

and mistakes that might occur. Other businesses picked up the third-generation baton, as Helicos bankrupted early in 2012 [1, 46, 48]. Moreover, the most commonly used third-generation technology was possibly Pacific Biosciences single molecule real time (SMRT) platform (PacBio range) [53]. In a very brief period of time, this method sequences a single molecule. Some other beneficial features available in the PacBio range and not commonly shared by other commercially available machines are producing kinetic data, which enables the detection of changed bases, also capable of generating an extremely long read more than 10 kilo bases (KB) suitable for de novo genome assemblies [46, 53, 54].

Nanopore sequencing, an offshoot of a giant field of utilizing nanopores for the determination and quantification of all kinds of biological and chemical samples, is perhaps the most waited for area to develop of the third-generation DNA sequencing [55]. For example, Oxford Nanopore Technologies (ONT), the first corporation to deliver nanopore sequencers, created a lot of exuberance about their GridION and MinION nanopore platforms [56, 57]. MinION nanopore platforms is small, cell phone sized USB device, which was first launched in an early access trial in 2014 to end users [58]. In spite of the undoubtedly poor-quality data produced with GridION and MinION nanopore platforms, it is wished that such sequencers reflect an authentically disruptive DNA sequencing technology, delivering much cheaper, faster and extremely elongated, not-amplified reads of sequence data than previously possible [55, 57, 59].

To sum up, the value of DNA sequencing for biological research is hardly to overstate; however, it is the determination way of one of the vital features by which our lives forms can be identified and distinguished from one another. Hence, numerous investigators from all over the world have spent a countless time and money over the last half century just to improve and enhance the technologies of DNA sequencing and also to combine many features from different sequencers generations coming up with outstanding capabilities for new one. Using the experience of all generations of sequencers will offer new perspectives for future generations, as lessons learn from the prior generations guide the next generations' development.
