**1. Introduction**

At the early stages of the COVID-19 pandemic, sequencing the full genome of SARS-CoV-2 was key to investigate the newly emerging outbreak of pneumonia in Wuhan, China. In addition, this fact provided evidence that it was being caused by a novel virus belonging to the family *Coronaviridae* [1–3]. The viral sequence became public very rapidly, permitting the scientific community to carry out analyses and pandemic preparedness to start promptly [4]. Back then, having available the full viral genomic sequence made possible the development of rapid and affordable molecular diagnostic tools to isolate infectious patients (symptomatic and asymptomatic). This was the first weapon to control the disease spread, given the lack of approved therapeutics and vaccines at that time [5, 6]. Later on, genomic surveillance of the virus played a center role in the prevention and control of the disease throughout the course of the pandemic [7]. In fact, it made it possible to study many different aspects of the disease, such as the transmission patterns of the virus, the time and from where the virus was introduced into a country, and local and superspreading events. Most notably, genomic surveillance was key to track virus evolution,

evidencing the emergence of genomic variants worldwide. Those variants that are more transmissible or virulent, and/or can decrease the effectiveness of treatments, vaccines, and public health measures were defined, by the WHO - in consultation with the Technical Advisory Group on Virus Evolution - as variants of Concern (VOC) [8]. To date, five VOCs (named Alpha, Beta, Gamma, Delta, and Omicron) have emerged at different times and places as a result of viral evolution displaying different features compared to the first strain isolated from Wuhan. Among these, Alfa, Beta, Gamma, and Delta are now designated former VOCs as they appear to no longer circulate in the population. Omicron and its descendent sub-lineages are, at the time of writing this chapter, the only circulating VOCs [8]. The five VOCs have demonstrated to be able to act as 'game changers', reshaping infection dynamics and causing new waves of infections in many countries. Periodic genomic sequencing of viral samples kept the world informed in a global pandemic setting and facilitated proper public health measures to be made. Implementation of comprehensive realtime genomic surveillance programs is vital for monitoring, detecting, and characterizing new variants, helping sanitary authorities to better manage the crisis.

## **2. SARS-CoV-2 then**

Whole-genome sequencing of specimens of an outbreak of pneumonia in Wuhan, China in December 2019, led to the discovery of a previously uncharacterized virus capable of infecting humans [2]. The first annotation of the complete 29,903 nucleotide-length genome of SARS-CoV-2 revealed it was a positive-sense, single-stranded RNA virus from the genus Betacoronavirus (β-CoVs). Comparative phylogenetic analysis shed light on the genomic organization of SARS-CoV proving that it shared key structural similarities with coronaviruses including SARS-CoV the causative agent of the severe acute respiratory syndrome outbreak in Asia in 2003. In addition, this novel virus, like many other members of the β-CoVs genus, had its evolutionary roots in viruses known to commonly infect bats [9, 10]. Yet, none of the SARS-CoV-2 related coronaviruses that can be found in public databases present more than 99% similarity to SARS-CoV-2 across the genome as a whole, suggesting that none of these viruses could be its direct ancestor. Efforts to find possible reservoirs and/or an intermediate host of the virus in wild animals' reservoirs, mostly bats and pangolins, had still not made clear the exact emergence event of the virus in the human population. The genome of SARS-CoV-2 has a rather mosaic pattern, to which different progenitors seem to contribute [11].

The closest related bat-borne virus at the whole-genome level identified so far is RaTG13 (from R. affinis, China, 2013), sharing 96.2% identity with SARS-CoV-2 [3]. Despite its apparent higher percentage of similarity, the receptor binding domain (RBD) sequence of the Spike (S) protein of SARS-CoV-2 shows a significant divergence from the RaTG13 strain. RaTG13 lacks the four-residue (PRRA) insertions at the furin cleavage site on the S protein, essential for viral binding to human cell receptors and infection [12, 13]. Furthermore, the authors referenced in [13] demonstrated the binding affinity between the RaTG13 RBD and human angiotensin-converting enzyme 2 (hACE2) to be approximately 70-fold lower than that between the SARS-CoV-2 RBD and hACE2. Further phylogenetic analysis identified pangolin-derived coronaviruses clustering with RaTG13 and SARS-CoV-2 and sharing a higher amino acid similarity to the RBD of SARS-CoV-2 (97.4%) [14]. This analysis raised the

#### *Perspective Chapter: Real-Time Genomic Surveillance for SARS-CoV-2 on Center Stage DOI: http://dx.doi.org/10.5772/intechopen.107842*

possibility that SARS-CoV-2 might have originated from a recombination event of a virus similar to pangolin-CoV with one similar to RaTG13 [15, 16]. Other groups have identified coronaviruses sampled from bats that shared higher similarity to the RBD of SARS-CoV-2, as STT182 and STT200 sampled from Cambodia [17] and BANAL-52 and BANAL-103 sampled from Laos [18]. Of note, comparative sequence analysis of the viruses sampled in Laos showed that those viruses have an RBD with only one or two amino acidic mismatches at the 17 residues that interact with the hACE2 receptor. Many more groups continued collecting genomic sequence data of coronaviruses and sampling animal reservoirs to better understand the exact spillover event and emergence process of SARS-CoV-2. These studies are of high importance due to the latent threat of the emergence and re-emergence of infectious diseases from animal origins.
