**7. Phylogenetic and phylodynamic tools**

*Vector-Borne Diseases - Recent Developments in Epidemiology and Control*

**6. Genotyping tools**

information needed to identify the virus.

sequenced strains of diverse virus families.

below species level [50, 51].

**6.1 Castor**

methods and have no statistical support for their results.

viral reads from viral metagenomic data and also to produce the assembled viral strains (i.e. haplotypes) from classified reads. It mainly has two components: (1) viral read classification using partial or remotely related reference genomes and (2) de novo assembly of viral haplotypes from recruited reads with PEHaplo [47, 48], which is a haplotype reconstruction tool. As TAR-VIR has a modular structure, the users have options to use other assembly tools after read classification in step (1).

While variant discovery and identification tools play a critical role in determining the pathogen responsible for the infection, they are unable to determine the subtype or quasispecies that is responsible for the outbreak. Arboviruses exist as a mixed population of genomic variants due to rapid replication and the error prone nature of viral RNA-dependent RNA polymerase (RdRp) [47]. Monitoring virus genotype diversity is therefore crucial to understand the emergence and spread of outbreaks. Genotyping tools provide an efficient workflow to enable researchers and public health practitioners to determine the strain that is responsible for the outbreak.

Most free-access bioinformatics programs used to classify the genetic profile of subtypes, genotypes, subgroups or groups of viruses are based on the use of similarity search tools to determine the genotype of a new sequence. These genotyping tools use a set of reference sequence genomes, carefully selected for the purpose of representing each individual genotype. The use of a number of reference sequences representing the genotype of a given group increases the consistency and reproducibility of the data, thus ensuring a higher speed in the search for the data and offering greater and more complete information while ensuring that the results are not limited to an inadequate set of reference sequences that do not represent the

The similarity-based methods are useful for identifying recombination patterns in viral sequences, but they need further confirmation of their own phylogenetic

The classification and annotation of virus genomes constitute important assets

CASTOR is a virus classification platform based on machine learning methods, inspired by a well-known technique in molecular biology: restriction fragment length polymorphism [52]. It simulates, in silico, the restriction digestion of genomic material by different enzymes into fragments. It uses two metrics to construct feature vectors for machine learning algorithms in the classification step. The performance of CASTOR, its genericity and robustness could permit the conduct of

in the discovery of genomic variability, taxonomic characteristics and disease mechanisms. Existing classification methods are often designed for specific wellstudied families of viruses [43]. Thus, the viral comparative genomic studies could benefit from more generic, fast and accurate tools for classifying and typing newly

Recently [49], four viral genotyping tools for yellow fever (YFV) (https:// www.genomedetective.com/app/typingtool/yellowfevervirus/), dengue (DENV) (https://www.genomedetective.com/app/typingtool/dengue/), Chikungunya (CHIKV) (https://www.genomedetective.com/app/typingtool/chikungunya/) and Zika (ZIKV) (https://www.genomedetective.com/app/typingtool/zika/) were developed and linked to genome detective to enable phylogenetic classification

**26**

Phylogenetic tools are an extremely important resource used in the field of virology to study viral evolution, trace the origin of epidemics, establish the mode of transmission, investigate the occurrence of drug resistance or determine the origin of the virus in different body compartments. Thus, the tools developed by bioinformatics are fundamental to monitor the evolution of viral diversity, supporting studies of genomic sequence analysis, crucial for the surveillance of viral polymorphism, the development of new therapeutic strategies, the development of vaccine products or the appropriate choice products. Toward the development of a global surveillance outbreak surveillance system, the advances below have been made.
