**4.7. Methods for typing of viruses**

**•** The selection is only tested for internal branches of the phylogenetic tree.

Here, the null model assumes that *α* = *β*\_I.

190 Next Generation Sequencing - Advances, Applications and Challenges

*4.5.3. Mixed Effects Model of Evolution (MEME)*

from branch to branch at a site (the random effect).

there is variation in step 'iii' as follows:

selection, a likelihood ratio test is applied.

**4.6. Methods for reconstruction of molecular phylogeny**

**Algorithm steps:**

**•** Each site has three rate parameters, *α*, *β*\_I (instantaneous non-synonymous site rate for internal branches) and *β*\_L (instantaneous non-synonymous site rate for terminal branches).

**Principle:** MEME is categorized under the 'branch-site random effects' phylogenetic methods [112]. Though this method is a generalization of FEL method, it differs from FEL and IFEL, by accounting for episodic positive selection that particularly affects a subset of lineages. MEME uniquely allows the distribution of *dN/dS (ω)* to vary from site to site (the fixed effect) and also

**i.** The steps 'i' and 'ii' are same as that of the SLAC method (Section 4.5.1), whereas

**ii.** The *ω* ratio is modelled across lineages at an individual site, i.e., each site is treated

Molecular phylogenetic analyses are the most commonly performed studies in virology with major applications in viral taxonomy, systematics and genotyping. Methods for reconstruction of phylogenetic tree are broadly classified into three main categories, *viz.* distance-based, character-based and Bayesian-based and are reviewed earlier [113, 114]. Distance-based methods use pairwise distance matrix as an input for tree building. Neighbour-joining [115], minimum evolution [116] and least square [117, 118] methods are widely used methods under this category. These methods are computationally efficient and suitable for the analysis of large datasets with low levels of sequence divergence. However, these methods do not perform equally well in case of highly divergent sequences with low levels of sequence similarity. Moreover, uncertainties can be introduced due to positioning of gaps in the MSA. Characterbased methods assume each site in MSA to evolve independently. The two classical methods under this category are maximum parsimony and maximum likelihood [119], which estimate the tree score based on the minimum number of changes and the log-likelihood value respec‐ tively. However, it needs to be mentioned that alignment-based phylogenetic methods are

observed to misclassify taxa with mixed ancestry and/or recombination [91, 92].

tion, substring theory, information theory and graphical representation [120].

The alignment-free methods have been developed as an alternative and can be classified into four categories based on the underlying principles employed. They are *k*-mer/word composi‐

as a fixed-effect component of the model using a two-bin random distribution with *ω*− ≤ 1 (proportion *p*) and ω+ (unrestricted, proportion 1−*p*). Thus, a proportion (*p*) of branches at a site evolve neutrally (or under negative selection), while the remaining (1–*p*) may evolve under diversifying selection. To test for evidence of episodic

> Phylogenetic analysis, whether alignment-based or alignment-free, is routinely used for genotyping/serotyping of viruses. Such analysis is carried out using the regions that are identified as markers for the purpose of classification by the expert evolutionary virologists and the International Committee of viruses (ICTV) [122]. It has been observed that genotype information for less than 10% of the viral genomes is available as part of their sequence records. As NGS technologies are producing a large number of genomic sequences for various strains, isolates and viral species, the genotype assignment gap is ever-increasing. Several tools for genotyping have been developed using both alignment-based and alignment-free methods and are most often organism-specific. NCBI Genotyping Tool is based on the sequence similarity for identifying the genotype of recombinant and non-recombinant viral sequences [123]. Similar tools exist for *Influenza virus, viz.* FluGenom [124]. Alignment-free method for phylogeny and genotyping of viruses based on the concept of Return Time Distribution has been developed *in-house* and its applicability for genotyping of viruses such as *Mumps virus*, *Dengue virus* and *West Nile virus* has been demonstrated [125–127].
