*2.1.1.2 Machine learning approaches*

Machine learning is an overarching term used to describe diverse algorithms that use data sets to perform intelligent predictions [41]. The algorithms can be trained on large datasets to identify patterns and interactions. The trained algorithm can then be applied to novel data to identify or predict outcomes or interactions.

Computer based drug repurposing techniques utilizing machine learning have been gaining a lot of traction due to a large increase in available omics data in a variety of databases and the development of sophisticated algorithms that can utilize this data [42–44]. It is carried out using computational biology, bioinformatics and database tools, which allows for economical and high efficiency drug discovery [45]. Machine learning techniques used for drug repurposing include: k-nearest neighbor algorithms, decision tree, random forest, artificial neural networks, k-means clustering and principal component analysis [20, 46, 47].

In recent years researchers have not been able to keep up with the amount of information being generated by omics experiments, creating a need for different data analysis methods. Where previously they would manually comb through the data looking for patterns and connections, there has been a shift towards big data analysis utilizing machine learning approaches, which have shown several specific applications in drug repurposing [48].

Signature matching is an approach where complex patterns and profiles signatures—are generated for diseases and drugs by machine learning algorithms from large omics datasets. By looking for negative correlations between differential signatures resulting from diseases and from drug treatments, drugs can be identified that can serve as treatments for those diseases outside of their original indication [5, 20]. Simultaneously, drug signatures can also be compared with the signatures of structurally dissimilar drugs, with the idea being that if drugs show a similar signature they can share a therapeutic application irrespective of chemical similarity. For both these applications there is an alternative signature that can be compared, the clinical phenotype signature. Even though some diseases or drugs might show little to no similarities in direct transcriptomic, metabolomics or proteomic patterns, they could still have similar clinical phenotypic outcomes, which can also allow for the identification of repurposing uses of drugs [49].

Another use of signature matching is in finding similar chemical features of drugs and mapping a network based on shared features. This allows for the identification of drugs that may potentially be repurposed—as similarity in pharmacophores tends to correlate with a similarity in biological activity.

Related to signature-based methods, application of genome-wide association studies (GWAS) have also shown to be valuable within the field of drug repurposing [50]. GWAS data can be analyzed using machine learning approaches to identify interaction and association patterns of genes linked to diseases [51]. Genes identified by GWAS to associate with a disease tend to be enriched with druggable targets. By cross-referencing the disease enriched genes with databases containing drugtarget information drugs can be found that inhibit specific genes that are involved in other indications but also seemingly play a role in the GWAS investigated disease, potentially being able to reuse that drug. In addition if a gene is shown to be associated with a disease it could become a novel drug target, which can be screened against using approved drug libraries.

Even though GWAS identified genes can be associated with a disease that does not mean that the target is druggable. Pathway mapping could be a potential tool to leverage the information gained with GWAS and expand upon it [52]. By analyzing the pathways or protein interaction networks up and/or downstream of the GWAS identified genes, other, previously elusive, proteins can be identified that could play a role in disease progression. This can either yield new drug targets or repurposing opportunities of drugs that already inhibit the elucidated target. For example, pathway analysis was performed on data sets containing gene expression data from human hosts infected with many different respiratory viruses. This identified 67 conserved biological pathways that could play an important role in respiratory viral infections. Comparing these pathways to a drug-target database resulted in drugs like pranlukast and amrinone, drugs with a different indication, that could potential be used in treating viral infections [53].

#### *2.1.2 Experiment-based approaches*

Empirical evidence is still highest order of evidence and remains the golden standard for drug screening, including drug repurposing. Since experimental assays provide the most immediate evidence of drug activity [51] they are not only used to discover potential repurposing candidates from libraries but they are also essential in validating hits from computational approaches.

Inhibition assays can serve to identify target-specific drug efficacy, including inhibition constants. Binding assays are very powerful as they can also provide binding constant information [54]. Immediate use can be made of the identified binding drug that might not be highly specific or effective but it could serve as a temporary stop gap in emergency situations (like pandemics). Whilst the repurposed drug is being used as a sort of band aid, drug development can be undertaken in parallel, using the drug as the starting point. Rapid SAR approaches can then be utilized to improve the drug binding

and efficacy [55]. The fact that the resulting drug would ideally be quite similar to the approved drug could lead to accelerated approval processes.
