**2.3 Inferring protein-protein interaction and biological networks for knowledge discovery**

In the context of this chapter, we only focus on protein-protein interaction (PPI) network, which is defined as a set of nodes (or vertices), representing proteins connected by undirected edges (or links), which are the interactions or relationships between them (either *direct physical* or *functional* interactions). A physical interaction is an interaction that involves physical contact between proteins, and on the other hand, functional interaction, which is broad, does not necessarily involve direct physical contact, but rather refers to a mechanism through which a protein participates in cell functions [21]. Several learning algorithms have been used to infer human and human-pathogen PPIs [22], including ANN [23].

There exist several types of PPI networks based on the type of interactions and when integrated in a single network, the relationships between proteins in a unified network are referred to as functional interactions. Here, we only refer to functional interactions, which include physical and genetic interactions, and those inferred from knowledge about co-expression and shared evolutionary history or biological pathways. Other types of biological networks include signaling networks, gene regulatory or DNA-protein interaction networks [24, 25], disease-gene networks linking diseases to genes causing the disease, and drug interaction networks connecting drugs to their targets [26]. These biological networks have been used in several applications and analyzing individual, collective, and sub-network behaviors of these biological networks has enabled effective knowledge discovery at different levels of biology.

*Artificial Intelligence - Applications in Medicine and Biology*

**2.2 Analyzing gene expression profiles**

approach.

deep learning. They developed a software tool (DeepBind) based on deep convolutional neural networks that has the ability to discover new patterns in a sequence without knowledge of the particular location of the pattern within the sequence. DeepBind is also said to have the ability to: learn from very large amounts of sequence data through parallel implementation on a graphics processing unit (GPU); use both microarray and sequencing data; automatically train predictive models without requiring hand-tuning; tolerate mislabeled data and some noise; and generalize well across technologies regardless of existing biases across technologies. Furthermore, DeepBind was also used for identifying RNA- and DNA-binding protein sequence specificities, and showed resilience to outliers and array biases. This suggests that the issue of predicting sequence specificities has been efficiently addressed using the deep learning

With the increased availability of genome-wide gene expression assays in public databases, there is increasing demand for more efficient computational models for data interpretation. The use of artificial neural networks in biomedical research is currently taking precedence over traditional analysis methods, as they have been proven to be better classifiers. Deep neural networks, using data from RNA-seq as inputs, are being used for prediction modeling. Classic models in applications like predicting patient outcomes using gene expression data are still not effective to the expected level, thus creating a need for more efficient robust algorithms. Recent studies that use deep learning models on gene expression data have indicated better performance. Urda et al. [17] illustrated the use of a multi-layer feed-forward artificial neural network, shown in **Figure 2**, in analyzing the RNA-seq gene expression data.

Dincer et al. [18] present a model that uses variational auto-encoders (VAEs) to extract latent variables from publicly available expression datasets and use them as features for predicting phenotypes. Their system, called DeepProfile, uses deep learning to learn a feature representation from large unlabeled expression data samples that are not incorporated in the prediction problem. This system was successfully used for the prediction of response to cancer drugs based on gene expression data. It also helped determine the effects of given drugs on specific patients

*Example neural network for binary classification. Input layer of P gene expression levels connected to k-hidden* 

**6**

**Figure 2.**

*layers through synaptic weights w.*
