**Abstract**

The growing evidences of *Candida albicans* (*C. albicans*) infections are slowly becoming a threat to public health. Moreover, prevalence of antifungal resistant strains of *C. albicans* has emphasized the need for identification of potent targets for rational drug designing. In this aspect, traditional methods for target identification with validation have been found to be expensive and time-consuming. To overcome the concern, genome scale metabolic model construction provides a promising platform that allows novel target identification in combination with subtractive genome analysis. Thus, the chapter details current advancement in model construction, target identification and validation. In brief, it elucidates the overall strategies of *C. albicans* metabolome draft preparation, gap filling, curation of model, simulation followed by model validation, target identification and host pathogen interaction analysis. Finally, several examples of successful metabolic model construction and their utility in rational drug designing also have been discussed.

**Keywords:** Genome Scale Metabolic Model, Target Identification, Drug Designing, Host-Pathogen Interaction, *In-Silico* Gene Knockout

#### **1. Introduction**

*Candida albicans* (*C. albicans*) is an opportunistic fungal pathogen that lives in equilibrium with normal microbial flora of healthy individual [1]. As commensal, it colonizes on the mucosal surface of oral, respiratory, gastrointestinal and genitourinary tract. But on transformation into pathogen, it breaches the protective barrier in imunocompromised patients and cause candidiasis [2, 3]. Over 90% patients of cancer and HIV endure with orophryngeal candidiasis whereas vulvovaginal candidiasis distressed 138 million women per year [4, 5]. Candidemia is the most recurrent nosocomial infection acquiring up to 15% infections of blood with mortality rate from 40 to 70% [6]. Consequently, candidiasis has become the most common fungal infections responsible for increased mortality and morbidity worldwide.

For the treatment of candidiasis, limited number of antifungals has been approved for clinical use. These antifungals categorize into four major classes azoles, polyenes, echinocandins, and pyrimidine analogs [7]. Azole and

echinocandins precisely target the enzymes liable for synthesis of cell membrane or cell wall while polyenes directly bind to membrane proteins that maintain the osmolarity of the cell. In addition, Pyrimidines analogs are the sole antifungals that target the pathogen's genome rather than proteome [8, 9]. Consequently, the antifungals disturb the integrity of cell directly or indirectly which ultimately leads to the death of the pathogen (**Figure 1**).

*C. albicans* still represents itself as an emergent pathogen due to the side effects associated with such as RBCs toxicity, nephrotoxicity, hepatotoxicity, arrhythmias, cardiotoxicity and genitointestinal disturbances [10, 11]. Moreover, the drug resistance has also increased the complexity of the disease. The reason behind resistance is the prolonged or discriminated use of antifungals. The resistance mechanism involves the hyperactivity of efflux pumps, mutation in targeted genes and metabolites bypass [12, 13]. Thus, *C. albicans* have different resistance pattern against the diverse antifungals that lift up the difficulty level during the management of

#### **Figure 1.**

*Antifungal Drug Discovery and Resistance. A) Since 1990s, polyenes, pyrimidine analogs, azoles, echinocandins, and allylamines, morpholines, thiocarbamates has been approved for treatment of* C*.* albicans *infection. Nystatin and 5-flucytosine binds to membrane ergosterol and thymidylate-synthetase, respectively that leads to the leakage of osmotic constituents. Azoles inhibit the synthesis of ergosterol by interrupting the activity of lanosterol-α-demethylase. Echinocandins halts the participation of (1,3) β-D-glucan synthase in glucan synthesis. Allylamines and thiocarbamates block the oxidation of squalene. Consequently, leads to the death of cell. B) Now, drug resistance has come into picture. Mechanism of resistance involves (1) overexpression of target product, (2) modification of target enzyme, (3) Hyperactivation of multi-drug pump, (4) Production of cell wall Barrier, (5) Adaption to stress response or metabolic bypass, (6) Inactivation of Drug.*

#### *Metabolic Network Modeling for Rational Drug Design against* Candida albicans *DOI: http://dx.doi.org/10.5772/intechopen.96749*

infection. On the other aspect, significant homology of drug targets with human genes/protein, fitness traits and survival strategies such as secretion of hydrolytic enzymes, morphogenetic switch, adhesion to surface and formation of biofilm make this pathogen hard to kill. Thus, the scenario emphasizes the need of the novel drug designing *i.e.* effective against resistant strain of *C. albicans* and easily accessible with less or no side effects [14–18].

Effective drug designing could be possible only after the (i) identification and (ii) validation of potential target. *In silico* and *in vitro* approaches have been attempted for the identification and validation of novel targets of *C. albicans* followed by drug designing. Traditional methods of target identification with experimental validation (growth assay, enzyme inhibition assay, gene knockout, yeast to hybrid system, RNA interference) have been expensive, time-consuming and more focused towards few genes instead of whole genome. To reduce the time and cost, various *in silico* approaches such as subtractive genomics, comparative genomics, machine learning and inverse docking has been performed for novel target identification [19–21]. But a reliable approach is still required for proposed drug target validation.

The present chapter introduces the advancement in reconstruction and analysis of genome scale metabolic model (GSMM) which provides a platform that offers the opportunity to mimic the biological environment of pathogen into a machine to validate the essentiality of the target for the survival of pathogen. "Gene-Protein-Reaction" association in GSMM establishes its importance as a hub for validation of targets while stoichiometry matrix of model helps to depict the linkage information of metabolites to each reaction. If the inhibition of a particular metabolite shows a negative effect on growth of pathogen, it ensures the essentiality of the gene. Additionally, the approach also allows identification of the effect of inhibition on whole metabolome of pathogen. The GSMM can be used independently or in combination with different approaches (high throughput transcriptome profiling and subtractive genomics approach) (**Figure 2**) [19, 20, 22, 23].

#### **Figure 2.**

*GSMM as a central approach for target identification and validation.* In silico *approaches such as (1) Transcriptome data analysis, (2) Genome Scale Metabolic Reconstruction (GSMM), (3) Subtractive & Comparative Genomics and (4) Integration of genomics variants dataset of pathogen are widely used for target identification and validation. Among these, GSMM serves as a central approach which can used independently or in combination with these approaches. Gene-protein-reaction association and* in silico *gene knockout feature of GSMM make it reliable and standard approach to validate the putative targets via monitoring the impact of gene deletion on biomass of the system. Further, prioritization of proposed genes can be done by host-pathogen interaction analysis and protein-protein interaction analysis.*

#### **2. Genome scale metabolic model reconstruction**

A genome scale metabolic model (GSMM) is a computationally designed framework of microorganism that allows an efficient and comprehensive annotation of the metabolic functions of an organism, integrated with large-scale omics datasets and the study of microbe-host interactions [24, 25]. In brief, it describes gene-protein-reaction association of organism that mimics the biological condition in a machine to understand the genetic engineering, protein–protein interaction and evolutionary traits of organism [26]. Consequently, it generates forecast ranging from lethality of pathogen's gene to the dynamics engaged in defense mechanism of host towards infection.

As genome-scale metabolic model reconstruction become the more standard approach, the requirement of *in-silico*, automated tool turn out to be more perceptible to design and analyze these kinds of networks [27, 28]. Furthermore, availability of the whole genome of the pathogen also encourages the construction of *in silico* models. The Recent examples also have shown the potential of these models in the quest for novel drug targets in pathogenic organisms [29–33]. Kim et al., 2009 emerged a model of multi drug resistant *A. baumannii* and find the essential novel targets for therapeutic implications. Abdel-Haleem et al. in 2018, described the reconstruction of genome-scale metabolic models for five life cycle stages of *Plasmodium falciparum*, enabling the identification of potential drug targets that could be used as both, anti-malarial drugs and transmission-blocking agents [34]. Reinksma et al., 2019 developed combine model of *M. tuberculosis* and human to understand the metabolic state of pathogen during infection. Subsequently, Reinksma and team also assessed the effect of increasing dosages of drugs targeting metabolism on the metabolic state of the pathogen and predict resulting metabolic adaptations and flux rerouting through various pathways [35]. Similarly, Nouri et al., 2020 designed a comprehensive model of *Z. mobilis* to find the target for metabolic engineering applications [36]. Thus, design of *C. albicans* would also be the strong platform to understand its metabolic state in distinct adverse conditions that helps to identify and validate the target for novel drug design even against the resistant strains.

## **3. Experimental design for** *C. albicans* **model**

Construction of a model involved 4 major steps: 1) Preparation of Draft; 2) Manual curation; 3) Generation of mathematical model; 4) Network evaluation and analysis [37]. In brief, draft preparation (50%) consist gene annotation of pathogen's genome that further map with data reported in literature. Manual curation (20%) considers the manual refinements and re-evaluation of draft due to the presence of annotations having low confidence score retrieved from organism unspecific biochemical databases that may affect the behavior of pathogenic model. Collection of data for growth condition and biomass composition also is the part of this stage. Generation of mathematical model (10%) is fully automated and includes the conversion of refined draft into mathematical model. Fourth stage comprises the verification, evaluation and validation of model that leads to the identification and fulfillment of network gaps by repeating stages 2 & 3 until the gap fill is accomplished.

The complete protocol of the genome scale metabolic model reconstruction of *C. albicans* is shown in **Figure 3**. The protocol consists of a set of methods that are introduced in sequence but can be combined in a multitude of ways.

#### **3.1 Hardware and software**

A 64 bit computer of 8 GB RAM with stable internet connection is desired for drafting a model till analysis. MATLAB vR2014b (https://www.mathworks.com/

*Metabolic Network Modeling for Rational Drug Design against* Candida albicans *DOI: http://dx.doi.org/10.5772/intechopen.96749*

#### **Figure 3.**

*A flow diagram of Genome Scale Metabolic Model Reconstruction.*

products/matlab.html) or above, COBRA Toolbox v3.0 or above, Pathway Tools v 22.5 are required to accomplished the reconstruction [38–40].

#### **Steps:**

*<sup>#</sup> Matlab Installation (v2014 or above)*

<sup>&</sup>gt; > Download ! Extract ! Click on Setup ! Install with or without Internet ! Next ! Accept license agreement ! Next ! Provide installation key ! Next ! Choose Installation Type ! Next ! Specify installation folder ! Next ! Provide license file location ! Next ! Select installation options ! Confirm the Installation ! Finish

*<sup>#</sup> CobraTool box Installation*

<sup>&</sup>gt; > Download i) git ii) curl (v7.0 or above) and iii) CobraTool box (v3.0 or above)

<sup>&</sup>gt; > First Install git: extract ! Click on Setup ! choose default settings except adjusting your PATH environment (select use git and optional unix tool from window command prompt) and configuring

the line ending conversion (choose checkout as –is, commit Unix-style line ending).

> > Install curl: Select default settings ! Just click next ! Finish

> > Install CobraTool box: Open git bash ! Run command "git clone –depth = 1 https://github.com/ opencobra/cobratoolbox.git cobratoolbox" (it will install the setup in C:

/user/username/cobratoolbox) ! open matlab ! click on set path ! select the Toolbox folder *# SBML installation*

> > Download ! Extract ! Open Matlab ! Navigate to SBML toolbox folder ! Run script "run (install.m)".

*# Pathway tool Installation*

> > Download ! Click on Setup.exe ! Select the location of installation (same as cobratool box) ! Next ! Choose location to store configuration and data file ! Next ! Verify location of installation ! Next ! uninstall older version (if present) ! Click finish to continue ! Ok ! Create desktop icon (optional) ! Finish

## **3.2 Preparation of draft**

The draft reconstruction can be done manually or automatically. On manual mode, it is very tedious and time taking process. Thus, the software such as metaShark and PathwayTools are available which automate the draft by using genome database (CMR, GOLD, SEED, TIGR and NCBI Entrez Gene), biochemical database (KEGG, BRENDA, Transport DB, TCDB and PubChem) and organismspecific database (EcoCyc, BioCyc, Metacyc and Gene Cards) [38, 41]. First, the chapter described the draft construction with PathwayTools followed by manual curation and biomass composition. Further, *in silico* activities and model analysis illustrated using COBRA Toolbox in MATLAB [38–40].

### *3.2.1 Input file format*

PathoLogic plugin of PathwayTool is dedicated for automated draft construction that accepts FASTA file (.fasta), genetic-elements.dat (.dat), GenBank (.gbk) or PathoLogic (.pf) format as input. FASTA and GenBank file formats are easily accessible and can be retrieved from RefSeq and GenBank database while geneticelements.dat and PathoLogic must be prepared that defines the annotation for each genetic element of organism. Each input file comprises at least the basic attributes such as unique ID, name, start base, end base, function, EC number and gene ontology.

#### **Steps:**

> > Retrieved the .fasta file and .gbk file of each chromosome of *C. albicans* from RefSeq (https://www. ncbi.nlm.nih.gov/genome/?term=candida+albicans).

#### *3.2.2 Creation of new database*

Database creation is the first step of draft model construction that requires the information like unique identifier, database name, taxonomy of organism and database storage type *etc*. Consequently, the provided data is saved into organism. dat and organism-init.dat files that indicate the initialization of new database. Once the database has been initialized, specify the replicons of your organism *i.e.,* the input files of each chromosome that can be .fasta, .dat, .gbk or .pf. Thereafter, specify the reference database of closest organism that will add the missing entities (reactions, enzymes and metabolites) which are absent in databases linked to pathway tool. Trial Parse operation parse the input file(s) to correct the errors present in input file before to automate the building of new database. Finally, the removal of

*Metabolic Network Modeling for Rational Drug Design against* Candida albicans *DOI: http://dx.doi.org/10.5772/intechopen.96749*

errors allows building the model. This is the focal phase of PathoLogic plugin that perform the parsing again and generate a database for each chromosome, gene, proteins, enzymes, metabolites of organism. Now save the organism database that will take several minutes to complete (**Figure 4, steps: 1–6)**.

#### **Steps:**



#### *3.2.3 Refinement*

Refining of database includes the inferences and manual operations: **1) Probable Enzymes** involved the additional enzyme-to-reaction assignments; **2) Name Matcher** add the additional name; **3) Rescore Pathways** performs the addition of new pathways and deletion of un-established pathways; **4) Create Protein Complex** permit to stipulate protein complexes that involuntarily link to appropriate reactions; **5) Assign Modified Proteins** allocate the modified substrate within the reaction encoded by gene product within the database; **6) Predict Operons** allow to choose genetic elements on the whole genome; 7) **Transport Inference Parser** finds transport reaction catalyzes proteins to construct their protein complexes and enzymatic reactions; **8) Pathway Hole Filler** seal the gaps using candidate enzymes arise during the construction; **9) Update Cellular Review** draw the cellular outline of database; **10) Consistency Checker** automatically rectify the disturbances of data constraints. Among the refinement, hole filler play the major role which can be done mechanically or manually (**Figure 5A and B**). Now, resave the database and

**Figure 4.** *A detailed Protocol of Genome Scale Metabolic Reconstruction.*

export it for further analysis in .sbml format using the command **File** ! **Export** ! **Selected reactions to SBML file.** For further processing, convert the sbml model into "CanCyc.xls" file for manual curation (**Figure 4, step: 7–8**).
