**3.4 Biomass composition**

Biomass reaction is the engine of metabolic model as it shows the obvious effects on model validation and strain improvement. The biomass consist of cellular components (proteins, RNA, DNA, Lipids, Lipopolysaccharides, Peptidoglycan, Glycogen, Polyamines, *etc.*) with its fractional constituents [46]. Estimation of total biomass composition may not be feasible; still the determination of comparative fraction of all precursors can be possible experimentally for log phase growing cells. Among the biomass precursors, lipid extraction is quite tough due the presence of different fatty acid with diverse chain length, saturation and un-saturation. After the quantification of biomass components, data is subjected to normalization that computes the equation of biomass (**Figure 4, step: 10)**.

### **3.5 Curated model conversion to mathematical model**

In this stage, model is subjected to convert the curated draft into a conditionspecific mathematical model *i.e.,* fully automated. MATLAB and COBRA Toolbox are widely used software for model conversion, evaluation and analysis. To convert the model initialize the COBRA toolbox using the command **"initCobraToolbox"** first, then set of optimization solvers such as Gurobi and LP. The optimization solvers provide commanding algorithms that improvise the programming of mathematical models, constraint models and constraint based scheduling models. The solver Gurobi is a default solver for LP, MILP and QP problems while GLPK is selected for LP and MILP problems. Read the model with **"xls2model"** command to verify and set the objective as well as simulation constraints to the model. Save it to "CanCyc.xml" format.

Script to load and save the model in mathematical format is provided in **Supplementary Data 1**.

## **3.6 Model evaluation and validation**

The metabolic model designed in third stage may have some common errors: 1) wrong reaction constraints; 2) cofactor cannot be produced or consumed; 3) shuffling of compounds across compartments; 4) missing transport and exchange reactions. To rectify these issues network verification, evaluation and validation is needed. Verification and evaluation usually leads to the addition of transport reaction, exchange reactions and metabolic function that can be done by the repetitive process of stage 2 and 3. Thus, it is also known as iterative process that evokes the debugging to cure errors arising computationally. The major concern is to make a decision when to end this process which is based on the rationale of reconstruction.

The process starts with the test of unbalanced reaction that provides the list of unbalanced reaction in model to balance it manually. Next is to identify the dead end metabolites that are only consumed or produced and indicates about gaps present in the model. Removal of dead metabolites promote gap filling which is a manual process that can be done by using the published literature, genome pathway annotation tools (KEGG) and organism specific databases. During gap fill, all added reactions and metabolites must be connecting to each other. This step also includes the addition of exchange reactions and transport reaction as well. Thereafter, the upper and lower constraints to desire medium or environmental condition required for the growth of organism. Constraints must be varying according to the objective of study.

To practice the exercise of evaluation and validation, user can also download the published GSMM model of other organisms from BioModel Database (http://www. ebi.ac.uk/biomodels/).

```
Steps:
> > initCobraToolbox;
> > solverok = changeCobraSolver('glpk','LP');
> > model = xls2model(CanCyc.xlsx');