*2.4.2 Mapping of available 3D structures in PDB*

For the top ranked proteins, the respective crystallized protein 3D structures available in Protein Data Bank (PDB) were retrieved (**Table 1**). The seleno-methionine in PDB structures were changed back into methionine using Dock Prep in Chimera [37].


#### **Table 1.**

*PDB structure availability among top rankers.*

#### *2.4.3 B-cell epitope prediction*

Unlike viral pathogens, most bacterial pathogens are not intracellular parasites, especially *Salmonella*. Thus, the humoral immune response, which involves B cells and antibodies, will be of great focus in this study. Herein, BepiPred v2.0 and DiscoTope v2.0 were utilized in predicting linear and discontinuous B-cell epitopes, respectively [38, 39]. For BepiPred, the default threshold score of 0.5 was applied for epitope recognition. For DiscoTope, the propensity score radius was 22 Angstrom, upper half sphere radius was 14 Angstrom, window size was 1, and alpha was 0.115. An in-house script (DiscoTope2ChimeraAttr) has been utilized to convert DiscoTope result into Chimera attributes for visualization in 3D, with a default threshold DiscoTope score of −3.7 [40]. These analyses were done to pinpoint the specific immunogenic regions within the full-length proteins. Thus, the immunogenically insignificant regions can be trimmed out, resulting in shorter peptides which can confer higher specificity and ease the peptide synthesis process.

#### *2.4.4 Allergenicity prediction*

The ability of proposed immunogen to potentially evoke allergic reactions can usually fail clinical trials due to the severe adverse effects arising upon vaccination. Herein, we utilized AllerCatPro, AlgPred2, and AllergenFP v1.0 to predict possible allergic reactions raised by the query proteins, which were the top rankers in this case. For AlgPred2, the hybrid algorithm was selected and the default threshold value of 0.3 was selected. AllerCatPro predicts allergenicity by comparing the protein structural and sequential information to known allergens [41]. Besides, the hybrid algorithm of AlgPred2.0 utilizes the random forest, BLAST, and MERCI algorithms to predict the allergenicity of the query proteins [42]. Moreover, the allergenicity prediction of AllergenFP v1.0 utilizes an alignment-independent fingerprint-based approach [43].

#### *2.4.5 Druggable pocket prediction*

P2Rank was being utilized to predict the presence of druggable pockets in the available 3D structures of proteins [44]. P2Rank utilizes a template-independent machine learning algorithm in predicting potential ligand-binding sites on the query proteins. Herein, the topmost ranked predicted pockets were selected for further analyses. Thus, besides being utilized in vaccination, the potential druggability of the top rankers can be discovered.

#### *2.4.6 Detecting human counterparts*

Peptide vaccines that contain regions of high sequence similarity to human proteome counterparts can lead to ineffective vaccination due to recognition as "self" by the immune system, which can result in low antigenicity or adverse effects that arise from potential self-reactivity. Thus, the top rankers were screened for human counterparts via sequence alignment approach using BLASTp against nonredundant proteins (nr) database with *Homo sapiens* as the specified organism [45].

#### **3. Interactome analyses of three virulent PINs**

Three different interactomes of virulent proteins of *Salmonella* were built using the method described above. The first of them comprised those available through

#### *Computational Identification of the Plausible Molecular Vaccine Candidates… DOI: http://dx.doi.org/10.5772/intechopen.95856*

literature search using different keywords comprising Virulence, Virulence Factor, Virulence Protein, Drug(s), Vaccine(s) and Key. This was named as VVaDK. The other two PINs were made of the full and experimentally verified datasets of virulent proteins from *Salmonella*, listed in VFDB and were named as VFDF and VFDX, respectively. The four centrality measures were applied for analyzing each of these PINs and twenty top rankers from each of the measures were initially segregated. Among them, the proteins present unanimously for all the measures were noted as 12, 10 and 7 for VVaDK, VFDF and VFDX, respectively, and a removal of duplicates from them finally yielded 20 candidates for further downstream analysis.

Our unique way of streamlining the candidates is based upon the following facts. Under pathological conditions, the virulent proteins are expected to be working in unison to render the final disease phenotype. Thus, their connectivity could be perceived in terms of the said PINs. Among these proteins, some can be master regulators and connecting to others more frequently thereby having higher order of connectivity. This renders them degree centrality (DC). Alternatively, there could be different types of such regulators for carrying out different subfunctions of the main disease phenotype and they form the bridge between the other proteins. These could impart the betweenness centrality (BC) of such proteins. Moreover, among such conglomerate of different proteins, certain numbers could connect to others faster to sequentially carry out their function, leading to a concept of closeness for them and having higher closeness centrality (CC). Furthermore, certain proteins could be more important to render the final disease phenotype and they are only connected to other important proteins to carry out their functions. These could bring out their character of eigen vector centrality (EC). Finally, from the top-ranking proteins of all these centrality measures, those, appearing unanimously, are expected to play a major role in virulence and could be segregated to scan for further analysis. These are 20 unique virulent proteins, mostly belonging to the *Salmonella* Pathogenicity Islands (SPI) from three different PIN analyses and reflected in **Figure 1** and **Table 2**. These are discussed in the next section.
