**1. Introduction**

With a current worldwide prevalence of around twenty-seven million cases [1, 2] and hundreds of thousands of deaths every year [2, 3], salmonellosis remains the second most common food/water-borne illness. It constitutes a disease caused due to the systemic infection of human and animal hosts by the facultatively anaerobic, Gram-negative rod-shaped bacterial species of *Salmonella enterica* from the family *Enterobacteriaceae*. Clinically, several serologic variants (serovars) of *S. enterica* exist, which differ with respect to their different antigenic variation in lipopolysaccharide and flagella [4, 5]. They include Typhi and Paratyphi A, besides the non-typhoidal serotypes like Typhimurium and Enteriditis [4]. Among these, the enteric fever termed typhoid, caused by *S.* Typhi and Paratyphi, is typically a more severe illness than those caused by other non-typhoidal serovars [5].

Being contagious in nature, salmonellosis, like typhoid, can spread through feces, water and the hands of those caring for the sick while, for non-typhoidal serovars, through the consumption of raw or undercooked contaminated food of animal origin such as meat, poultry, eggs and milk by humans [1, 6, 7]. Salmonellosis begins with ingestion of a dose for the bacterium enough to broach the first-line host defenses and colonize the gastrointestinal tract. The onset symptoms for typhoid are usually accompanied with fever, headache, myalgia, anorexia and sometimes diarrhea or constipation [6, 7], moving onto remittent fever, with a stepwise increment in the daily peak temperature, reaching 40°C by the end of the first week [6]. Slow recovery after 3–4 weeks is the normal case, though, for untreated patients with complications, major fatalities occur due to intestinal hemorrhage or perforation [6, 7].

Drugs available for the treatments are mostly ineffective due to the resistance developed with the emergence of multidrug-resistance (MDR) *Salmonella* strains [8]. These new strains are ineffective to the older generations of drugs including ampicillin, chloramphenicol, ciprofloxacin, trimethoprim as well as co-trimoxazole and their derivatives, thereby necessitating the newer classes of cephalosporins and quinolone derivatives to be greatly explored to combat such MDR threats [1, 8]. Moreover, dating as early as the 1890s, whole-cell vaccines with parenteral administration of killed suspensions of *S.* Typhi [9] has several problems having: a) highreactivity with 20–25% fever and 40–50% local reactions, b) moderate efficacy with protection rates of 51–88% insufficient to halt disease transmission in endemic area and c) logistical and safety problems having the need for needles and two doses. Approaches with recent vaccines, like, single-dose Typhim Vi® containing purified Vi capsular polysaccharide, or, the live attenuated vaccine *S.* Typhi Ty21a (Vivotif®), confer around 50% protection in adults, and very poor immunogenicity among young children, without any license for under two years old, besides being considered to be expensive for low-middle income areas [10, 11]. Thus, the urgency, for new and specific vaccines and/or drugs to combat the disease, is evident and indeed, proteins of the pathogen-specific biochemical and biosynthetic pathways, involved in the virulence of *S.* Typhi, has already begun to be targeted with a view to developing novel vaccines/drugs.

While the two afore-mentioned vaccines are for *S.* Typhi, those for other serovars including Paratyphi, Typhimurium and Enteritidis were largely unavailable until some few years back [11]. Of late, efforts to confer protective immunity for serovars of Typhimurium has been reported with the *lppA* and *lppB* Braun lipoprotein genes with and without the *msbB* gene, encoding an acetyltransferase enzyme required for modification of the lipid A of lipopolysaccharide [12]. Other candidate genes proposed for effective vaccines for different serovars include *rpoS*, *phoPQ*, *ssaV*, *htrA* [13], besides the proteins of SseBI, OmpACDFL and SopB being used as antigens in other vaccination studies [14]. Such recombinant attenuated *Salmonella* vaccines (RASV) are considered to be same or more effective than the whole wild-type strains [15]. RASV can persistently colonize internal lymphoid tissues to produce recombinant antigens having their maximum abilities to elicit mucosal and systemic antibody along with those of the cell mediated immune responses [15]. Thus, development of such recombinant vaccines is considered to be the cost-effective and most promising strategy against the pressing antibiotic resistance threats. In this regard, several strategies have been adopted in other drug resistant bacteria including reverse vaccinology through comparative genome analysis and *in vitro* proteomics [16, 17]. These become especially effective keeping in mind the new and emerging threats of multidrug resistance strains of *Salmonella*. Such strains might possibly arise form immune selection leading to antigen

*Computational Identification of the Plausible Molecular Vaccine Candidates… DOI: http://dx.doi.org/10.5772/intechopen.95856*

**Figure 1.**

*Graphical summary of the methods adopted in vaccine candidates and druggability prediction. This comprises a network-based approach to identify the key players in* Salmonella *virulent proteome coupled with downstream predictions of vaccine candidates and druggable pockets among the top rankers.*

sequence variability followed by a down-regulation of the target antigens, thereby conferring poor "cross-protective efficacy" as reported for MDR *Acinetobacter baumannii* [18]. Therefore, identification of new and effective vaccine candidates is, probably, the current need of the hour.

With an availability of different virulent proteins, reported from different experimental verification and predictive databases, selection of the most plausible vaccine candidates can be confusing. To cater to the need of simplifying this complex problem of selection, graph theoretical analysis of the interacting networks of such virulent proteins, involved in the disease scenario, might be poised to be quite useful. Such virulent protein interaction networks (PIN) can be utilized to find out the most central or sought-after proteins for such cases [19]. Ideally, the centrality of any biological networks is efficiently analyzed through global parameters like betweenness, closeness, degree and eigen-vector centralities, referred to as the BC, CC, DC and EC, respectively [19–21]. Among them, BC has been regarded to be efficient enough to impart central character of a network above CC and DC for long until EC gained some prominence and can be quite effective as reported through recent studies [22–25].

In this study, we proposed the vaccine candidates for *Salmonella* serovars (**Figure 1**) as explained in the next section. Essentially, we utilized the four different centrality measures for analyzing three different virulent PINs denoted as VVaDK, VFDF and VFDX. Among the top 20 rankers of each of the different centralities, the unanimously present unique candidates were finally collected for further downstream analyses. These shortlisted candidate virulent proteins were rigorously analyzed through different bioinformatic tools to determine their antigenic and allergenic potential besides revealing the epitopes for efficient vaccines or molecular crevices for good drug targets.

### **2. Approach**

#### **2.1 Dataset collection**

We have initiated our study with the proteins collected for *Salmonella enterica* serovar Typhimurium str. LT2 (NCBI txid: 99287) on the 19th of December 2020. They were retrieved from two different sources namely, the National Center for Biotechnology Information (NCBI) and the Virulence Factor Database (VFDB) [26]. From NCBI, protein datasets were collected through literature search using various keywords such as Virulence, Virulence Factor, Virulence Protein, Drug(s), Vaccine(s) and Key. Some of these keywords, having essentially the same meaning, were used to get more hits and to avoid missing of any possible candidates thereby reducing the false-negative hits. Finally, all the candidates of the lists were merged, and duplicates were removed to yield 120 proteins to be considered for further analysis. They were termed as VVaDK for easy reference, where V stands for Virulence, Va represents Vaccine(s), D means Drug(s) and K denotes Key. Moreover, two types of candidates' lists were retrieved from VFDB. They comprised the Full dataset which covers all the proteins (261) related to unknown and predicted VFs of *S.* Typhimurium and were referred as VFDF. Additionally, 117 experimentally verified candidates were retrieved for *S.* Typhimurium and termed as VFDX.

All the afore-mentioned proteins for the different categories of VVaDK, VFDF and VFDX were fed as queries to the biological meta-database of protein interaction, STRING version 11.0 [27] to retrieve all the possible interactions of a particular protein [date and time of access: Dec 22, 2020, from 17 hours IST onwards]. Detailed protein links file under the accession number 90371 in STRING v11 was used to collect all the interactions of the whole genome proteins of *S.* Typhimurium. In each case, a database dictated default medium confidence value of 0.4, for the combined scores from different parameters of interaction, was used. Accordingly, the total number of protein interactions obtained were 138, 3501 and 2464 for VVaDK, VFDF and VFDX listed candidates, respectively.

#### **2.2 Interactome construction**

The protein interaction data for all individual sets for VVaDK, VFDF and VFDX, having medium confidence values, were imported into Cytoscape version 3.8.2 [28] to integrate and build the respective interactomes of protein interactions. Care was taken to remove duplicate and bidirectional interactions from each dataset. In essence, such interactome of proteins or the protein interaction network (PIN) has been constructed as an undirected graph, G = (V, E), consisting of E edges and a finite set of V vertices (or nodes) where, edge, e = (u, v), is connected to two vertices u and v. Each vertex/node in our PIN represents a protein. The number of connections/interactions/associations/links, a protein has with other proteins, reflects its degree, d [29].

### **2.3 Network analysis**

All the constructed 3 PINs have been viewed by Cytoscape v 3.8.2 in the form of interactomes of aforementioned interconnected proteins. They were subsequently analyzed through the integrated java plugin CytoNCA version 2.1.6 [30] to compute values for BC, CC, DC and EC as the four different global network centrality parameters. The different parametric combined scores from STRING were considered as edge weights for computing the CytoNCA scores of the 4 centrality parameters. Upon sorting these 4 measures from largest to smallest, top 20 proteins for each of the categories of centrality were picked to create Venn diagrams using Venny 2.0 [31] for finding the common proteins from each of the measures. This resulted in 12, 10, 7 proteins from VVaDK, VFDF and VFDX, respectively. Among these 29 candidates, 9 duplicates were removed to yield a total of 20 proteins. Through a

BLASTp alignment, these Typhimurium proteins were unanimously found in the serovars of Typhi and Paratyphi, and thus, considered for further analyses.
