**6.1. Stimulation of innate immune response**

268 Bioinformatics

levels.

good result.

*4.1.2. Support Vector machine based classification* 

belonging to two classes so that their gap is maximized.

**5. Experimentally validated siRNA datasets** 

**6. Improving specificity of siRNAs** 

Biopredsi siRNA designing algorithm from Novartis lab used Stuttgart Neural Net Simulator to train algorithms on a data set of 2182 randomly selected siRNAs targeted to 34 mRNA species [11]. It reliably predicted activity of 249 siRNAs of an independent test set (Pearson coefficient *r* = 0.66) and siRNAs targeting endogenous genes at mRNA and protein

Support Vector Machine (SVM) is a non-probabilistic binary linear classifier. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples one of the two categories. An SVM is called the maximum margin classifier that optimizes the margin between the example points

A newly developed siRNA designing tool enables improved selection of potent siRNAs by application of a Support Vector machine based optimization of a set of eight siRNA selection parameters. The support vector machine is trained with the feature set of 200 highly efficient and 200 poorly efficient siRNA candidates, collected from siRecords, a database of validated siRNAs [12]. The support vector machine is trained using a Gaussian kernel and Sequential Minimal Optimization (SMO) algorithm [13]. It has been tested with huge number of experimentally validated data samples from four different sources and gave sufficiently

The effectiveness of the siRNA designing rules should be tested on biologically validated siRNA datasets. On the early days of RNAi research, these biologically validated datasets were scarce. But now, with emerging high throughput technologies, large amount of validated siRNA data is being generated. Some databases are created by manual curation of literature describing validation of siRNA mediated silencing. siRecords [12] is one such database where siRNAs are marked with their respective silencing efficacy (low, medium, high and very high). MIT siRNA database [14] consists of siRNAs designed by Qiagen with

The specificity of siRNAs is a big issue in siRNA mediated gene silencing experiments. Exogenous siRNAs are reported to have off-target effects arising from either silencing unintended targets or toxic effects arising from their recognition by innate immune system [15].

The recognition of siRNAs by innate immune system can result from interferon response triggered by double stranded siRNA duplex or sequence dependent stimulation of toll like receptors. Avoiding some sequence motifs and a constraint related to the siRNA duplex

validated knockdown efficiency and marked with mRNA knockdown level.

length can effectively reduce immune response stimulation [16].

siRNAs can induce potential unwanted effects by activating innate immune system. Exogenous siRNAs are prone to be recognized by Toll-like receptors (TLRs), mainly TLR7, TLR8 and TLR9. TLR7 and TLR8 recognize synthetic siRNAs in a sequence dependent manner [16]. There seems to be preferential recognition of GU-rich sequences. AU rich sequences can also be immune stimulatory. Selecting siRNA sequences lacking GU rich regions can provide siRNAs with low immune stimulatory activity. Also presence of the motif "GUCCUUCAA" the 4-base motif "UGGC" in the siRNA is known to be immune stimulatory [17]. So, this motif should be avoided in the time of designing of siRNAs. The length of the siRNA is also an important factor for stimulation of immune response- the minimum length of siRNA to be recognized by innate immune system is in the range of 19 nucleotides.

## **6.2. Near perfect complementarity with other mRNAs**

mRNAs other than intended targets which exhibit near perfect sequence complementarity with the siRNA are likely to be degraded by the siRNA. This kind of off-targets can be avoided by choosing targets sites that do not have many consecutive base homologies with any other mRNA. Actually siRNAs can potentially silence transcripts with more than 11 base complementarity including base matches corresponding its 9th-11th nucleotides. But as finding unique 11 base target site is impossible, the siRNA designing algorithms try to find unique target sites that do not have 15 or more consecutive base homology with other transcripts.
