**3. Choice of appropriate parameters**

264 Bioinformatics

**2.2. Structural and thermodynamic consideration** 

*2.2.2. Thermodynamic property for efficient RISC loading* 

*2.2.1. Presence of Secondary structure* 

end of the sense and antisense strand.

*2.3.1. Position specific nucleotide composition* 

*2.3.2. Sequence feature for efficient RISC entry* 

**2.3. Sequence characteristics** 

below in table 1-

is preferable.

*2.3.3. siRNA duplex stability* 

of the most important parameter to be considered during siRNA design.

siRNA potency largely depends upon structural constraints of the target region. Heavily structured sites are less likely to be bound by siRNAs as these sites are not accessible by siRNAs. The relative binding energy of the 5' and 3' ends of the siRNA with the target site play a vital role in the choice of strand to be incorporated into RISC complex and thus is one

It has been suggested that presence of local secondary structures (stem loops) in the target site restricts its accessibility to RISC and hence reduces the efficiency of the siRNA. So it is necessary to filter out those potential inaccessible target sites with strong secondary structures. The prediction of local secondary structure can be made by numerous RNA secondary structure prediction tools or packages like Mfold [7] or Vienna RNA package [8] -

In a siRNA duplex, antisense strand with relatively low energy in 5' end is favourable for its loading into RISC complex. So, there should be difference in binding energy between the 5'

Years of research for finding appropriate designing parameters identified some sequence parameters enriched within efficient siRNAs. These sequence characteristics often contribute

Sequence analysis of effective siRNAs revealed many position specific nucleotide compositions for enhancing potency of the siRNA. Some of these preferences are listed

siRNA guide strands with low energy at 5' end are favored for entering the RISC complex. So, presence of at least three (A/U)s in the seven nucleotides at the 3' end of the sense strand

Target sites with low GC content (generally less than 55%) has a greater potential for being functional siRNA site, as too high GC content can impede the loading of siRNAs into RISC

to efficient RISC loading or siRNA sequence specificity or stability issues.

that mainly predict minimum free energy secondary structure of a RNA sequence.

All the parameters discussed above are not equally important for selection of efficient siRNAs. By far, many research groups have conducted studies for evaluation of effective parameter sets for siRNA selection. Gong et al. studied 276 known siRNA selection parameters on a sufficiently large set of 3277 experimentally validated siRNAs targeting 1518 genes to identify common parameters that effectively distinguishes functional siRNAs from non functional ones [9]. They were able to identify 34 features associated with improved siRNA efficacy among which 27 features were associated with greater than 70% efficacy. They examined combination of siRNA features to find their cooperative effects on potent siRNA selection and used a disjunctive rule merging (DRM) algorithm to generate a bunch of non-redundant rules set to efficiently predict functional siRNAs and lower the false positive predictions. Table 2 list 17 features set associated with greater than 90% efficacy and used for optimal features combination.


Computational Approaches for Designing Efficient and Specific siRNAs 267

Each A/U base pair in this region earns 1

Failure to satisfy this criteria decreases 1

**siRNA selection parameter Parameter weight**

Occurence of 3 or more A/U base pair at

Presence of A at position 19 of the sense

Presence of A at position 3 of the sense

Presence of U at position 10 of the sense

Absense of G or C at position 19 of the sense

Threshold for efficient siRNAs score>=6

classification of effective siRNAs from non-effective ones.

*4.1.1. Use of artificial neural network for siRNA classification* 

**Table 3.** Parameters used in Raynold's algorithm with their weights

Low internal stability at target site (melting

position 15-19 of sense strand

strand

algorithms.

GC content 30% to 52% Satisfying this criteria earns 1 point

temperature Tm>-20οc) Satisfying this criteria earns 1 point

strand Satisfying this criteria earns 1 point

strand Satisfying this criteria earns 1 point

strand Satisfying this criteria earns 1 point

point

point

Absense of G at position 13 of the sense strand Failure to satisfy this criteria decreases 1 point

Since then many siRNA designing algorithm worked on different weight distribution schemes for improved prediction of siRNA potency and some even used machine learning

**4.1. Use of machine learning algorithms for classification of functional siRNAs** 

After many years of research about the guidelines for selection of effective siRNAs, we are a few steps ahead in the process of improving the targeting success rate. But for better targeting success, the siRNA selection parameters provided in various guidelines needs to be optimized. Still there is no reliable guideline for optimization of weights of siRNA selection parameter. Machine learning algorithms like Support vector machine or artificial neural network can serve excellent purpose, when trained with sufficient volume of biologically validated siRNA data sets [11]. Some online siRNA designing tools (like BioPredsi and Genescript siRNA target finder) use machine learning algorithms for

Artificial neural networks (ANNs), as they aim to mimic the working of biological networks through a connectionist approach to computation, provide a powerful method of identifying highly complex traits in data sets. ANNs are generally very efficient classifiers in case of complex patterns in the given data set as they can adaptively change their weighting parameters during the learning process. ANNs have been broadly applied in the biological sciences. The prediction quality and generalization capabilities of an ANN of fixed size

depend on a sufficiently large training set of directly comparable data points.

**Table 2.** Feature sets predicted to be associated with greater siRNA efficacy as described by Gong et al.
