**11. Conclusions and perspectives**

Although the Rd.HMM protocol is highly sensitive and its alignments become inaccurate when the HMM score decreases, it can be used to guide the comparative modeling of proteins, as the examples given in section 8 show. Even if the alignment employed is flawed, when the model is produced and analyzed with Rd.HMM, the flaw will become evident and the model can then be discarded, and additional modeling rounds may be tried.

On the Assessment of Structural Protein Models with ROSETTA-Design and HMMer: Value, Potential and Limitations 235

Markov model; HMM, hidden Markov model; CASP, critical assessment of the structure of

Bowie, J., Reidhaar-Olson, J., Lim, W., & Sauer, R.(1990). Deciphering the message in protein sequences: tolerance to amino acid substitutions. *Science,* Vol. 247, No. 4948, (Mar 1990)

Brindis, F., Rodríguez, R., Bye, R., González-Andrade, M., & Mata, R.(2011). (Z)-3 butylidenephthalide from *Ligusticum porteri* , an α-glucosidase inhibitor. *J Nat Prod,* Vol.

Chivian, D. & Baker, D.(2006). Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection. *Nucleic Acids Research,* 

Chodanowski, P., Grosdidier, A., Feytmans, E., & Michielin, O.(2008). Local Alignment Refinement Using Structural Assessment. *PLoS ONE,* Vol. 3, No. 7, (Jul 2008) pp. e2645,

Cowles, M. K. & Carlin, B. P.(1996). Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review. *Journal of the American Statistical Association,* Vol. 91, No. 434, (Jun

Dill, K. A., Ozkan, S. B., Shell, M. S., & Weikl, T. R.(2008). The protein folding problem. *Annu Rev Biophys,* Vol. 37, (Jun 2008) pp. 289-316, ISSN ISSN:1936-122X (print) 1936-1238

Eddy, S. R., Mitchison, G., & Durbin, R.(1995). Maximum discrimination hidden Markov models of sequence consensus. *J Comput Biol,* Vol. 2, No. 1, (Jan 1995) pp. 9--23, ISSN

Eddy, S. R.(2004). What is a hidden Markov model? *Nat Biotech,* Vol. 22, No. 10, (Oct 2004)

Faver, J. C., Benson, M. L., He, X., Roberts, B. P., Wang, B., Marshall, M. S., Sherrill, C. D., & Merz, Jr., K. M.(2011). The Energy Computation Paradox and *ab initio* Protein Folding.

Fiser, A. & Sali, A.(2003). Modeller: generation and refinement of homology-based protein structure models. *Methods Enzymol,* Vol. 374, (Dec 2003) pp. 461-91, ISSN 978-0-12-

74, No. 3, (Sep 2011) pp. 314-20, ISSN 0163-3864 (print) 1520-6025 (electronic) Butterfoss, G. L. & Kuhlman, B.(2006). Computer-based design of novel protein structures. *Annu Rev Biophys Biomol Struct,* Vol. 35, (Jun 2006) pp. 49-65, ISSN 1056-8700 Chavelas Adame, E. A., Hernández-Domínguez, E. E., Gaytán-Mondrangón, S., Rosales León, L., Valencia-Turcotte, L., & Rodríguez-Sotres, R.(2011). A Hitchhiker's Guide to the modeling of the three-Dimensional structure of proteins. *International Color Biotechnology Journal,* Vol. 1, No. 1, (Nov 2011) pp. 26-35, ISSN 2226-0404 (electronic) Cheng, G., Qian, B., Samudrala, R., & Baker, D.(2005). Improvement in protein functional site prediction by distinguishing structural and functional constraints on protein family evolution using computational design. *Nucleic Acids Research,* Vol. 33, No. 18, (Sep

proteins; PDB, international protein data bank; TIM, triose phosphate isomerase.

pp. 1306-1310, ISSN 0036-8075 (print), 1095-9203 (electronic)

2005) pp. 5861--7, ISSN 1362-4962 (print)

ISSN 1932-6203 (electronic)

(electronic) 1936-122X (linking)

pp. 1315--1316, ISSN 1087-0156

182777-9

1066-5277 (print); 1557-8666 (electronic)

Vol. 34, No. 17, (Sep 2006) pp. e112, ISSN 1362-4962

1996) pp. 883-904, ISSN 0162-1459 (print), 1537-274X (electronic)

*PLoS ONE,* Vol. 6, No. 4, (Apr 2011) pp. e18868, ISSN e1932-6203

**12. References** 

An additional advantage of Rd.HMM alignments, as a guide to comparative modeling, comes form the fact that Rd.HMM models are independent of the functional constrains reflected in the conservation of active and binding sites. Since the Rd. step removes all conservation due to ligand binding and functional sites, other than that required to keep the structure stable, geometrical differences in the organization of two related, but not identical active sites will not affect the modeling process. In contrast, in the classic comparative modeling methods, the residue conservation at active and other functional sites is usually an important reference to perform the sequence to structure alignment. Then when a model in produced with the guidance of Rd.HMM, and a model with good quality and appropriateness is obtained, any coincidences in the active site geometry, would not come as a consequence of forcing the conserved residues in the target sequence to fall at the template's active site, but should be a consequence of meeting the structural requirements of the target.

From the above discussion, Rd.HMM is clearly a valuable tool, but has some limitations. We speculate that some of this limitations derive form the inability of HMMs to incorporate long range interactions, which can be detected as significant mutual information between distant positions in the sequence alignments. Currently we are working on the analysis of the mutual information in the Rosetta-designed sequence alignments using the statistical coupling analysis strategy (Socolich et al., 2005,Lockless et al., 1999). We hope this powerful statistical approach can extend the Rd.HMM and provide a richer tool.
