Ab initio modeling of small proteins by iterative TASSER simulations

doi:10.1186/1741-7007-5-17

. 2007 May 8:5:17.

doi: 10.1186/1741-7007-5-17.

Ab initio modeling of small proteins by iterative TASSER simulations

Sitao Wu¹, Jeffrey Skolnick, Yang Zhang

Affiliations

PMID: 17488521
PMCID: PMC1878469
DOI: 10.1186/1741-7007-5-17

Ab initio modeling of small proteins by iterative TASSER simulations

Sitao Wu et al. BMC Biol. 2007.

. 2007 May 8:5:17.

doi: 10.1186/1741-7007-5-17.

Authors

Sitao Wu¹, Jeffrey Skolnick, Yang Zhang

Affiliation

¹ Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, Lawrence, KS 66047, USA. stwu@ku.edu

PMID: 17488521
PMCID: PMC1878469
DOI: 10.1186/1741-7007-5-17

Abstract

Background: Predicting 3-dimensional protein structures from amino-acid sequences is an important unsolved problem in computational structural biology. The problem becomes relatively easier if close homologous proteins have been solved, as high-resolution models can be built by aligning target sequences to the solved homologous structures. However, for sequences without similar folds in the Protein Data Bank (PDB) library, the models have to be predicted from scratch. Progress in the ab initio structure modeling is slow. The aim of this study was to extend the TASSER (threading/assembly/refinement) method for the ab initio modeling and examine systemically its ability to fold small single-domain proteins.

Results: We developed I-TASSER by iteratively implementing the TASSER method, which is used in the folding test of three benchmarks of small proteins. First, data on 16 small proteins (< 90 residues) were used to generate I-TASSER models, which had an average Calpha-root mean square deviation (RMSD) of 3.8A, with 6 of them having a Calpha-RMSD < 2.5A. The overall result was comparable with the all-atomic ROSETTA simulation, but the central processing unit (CPU) time by I-TASSER was much shorter (150 CPU days vs. 5 CPU hours). Second, data on 20 small proteins (< 120 residues) were used. I-TASSER folded four of them with a Calpha-RMSD < 2.5A. The average Calpha-RMSD of the I-TASSER models was 3.9A, whereas it was 5.9A using TOUCHSTONE-II software. Finally, 20 non-homologous small proteins (< 120 residues) were taken from the PDB library. An average Calpha-RMSD of 3.9A was obtained for the third benchmark, with seven cases having a Calpha-RMSD < 2.5A.

Conclusion: Our simulation results show that I-TASSER can consistently predict the correct folds and sometimes high-resolution models for small single-domain proteins. Compared with other ab initio modeling methods such as ROSETTA and TOUCHSTONE II, the average performance of I-TASSER is either much better or is similar within a lower computational time. These data, together with the significant performance of automated I-TASSER server (the Zhang-Server) in the 'free modeling' section of the recent Critical Assessment of Structure Prediction (CASP)7 experiment, demonstrate new progresses in automated ab initio model generation. The I-TASSER server is freely available for academic users http://zhang.bioinformatics.ku.edu/I-TASSER.

PubMed Disclaimer

Figures

**Figure 1**
Flowchart of I-TASSER method for protein structure prediction.

**Figure 2**
**Examples of I-TASSER models from three independent benchmark sets**. The green color is for I-TASSER models and blue for the native structures. (A–C) are from benchmark I (Bradley et al [13]); (D–F) are from benchmark II (Zhang et al [12]); and (G–I) are from benchmark III, selected directly from the PDB library. Column 1 contains the high-resolution models with a Cα-RMSD ≤ 1.5Å; column 2 contains the medium-resolution models with a Cα-RMSD of 1.5–5Å; column 3 contains the low-resolution models with a Cα-RMSD > 5Å. The Cα-RMSD value for the examples are: **(A)** 1ogwA_ (1.1Å), **(B)** 1di2A_ (2.3Å), **(C)** 1dcjA_(10.0Å), **(D)** 1cy5A (1.5Å), **(E)** 1pgx (3.1Å), **(F)** 1gnuA (8.2Å), **(G)** 1cqkA (1.5Å), **(H)** 1gyvA (3.3Å), **(I)** 1no5A(10.5Å). The pictures were generated using PyMOL software [45].

**Figure 3**
**Comparison of I-TASSER models with the PPA threading alignment results**. **(A)** Cα-RMSD to native of the I-TASSER models versus Cα-RMSD to native of the best threading alignment over the same aligned regions. **(B)** TM-score of the I-TASSER models versus TM-score of the best threading alignments.

See this image and copyright information in PMC

Cited by

Construction of an aerolysin-based multi-epitope vaccine against Aeromonas hydrophila: an in silico machine learning and artificial intelligence-supported approach.
Alawam AS, Alwethaynani MS. Alawam AS, et al. Front Immunol. 2024 Mar 1;15:1369890. doi: 10.3389/fimmu.2024.1369890. eCollection 2024. Front Immunol. 2024. PMID: 38495891 Free PMC article.
Sulfated disaccharide protects membrane and DNA damages from arginine-rich dipeptide repeats in ALS.
Chang YJ, Lin KT, Shih O, Yang CH, Chuang CY, Fang MH, Lai WB, Lee YC, Kuo HC, Hung SC, Yao CK, Jeng US, Chen YR. Chang YJ, et al. Sci Adv. 2024 Feb 23;10(8):eadj0347. doi: 10.1126/sciadv.adj0347. Epub 2024 Feb 23. Sci Adv. 2024. PMID: 38394210 Free PMC article.
Engineering lentivirus envelope VSV-G for liver targeted delivery of IDOL-shRNA to ameliorate hypercholesterolemia and atherosclerosis.
Wang W, Chen X, Chen J, Xu M, Liu Y, Yang S, Zhao W, Tan S. Wang W, et al. Mol Ther Nucleic Acids. 2024 Jan 11;35(1):102115. doi: 10.1016/j.omtn.2024.102115. eCollection 2024 Mar 12. Mol Ther Nucleic Acids. 2024. PMID: 38314097 Free PMC article.
Tpgen: a language model for stable protein design with a specific topology structure.
Min X, Yang C, Xie J, Huang Y, Liu N, Jin X, Wang T, Kong Z, Lu X, Ge S, Zhang J, Xia N. Min X, et al. BMC Bioinformatics. 2024 Jan 23;25(1):35. doi: 10.1186/s12859-024-05637-5. BMC Bioinformatics. 2024. PMID: 38254030 Free PMC article.
Role of environmental specificity in CASP results.
Roterman I, Stapor K, Konieczny L. Roterman I, et al. BMC Bioinformatics. 2023 Nov 11;24(1):425. doi: 10.1186/s12859-023-05559-8. BMC Bioinformatics. 2023. PMID: 37950210 Free PMC article.

See all "Cited by" articles

References

1. Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. doi: 10.1126/science.1065659. - DOI - PubMed
1. Skolnick J, Fetrow JS, Kolinski A. Structural genomics and its importance for gene function analysis. Nat Biotechnol. 2000;18:283–287. doi: 10.1038/73723. - DOI - PubMed
1. Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. - DOI - PubMed
1. Fiser A, Do RK, Sali A. Modeling of loops in protein structures. Protein Sci. 2000;9:1753–1773. - PMC - PubMed
1. Bowie JU, Luthy R, Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991;253:164–170. doi: 10.1126/science.1853201. - DOI - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

[1] Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. doi: 10.1126/science.1065659. - DOI - PubMed

[2] Baker D, Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. doi: 10.1126/science.1065659. - DOI - PubMed

[3] Skolnick J, Fetrow JS, Kolinski A. Structural genomics and its importance for gene function analysis. Nat Biotechnol. 2000;18:283–287. doi: 10.1038/73723. - DOI - PubMed

[4] Skolnick J, Fetrow JS, Kolinski A. Structural genomics and its importance for gene function analysis. Nat Biotechnol. 2000;18:283–287. doi: 10.1038/73723. - DOI - PubMed

[5] Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. - DOI - PubMed

[6] Sali A, Blundell TL. Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993;234:779–815. doi: 10.1006/jmbi.1993.1626. - DOI - PubMed

[7] Fiser A, Do RK, Sali A. Modeling of loops in protein structures. Protein Sci. 2000;9:1753–1773. - PMC - PubMed

[8] Fiser A, Do RK, Sali A. Modeling of loops in protein structures. Protein Sci. 2000;9:1753–1773. - PMC - PubMed

[9] Bowie JU, Luthy R, Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991;253:164–170. doi: 10.1126/science.1853201. - DOI - PubMed

[10] Bowie JU, Luthy R, Eisenberg D. A method to identify protein sequences that fold into a known three-dimensional structure. Science. 1991;253:164–170. doi: 10.1126/science.1853201. - DOI - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ab initio modeling of small proteins by iterative TASSER simulations

Affiliation

Ab initio modeling of small proteins by iterative TASSER simulations

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources