Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
- PMID: 9254694
- PMCID: PMC146917
- DOI: 10.1093/nar/25.17.3389
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Abstract
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
Similar articles
-
Getting the most from PSI-BLAST.Trends Biochem Sci. 2002 Mar;27(3):161-4. doi: 10.1016/s0968-0004(01)02039-4. Trends Biochem Sci. 2002. PMID: 11893514 Review.
-
Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.Nucleic Acids Res. 2001 Jul 15;29(14):2994-3005. doi: 10.1093/nar/29.14.2994. Nucleic Acids Res. 2001. PMID: 11452024 Free PMC article. Review.
-
Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases.Bioinformatics. 2000 Nov;16(11):988-1002. doi: 10.1093/bioinformatics/16.11.988. Bioinformatics. 2000. PMID: 11159310
-
Large-scale comparison of protein sequence alignment algorithms with structure alignments.Proteins. 2000 Jul 1;40(1):6-22. doi: 10.1002/(sici)1097-0134(20000701)40:1<6::aid-prot30>3.0.co;2-7. Proteins. 2000. PMID: 10813826
-
Using CLUSTAL for multiple sequence alignments.Methods Enzymol. 1996;266:383-402. doi: 10.1016/s0076-6879(96)66024-8. Methods Enzymol. 1996. PMID: 8743695
Cited by
-
Roseateles caseinilyticus sp. nov. and Roseateles cellulosilyticus sp. nov., isolated from rice paddy field soil.Antonie Van Leeuwenhoek. 2024 Jun 4;117(1):87. doi: 10.1007/s10482-024-01988-4. Antonie Van Leeuwenhoek. 2024. PMID: 38833203
-
Assessment of Plant Growth Promotion Potential of Endophytic Bacterium B. subtilis KU21 Isolated from Rosmarinus officinalis.Curr Microbiol. 2024 Jun 4;81(7):207. doi: 10.1007/s00284-024-03734-5. Curr Microbiol. 2024. PMID: 38831110
-
SOFB is a comprehensive ensemble deep learning approach for elucidating and characterizing protein-nucleic-acid-binding residues.Commun Biol. 2024 Jun 3;7(1):679. doi: 10.1038/s42003-024-06332-0. Commun Biol. 2024. PMID: 38830995 Free PMC article.
-
Sororin is an evolutionary conserved antagonist of WAPL.Nat Commun. 2024 Jun 3;15(1):4729. doi: 10.1038/s41467-024-49178-0. Nat Commun. 2024. PMID: 38830897 Free PMC article.
-
Cracking the antigenic code of mycobacteria: CFP-10/ESAT-6 tuberculosis skin test and misleading results.J Clin Tuberc Other Mycobact Dis. 2024 Apr 2;36:100436. doi: 10.1016/j.jctube.2024.100436. eCollection 2024 Aug. J Clin Tuberc Other Mycobact Dis. 2024. PMID: 38828192 Free PMC article.
References
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Research Materials