Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms

史密斯-沃特曼算法 生物 序列比对 序列数据库 相似性(几何) 蛋白质测序 序列(生物学) 多序列比对 灵敏度(控制系统) 生物信息学 计算生物学 算法 计算机科学 遗传学 肽序列 人工智能 基因 电子工程 图像(数学) 工程类
作者
William H. Pearson
出处
期刊:Genomics [Elsevier]
卷期号:11 (3): 635-650 被引量:519
标识
DOI:10.1016/0888-7543(91)90071-l
摘要

The sensitivity and selectivity of the FASTA and the Smith-Waterman protein sequence comparison algorithms were evaluated using the superfamily classification provided in the National Biomedical Research Foundation/Protein Identification Resource (PIR) protein sequence database. Sequences from each of the 34 superfamilies in the PIR database with 20 or more members were compared against the protein sequence database. The similarity scores of the related and unrelated sequences were determined using either the FASTA program or the Smith-Waterman local similarity algorithm. These two sets of similarity scores were used to evaluate the ability of the two comparison algorithms to identify distantly related protein sequences. The FASTA program using the ktup = 2 sensitivity setting performed as well as the Smith-Waterman algorithm for 19 of the 34 superfamilies. Increasing the sensitivity by setting ktup = 1 allowed FASTA to perform as well as Smith-Waterman on an additional 7 superfamilies. The rigorous Smith-Waterman method performed better than FASTA with ktup = 1 on 8 superfamilies, including the globins, immunoglobulin variable regions, calmodulins, and plastocyanins. Several strategies for improving the sensitivity of FASTA were examined. The greatest improvement in sensitivity was achieved by optimizing a band around the best initial region found for every library sequence. For every superfamily except the globins and immunoglobulin variable regions, this strategy was as sensitive as a full Smith-Waterman. For some sequences, additional sensitivity was achieved by including conserved but nonidentical residues in the lookup table used to identify the initial region.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
sunnyqqz完成签到,获得积分10
刚刚
BSDL发布了新的文献求助10
刚刚
nan完成签到,获得积分10
刚刚
小苏打完成签到,获得积分10
1秒前
欣喜大白菜真实的钥匙完成签到,获得积分20
1秒前
Yeeee发布了新的文献求助10
1秒前
谨慎非笑完成签到,获得积分10
1秒前
WJL完成签到,获得积分10
2秒前
南卡完成签到,获得积分10
2秒前
断水断粮的科研民工完成签到,获得积分10
2秒前
3秒前
lili完成签到,获得积分10
3秒前
任性诺言应助BRADp采纳,获得10
3秒前
wx完成签到 ,获得积分10
3秒前
西北一枝花完成签到,获得积分10
4秒前
Kiki完成签到 ,获得积分10
4秒前
123驳回了JamesPei应助
4秒前
4秒前
奉雨眠完成签到,获得积分10
5秒前
Pansy527完成签到,获得积分10
5秒前
fanny完成签到 ,获得积分10
6秒前
陈颖完成签到,获得积分10
6秒前
燕子完成签到,获得积分10
6秒前
BSDL完成签到,获得积分20
6秒前
烯灯发布了新的文献求助10
7秒前
科研通AI6.2应助MY采纳,获得30
7秒前
明天天气真好完成签到,获得积分10
8秒前
阳光的皮皮虾完成签到,获得积分10
8秒前
西西完成签到,获得积分10
8秒前
微糖完成签到,获得积分10
8秒前
能干冰露完成签到,获得积分10
8秒前
Liu完成签到 ,获得积分10
8秒前
GOuO完成签到,获得积分10
8秒前
9秒前
江水边发布了新的文献求助10
9秒前
搞怪惜儿完成签到 ,获得积分10
10秒前
liu完成签到,获得积分10
10秒前
无心的语风完成签到,获得积分10
10秒前
lemon完成签到 ,获得积分10
10秒前
神经娃完成签到,获得积分10
12秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Handbook of pharmaceutical excipients, Ninth edition 5000
Aerospace Standards Index - 2026 ASIN2026 3000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
T/SNFSOC 0002—2025 独居石精矿碱法冶炼工艺技术标准 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6043220
求助须知:如何正确求助?哪些是违规求助? 7804296
关于积分的说明 16238465
捐赠科研通 5188762
什么是DOI,文献DOI怎么找? 2776731
邀请新用户注册赠送积分活动 1759767
关于科研通互助平台的介绍 1643316