Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms

史密斯-沃特曼算法 生物 序列比对 序列数据库 相似性(几何) 蛋白质测序 序列(生物学) 多序列比对 灵敏度(控制系统) 生物信息学 计算生物学 算法 计算机科学 遗传学 肽序列 人工智能 基因 电子工程 图像(数学) 工程类
作者
William H. Pearson
出处
期刊:Genomics [Elsevier]
卷期号:11 (3): 635-650 被引量:519
标识
DOI:10.1016/0888-7543(91)90071-l
摘要

The sensitivity and selectivity of the FASTA and the Smith-Waterman protein sequence comparison algorithms were evaluated using the superfamily classification provided in the National Biomedical Research Foundation/Protein Identification Resource (PIR) protein sequence database. Sequences from each of the 34 superfamilies in the PIR database with 20 or more members were compared against the protein sequence database. The similarity scores of the related and unrelated sequences were determined using either the FASTA program or the Smith-Waterman local similarity algorithm. These two sets of similarity scores were used to evaluate the ability of the two comparison algorithms to identify distantly related protein sequences. The FASTA program using the ktup = 2 sensitivity setting performed as well as the Smith-Waterman algorithm for 19 of the 34 superfamilies. Increasing the sensitivity by setting ktup = 1 allowed FASTA to perform as well as Smith-Waterman on an additional 7 superfamilies. The rigorous Smith-Waterman method performed better than FASTA with ktup = 1 on 8 superfamilies, including the globins, immunoglobulin variable regions, calmodulins, and plastocyanins. Several strategies for improving the sensitivity of FASTA were examined. The greatest improvement in sensitivity was achieved by optimizing a band around the best initial region found for every library sequence. For every superfamily except the globins and immunoglobulin variable regions, this strategy was as sensitive as a full Smith-Waterman. For some sequences, additional sensitivity was achieved by including conserved but nonidentical residues in the lookup table used to identify the initial region.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
bkagyin应助不坠采纳,获得30
刚刚
善学以致用应助Sx13a采纳,获得10
刚刚
科研通AI6.3应助ws采纳,获得10
刚刚
1秒前
1026918发布了新的文献求助10
1秒前
1秒前
曹鑫宇发布了新的文献求助10
1秒前
2秒前
3秒前
3秒前
3秒前
3秒前
4秒前
常富育发布了新的文献求助10
5秒前
5秒前
5秒前
6秒前
6秒前
6秒前
Owen应助cyanpomelo采纳,获得10
7秒前
7秒前
BJ_whc完成签到 ,获得积分10
7秒前
8秒前
研友_8Yo3dn完成签到,获得积分10
8秒前
8秒前
ws完成签到,获得积分10
8秒前
栀蓝完成签到 ,获得积分10
9秒前
量子星尘发布了新的文献求助10
9秒前
刘娇发布了新的文献求助10
10秒前
10秒前
DarrenWu发布了新的文献求助10
10秒前
LDD发布了新的文献求助10
10秒前
10秒前
11秒前
咸鱼完成签到,获得积分10
11秒前
desperado发布了新的文献求助10
11秒前
12秒前
Nexus发布了新的文献求助10
13秒前
安静的绿竹应助风清扬采纳,获得10
14秒前
15秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Aerospace Standards Index - 2026 ASIN2026 3000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
Discrete-Time Signals and Systems 610
Research Methods for Business: A Skill Building Approach, 9th Edition 500
Social Work and Social Welfare: An Invitation(7th Edition) 410
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 纳米技术 有机化学 物理 生物化学 化学工程 计算机科学 复合材料 内科学 催化作用 光电子学 物理化学 电极 冶金 遗传学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 6053303
求助须知:如何正确求助?哪些是违规求助? 7871588
关于积分的说明 16278025
捐赠科研通 5198724
什么是DOI,文献DOI怎么找? 2781589
邀请新用户注册赠送积分活动 1764532
关于科研通互助平台的介绍 1646136