聚类分析
计算机科学
序列(生物学)
灵敏度(控制系统)
数据挖掘
利用
模式识别(心理学)
机器学习
人工智能
生物
遗传学
计算机安全
电子工程
工程类
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2010-08-12
卷期号:26 (19): 2460-2461
被引量:17074
标识
DOI:10.1093/bioinformatics/btq461
摘要
Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification.UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.
科研通智能强力驱动
Strongly Powered by AbleSci AI