发布文献求助

Clover: tree structure-based efficient DNA clustering for DNA-based data storage

计算机科学聚类分析 DNA 树（集合论）时间复杂性 DNA运算贮藏蛋白 DNA测序计算复杂性理论计算生物学算法数据挖掘生物数学计算遗传学基因人工智能数学分析

作者

Guanjin Qu,Zihui Yan,Huaming Wu

出处

期刊：Briefings in Bioinformatics [Oxford University Press]
日期：2022-08-16 卷期号：23 (5) 被引量：7

链接

标识

DOI：10.1093/bib/bbac336

摘要

Deoxyribonucleic acid (DNA)-based data storage is a promising new storage technology which has the advantage of high storage capacity and long storage time compared with traditional storage media. However, the synthesis and sequencing process of DNA can randomly generate many types of errors, which makes it more difficult to cluster DNA sequences to recover DNA information. Currently, the available DNA clustering algorithms are targeted at DNA sequences in the biological domain, which not only cannot adapt to the characteristics of sequences in DNA storage, but also tend to be unacceptably time-consuming for billions of DNA sequences in DNA storage. In this paper, we propose an efficient DNA clustering method termed Clover for DNA storage with linear computational complexity and low memory. Clover avoids the computation of the Levenshtein distance by using a tree structure for interval-specific retrieval. We argue through theoretical proofs that Clover has standard linear computational complexity, low space complexity, etc. Experiments show that our method can cluster 10 million DNA sequences into 50 000 classes in 10 s and meet an accuracy rate of over 99%. Furthermore, we have successfully completed an unprecedented clustering of 10 billion DNA data on a single home computer and the time consumption still satisfies the linear relationship. Clover is freely available at https://github.com/Guanjinqu/Clover.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

祝大家在新的一年里科研腾飞

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 小文子完成签到，获得积分10

8秒前; 小二郎上传了应助文件

9秒前; 於伟祺发布了新的文献求助10

14秒前; 内向东蒽完成签到，获得积分10

14秒前; 搜集达人的应助被於伟祺采纳，获得10

27秒前; 小乙猪完成签到，获得积分0

28秒前; Avicii完成签到，获得积分10

40秒前; xiaogang127完成签到，获得积分10

43秒前; 於伟祺完成签到，获得积分10

44秒前; 荔枝波波加油完成签到，获得积分10

47秒前; 科研通AI2S的应助被shuaiwen25采纳，获得10

50秒前; 木木完成签到，获得积分10

52秒前; imica完成签到，获得积分10

56秒前; 四叶草完成签到，获得积分10

58秒前; chcmy完成签到，获得积分0

1分钟前; CLTTT完成签到，获得积分10

1分钟前; 小苔藓完成签到，获得积分10

1分钟前; fireking_sid完成签到，获得积分10

1分钟前; NexusExplorer上传了应助文件

1分钟前; 风中的丝袜发布了新的文献求助30

1分钟前; 苗条的嘉熙完成签到，获得积分10

1分钟前; 钟声完成签到，获得积分0

1分钟前; 咯咯咯完成签到，获得积分10

2分钟前; 皮汤汤完成签到，获得积分10

2分钟前; qqaeao完成签到，获得积分10

2分钟前; 时笙完成签到，获得积分10

2分钟前; 梦想照进现实完成签到，获得积分10

2分钟前; 妮妮完成签到，获得积分10

2分钟前; 神勇语堂完成签到，获得积分10

2分钟前; mm完成签到，获得积分10

2分钟前; 天马行空完成签到，获得积分10

3分钟前; 松柏完成签到，获得积分10

3分钟前; xiaoputaor完成签到，获得积分10

3分钟前; 33上传了应助文件

3分钟前; 李健的小迷弟上传了应助文件

3分钟前; 乐观海云完成签到，获得积分10

3分钟前; 阿越爱学习发布了新的文献求助10

3分钟前; 谷子完成签到，获得积分10

3分钟前; 勤劳的颤完成签到，获得积分10

3分钟前; chenbin完成签到，获得积分10

3分钟前

高分求助中: Востребованный временем 2500; Aspects of Babylonian celestial divination: the lunar eclipse tablets of Enūma Anu Enlil 1000; Kidney Transplantation: Principles and Practice 1000; The Restraining Hand: Captivity for Christ in China 500; Encyclopedia of Mental Health Reference Work 400; Mercury and Silver Mining in the Colonial Atlantic 300; Studi sul Vicino Oriente antico dedicati alla memoria di Luigi Cagni vol.1 300

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3375060; 求助须知：如何正确求助？哪些是违规求助？ 2991600; 关于积分的说明 8746733; 捐赠科研通 2675579; 什么是DOI，文献DOI怎么找？ 1465759; 科研通“疑难数据库（出版商）”最低求助积分说明 677935; 邀请新用户注册赠送积分活动 669607

今日热心研友

有哪些并发症

糟糕的铁锤

互助遵法尚德

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通