计算机科学
字错误率
错误检测和纠正
灵敏度(控制系统)
计算生物学
算法
生物
人工智能
工程类
电子工程
作者
Huimin Chen,F. Richard Yu,Debin Lu,S.K. Huang,Shaoduo Liu,Boseng Zhang,Kunxian Shu,Dan Pu
标识
DOI:10.1002/elps.202400202
摘要
ABSTRACT The identification of low‐frequency variants remains challenging due to the inevitable high error rates of next‐generation sequencing (NGS). Numerous promising strategies employ unique molecular identifiers (UMIs) for error suppression. However, their efficiency depends highly on redundant sequencing and quality control, leading to tremendous read waste and cost inefficiency. Here, we describe a novel approach, enhanced error suppression strategy (EES), that addresses these challenges by (1) optimizing data utilization and reducing read waste by utilizing single‐read correction that reserves abundant single reads that complement other single reads or single‐strand consensus sequences (SSCSs), and (2) effectively enhancing the accuracy of NGS by employing Bayes’ theorem. EES significantly improves variant detection accuracy, achieving a background error rate of less than 4.4 × 10 −5 per base pair. Additionally, the data utilization rate is dramatically increased, with a 22.9‐fold enhancement in duplex consensus sequence (DCS) recovery compared to traditional methodologies. Furthermore, EES demonstrates superior error suppression performance across various base substitutions. In conclusion, EES represents a significant advancement in detecting low‐frequency variants by improving data utilization and reducing sequencing errors. It potentially enhances the sensitivity and accuracy of NGS applications, proving highly valuable in clinical and research contexts where precise variant detection is critical.
科研通智能强力驱动
Strongly Powered by AbleSci AI