巨量平行
计算机科学
大规模并行测序
计算生物学
生物
并行计算
遗传学
基因
基因组
作者
Ayaan Hossain,Daniel P. Cetnar,Travis L. LaFleur,Josie McLellan,Howard M. Salis
标识
DOI:10.1021/acssynbio.4c00661
摘要
Oligopool synthesis and next-generation sequencing enable the construction and characterization of large libraries of designed genetic parts and systems. As library sizes grow, it becomes computationally challenging to optimally design large numbers of primer binding sites, barcode sequences, and overlap regions to obtain efficient assemblies and precise measurements. We present the Oligopool Calculator, an end-to-end suite of algorithms and data structures that rapidly designs many thousands of oligonucleotides within an oligopool and rapidly analyzes many billions of barcoded sequencing reads. We introduce several novel concepts that greatly increase the design and analysis throughput, including orthogonally symmetric barcode design, adaptive decision trees for primer design, a Scry barcode classifier, and efficient read packing. We demonstrate the Oligopool Calculator's capabilities across computational benchmarks and real-data projects, including the design of over four million highly unique and compact barcodes in 1.2 h, the design of universal primer binding sites for one million 200-mer oligos in 15 min, and the analysis of about 500 million deep sequencing reads per hour, all on an 8-core desktop computer. Overall, the Oligopool Calculator accelerates the creative use of massively parallel experiments by eliminating the computational complexity of their design and analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI