基因组
自编码
人类微生物组计划
微生物群
聚类分析
编码
基因组
计算生物学
计算机科学
人工智能
模式识别(心理学)
基因组学
深度学习
k-mer公司
数据挖掘
生物
生物信息学
基因
遗传学
作者
Jakob Nybo Nissen,Joachim Johansen,Rosa Lundbye Allesøe,Casper Kaae Sønderby,José Juan Almagro Armenteros,Christopher Heje Grønbech,Lars Juhl Jensen,Henrik Bjørn Nielsen,Thomas Nordahl Petersen,Ole Winther,Simon Rasmussen
标识
DOI:10.1038/s41587-020-00777-4
摘要
Despite recent advances in metagenomic binning, reconstruction of microbial species from metagenomics data remains challenging. Here we develop variational autoencoders for metagenomic binning (VAMB), a program that uses deep variational autoencoders to encode sequence coabundance and k-mer distribution information before clustering. We show that a variational autoencoder is able to integrate these two distinct data types without any previous knowledge of the datasets. VAMB outperforms existing state-of-the-art binners, reconstructing 29–98% and 45% more near-complete (NC) genomes on simulated and real data, respectively. Furthermore, VAMB is able to separate closely related strains up to 99.5% average nucleotide identity (ANI), and reconstructed 255 and 91 NC Bacteroides vulgatus and Bacteroides dorei sample-specific genomes as two distinct clusters from a dataset of 1,000 human gut microbiome samples. We use 2,606 NC bins from this dataset to show that species of the human gut microbiome have different geographical distribution patterns. VAMB can be run on standard hardware and is freely available at https://github.com/RasmussenLab/vamb . Metagenomics data are resolved into their constituent genomes using a new deep learning method.
科研通智能强力驱动
Strongly Powered by AbleSci AI