作者
Henrik Bjørn Nielsen,Mathieu Almeida,Agnieszka Sierakowska Juncker,Simon Rasmussen,Junhua Li,Shinichi Sunagawa,Damian R. Plichta,Laurent Gautier,Anders Gorm Pedersen,Emmanuelle Le Chatelier,Éric Pelletier,Ida Bonde,Trine Nielsen,Chaysavanh Manichanh,Manimozhiyan Arumugam,Jean-Michel Batto,Marcelo Bertalan Quintanilha dos Santos,Nikolaj Blom,Natalia Borruel,Kristoffer Sølvsten Burgdorf,Fouad Boumezbeur,Francesc Casellas,Joël Doré,Piotr Dworzyński,Francisco Guarner,Torben Hansen,Falk Hildebrand,Rolf Sommer Kaas,Sean P. Kennedy,Karsten Kristiansen,Jens Roat Kultima,Pierre Léonard,Florence Levenez,Ole Lund,Bouziane Moumen,Denis Le Paslier,Nicolas Pons,Oluf Pedersen,Edi Prifti,Junjie Qin,Jeroen Raes,Søren J. Sørensen,Julien Tap,Sebastian Tims,David W. Ussery,Takuji Yamada,Pierre Renault,Thomas Sicheritz‐Pontén,Peer Bork,Jun Wang,Søren Brunak,S. Dusko Ehrlich,Alexandre Jamet,Antonietta Cultrone,Christine Delorme,Emmanuelle Maguin,Éric Guédon,Gaetana Vandemeulebrouck,Ghalia Kaci,Hervé M. Blottière,Maarten van de Guchte,Nicolás Sánchez,Rozenn Dervyn,Séverine Layec,Yohanan Winogradsky
摘要
Sequencing the microbial species present in complex metagenomic samples is made easier with a method that groups genes by co-abundance. Most current approaches for analyzing metagenomic data rely on comparisons to reference genomes, but the microbial diversity of many environments extends far beyond what is covered by reference databases. De novo segregation of complex metagenomic data into specific biological entities, such as particular bacterial strains or viruses, remains a largely unsolved problem. Here we present a method, based on binning co-abundant genes across a series of metagenomic samples, that enables comprehensive discovery of new microbial organisms, viruses and co-inherited genetic entities and aids assembly of microbial genomes without the need for reference sequences. We demonstrate the method on data from 396 human gut microbiome samples and identify 7,381 co-abundance gene groups (CAGs), including 741 metagenomic species (MGS). We use these to assemble 238 high-quality microbial genomes and identify affiliations between MGS and hundreds of viruses or genetic entities. Our method provides the means for comprehensive profiling of the diversity within complex metagenomic samples.