计算机科学
基因组
推论
元数据
单变量
假阳性悖论
数据挖掘
图形模型
微生物生态学
集合(抽象数据类型)
概率逻辑
比例(比率)
多样性(控制论)
机器学习
人工智能
多元统计
生物
地理
操作系统
基因
地图学
程序设计语言
细菌
生物化学
遗传学
作者
Janko Tackmann,João F. Matias Rodrigues,Christian von Mering
摘要
Abstract The recent explosion of metagenomic sequencing data opens the door towards the modeling of microbial ecosystems in unprecedented detail. In particular, co-occurrence based prediction of ecological interactions could strongly benefit from this development. However, current methods fall short on several fronts: univariate tools do not distinguish between direct and indirect interactions, resulting in excessive false positives, while approaches with better resolution are so far computationally highly limited. Furthermore, confounding variables typical for cross-study data sets are rarely addressed. We present FlashWeave, a new approach based on a flexible Probabilistic Graphical Models framework to infer highly resolved direct microbial interactions from massive heterogeneous microbial abundance data sets with seamless integration of metadata. On a variety of benchmarks, FlashWeave outperforms state-of-the-art methods by several orders of magnitude in terms of speed while generally providing increased accuracy. We apply FlashWeave to a cross-study data set of 69 818 publicly available human gut samples, resulting in one of the largest and most diverse models of microbial interactions in the human gut to date.
科研通智能强力驱动
Strongly Powered by AbleSci AI