You must choose, but choose wisely: Model-based approaches for microbial community analysis

计算机科学基因组数据科学微生物生态学计数数据微生物群生态学机器学习生化工程数据挖掘生物数学生物信息学工程类生物化学统计遗传学细菌基因泊松分布

作者

Márcio Fernandes Alves Leite,Eiko E. Kuramae

出处

期刊：Soil Biology & Biochemistry [Elsevier BV]
日期：2020-10-07 卷期号：151: 108042-108042 被引量：40

链接

knaw.nl knaw.nl knaw.nldoi.org

标识

DOI：10.1016/j.soilbio.2020.108042

摘要

Soil microbial community data produced by next-generation sequencing platforms has introduced a new era in microbial ecology studies but poses a challenge for data analysis: huge tables with highly sparse data combined with methodological limitations leading to biased analyses. Methodological studies have attempted to improve data interpretation via data transformation and/or rarefaction but usually neglect the assumptions required for an appropriate analysis. Advances in both mathematics and computation are now making model-based approaches feasible, especially latent variable modeling (LVM). LVM is a cornerstone of modern unsupervised learning that permits the evaluation of evolutionary, temporal, and count structure in a unified approach that directly incorporates the data distribution. Despite these advantages, LVM is rarely applied in data analyses of the soil microbiome. Here, we review available methods to handle the characteristics of soil microbial data obtained from next-generation sequencing and advocate for model-based approaches. We focus on the importance of assumption checking for guiding the selection of the most appropriate method of data analysis. We also provide future directions by advocating for the consideration of the dataset produced by sequencing as a representation of microbial detections instead of abundances and for the adoption of hierarchical models to convert these detections into estimated abundances prior to evaluating the microbial community. In summary, we show that model assessment is important for qualifying interpretations and can further guide refinements in subsequent analyses. We have only begun to understand the factors regulating soil microbial communities and the impacts of this microbiota on the environment/ecosystem. Understanding the assumptions of new methods is essential to fully harness their power to test hypotheses using high-throughput sequencing data.

求助该文献

最长约 10秒，即可获得该文献文件

You must choose, but choose wisely: Model-based approaches for microbial community analysis

今日热心研友