The Bayesian Paradigm in Molecular Phylogeny

贝叶斯概率 贝叶斯推理 计算机科学 系统发育树 概率逻辑 马尔科夫蒙特卡洛 推论 人工智能 生物 基因 生物化学
作者
Nicolas RODRIGUE
标识
DOI:10.1002/9781394284252.ch8
摘要

Chapter 8 The Bayesian Paradigm in Molecular Phylogeny Nicolas RODRIGUE, Nicolas RODRIGUE Carleton University, Ottawa, CanadaSearch for more papers by this author Nicolas RODRIGUE, Nicolas RODRIGUE Carleton University, Ottawa, CanadaSearch for more papers by this author Gilles Didier, Gilles DidierSearch for more papers by this authorStéphane Guindon, Stéphane GuindonSearch for more papers by this author Book Author(s):Gilles Didier, Gilles DidierSearch for more papers by this authorStéphane Guindon, Stéphane GuindonSearch for more papers by this author First published: 12 April 2024 https://doi.org/10.1002/9781394284252.ch8 AboutPDFPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShareShare a linkShare onEmailFacebookTwitterLinkedInRedditWechat Summary The applications of probabilistic methods were initially developed within a maximum likelihood (ML) framework. Accommodating for multiple substitutions along a branch in a phylogenetic tree is a major advantage of probabilistic methods. This chapter discusses the technical limitations of the ML framework in building rich molecular evolutionary models, and how the computational development environment of Bayesian models overcomes them. It introduces the basic principles of Bayesian phylogenetic inference, namely the Monte Carlo-based sampling numerical methods commonly used for approximating the probabilities involved, and possible ways to summarize the model posterior distribution parameters. The chapter explains the principle of demarginalization based on two examples, often resulting in faster Monte Carlo sampling, as well as the implementation of substitution models including a non-analytic likelihood function. It also discusses the possible areas for future research in Bayesian molecular phylogeny and the necessary work to access its full potential at the genomic scale. References Adam , P.S. , Borrel , G. , Brochier-Armanet , C. , Gribaldo , S. ( 2017 ). The growing tree of Archaea: New perspectives on their diversity, evolution and ecology . The ISME Journal , 11 , 2407 – 2425 . 10.1038/ismej.2017.122 PubMedWeb of Science®Google Scholar Antunes , L.S. , Poppleton , D. , Klingl , A. , Criscuolo , A. , Dupuy , B. , Brochier-Armanet , C. , Beloin , C. , Gribaldo , S. ( 2016 ). Phylogenomic analysis supports the ancestral presence of LPS-outer membranes in the firmicutes . Elife , 5 , e14589 . 10.7554/eLife.14589 PubMedWeb of Science®Google Scholar Baele , G. , Lemey , P. , Rambaut , A. , Suchard , M.A. ( 2017 ). Adaptive mcmc in Bayesian phylogenetics: An application to analyzing partitioned data in BEAST . Bioinformatics , 33 , 1798 – 1805 . 10.1093/bioinformatics/btx088 CASPubMedWeb of Science®Google Scholar Blanquart , S. and Lartillot , N. ( 2006 ). A Bayesian compound stochastic process for modeling nonstationary and nonhomogeneous sequence evolution . Molecular Biology and Evolution , 23 , 2058 – 2071 . 10.1093/molbev/msl091 CASPubMedWeb of Science®Google Scholar Bollback , J.P. ( 2005 ). Posterior mapping and posterior predictive distributions . In Statistical Methods in Molecular Evolution , R. Nielsen (ed.). Springer , New York . 10.1007/0-387-27733-1_16 Google Scholar Brown , J.M. and Thomson , R.C. ( 2018 ). Evaluating model performance in evolutionary biology . Annual Review of Ecology, Evolution and Systematics , 49 , 95 – 114 . 10.1146/annurev-ecolsys-110617-062249 Web of Science®Google Scholar Brown , M.W. , Heiss , A.A. , Kamikawa , R. , Inagaki , Y. , Yabuki , A. , Tice , A.K. , Shiratori , T. , Ishida , K.-I. , Hashimoto , T. , Simpson , A.G. et al. ( 2018 ). Phylogenomics places orphan protistan lineages in a novel eukaryotic super-group . Genome Biology and Evolution , 10 , 427 – 433 . 10.1093/gbe/evy014 CASPubMedWeb of Science®Google Scholar Fan , Y. , Wu , R. , Chen , M.-H. , Kuo , L. , Lewis , P.O. ( 2011 ). Choosing among partition models in Bayesian phylogenetics . Molecular Biology and Evolution , 28 , 523 – 532 . 10.1093/molbev/msq224 CASPubMedWeb of Science®Google Scholar Felsenstein , J. ( 1981 ). Evolutionary trees from DNA sequences: A maximum likelihood approach . Journal of Molecular Evolution , 17 ( 6 ), 368 – 376 . 10.1007/BF01734359 CASPubMedWeb of Science®Google Scholar Felsenstein , J. ( 2004 ). Inferring Phylogenies . Sinauer Associates , Sunderland, MA . Google Scholar Foster , P.G. ( 2004 ). Modeling compositional heterogeneity . Systematic Biology , 53 ( 3 ), 485 – 495 . 10.1080/10635150490445779 PubMedWeb of Science®Google Scholar Gelman , A. ( 2013 ). Two simple examples for understanding posterior p-values whose distributionsare far from uniform . Electronic Journal of Statistics , 7 , 2595 – 2602 . 10.1214/13-EJS854 Web of Science®Google Scholar Gelman , A. , Meng , X.L. , Stern , H. ( 1996 ). Posterior predicive assessment of model fitness via realised discrepancies . Statistica Sinica , 6 , 733 – 807 . Web of Science®Google Scholar Hastings , W.K. ( 1970 ). Monte Carlo sampling methods using Markov chains and their applications . Biometrika , 57 , 97 – 109 . 10.1093/biomet/57.1.97 Web of Science®Google Scholar Huelsenbeck , J.P. , Ronquist , F. , Nielsen , R. , Bollback , J.P. ( 2001 ). Bayesian inference of phylogeny and its impact on evolutionary biology . Science , 294 ( 5550 ), 2310 – 2314 . 10.1126/science.1065889 CASPubMedWeb of Science®Google Scholar Huelsenbeck , J.P. , Jain , S. , Frost , S.D. , Pond , S.L.K. ( 2006 ). A Dirichlet process model for detecting positive selection in protein-coding DNA sequences . Proceedings of the National Academy of Science of the USA , 103 , 6263 – 6268 . 10.1073/pnas.0508279103 CASPubMedWeb of Science®Google Scholar Jeffreys , H. ( 1935 ). Some tests of significance, treated by the theory of probability . Proceedings of the Cambridge Philosophical Society , 31 , 203 – 222 . 10.1017/S030500410001330X Web of Science®Google Scholar Kass , R. and Raftery , A. ( 1995 ). Bayes factors and model uncertainty . Journal of the American Statistical Association , 90 , 773 – 795 . 10.1080/01621459.1995.10476572 Web of Science®Google Scholar Lakner , C. , Van Der Mark , P. , Huelsenbeck , J.P. , Larget , B. , Ronquist , F. ( 2008 ). Efficiency of Markov chain Monte Carlo tree proposals in Bayesian phylogenetics . Systematic Biology , 57 , 86 – 103 . 10.1080/10635150801886156 PubMedWeb of Science®Google Scholar Lanave , C. , Preparata , G. , Saccone , C. , Serio , G. ( 1984 ). A new method for calculating evolutionary substitution rates . Journal of Molecular Evolution , 20 , 86 – 93 . 10.1007/BF02101990 CASPubMedWeb of Science®Google Scholar Larget , B. and Simon , D. ( 1999 ). Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees . Molecular Biology and Evolution , 16 , 750 – 759 . 10.1093/oxfordjournals.molbev.a026160 CASWeb of Science®Google Scholar Lartillot , N. ( 2006 ). Conjugate Gibbs sampling for Bayesian phylogenetic models . Journal of Computational Biology , 13 , 1701 – 1722 . 10.1089/cmb.2006.13.1701 CASPubMedWeb of Science®Google Scholar Lartillot , N. and Philippe , H. ( 2004 ). A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process . Molecular Biology and Evolution , 21 , 1095 – 1109 . 10.1093/molbev/msh112 CASPubMedWeb of Science®Google Scholar Lartillot , N. and Philippe , H. ( 2006 ). Computing Bayes factors using thermodynamic integration . Systematic Biology , 55 , 195 – 207 . 10.1080/10635150500433722 PubMedWeb of Science®Google Scholar Lartillot , N. and Poujol , R. ( 2010 ). A phylogenetic model for investigating correlated evolution of substitution rates and continuous phenotypic characters . Molecular Biology and Evolution , 28 , 729 – 744 . 10.1093/molbev/msq244 CASPubMedWeb of Science®Google Scholar Lartillot , N. , Brinkmann , H. , Philippe , H. ( 2007 ). Suppression of long branch attraction artefacts in the animal phylogeny using a site-heterogeneous model . BMC Evolutionary Biology , 7 ( Suppl 1 ), S4 . 10.1186/1471-2148-7-S1-S4 CASPubMedWeb of Science®Google Scholar Lartillot , N. , Rodrigue , N. , Stubbs , D. , Richer , J. ( 2013 ). Phylobayes MPI: Phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment . Systematic Biology , 62 , 611 – 615 . 10.1093/sysbio/syt022 CASPubMedWeb of Science®Google Scholar Meng , X.-L. ( 1994 ). Posterior predictive p-values . Annals of Statistics , 22 , 1142 – 1160 . 10.1214/aos/1176325622 Web of Science®Google Scholar Metropolis , S. , Rosenbluth , A.W. , Rosenbluth , M.N. , Teller , A.H. , Teller , E. ( 1953 ). Equation of state calculation by fast computing machines . Journal of Chemical Physics , 21 , 1087 – 1092 . 10.1063/1.1699114 CASPubMedWeb of Science®Google Scholar Nielsen , R. ( 2002 ). Mapping mutations on phylogenies . Systematic Biology , 51 ( 5 ), 729 – 739 . 10.1080/10635150290102393 PubMedWeb of Science®Google Scholar Rambaut , A. , Drummond , A.J. , Xie , D. , Baele , G. , Suchard , M.A. ( 2018 ). Posterior summarization in Bayesian phylogenetics using tracer 1.7 . Systematic Biology , 67 , 901 – 904 . 10.1093/sysbio/syy032 CASPubMedWeb of Science®Google Scholar Robert , C.P. and Casella , G. ( 2004 ). Monte Carlo Statistical Methods . Springer , New York . 10.1007/978-1-4757-4145-2 Google Scholar Robinson , D.M. , Jones , D.T. , Kishino , H. , Goldman , N. , Thorne , J.L. ( 2003 ). Protein evolution with dependence among codons due to tertiary structure . Molecular Biology and Evolution , 18 , 1692 – 1704 . 10.1093/molbev/msg184 Google Scholar Rodrigue , N. and Aris-Brosou , S. ( 2011 ). Fast Bayesian choice of phylogenetic models: Prospecting data augmentation-based thermodynamic integration . Systematic Biology , 60 , 881 – 887 . 10.1093/sysbio/syr065 PubMedWeb of Science®Google Scholar Rodrigue , N. and Lartillot , N. ( 2016 ). Detecting adaptation in protein-coding genes using a Bayesian site-heterogeneous mutation-selection codon substitution model . Molecular Biology and Evolution , 34 , 204 – 214 . 10.1093/molbev/msw220 PubMedWeb of Science®Google Scholar Rodrigue , N. , Lartillot , N. , Bryant , D. , Philippe , H. ( 2005 ). Site interdependence attributed to tertiary structure in amino acid sequence evolution . Gene , 347 , 207 – 217 . 10.1016/j.gene.2004.12.011 CASPubMedWeb of Science®Google Scholar Rodrigue , N. , Philippe , H. , Lartillot , N. ( 2006 ). Assessing site-interdependent phylogenetic models of sequence evolution . Molecular Biology and Evolution , 23 , 1762 – 1775 . 10.1093/molbev/msl041 CASPubMedWeb of Science®Google Scholar Rodrigue , N. , Philippe , H. , Lartillot , N. ( 2007 ). Exploring fast computational strategies for probabilistic phylogenetic analysis . Systematic Biology , 56 , 711 – 726 . 10.1080/10635150701611258 PubMedWeb of Science®Google Scholar Rodrigue , N. , Lartillot , N. , Philippe , H. ( 2008a ). Bayesian comparisons of codon substitution models . Genetics , 180 , 1579 – 1591 . 10.1534/genetics.108.092254 CASPubMedWeb of Science®Google Scholar Rodrigue , N. , Philippe , H. , Lartillot , N. ( 2008b ). Uniformization for sampling realizations of Markov processes: Applications to Bayesian implementations of codon substitution models . Bioinformatics , 24 , 56 – 62 . 10.1093/bioinformatics/btm532 CASPubMedWeb of Science®Google Scholar Rodrigue , N. , Philippe , H. , Lartillot , N. ( 2009 ). Computational methods for evaluating phylogenetic models of coding sequence evolution with dependence between codons . Molecular Biology and Evolution , 26 , 1663 – 1676 . 10.1093/molbev/msp078 CASPubMedWeb of Science®Google Scholar Rubin , D.B. ( 1984 ). Bayesianly justifiable and relevant frequency calculations for the applied statistician . Annals of Statistics , 4 , 1151 – 1172 . Google Scholar Simion , P. , Philippe , H. , Baurain , D. , Jager , M. , Richter , D.J. , Di Franco , A. , Roure , B. , Satoh , N. , Queinnec , E. , Ereskovsky , A. et al. ( 2017 ). A large and consistent phylogenomic dataset supports sponges as the sister group to all other animals . Current Biology , 27 , 958 – 967 . 10.1016/j.cub.2017.02.031 CASPubMedWeb of Science®Google Scholar Xie , W. , Lewis , P.O. , Fan , Y. , Kuo , L. , Chen , M.-H. ( 2011 ). Improving marginal likelihood estimation for Bayesian phylogenetic model selection . Systematic Biology , 60 , 150 – 160 . 10.1093/sysbio/syq085 PubMedWeb of Science®Google Scholar Yang , Z. ( 1993 ). Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites . Molecular Biology and Evolution , 10 , 1396 – 1401 . CASPubMedWeb of Science®Google Scholar Yang , Z. ( 1994 ). Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: Approximate methods . Journal of Molecular Evolution , 39 , 306 – 14 . 10.1007/BF00160154 CASPubMedWeb of Science®Google Scholar Models and Methods for Biological Evolution: Mathematical Models and Algorithms to Study Evolution ReferencesRelatedInformation
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
善良友安发布了新的文献求助10
刚刚
刚刚
卑以自牧发布了新的文献求助10
1秒前
顺心的半兰完成签到 ,获得积分20
1秒前
selfevidbet发布了新的文献求助30
1秒前
1秒前
文忉嫣发布了新的文献求助10
1秒前
打工羊完成签到,获得积分10
1秒前
白衣未央完成签到,获得积分10
1秒前
阳光向秋发布了新的文献求助10
1秒前
1秒前
QL应助图苏采纳,获得30
2秒前
2秒前
hy完成签到,获得积分10
2秒前
粗暴的君浩完成签到,获得积分10
2秒前
2秒前
3秒前
大个应助立波采纳,获得10
3秒前
乐乐应助柔弱凡松采纳,获得10
3秒前
4秒前
4秒前
共享精神应助白华苍松采纳,获得10
4秒前
钰宁发布了新的文献求助10
5秒前
5秒前
小神完成签到,获得积分10
6秒前
菠萝炒蛋加饭完成签到 ,获得积分10
6秒前
Eddy完成签到,获得积分20
6秒前
无敌OUT曼完成签到,获得积分10
6秒前
luuuuuing发布了新的文献求助30
7秒前
spring完成签到 ,获得积分10
7秒前
ding应助白衣未央采纳,获得10
7秒前
bkagyin应助饱满小兔子采纳,获得30
7秒前
吨吨喝水发布了新的文献求助10
8秒前
bkagyin应助细心映寒采纳,获得10
8秒前
灬乔发布了新的文献求助30
8秒前
8秒前
8秒前
西西的瓜皮皮完成签到,获得积分20
9秒前
9秒前
善良友安完成签到,获得积分10
10秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Social media impact on athlete mental health: #RealityCheck 1020
Ensartinib (Ensacove) for Non-Small Cell Lung Cancer 1000
Unseen Mendieta: The Unpublished Works of Ana Mendieta 1000
Bacterial collagenases and their clinical applications 800
El viaje de una vida: Memorias de María Lecea 800
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3527723
求助须知:如何正确求助?哪些是违规求助? 3107826
关于积分的说明 9286663
捐赠科研通 2805577
什么是DOI,文献DOI怎么找? 1539998
邀请新用户注册赠送积分活动 716878
科研通“疑难数据库(出版商)”最低求助积分说明 709762