生物
七鳃鳗科
树(集合论)
稳健性(进化)
系统发育树
贝叶斯定理
分歧(语言学)
贝叶斯概率
推论
统计
分子钟
进化生物学
基因
计算生物学
计算机科学
遗传学
人工智能
数学
萤火虫协议
语言学
哲学
数学分析
作者
Sebastian Hoehna,Sarah E. Lower,Pablo Duchen,Ana Catalán
标识
DOI:10.1093/sysbio/syae065
摘要
Abstract Genomic data has become ubiquitous in phylogenomic studies, including divergence time estimation, but provide new challenges. These challenges include, amongst others, biological gene tree discordance, methodological gene tree estimation error, and computational limitations on performing full Bayesian inference under complex models. In this study, we use a recently published firefly (Coleoptera: Lampyridae) anchored hybrid enrichment dataset (AHE; 436 loci for 88 Lampyridae species and 10 outgroup species) as a case study to explore gene tree estimation error and the robustness of divergence time estimation. First, we explored the amount of model violation using posterior predictive simulations because model violations are likely to bias phylogenetic inferences and produce gene tree estimation error. We specifically focused on missing data (either uniformly distributed or systematically) and the distribution of highly variable and conserved sites (either uniformly distributed or clustered). Our assessment of model adequacy showed that standard phylogenetic substitution models are not adequate for any of the 436 AHE loci. We tested if the model violations and alignment errors resulted indeed in gene tree estimation error by comparing the observed gene tree discordance to simulated gene tree discordance under the multispecies coalescent model. Thus, we show that the inferred gene tree discordance is not only due to biological mechanism but primarily due to inference errors. Lastly, we explored if divergence time estimation is robust despite the observed gene tree estimation error. We selected four subsets of the full AHE dataset, concatenated each subset and performed a Bayesian relaxed clock divergence estimation in RevBayes. The estimated divergence times overlapped for all nodes that are shared between the topologies. Thus, divergence time estimation is robust using any well selected data subset as long as the topology inference is robust.
科研通智能强力驱动
Strongly Powered by AbleSci AI