计算机科学
人口
机器学习
线性判别分析
贝叶斯定理
人工智能
朴素贝叶斯分类器
支持向量机
决策树
随机森林
遗传算法
主成分分析
数据挖掘
贝叶斯概率
社会学
人口学
作者
Kuan‐Yu Chen,Elizabeth A. Marschall,Michael G. Sovic,Anthony C Fries,H. Lisle Gibbs,Stuart A. Ludsin
标识
DOI:10.1111/2041-210x.12897
摘要
Abstract The use of biomarkers (e.g., genetic, microchemical and morphometric characteristics) to discriminate among and assign individuals to a population can benefit species conservation and management by facilitating our ability to understand population structure and demography. Tools that can evaluate the reliability of large genomic datasets for population discrimination and assignment, as well as allow their integration with non‐genetic markers for the same purpose, are lacking. Our r package, assign POP , provides both functions in a supervised machine‐learning framework. assign POP uses Monte‐Carlo and K ‐fold cross‐validation procedures, as well as principal component analysis, to estimate assignment accuracy and membership probabilities, using training (i.e., baseline source population) and test (i.e., validation) datasets that are independent. A user then can build a specified predictive model based on the relative sizes of these datasets and classification functions, including linear discriminant analysis, support vector machine, naïve Bayes, decision tree and random forest. assign POP can benefit any researcher who seeks to use genetic or non‐genetic data to infer population structure and membership of individuals. assign POP is a freely available r package under the GPL license, and can be downloaded from CRAN or at https://github.com/alexkychen/assignPOP . A comprehensive tutorial can also be found at https://alexkychen.github.io/assignPOP/ .
科研通智能强力驱动
Strongly Powered by AbleSci AI