Lasso(编程语言)
过度拟合
算法
树(集合论)
集合(抽象数据类型)
数学
功能(生物学)
节点(物理)
数据集
线性回归
外推法
计算机科学
统计
组合数学
人工智能
物理
万维网
程序设计语言
生物
进化生物学
量子力学
人工神经网络
标识
DOI:10.1107/s1600576720016751
摘要
A new linear function for modelling the background in whole-powder-pattern fitting has been derived by applying LASSO (least absolute shrinkage and selection operator) and the technique of tree search. The background function (BGF) consists of terms b n L (2θ/180) − n /2 and b n H (1 − 2θ/180) − n /2 for the low- and high-angle sides, respectively. Some variable parameters of the BGF should be fixed at zero while others should be varied in order to find the best fit for a given data set without inducing overfitting. The LASSO algorithm can automatically select the variables in linear regression analysis. However, it finds the best-fit BGF with a set of adjustable parameters for a given data set while it derives a different set of parameters for a different data set. Thus, LASSO derives multiple solutions depending on the data set used. By regarding the individual solutions from LASSO as nodes of trees, tree structures were constructed from these solutions. The root node has the maximum number of adjustable parameters, P . P decreases with descending levels of the tree one by one, and leaf nodes have just one parameter. By evaluating individual solutions (nodes) by their χ 2 index, the best-fit single path from a root node to a leaf node was found. The present BGF can be used simply by varying P in the range 1–10. The BGF thus derived as a final single solution was incorporated into computer programs for Pawley-based whole-powder-pattern decomposition and Rietveld refinement, and the performance of the BGF was tested in comparison with the polynomials currently widely used as the BGF. The present BGF has been demonstrated to be stable and to give an excellent fit, comparable to polynomials but with a smaller number of adjustable parameters and without introducing undulation into the calculated background curve. Basic algorithms used in statistics and machine learning have been demonstrated to be useful in developing an analytical model in X-ray crystallography.
科研通智能强力驱动
Strongly Powered by AbleSci AI