Metrics for Benchmarking and Uncertainty Quantification: Quality, Applicability, and Best Practices for Machine Learning in Chemistry

标杆管理 计算机科学 机器学习 质量(理念) 数据科学 管理科学 工程类 化学 认识论 哲学 经济 管理
作者
Gaurav Vishwakarma,Aditya Sonpal,Johannes Hachmann
出处
期刊:Trends in chemistry [Elsevier]
卷期号:3 (2): 146-156 被引量:49
标识
DOI:10.1016/j.trechm.2020.12.004
摘要

As machine learning (ML) is gaining an increasingly prominent role in chemical research, so is the need to assess the quality and applicability of ML models, compare different ML models, and develop best-practice guidelines for their design and utilization. Statistical loss function metrics and uncertainty quantification techniques are key issues in this context. Different analyses highlight different facets of a model’s performance, and a compilation of metrics, as opposed to a single metric, allows for a well-rounded understanding of what can be expected from a model. They also allow us to identify unexplored regions of chemical space and pursue their survey. Metrics can thus make an important contribution to further democratize ML in chemistry; promote best practices; provide context to predictions and methodological developments; lend trust, legitimacy, and transparency to results from ML studies; and ultimately advance chemical domain knowledge. This review aims to draw attention to two issues of concern when we set out to make machine learning work in the chemical and materials domain, that is, statistical loss function metrics for the validation and benchmarking of data-derived models, and the uncertainty quantification of predictions made by them. They are often overlooked or underappreciated topics as chemists typically only have limited training in statistics. Aside from helping to assess the quality, reliability, and applicability of a given model, these metrics are also key to comparing the performance of different models and thus for developing guidelines and best practices for the successful application of machine learning in chemistry. This review aims to draw attention to two issues of concern when we set out to make machine learning work in the chemical and materials domain, that is, statistical loss function metrics for the validation and benchmarking of data-derived models, and the uncertainty quantification of predictions made by them. They are often overlooked or underappreciated topics as chemists typically only have limited training in statistics. Aside from helping to assess the quality, reliability, and applicability of a given model, these metrics are also key to comparing the performance of different models and thus for developing guidelines and best practices for the successful application of machine learning in chemistry. in a binary classification problem, each sample belongs to either one class or the other (i.e., it has a known probability of 1.0 for one class and 0.0 for the other). A classifier model can estimate the probability of a sample belonging to each class. The binary cross-entropy is used as a metric to assess the difference between the two probability distributions and thus the uncertainty of a classifier’s prediction. (Also see cross-entropy, categorical cross-entropy, and log loss.) for multiclass classification problems, that is, for problems involving more than two categories (classes) of data, the cross-entropy measures the difference between the probability distribution of a sample belonging to one class and the probability distribution of that sample not belonging to that class (i.e. belonging to any of the other classes). This metric is known as categorical cross-entropy. (Also see binary cross-entropy.) a measure of the difference between two probability distributions for a given set of samples. (Also see binary cross-entropy, categorical cross-entropy, and log loss.) This is a heuristic-based approach inspired by natural selection in biological processes (i.e., survival of the fittest). It is typically employed to tackle (combinatorial) optimization problems, in which gradients (needed for gradient descent methods) are ill-defined (e.g., in problems involving discrete or categorical variables) or otherwise inaccessible. Each possible solution behaves as an individual in a population of solutions and a fitness function (itself a loss function metric) is used to determine its quality. Evolutionary optimization of the population takes place via reproduction, mutation, crossover, and selection iterations. This is a loss function metric that assesses the quality of a solution with respect to an objective of an optimization. Its output can be maximized or minimized (e.g., as part of an evolutionary algorithm). one of multiple types of mean value metrics. Given a set of sample values, the harmonic mean is the inverse of the arithmetic mean of the inverse of the sample values. in ML, hyperparameters are the parameters that define the structure of a model and control the learning process, as opposed to other parameters that are derived (‘learned’) from the data in the course of training the model. the negative logarithm of the likelihood of a set of observations given a model’s parameters. While log loss and cross-entropy are not the same by definition, they calculate the same quantity when used as fitness functions. In practice, the two terms are thus often used interchangeably. statistical error metrics used to assess the performance of ML models and the quality of their predictions. a technique to transform the feature basis, in which a set of data is described, into a basis that is adapted to the nature of the given data. The principal components are the eigenvectors of the covariance matrix of the data set. this metric is used to assess the similarity between the finite feature (e.g., descriptor, fingerprint) vectors of two samples. The similarity ranges from 0 to 1, with 0 indicating no point of intersection between the two vectors and 1 revealing completely identical vectors.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
猇会不会发布了新的文献求助10
刚刚
1秒前
少堂发布了新的文献求助10
1秒前
RapGod完成签到,获得积分10
3秒前
nmr发布了新的文献求助10
3秒前
Cheetahhh完成签到,获得积分10
4秒前
ClancyJacky发布了新的文献求助10
4秒前
5秒前
babubu完成签到,获得积分20
6秒前
buno应助123采纳,获得10
6秒前
可爱的函函应助龚成明采纳,获得10
7秒前
奋斗的威发布了新的文献求助10
9秒前
12秒前
12秒前
中中完成签到 ,获得积分10
13秒前
卡皮巴拉完成签到 ,获得积分10
14秒前
14秒前
学术地瓜发布了新的文献求助10
14秒前
小蘑菇应助22222采纳,获得10
15秒前
彭于晏应助风音采纳,获得10
16秒前
16秒前
VDC应助彩色德天采纳,获得30
16秒前
科研通AI2S应助123采纳,获得10
16秒前
酷波er应助我是小魔菇采纳,获得10
16秒前
17秒前
慕青应助朝暮采纳,获得20
17秒前
慕青应助葳蕤采纳,获得10
17秒前
18秒前
koito发布了新的文献求助10
19秒前
19秒前
完美的天抒完成签到,获得积分10
21秒前
scq发布了新的文献求助10
22秒前
24秒前
风趣的初阳完成签到,获得积分10
25秒前
25秒前
27秒前
28秒前
666完成签到,获得积分10
28秒前
zsz发布了新的文献求助10
29秒前
高分求助中
歯科矯正学 第7版(或第5版) 1004
Semiconductor Process Reliability in Practice 1000
Smart but Scattered: The Revolutionary Executive Skills Approach to Helping Kids Reach Their Potential (第二版) 1000
GROUP-THEORY AND POLARIZATION ALGEBRA 500
Mesopotamian divination texts : conversing with the gods : sources from the first millennium BCE 500
Days of Transition. The Parsi Death Rituals(2011) 500
The Heath Anthology of American Literature: Early Nineteenth Century 1800 - 1865 Vol. B 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3233472
求助须知:如何正确求助?哪些是违规求助? 2880022
关于积分的说明 8213600
捐赠科研通 2547449
什么是DOI,文献DOI怎么找? 1376954
科研通“疑难数据库(出版商)”最低求助积分说明 647713
邀请新用户注册赠送积分活动 623154