Metrics for Benchmarking and Uncertainty Quantification: Quality, Applicability, and Best Practices for Machine Learning in Chemistry

标杆管理 计算机科学 机器学习 质量(理念) 数据科学 管理科学 工程类 化学 哲学 管理 认识论 经济
作者
Gaurav Vishwakarma,Aditya Sonpal,Johannes Hachmann
出处
期刊:Trends in chemistry [Elsevier BV]
卷期号:3 (2): 146-156 被引量:49
标识
DOI:10.1016/j.trechm.2020.12.004
摘要

As machine learning (ML) is gaining an increasingly prominent role in chemical research, so is the need to assess the quality and applicability of ML models, compare different ML models, and develop best-practice guidelines for their design and utilization. Statistical loss function metrics and uncertainty quantification techniques are key issues in this context. Different analyses highlight different facets of a model’s performance, and a compilation of metrics, as opposed to a single metric, allows for a well-rounded understanding of what can be expected from a model. They also allow us to identify unexplored regions of chemical space and pursue their survey. Metrics can thus make an important contribution to further democratize ML in chemistry; promote best practices; provide context to predictions and methodological developments; lend trust, legitimacy, and transparency to results from ML studies; and ultimately advance chemical domain knowledge. This review aims to draw attention to two issues of concern when we set out to make machine learning work in the chemical and materials domain, that is, statistical loss function metrics for the validation and benchmarking of data-derived models, and the uncertainty quantification of predictions made by them. They are often overlooked or underappreciated topics as chemists typically only have limited training in statistics. Aside from helping to assess the quality, reliability, and applicability of a given model, these metrics are also key to comparing the performance of different models and thus for developing guidelines and best practices for the successful application of machine learning in chemistry. This review aims to draw attention to two issues of concern when we set out to make machine learning work in the chemical and materials domain, that is, statistical loss function metrics for the validation and benchmarking of data-derived models, and the uncertainty quantification of predictions made by them. They are often overlooked or underappreciated topics as chemists typically only have limited training in statistics. Aside from helping to assess the quality, reliability, and applicability of a given model, these metrics are also key to comparing the performance of different models and thus for developing guidelines and best practices for the successful application of machine learning in chemistry. in a binary classification problem, each sample belongs to either one class or the other (i.e., it has a known probability of 1.0 for one class and 0.0 for the other). A classifier model can estimate the probability of a sample belonging to each class. The binary cross-entropy is used as a metric to assess the difference between the two probability distributions and thus the uncertainty of a classifier’s prediction. (Also see cross-entropy, categorical cross-entropy, and log loss.) for multiclass classification problems, that is, for problems involving more than two categories (classes) of data, the cross-entropy measures the difference between the probability distribution of a sample belonging to one class and the probability distribution of that sample not belonging to that class (i.e. belonging to any of the other classes). This metric is known as categorical cross-entropy. (Also see binary cross-entropy.) a measure of the difference between two probability distributions for a given set of samples. (Also see binary cross-entropy, categorical cross-entropy, and log loss.) This is a heuristic-based approach inspired by natural selection in biological processes (i.e., survival of the fittest). It is typically employed to tackle (combinatorial) optimization problems, in which gradients (needed for gradient descent methods) are ill-defined (e.g., in problems involving discrete or categorical variables) or otherwise inaccessible. Each possible solution behaves as an individual in a population of solutions and a fitness function (itself a loss function metric) is used to determine its quality. Evolutionary optimization of the population takes place via reproduction, mutation, crossover, and selection iterations. This is a loss function metric that assesses the quality of a solution with respect to an objective of an optimization. Its output can be maximized or minimized (e.g., as part of an evolutionary algorithm). one of multiple types of mean value metrics. Given a set of sample values, the harmonic mean is the inverse of the arithmetic mean of the inverse of the sample values. in ML, hyperparameters are the parameters that define the structure of a model and control the learning process, as opposed to other parameters that are derived (‘learned’) from the data in the course of training the model. the negative logarithm of the likelihood of a set of observations given a model’s parameters. While log loss and cross-entropy are not the same by definition, they calculate the same quantity when used as fitness functions. In practice, the two terms are thus often used interchangeably. statistical error metrics used to assess the performance of ML models and the quality of their predictions. a technique to transform the feature basis, in which a set of data is described, into a basis that is adapted to the nature of the given data. The principal components are the eigenvectors of the covariance matrix of the data set. this metric is used to assess the similarity between the finite feature (e.g., descriptor, fingerprint) vectors of two samples. The similarity ranges from 0 to 1, with 0 indicating no point of intersection between the two vectors and 1 revealing completely identical vectors.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ZQ完成签到,获得积分10
7秒前
小包子完成签到,获得积分10
8秒前
liyan完成签到 ,获得积分10
9秒前
10秒前
嗯啊完成签到,获得积分10
12秒前
酷波er应助immm采纳,获得10
13秒前
优雅含莲完成签到 ,获得积分10
13秒前
呜啦啦完成签到,获得积分10
14秒前
14秒前
lulu8809完成签到,获得积分10
17秒前
17秒前
二十五完成签到,获得积分10
18秒前
romeo完成签到,获得积分10
19秒前
kaka完成签到 ,获得积分10
19秒前
Akim应助xialuoke采纳,获得10
19秒前
昏睡的蟠桃应助guoxingliu采纳,获得200
20秒前
慕容松完成签到,获得积分10
21秒前
romeo发布了新的文献求助10
21秒前
ss_hHe完成签到,获得积分10
22秒前
22秒前
23秒前
zjcomposite完成签到,获得积分10
23秒前
nn发布了新的文献求助10
23秒前
css完成签到,获得积分10
23秒前
大橙子发布了新的文献求助10
24秒前
1111完成签到,获得积分10
24秒前
敏er好学完成签到,获得积分10
25秒前
细腻的谷秋完成签到 ,获得积分10
25秒前
独特的易形完成签到,获得积分10
26秒前
yangyangyang完成签到,获得积分0
29秒前
yirenli完成签到,获得积分10
30秒前
叶子完成签到 ,获得积分10
30秒前
angel完成签到,获得积分10
32秒前
正经大善人完成签到,获得积分10
34秒前
动听的秋白完成签到 ,获得积分10
35秒前
汉堡包应助biofresh采纳,获得30
35秒前
自然归尘完成签到 ,获得积分10
36秒前
缓慢海蓝完成签到 ,获得积分10
38秒前
liyiren完成签到,获得积分10
39秒前
39秒前
高分求助中
【提示信息,请勿应助】关于scihub 10000
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] 3000
徐淮辽南地区新元古代叠层石及生物地层 3000
The Mother of All Tableaux: Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 3000
Handbook of Industrial Diamonds.Vol2 1100
Global Eyelash Assessment scale (GEA) 1000
Picture Books with Same-sex Parented Families: Unintentional Censorship 550
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4038201
求助须知:如何正确求助?哪些是违规求助? 3575940
关于积分的说明 11373987
捐赠科研通 3305747
什么是DOI,文献DOI怎么找? 1819274
邀请新用户注册赠送积分活动 892662
科研通“疑难数据库(出版商)”最低求助积分说明 815022