计算机科学
质量(理念)
仿形(计算机编程)
数据收集
分析
注释
训练集
机器学习
数据质量
人工智能
数据分析
数据科学
数据挖掘
工程类
操作系统
认识论
哲学
公制(单位)
统计
数学
运营管理
作者
Nitin Gupta,Shashank Mujumdar,Hima Patel,Satoshi Masuda,Naveen Panwar,Sambaran Bandyopadhyay,Sameep Mehta,Shanmukha Guttula,Shazia Afzal,Ruhi Sharma Mittal,Vitobha Munigala
出处
期刊:Knowledge Discovery and Data Mining
日期:2021-08-12
卷期号:: 4040-4041
被引量:57
标识
DOI:10.1145/3447548.3470817
摘要
The quality of training data has a huge impact on the efficiency, accuracy and complexity of machine learning tasks. Data remains susceptible to errors or irregularities that may be introduced during collection, aggregation or annotation stage. This necessitates profiling and assessment of data to understand its suitability for machine learning tasks and failure to do so can result in inaccurate analytics and unreliable decisions. While researchers and practitioners have focused on improving the quality of models, there are limited efforts towards improving the data quality.
科研通智能强力驱动
Strongly Powered by AbleSci AI