数据科学
杠杆(统计)
计算机科学
科学发现
数据发现
领域(数学分析)
人工智能
万维网
认知科学
元数据
心理学
数学
数学分析
作者
Anuj Karpatne,Gowtham Atluri,James H. Faghmous,Michael Steinbach,Arindam Banerjee,Auroop R. Ganguly,Shashi Shekhar,Nagiza Samatova,Vipin Kumar
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2017-06-29
卷期号:29 (10): 2318-2331
被引量:473
标识
DOI:10.1109/tkde.2017.2720168
摘要
Data science models, although successful in a number of commercial domains, have had limited applicability in scientific problems involving complex physical phenomena. Theory-guided data science (TGDS) is an emerging paradigm that aims to leverage the wealth of scientific knowledge for improving the effectiveness of data science models in enabling scientific discovery. The overarching vision of TGDS is to introduce scientific consistency as an essential component for learning generalizable models. Further, by producing scientifically interpretable models, TGDS aims to advance our scientific understanding by discovering novel domain insights. Indeed, the paradigm of TGDS has started to gain prominence in a number of scientific disciplines such as turbulence modeling, material discovery, quantum chemistry, bio-medical science, bio-marker discovery, climate science, and hydrology. In this paper, we formally conceptualize the paradigm of TGDS and present a taxonomy of research themes in TGDS. We describe several approaches for integrating domain knowledge in different research themes using illustrative examples from different disciplines. We also highlight some of the promising avenues of novel research for realizing the full potential of theory-guided data science.
科研通智能强力驱动
Strongly Powered by AbleSci AI