亲属关系
计算机科学
知识抽取
人工智能
水准点(测量)
集合(抽象数据类型)
数据科学
自然语言处理
K-最优模式发现
领域(数学)
数据挖掘
关联规则学习
任务(项目管理)
机器学习
数学
工程类
地理
大地测量学
系统工程
政治学
纯数学
法学
程序设计语言
作者
Yue Yangming,Chunxiao Li,Chen YeZeng,Zijie Dai,Yi Zhou
标识
DOI:10.1109/bdai59165.2023.10257043
摘要
In the expansive realm of knowledge discovery, this study propels forward the subdomain of rule mining with the inception of a singular synthetic dataset – the Kinship 10K Dataset. This dataset, purpose-built for natural language rule mining, derives from the intricate relationship networks across 20 simulated families. These networks include 1,500 unique characters. The development leverages generative techniques, producing a rich array of kinship rules. Each rule is grounded in one of eight foundational Meta kinship relations. The final ensemble, a comprehensive dataset, comprises 10,526 relationship instances, 234 distinct kinship relations, and 104 learnable rules. In addition, we introduce two evaluation metrics – Rule Coverage (RC) and Directed Rule Mining Capability (DRMC) for examining rule mining algorithms in closed domains. RC quantifies the inclusiveness of rule mining datasets, while DRMC delivers nuanced analysis of algorithmic performance in discerning and extracting precise rules, taking accuracy and precision into account. Additionally, we set a benchmark by utilizing the GPT-3.5 and GPT-4 models as baselines. It is noteworthy that the GPT-4 model attained scores of 0.78 and 0.35 on the RC and DRMC metrics respectively. These scores underscore the inherent challenges of the task and signify the merit in pursuing further research to advance this domain. Collectively, this investigation presents a substantial contribution to knowledge discovery. By introducing an innovative dataset, formulating novel evaluation metrics, and instituting a robust baseline model, it not only highlights the prospects for deeper insights and increased automation in the wider field of knowledge discovery but also sets the stage for upcoming advancements in rule mining research.
科研通智能强力驱动
Strongly Powered by AbleSci AI