判决
计算机科学
关系抽取
特征(语言学)
人工智能
噪音(视频)
关系(数据库)
代表(政治)
自然语言处理
文字袋模型
机器学习
模式识别(心理学)
信息抽取
数据挖掘
哲学
法学
图像(数学)
政治
语言学
政治学
标识
DOI:10.1016/j.eswa.2023.119727
摘要
Distant supervision employs external knowledge bases to automatically label corpora. The labeled sentences in a corpus are usually packaged and trained for relation extraction using a multi-instance learning paradigm. The automated distant supervision inevitably introduces label noises. Previous studies that used sentence-level attention mechanisms to de-noise neither considered correlation among sentences in a bag nor correlation among bags. As a result, a large amount of effective supervision information is lost, which will affect the performance of learned relation extraction models. Moreover, these methods ignore the lack of feature information in the few-sentence bags (especially the one-sentence bags). To address these issues, this paper proposes hierarchical attention-based networks that can de-noise at both sentence and bag levels. In the calculation of bag representation, we provide weights to sentence representations using sentence-level attention that considers correlations among sentences in each bag. Then, we employ bag-level attention to combine the similar bags by considering their correlations, which can enhance the feature of target bags with poor feature information, and to provide properer weights in the calculation of bag group representation. Both sentence-level attention and bag-level attention can make full use of supervised information to improve model performance. The proposed method was compared with nine state-of-the-art methods on the New York Times datasets and Google IISc Distant Supervision dataset, respectively, whose experimental results show its conspicuous advantages in relation extraction tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI