计算机科学
杠杆(统计)
多标签分类
人工智能
语言模型
集合(抽象数据类型)
自然语言处理
机器学习
程序设计语言
作者
Rui Song,Zelong Liu,Xingbing Chen,Haining An,Zhiqi Zhang,Xiaoguang Wang,Hao Xu
标识
DOI:10.1007/s10489-022-03896-4
摘要
Multi-label text classification has been widely concerned by scholars due to its contribution to practical applications. One of the key challenges in multi-label text classification is how to extract and leverage the correlation among labels. However, it is quite challenging to directly model the correlations among labels in a complex and unknown label space. In this paper, we propose a Label Prompt Multi-label Text Classification model (LP-MTC), which is inspired by the idea of prompt learning of pre-trained language model. Specifically, we design a set of templates for multi-label text classification, integrate labels into the input of the pre-trained language model, and jointly optimize by Masked Language Models (MLM). In this way, the correlations among labels as well as semantic information between labels and text with the help of self-attention can be captured, and thus the model performance is effectively improved. Extensive empirical experiments on multiple datasets demonstrate the effectiveness of our method. Compared with BERT, LP-MTC improved 3.4% micro-F1 on average over the four public datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI