计算机科学
班级(哲学)
多标签分类
水准点(测量)
人工智能
空格(标点符号)
集合(抽象数据类型)
特征(语言学)
功能(生物学)
特征向量
机器学习
分布(数学)
数据挖掘
数学
数学分析
语言学
哲学
大地测量学
进化生物学
生物
程序设计语言
地理
操作系统
作者
Xingyu Zhao,Yuexuan An,Ning Xu,Xin Geng
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:: 1-15
被引量:3
标识
DOI:10.1109/tkde.2023.3323401
摘要
Multi-label text classification (MLTC) refers to the problem of tagging a given document with the most relevant subset of labels. One of the biggest challenges for MLTC is the existence of class imbalance. Most advanced MLTC models suffer from this issue, which limits the performance of the models. In this paper, we propose a model-agnostic framework named variational continuous label distribution learning (VCLDL) to address this problem. VCLDL theoretically builds a corresponding relationship between the feature space and the label space to mine the information hidden in the observable logical labels. Specifically, VCLDL regards label distribution as a continuous density function in latent space and forms a flexible variational approach to approximate the density function of the labels with the collaboration of the feature space. Combined with VCLDL, MLTC models can pay more attention to the distribution of the whole label set, rather than specific labels with maximum response values, thus the class imbalance problem can be well overcome. Experimental results on multiple benchmark datasets demonstrate that VCLDL can bring significant performance improvements over the existing MLTC models.
科研通智能强力驱动
Strongly Powered by AbleSci AI