基数(数据建模)
多标签分类
计算机科学
次模集函数
人工智能
整数(计算机科学)
机器学习
符号
一致性(知识库)
数学
数据挖掘
数学优化
算术
程序设计语言
作者
Ningzhao Sun,Tingjin Luo,Wenzhang Zhuge,Hong Tao,Chenping Hou,Dewen Hu
出处
期刊:IEEE Transactions on Knowledge and Data Engineering
[Institute of Electrical and Electronics Engineers]
日期:2021-04-29
卷期号:35 (1): 877-890
被引量:1
标识
DOI:10.1109/tkde.2021.3076457
摘要
The scarcity of labels is common and great challenge in traditional supervised learning. Semi-supervised learning (SSL) leverages unlabeled samples to alleviate the absence of label information. Similar with annotation, label proportion is another type of prior information and plays a significant role in classification tasks. Compared with the acquisition of labels, label proportion can be obtained more easily. For example, only a small number of patients have been diagnosed with or not with cancers in hospital database, while the proportion with cancer can be generally estimated by historical records. How to incorporate such prior information of label proportion is crucial but rarely studied in literature. Traditional SSL methods often ignore this prior information and will lead to performance degradation inevitably. To solve this problem, we propose a novel SSL with Label Proportion (SSLLP). Our approach encourages to preserve label consistency and label proportion by imposing the cardinality bound constraints. Our formulated problem equals to a mixed-integer constrained submodular minimization and it is difficult to be solved directly. Therefore, we transformed the original problem into a convex one by Lov $\acute{\text{a}}$ sz extension and designed an efficient solving algorithm. Extensive experimental results present the improved performance of our method over several state-of-the-art methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI