计算机科学
机器学习
公制(单位)
人工智能
一般化
领域(数学分析)
域适应
利用
功能(生物学)
构造(python库)
算法
数据挖掘
数学
分类器(UML)
生物
数学分析
进化生物学
经济
计算机安全
运营管理
程序设计语言
作者
Zhiyong Yang,Qianqian Xu,Shanhu Bao,Peisong Wen,Yuan He,Xiaochun Cao,Qingming Huang
标识
DOI:10.1109/tpami.2023.3303943
摘要
The Area Under the ROC curve (AUC) is a crucial metric for machine learning, which is often a reasonable choice for applications like disease prediction and fraud detection where the datasets often exhibit a long-tail nature. However, most of the existing AUC-oriented learning methods assume that the training data and test data are drawn from the same distribution. How to deal with domain shift remains widely open. This paper presents an early trial to attack AUC-oriented Unsupervised Domain Adaptation (UDA) (denoted as AUCUDA hence after). Specifically, we first construct a generalization bound that exploits a new distributional discrepancy for AUC. The critical challenge is that the AUC risk could not be expressed as a sum of independent loss terms, making the standard theoretical technique unavailable. We propose a new result that not only addresses the interdependency issue but also brings a much sharper bound with weaker assumptions about the loss function. Turning theory into practice, the original discrepancy requires complete annotations on the target domain, which is incompatible with UDA. To fix this issue, we propose a pseudo-labeling strategy and present an end-to-end training framework. Finally, empirical studies over five real-world datasets speak to the efficacy of our framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI