Marginal singularity and the benefits of labels in covariate-shift

协变量极小极大数学非参数统计边际分布传输（计算）样本量测定分布（数学）学习迁移分类器（UML）联合概率分布概率分布统计样品（材料）计量经济学人工智能计算机科学数学优化随机变量数学分析色谱法并行计算化学

作者

Samory Kpotufe,Guillaume Martinet

出处

期刊：Annals of Statistics [Institute of Mathematical Statistics]
日期：2021-12-01 卷期号：49 (6) 被引量：17

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.1214/21-aos2084

摘要

Transfer Learning addresses common situations in Machine Leaning where little or no labeled data is available for a target prediction problem—corresponding to a distribution Q, but much labeled data is available from some related but different data distribution P. This work is concerned with the fundamental limits of transfer, that is, the limits in target performance in terms of (1) sample sizes from P and Q, and (2) differences in data distributions P, Q. In particular, we aim to address practical questions such as how much target data from Q is sufficient given a certain amount of related data from P, and how to optimally sample such target data for labeling. We present new minimax results for transfer in nonparametric classification (i.e., for situations where little is known about the target classifier), under the common assumption that the marginal distributions of covariates differ between P and Q (often termed covariate-shift). Our results are first to concisely capture the relative benefits of source and target labeled data in these settings through information-theoretic limits. Namely, we show that the benefits of target labels are tightly controlled by a transfer-exponent γ that encodes how singular Q is locally with respect to P, and interestingly paints a more favorable picture of transfer than what might be believed from insights from previous work. In fact, while previous work rely largely on refinements of traditional metrics and divergences between distributions, and often only yield a coarse view of when transfer is possible or not, our analysis—in terms of γ—reveals a continuum of new regimes ranging from easy to hard transfer. We then address the practical question of how to efficiently sample target data to label, by showing that a recently proposed semi-supervised procedure—based on k-NN classification, can be refined to adapt to unknown γ and, therefore, requests target labels only when beneficial, while achieving nearly minimax-optimal transfer rates without knowledge of distributional parameters. Of independent interest, we obtain new minimax-optimality results for vanilla k-NN classification in regimes with nonuniform marginals.

求助该文献

最长约 10秒，即可获得该文献文件

Marginal singularity and the benefits of labels in covariate-shift

今日热心研友