混合模型
度量(数据仓库)
聚类分析
水准点(测量)
数学
领域(数学分析)
混合物分布
瓦瑟斯坦度量
概率分布
概率测度
适应(眼睛)
计算机科学
人工智能
数学优化
统计
应用数学
概率密度函数
数据挖掘
数学分析
物理
光学
大地测量学
地理
作者
Yogesh Balaji,Rama Chellappa,Soheil Feizi
出处
期刊:Cornell University - arXiv
日期:2019-01-01
被引量:8
标识
DOI:10.48550/arxiv.1902.00415
摘要
Understanding proper distance measures between distributions is at the core of several learning tasks such as generative models, domain adaptation, clustering, etc. In this work, we focus on mixture distributions that arise naturally in several application domains where the data contains different sub-populations. For mixture distributions, established distance measures such as the Wasserstein distance do not take into account imbalanced mixture proportions. Thus, even if two mixture distributions have identical mixture components but different mixture proportions, the Wasserstein distance between them will be large. This often leads to undesired results in distance-based learning methods for mixture distributions. In this paper, we resolve this issue by introducing the Normalized Wasserstein measure. The key idea is to introduce mixture proportions as optimization variables, effectively normalizing mixture proportions in the Wasserstein formulation. Using the proposed normalized Wasserstein measure leads to significant performance gains for mixture distributions with imbalanced mixture proportions compared to the vanilla Wasserstein distance. We demonstrate the effectiveness of the proposed measure in GANs, domain adaptation and adversarial clustering in several benchmark datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI