计算机科学
最大后验估计
人工智能
频道(广播)
背景(考古学)
超参数
NIST公司
模式识别(心理学)
贝叶斯概率
先验概率
先验与后验
集合(抽象数据类型)
概率逻辑
语音识别
最大似然
数学
统计
哲学
古生物学
计算机网络
认识论
生物
程序设计语言
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing
[Institute of Electrical and Electronics Engineers]
日期:2021-11-26
卷期号:30: 414-428
被引量:3
标识
DOI:10.1109/taslp.2021.3130980
摘要
This paperpresents a Bayesian framework for estimating a Probabilistic Linear Discriminant Analysis (PLDA) model in the presence of noisy labels. True class labels are interpreted as latent random variables, which are transmitted through a noisy channel, and received as observed speaker labels. The labeling process is modeled as a Discrete Memoryless Channel (DMC). PLDA hyperparameters are interpreted as random variables, and their joint posterior distribution is derived using mean-field Variational Bayes, allowing maximum a posteriori (MAP) estimates of the PLDA model parameters to be determined. The proposed solution, referred to as VB-MAP, is presented as a general framework, but is studied in the context of speaker verification, and a variety of use cases are discussed. Specifically, VB-MAP can be used for PLDA estimation with unreliable labels, unsupervised PLDA estimation, and to infer the reliability of a PLDA training set. Experimental results show the proposed approach to provide significant performance improvements on a variety of NIST Speaker Recognition Evaluation (SRE) tasks, both for data sets with simulated mislabels, and for data sets with naturally occurring missing or unreliable labels.
科研通智能强力驱动
Strongly Powered by AbleSci AI