计算机科学
对抗制
规范化(社会学)
语音识别
一般化
一致性(知识库)
人工智能
说话人识别
说话人验证
集合(抽象数据类型)
领域(数学)
数学分析
数学
社会学
人类学
纯数学
程序设计语言
作者
Jie Lian,Pingyuan Lin,Yuxing Dai,Guilin Li
出处
期刊:Springer International Publishing eBooks
[Springer Nature]
日期:2022-01-01
卷期号:: 569-578
标识
DOI:10.1007/978-3-031-13829-4_49
摘要
AbstractRecently, Non-parallel voice conversion (VC) has attracted the attention of many researchers in the field of speech. However, such model suffers from the limitation that how to improve the generalization of the model to extract the speaker information from unseen speaker. In this paper, we proposed a novel zero-shot VC approach which performs VC only rely on the speaker belong to training set. To achieve this VC method, we disentangle the speaker and content representations with instance normalization, and then use the adversarial learning to encourage model to produce more similar converted result. The experiment results demonstrate that our approach can achieve arbitrary voice conversion without any supervision.KeywordsVoice conversionZero-shot learningAdversarial learningDisentangle representations
科研通智能强力驱动
Strongly Powered by AbleSci AI