计算机科学
语音识别
虚假关系
降维
说话人识别
说话人日记
噪音(视频)
还原(数学)
说话人验证
降噪
维数之咒
航程(航空)
模式识别(心理学)
人工智能
机器学习
数学
工程类
几何学
图像(数学)
航空航天工程
作者
You Jin Kim,Hee-Soo Heo,Jee-weon Jung,Youngki Kwon,Bong‐Jin Lee,Joon Son Chung
标识
DOI:10.1109/icassp49357.2023.10095530
摘要
The objective of this work is to train noise-robust speaker embeddings adapted for speaker diarisation. Speaker embeddings play a crucial role in the performance of diarisation systems, but they often capture spurious information such as noise, adversely affecting performance. Our previous work has proposed an auto-encoder-based dimensionality reduction module to help remove the redundant information. However, they do not explicitly separate such information and have also been found to be sensitive to hyper-parameter values. To this end, we propose two contributions to overcome these issues: (i) a novel dimensionality reduction framework that can disentangle spurious information from the speaker embeddings; (ii) the use of speech activity vector to prevent the speaker code from representing the background noise. Through a range of experiments conducted on four datasets, our approach consistently demonstrates the state-of-the-art performance among models without system fusion.
科研通智能强力驱动
Strongly Powered by AbleSci AI