计算机科学
相似性(几何)
人工智能
机器学习
变压器
监督学习
代表(政治)
特征学习
自然语言处理
相关性
人工神经网络
数学
图像(数学)
法学
电压
物理
几何学
政治
量子力学
政治学
作者
Yu-An Chung,Yonatan Belinkov,James Glass
标识
DOI:10.1109/icassp39728.2021.9414321
摘要
Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of speech tasks have also been investigated. However, there has been little research focusing on understanding the properties of existing approaches. In this work, we aim to provide a comparative study of some of the most representative self-supervised algorithms. Specifically, we quantify the similarities between different self-supervised representations using existing similarity measures. We also design probing tasks to study the correlation between the models’ pre-training loss and the amount of specific speech information contained in their learned representations. In addition to showing how various self-supervised models behave differently given the same input, our study also finds that the training objective has a higher impact on representation similarity than architectural choices such as building blocks (RNN/Transformer/CNN) and directionality (uni/bidirectional). Our results also suggest that there exists a strong correlation between pre-training loss and downstream performance for some self-supervised algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI