计算机科学
任务(项目管理)
人工智能
稀缺
训练集
度量(数据仓库)
机器学习
自然语言处理
语言模型
语音识别
数据挖掘
经济
微观经济学
管理
作者
Yukai Wan,Yuqi Shi,Binghuai Lin,Yanlu Xie
标识
DOI:10.1109/icassp48485.2024.10447007
摘要
The majority of the current mispronunciation detection and diagnosis (MD&D) methods rely on manually annotated data for model training. However, annotating mispronunciations produced by second language (L2) learners is costly. Consequently, data scarcity emerges as a significant challenge in MD&D tasks. In this paper, we employ model-agnostic meta-learning (MAML) to train a phoneme recognition model for MD&D. We conduct experiments using varied meta-learning task partitioning and training strategies to endow the model's ability to rapidly adapt to unfamiliar speakers. Our best-performing method achieves an F-measure of 61.45%, surpassing both the method using fine-tuned pre-trained model wav2vec2.0 and the approach of incorporating reference text during training. These related works also aim to address the challenge of data scarcity in MD&D. Notably, with few-shot fine-tuning, our model still yielded some remarkable results on F-measure, which suggest that in MD&D tasks, meta-learning is indeed effective.
科研通智能强力驱动
Strongly Powered by AbleSci AI