计算机科学
语言模型
人工智能
自然语言处理
特征(语言学)
资源(消歧)
重新使用
语音识别
隐马尔可夫模型
语言学
工程类
计算机网络
哲学
废物管理
作者
Jiahao Yang,Jianguo Wei,Kuntharrgyal Khysru,Junhai Xu,Wenhuan Lu,Wenjun Ke,Xiaokang Yang
标识
DOI:10.1109/apsipaasc58517.2023.10317230
摘要
Tibetan is a distinctive and culturally rich language spoken by millions of people across the Tibetan Plateau and surrounding regions. Exploring the application of speech recognition technology to Tibetan has special significance for preserving language diversity and fostering cultural integration. Moreover, Tibetan comprises a multitude of distinct dialects, which present a hurdle for reusing speech recognition models. In low-resource dialect tasks, conventional approaches endeavor to transfer well-trained models from linguistically akin languages to the target. However, recent studies have shown that an indiscriminate fine-tuning of all parameters may disrupt the feature extractor of the pre-trained model, leading to catastrophic forgetting. This paper introduces an innovative fine-tuning method grounded in model adaptation. Aimed at training automatic speech recognition (ASR) models within the constraints of limited training data and cross-dialect transfer, our novel approach refines a select group of language-specific parameters, leading to robust performance. These parameters, signified by a sparse binary mask identical to the model, circumvent the need for additional parameters. Experiments conducted on two downstream low-resource Tibetan languages show that our proposed methodology outperforms the traditional fine-tuning and adapter based fine-tuning.
科研通智能强力驱动
Strongly Powered by AbleSci AI