粒度
弹丸
计算机科学
人工智能
培训(气象学)
自然语言处理
一次性
计算机视觉
工程类
地理
材料科学
程序设计语言
机械工程
气象学
冶金
作者
Ruikun Luo,Guanhuan Huang,Xiaojun Quan
标识
DOI:10.18653/v1/2021.findings-acl.151
摘要
The major paradigm of applying a pre-trained language model to downstream tasks is to finetune it on labeled task data, which often suffers instability and low performance when the labeled examples are scarce.One way to alleviate this problem is to apply post-training on unlabeled task data before fine-tuning, adapting the pre-trained model to target domains by contrastive learning that considers either tokenlevel or sequence-level similarity.Inspired by the success of sequence masking, we argue that both token-level and sequence-level similarities can be captured with a pair of masked sequences.Therefore, we propose complementary random masking (CRM) to generate a pair of masked sequences from an input sequence for sequence-level contrastive learning and then develop contrastive masked language modeling (CMLM) for post-training to integrate both token-level and sequence-level contrastive learnings.Empirical results show that CMLM surpasses several recent post-training methods in few-shot settings without the need for data augmentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI