变压器
计算机科学
学习迁移
人工智能
下游(制造业)
自然语言处理
语言模型
答疑
机器学习
工程类
运营管理
电气工程
电压
作者
Katikapalli Subramanyam Kalyan,Ajit Rajasekharan,S. Sangeetha
出处
期刊:Cornell University - arXiv
日期:2021-01-01
被引量:112
标识
DOI:10.48550/arxiv.2108.05542
摘要
Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. The evolution of these models started with GPT and BERT. These models are built on the top of transformers, self-supervised learning and transfer learning. Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning and transfer this knowledge to downstream tasks. These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch. In this comprehensive survey paper, we initially give a brief overview of self-supervised learning. Next, we explain various core concepts like pretraining, pretraining methods, pretraining tasks, embeddings and downstream adaptation methods. Next, we present a new taxonomy of T-PTLMs and then give brief overview of various benchmarks including both intrinsic and extrinsic. We present a summary of various useful libraries to work with T-PTLMs. Finally, we highlight some of the future research directions which will further improve these models. We strongly believe that this comprehensive survey paper will serve as a good reference to learn the core concepts as well as to stay updated with the recent happenings in T-PTLMs.
科研通智能强力驱动
Strongly Powered by AbleSci AI