变压器
强化学习
建筑
计算机科学
表演艺术
人工智能
深度学习
机器学习
工程类
电气工程
视觉艺术
艺术
电压
作者
Yi Tay,Mostafa Dehghani,Dara Bahri,Donald Metzler
出处
期刊:Cornell University - arXiv
日期:2020-01-01
被引量:226
标识
DOI:10.48550/arxiv.2009.06732
摘要
Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and reinforcement learning. In the field of natural language processing for example, Transformers have become an indispensable staple in the modern deep learning stack. Recently, a dizzying number of "X-former" models have been proposed - Reformer, Linformer, Performer, Longformer, to name a few - which improve upon the original Transformer architecture, many of which make improvements around computational and memory efficiency. With the aim of helping the avid researcher navigate this flurry, this paper characterizes a large and thoughtful selection of recent efficiency-flavored "X-former" models, providing an organized and comprehensive overview of existing work and models across multiple domains.
科研通智能强力驱动
Strongly Powered by AbleSci AI