发布文献求助

MAFormer: A transformer network with multi-scale attention fusion for visual recognition

计算机科学人工智能变压器模式识别（心理学）融合比例（比率）机器学习工程类语言学哲学物理量子力学电压电气工程

作者

Huixin Sun,Yunhao Wang,Xiaodi Wang,Bin Zhang,Ying Xin,Baochang Zhang,Xianbin Cao,Errui Ding,Shumin Han

出处

期刊：Neurocomputing [Elsevier]
日期：2024-05-10 卷期号：595: 127828-127828 被引量：1

链接

arxiv.org arxiv.orgdoi.org

标识

DOI：10.1016/j.neucom.2024.127828

摘要

Vision Transformer and its variants have demonstrated great potential in various computer vision tasks. However conventional vision transformers often focus on global dependency at a coarse level, which results in a learning challenge on global relationships and fine-grained representation at a token level. In this paper, we introduce Multi-scale Attention Fusion into transformer (MAFormer), which explores local aggregation and global feature extraction in a dual-stream framework for visual recognition. We develop a simple but effective module to explore the full potential of transformers for visual representation by learning fine-grained and coarse-grained features at a token level and dynamically fusing them. Our Multi-scale Attention Fusion (MAF) block consists of: i) a local window attention branch that learns short-range interactions within windows, aggregating fine-grained local features; ii) global feature extraction through a novel Global Learning with Down-sampling (GLD) operation to efficiently capture long-range context information within the whole image; iii) a fusion module that self-explores the integration of both features via attention. Our MAFormer achieves state-of-the-art results on several common vision tasks. In particular, MAFormer-L achieves 85.9% Top-1 accuracy on ImageNet, surpassing CSWin-B and LV-ViT-L by 1.7% and 0.6% respectively. On MSCOCO, MAFormer outperforms the prior art CSWin by 1.7% mAPs on object detection and 1.4% on instance segmentation with similar-sized parameters. With the performance, MAFormer demonstrates the ability to generalize across various visual benchmarks and prospects as a general backbone for different self-supervised pre-training tasks in the future.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 深情安青上传了应助文件

1秒前; ding的应助被Wang采纳，获得10

3秒前; db发布了新的文献求助10

4秒前; 上官若男的应助被lin01采纳，获得10

6秒前; girl发布了新的文献求助10

8秒前; 卤化氢完成签到，获得积分10

9秒前; 科研通AI2.0上传了应助文件

9秒前; 彳亍1117的应助被束负允三金采纳，获得50

10秒前; 一只鱼的故事完成签到，获得积分10

10秒前; 花花完成签到，获得积分10

10秒前; Ava上传了应助文件

11秒前; SweetAndCool完成签到，获得积分10

12秒前; db完成签到，获得积分10

14秒前; Yogita发布了新的文献求助10

14秒前; 向天歌发布了新的文献求助20

15秒前; 无花果的应助被顺心的水之采纳，获得10

15秒前; 彭于晏上传了应助文件

15秒前; 断罪发布了新的文献求助10

16秒前; 上官若男上传了应助文件

17秒前; 所所的应助被小嘎采纳，获得10

19秒前; lin01发布了新的文献求助10

20秒前; 丘比特上传了应助文件

20秒前; 今后的应助被香菜味钠片采纳，获得10

20秒前; luguo发布了新的文献求助10

21秒前; 吉乐园完成签到，获得积分10

22秒前; orixero的应助被Snoopy采纳，获得10

24秒前; 浅眠发布了新的文献求助10

24秒前; 乐乐的应助被wpk9904采纳，获得10

25秒前; NexusExplorer的应助被Max采纳，获得10

25秒前; Morianm完成签到，获得积分10

26秒前; 媛媛子完成签到，获得积分20

27秒前; sallytan完成签到，获得积分10

28秒前; dyj完成签到，获得积分10

31秒前; pluto的应助被爱幻想的青柠采纳，获得10

32秒前; 瞬间发布了新的文献求助10

33秒前; orixero上传了应助文件

33秒前; 斯文败类的应助被mona采纳，获得10

33秒前; NexusExplorer的应助被章鱼采纳，获得30

33秒前; 思源的应助被糊涂涂采纳，获得10

35秒前; xiaoGuo的应助被luguo采纳，获得10

35秒前

高分求助中: Sustainability in Tides Chemistry 2800; The Young builders of New china : the visit of the delegation of the WFDY to the Chinese People's Republic 1000; Rechtsphilosophie 1000; Bayesian Models of Cognition:Reverse Engineering the Mind 888; Le dégorgement réflexe des Acridiens 800; Defense against predation 800; Very-high-order BVD Schemes Using β-variable THINC Method 568

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3136067; 求助须知：如何正确求助？哪些是违规求助？ 2786953; 关于积分的说明 7779912; 捐赠科研通 2443071; 什么是DOI，文献DOI怎么找？ 1298892; 科研通“疑难数据库（出版商）”最低求助积分说明 625244; 版权声明 600870

今日热心研友

可靠的书桃

完美的天空

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：826996720【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通