发布文献求助

Combining CNN and Transformer as Encoder to Improve End-to-End Handwritten Mathematical Expression Recognition Accuracy

计算机科学变压器编码器卷积神经网络人工智能模式识别（心理学）语音识别电压量子力学操作系统物理

作者

Zhang Zhang,Yibo Zhang

出处

期刊：Lecture Notes in Computer Science 日期：2022-01-01 卷期号：: 185-197 被引量：3

标识

DOI：10.1007/978-3-031-21648-0_13

摘要

The attention-based encoder-decoder (AED) models are increasingly used in handwritten mathematical expression recognition (HMER) tasks. Given the recent success of Transformer in computer vision and a variety of attempts to combine Transformer with convolutional neural network (CNN), in this paper, we study 3 ways of leveraging Transformer and CNN designs to improve AED-based HMER models: 1) Tandem way, which feeds CNN-extracted features to a Transformer encoder to capture global dependencies; 2) Parallel way, which adds a Transformer encoder branch taking raw image patches as input and concatenates its output with CNN's as final feature; 3) Mixing way, which replaces convolution layers of CNN's last stage with multi-head self-attention (MHSA). We compared these 3 methods on the CROHME benchmark. On CROHME 2016 and 2019, Tandem way attained the ExpRate of 54.85% and 58.56%, respectively; Parallel way attained the ExpRate of 55.63% and 57.39%; and Mixing way achieved the ExpRate of 53.93% and 55.64%. This result indicates that Parallel and Tandem ways perform better than Mixing way, and have little difference between each other.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2025年影响因子查询已上线 (2025-6-18)

更新

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 搜集达人上传了应助文件

1秒前; 摇摇奶昔发布了新的文献求助10

1秒前; 完美世界上传了应助文件

1秒前; ZYH完成签到，获得积分10

4秒前; 稳重元冬发布了新的文献求助10

4秒前; 所所上传了应助文件

4秒前; 静影沉璧完成签到，获得积分10

4秒前; 快去吃蛋糕发布了新的文献求助10

4秒前; coconut完成签到，获得积分10

4秒前; 卓聪健发布了新的文献求助10

6秒前; 邓炎林发布了新的文献求助10

6秒前; 无花果的应助被热木采纳，获得10

7秒前; 小蘑菇上传了应助文件

7秒前; Ava上传了应助文件

7秒前; 慕青的应助被默默安双采纳，获得10

8秒前; 阳阳阳发布了新的文献求助10

9秒前; bwl完成签到，获得积分10

9秒前; zhangshu发布了新的文献求助10

11秒前; waikeyan发布了新的文献求助10

12秒前; 眼睛大雨筠上传了应助文件

14秒前; 量子星尘发布了新的文献求助10

15秒前; 坚强谷槐完成签到，获得积分10

15秒前; 高美美完成签到，获得积分20

15秒前; 慕青上传了应助文件

16秒前; mg完成签到，获得积分10

17秒前; txy完成签到，获得积分10

18秒前; run关闭了run的文献求助

19秒前; 默默安双发布了新的文献求助10

20秒前; 大个的应助被海上森林的一只猫采纳，获得10

21秒前; ding的应助被海上森林的一只猫采纳，获得10

21秒前; 星辰大海的应助被海上森林的一只猫采纳，获得10

21秒前; 酷波er的应助被Fei采纳，获得10

22秒前; 小月发布了新的文献求助10

22秒前; 桐桐的应助被zhangshu采纳，获得10

23秒前; 脑洞疼的应助被平平无奇种花小天才采纳，获得10

24秒前; 李爱国上传了应助文件

24秒前; JamesPei的应助被FDSDK采纳，获得10

25秒前; 橘子完成签到，获得积分10

27秒前; 歌尔德蒙发布了新的文献求助10

27秒前; 笨笨棒球的应助被阳阳阳采纳，获得20

28秒前

高分求助中: Ophthalmic Equipment Market by Devices(surgical: vitreorentinal,IOLs,OVDs,contact lens,RGP lens,backflush,diagnostic&monitoring:OCT,actorefractor,keratometer,tonometer,ophthalmoscpe,OVD), End User,Buying Criteria-Global Forecast to2029 2000; A new approach to the extrapolation of accelerated life test data 1000; Cognitive Neuroscience: The Biology of the Mind 1000; Cognitive Neuroscience: The Biology of the Mind (Sixth Edition) 1000; ACSM’s Guidelines for Exercise Testing and Prescription, 12th edition 588; Christian Women in Chinese Society: The Anglican Story 500; A Preliminary Study on Correlation Between Independent Components of Facial Thermal Images and Subjective Assessment of Chronic Stress 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3961059; 求助须知：如何正确求助？哪些是违规求助？ 3507282; 关于积分的说明 11135400; 捐赠科研通 3239738; 什么是DOI，文献DOI怎么找？ 1790416; 邀请新用户注册赠送积分活动 872379; 科研通“疑难数据库（出版商）”最低求助积分说明 803150

今日热心研友

热心市民小红花

眼睛大雨筠

昏睡的蟠桃

眯眯眼的衬衫

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通