计算机科学
符号(正式)
背景(考古学)
人工智能
解码方法
过程(计算)
表达式(计算机科学)
编码器
模式识别(心理学)
关系(数据库)
像素
机器学习
数据挖掘
算法
古生物学
生物
程序设计语言
操作系统
作者
Yingnan Fu,Wenyuan Cai,Ming Gao,Aoying Zhou
标识
DOI:10.1145/3591106.3592259
摘要
Recently most handwritten mathematical expression recognition methods adopt the attention-based encoder-decoder framework, which generates LaTeX sequences from given images. However, the accuracy of the attention mechanism limits the performance of HMER models. Lacking global context information in the decoding process is also a challenge for HMER. Some methods adopt symbol-level counting to localize symbols for improving the model performance, while these methods cannot work well. In this paper, we propose a method named SLAN, shorted for a Symbol Location-Aware Network, to solve the HMER problem. Specifically, we propose an advanced relation-level counting method to detect symbols in the image. We solve the lacking global context problem with a new global context-aware decoder. For improving the accuracy of attention, we design a novel attention alignment loss function by the dynamic programming algorithm, which can learn attention alignment directly without pixel-level labels. We conducted extensive experiments on the CROHME dataset to demonstrate the effectiveness of each part of SLAN and achieved state-of-the-art performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI