计算机科学
表(数据库)
编码器
人工智能
端到端原则
变压器
水准点(测量)
查阅表格
机器学习
模式识别(心理学)
数据挖掘
物理
大地测量学
量子力学
电压
程序设计语言
地理
操作系统
作者
Nam Tuan Ly,Atsuhiro Takasu
标识
DOI:10.1007/978-3-031-41679-8_2
摘要
Recently, due to the rapid development of deep learning, especially Transformer, many Transformer-based methods have been studied and proven to be very powerful for table recognition. However, Transformer-based models usually struggle to process big tables due to the limitation of their global attention mechanism. In this paper, we propose a local attention mechanism to address the limitation of the global attention mechanism. We also present an end-to-end local attention-based model for recognizing both table structure and table cell content from a table image. The proposed model consists of four main components: 1) an encoder for feature extraction; 2) the three decoders for the three sub-tasks of the table recognition problem. In the experiments, we evaluate the performance of the proposed model and the effectiveness of the local attention mechanism on the two large-scale datasets: PubTabNet and FinTabNet. The experiment results show that the proposed model outperforms the state-of-the-art methods on all benchmark datasets. Furthermore, we demonstrate the effectiveness of the local attention mechanism for table recognition, especially for big table recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI