解析
计算机科学
表(数据库)
人工智能
卷积神经网络
模式识别(心理学)
关系(数据库)
对象(语法)
图形
自然语言处理
数据挖掘
理论计算机科学
作者
Xiaohui Li,Fei Yin,He-Sen Dai,Cheng‐Lin Liu
标识
DOI:10.1016/j.patcog.2022.108946
摘要
The recognition of two-dimensional structure of tables and forms from document images is a challenge due to the complexity of document structures and the diversity of layouts. In this paper, we propose a graph neural network (GNN) based unified framework named Table Structure Recognition Network (TSRNet) to jointly detect and recognize the structures of various tables and forms. First, a multi-task fully convolutional network (FCN) is used to segment primitive regions such as text segments and ruling lines from document images, then a GNN is used to classify and group these primitive regions into page objects such as tables and cells. At last, the relationships between neighboring page objects are analyzed using another GNN based parsing module. The parameters of all the modules in the system can be trained end-to-end to optimize the overall performance. Experiments of table detection and structure recognition for modern documents on the POD 2017, cTDaR 2019 and PubTabNet datasets and template-free form parsing for historical documents on the NAF dataset show that the proposed method can handle various table/form structures and achieve superior performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI