计算机科学
变压器
新闻聚合器
数据挖掘
机器学习
人工智能
电压
物理
量子力学
操作系统
作者
Hao Líu,Xin Li,Bing Liu,Deqiang Jiang,Yinsong Liu,Bo Ren,Rongrong Ji
出处
期刊:ACM Multimedia
日期:2021-10-17
被引量:15
标识
DOI:10.1145/3474085.3481534
摘要
We investigate the challenging problem of table structure recognition in this work. Many recent methods adopt graph-based context aggregator with strong inductive bias to reason sparse contextual relationships of table elements. However, the strong constraints may be too restrictive to represent the complicated table relationships. In order to learn more appropriate inductive bias from data, we try to introduce Transformer as context aggregator in this work. Nevertheless, Transformer taking dense context as input requires larger scale data and may suffer from unstable training procedure due to the weakening of inductive bias. To overcome the above limitations, we in this paper design a FLAG (FLexible context AGgregator), which marries Transformer with graph-based context aggregator in an adaptive way. Based on FLAG, an end-to-end framework requiring no extra meta-data or OCR information, termed FLAG-Net, is proposed to flexibly modulate the aggregation of dense context and sparse one for the relational reasoning of table elements. We investigate the modulation pattern in FLAG and show what contextual information is focused, which is vital for recognizing table structure. Extensive experimental results on benchmarks demonstrate the performance of our proposed FLAG-Net surpasses other compared methods by a large margin.
科研通智能强力驱动
Strongly Powered by AbleSci AI