水准点(测量)
管道(软件)
计算机科学
钥匙(锁)
基线(sea)
信息抽取
人工智能
过程(计算)
萃取(化学)
特征提取
直线(几何图形)
数据挖掘
模式识别(心理学)
机器学习
数学
海洋学
化学
几何学
计算机安全
大地测量学
色谱法
地质学
程序设计语言
地理
操作系统
作者
Thua Nguyen,Thuyen Tran Doan,Khiem Le,Tien Do,Thanh Duc Ngo,Duy-Dinh Le
标识
DOI:10.1109/mapr59823.2023.10288832
摘要
Localizing and extracting essential information from semi-structured business documents, such as invoices, is crucial in practical applications. This complex problem includes key information localization and extraction (KILE) and line item recognition (LIR), which depend on the choice of model and optimal training methodology. This paper presents a novel pipeline that applies RoBERTa and LION Optimizer as the primary modules for identifying and extracting crucial information on the DocILE benchmark. The experimental results indicate that the proposed method significantly improves the KILE phase with 7.24% increase in accuracy compared to the baseline and also enhances the correct recognition rate at the LIR stage.
科研通智能强力驱动
Strongly Powered by AbleSci AI