计算机科学
整改
图像扭曲
杠杆(统计)
人工智能
数字化
失真(音乐)
情报检索
深度学习
计算机视觉
数据挖掘
放大器
计算机网络
功率(物理)
物理
带宽(计算)
量子力学
作者
Shaokai Liu,Hao Feng,Wengang Zhou
出处
期刊:IEEE Transactions on Circuits and Systems for Video Technology
[Institute of Electrical and Electronics Engineers]
日期:2023-11-23
卷期号:: 1-1
标识
DOI:10.1109/tcsvt.2023.3336068
摘要
In recent years, the proliferation of smartphones has led to an upsurge in the digitization of document files via these portable devices. However, images captured by smartphones often suffer from distortions, thereby negatively affecting digital preservation and downstream applications. To address this issue, we introduce DRNet, a novel deep network for document image rectification. Our approach is based on three key designs. Firstly, we exploit the intrinsic geometric consistency inherent in document images to guide the learning process of distortion rectification. Secondly, we design a coarse-to-fine rectification network to leverage the representations derived from the distorted document image, thereby enhancing the rectification result. Thirdly, we propose a unique perspective for supervising the learning of rectification networks, where undistorted document images are employed for supervision, which is free of warping mesh as ground truth in existing methods. Technically, both low-level pixel alignment and high-level semantic alignment jointly contribute to the learning of the mapping relationship between deformed document images and distortion-free ones. We evaluate our method on the challenging DocUNet Benchmark dataset, where it sets a series of state-of-the-art records, demonstrating the superiority of our approach compared to existing learning-based solutions. Additionally, we conduct a comprehensive series of ablation experiments to further validate the effectiveness and merits of our method.
科研通智能强力驱动
Strongly Powered by AbleSci AI