计算机科学
预处理器
光学字符识别
灰度
人工智能
文本识别
字符识别
性格(数学)
文件处理
图像文件格式
开源
情报检索
自然语言处理
模式识别(心理学)
图像(数学)
软件
操作系统
数学
几何学
作者
Tao Ma,Min Yue,Chao Yuan,Haibo Yuan
标识
DOI:10.1109/icaml54311.2021.00057
摘要
Through the research of image preprocessing technology, this paper designs and implements a web archive file recognition management system based on open source Tesseract character recognition technology. The system first preprocesses the image with grayscale and binarization. Secondly, in order to improve the recognition accuracy of handwritten content, we trained the text recognition library of Tesseract. Finally, the characters are recognized and stored for later use. Archivists can use this system to convert paper documents into electronic documents, which can significantly improve the management level and digital efficiency of the file system.
科研通智能强力驱动
Strongly Powered by AbleSci AI