计算机科学
人工智能
光学字符识别
文本检测
文本识别
领域(数学)
机器学习
噪声文本分析
自然语言处理
模式识别(心理学)
情报检索
文本图
图像(数学)
自动汇总
数学
纯数学
作者
Xiaofeng Wang,Zhi-Huang He,Kai Wang,Yifan Wang,Le Zou,Zhize Wu
出处
期刊:Neurocomputing
[Elsevier]
日期:2023-08-18
卷期号:556: 126702-126702
被引量:16
标识
DOI:10.1016/j.neucom.2023.126702
摘要
Optical Character Recognition (OCR) poses a crucial challenge within the realm of computer vision research, as it plays a pivotal role in converting vast amounts of unstructured text data into structured formats to support diverse artificial intelligence applications. The OCR process encompasses two core components: text detection and text recognition. Text detection involves identifying and extracting text regions, achieved through either object detection or segmentation techniques, while text recognition focuses on accurately deciphering the content within these identified regions. In recent years, remarkable strides have been made in the domain of text recognition, primarily driven by deep learning-based models. These models eliminate the need for manual feature processing and excel in recognizing text even within complex scenes, surpassing the performance of traditional text recognition methods and subsequently emerging as the dominant approach. The objective of this paper is to present a comprehensive survey of both text detection and text recognition models. Firstly, we systematically categorize and provide an overview of existing off-the-shelf text detection methods. Subsequently, we conduct an in-depth investigation of six distinct text recognition models, taking into account their unique implementations. Additionally, we explore and analyze the principal datasets that currently prevail in the field of text detection and recognition. Furthermore, this research entails a meticulous performance comparison of various text detection algorithms on the CTW1500, TotalText, and ICDAR2015 datasets. Additionally, we evaluate and scrutinize the efficacy of mainstream text recognition algorithms on the IIIT-5K, SVT, ICDAR2013, SVT-P, CUTE80, and ICDAR2015 datasets. Finally, we conclude with a discussion on the future development and research trends concerning text detection and recognition, providing insights that can further drive progress in this crucial area.
科研通智能强力驱动
Strongly Powered by AbleSci AI