计算机科学
分解
人工智能
分割
计算机视觉
图像分割
自然语言处理
计算机图形学(图像)
情报检索
生态学
生物
作者
Yong Wang,Youguang Chen
标识
DOI:10.1109/icicml60161.2023.10424830
摘要
The field of image processing widely utilizes scene text segmentation technology, with applications extending to image editing and font style transfer. These applications enhance image understanding quality and aid in boosting the performance of numerous computer vision tasks. The advent and progression of deep learning have led to substantial advancements in scene text segmentation technology. However, the limited size of existing scene text segmentation datasets constrains the performance of models. Therefore, we propose an algorithm for synthetic segmentation data. We first pretrain the model using large-scale synthetic data, then fine-tune it on the target dataset to address the issue of limited dataset size. Existing models employ end-to-end segmentation, which presents challenges in segmentation. We propose a scene text segmentation method. By decomposing the segmentation task into subtasks and solving them one by one, the complexity of the task can be reduced compared to direct segmentation of the entire image significantly improving the segmentation effect. The proposed method consists of three modules: a fragment crop module, a fragment segmentation module, and a fragment combination module. The fragment crop module is composed of an additional corp layer added after DBnet. The fragment segmentation module can be embedded with various segmentation methods. The fragment combination module uses the maximum pixel value pasting algorithm to combine the segmented fragments. We call this method Crop-Segmentation-Combination Framework (CSCF). We conducted experiments on the ICDAR 2013 and TextSeg datasets. The CSCF, embedded in Unet within the segment segmentation module, enhanced the text segmentation IoU by 5.80% on the ICDAR 2013 test dataset. Our suggested approach has been shown to notably enhance the efficiency of scene text segmentation.
科研通智能强力驱动
Strongly Powered by AbleSci AI