计算机科学
失败
目标检测
人工智能
分割
模式识别(心理学)
并行计算
作者
Hao Wu,Jun Zhou,Qiong Zhang,Lei Yang,Kun Yu,Wenbo An,Juntao Zhang
标识
DOI:10.1007/978-981-99-8543-2_1
摘要
Attention mechanisms have provided benefits in very many visual tasks, e.g. image classification, object detection, semantic segmentation. However, few attention modules have been proposed specifically for scene text detection. We propose an attention mechanism based on Quantum-State-based Mapping (QSM) that enhances channel and spatial attention, introduces higher-order representations, and mixes contextual information. Our approach includes two attention modules: Quantum-based Convolutional Attention Module (QCAM), a plug-and-play module applicable to pre-trained text detection models; Adaptive Channel Information Transfer Module (ACTM), which replaces feature pyramids and complex networks of DBNet++ with a 35.9% reduction in FLOPs. In CNN-based methods, our QCAM achieves state-of-the-art performance on three benchmarks. Remarkably, when compared to the Transformer-based methods such as FSG, our QCAM remains competitive in F-measure on all benchmarks. Notably, QCAM has a 29.5% reduction in parameters compared to FSG, resulting in a balance between detection accuracy and efficiency. ACTM significantly improves F-measure over DBNet++ on three benchmarks, providing an alternative to feature pyramids in scene text detection. The codes, models and training logs are available at https://github.com/yws-wxs/QCAM .
科研通智能强力驱动
Strongly Powered by AbleSci AI