计算机科学
人工智能
GSM演进的增强数据速率
特征(语言学)
模式识别(心理学)
计算机视觉
边缘检测
突出
对象(语法)
变压器
骨干网
图像(数学)
图像处理
电压
物理
哲学
量子力学
语言学
计算机网络
作者
Zhaojian Yao,Luping Wang
标识
DOI:10.1016/j.eswa.2022.118973
摘要
Most existing methods mainly input images into a CNN backbone to obtain image features. However, compared with convolutional features, the recently emerging transformer features can more accurately express the meaningful features of images. In this paper, we use a transformer backbone to capture multiple feature layers of an image, and design an Object Localization and Edge Refinement (OLER) Network for saliency detection. Our network is divided into two stages, the first stage for object positioning and the second stage for refining their boundaries. In the first stage, we directly apply multiple feature layers to identify salient regions, where we design an Information Multiple Selection (IMS) module to capture saliency cues for each feature layer. The IMS module contains multiple pathways, each of which is a judgment of the location of saliency information. After the input feature layer is processed by the IMS module, its potential salient object information is mined. The second stage consists of two modules, namely the edge generation module and the edge refinement module. The edge generation module takes the original image and saliency map as inputs, and then outputs two edge maps focusing on different edge ranges. To make the object edges sharp, the original image, initial saliency map and two edge maps are fed into the edge refinement module, and the final saliency map is output. Our network as a whole is relatively simple and easy to build without involving complex components. Experimental results on five public datasets demonstrate that our method has tremendous advantages in terms of not only significantly improving detection accuracy, but also achieving better detection efficiency. The code is available at https://github.com/CKYiu/OLER.
科研通智能强力驱动
Strongly Powered by AbleSci AI