计算机科学
编码器
水准点(测量)
卷积神经网络
人工智能
基线(sea)
模式识别(心理学)
机器学习
解码方法
算法
大地测量学
海洋学
操作系统
地质学
地理
作者
Yuzhu Ji,Haijun Zhang,Zhao Zhang,Ming Liu
标识
DOI:10.1016/j.ins.2020.09.003
摘要
Convolutional neural network (CNN)-based encoder-decoder models have profoundly inspired recent works in the field of salient object detection (SOD). With the rapid development of encoder-decoder models with respect to most pixel-level dense prediction tasks, an empirical study still does not exist that evaluates performance by applying a large body of encoder-decoder models on SOD tasks. In this paper, instead of limiting our survey to SOD methods, a broader view is further presented from the perspective of fundamental architectures of key modules and structures in CNN-based encoder-decoder models for pixel-level dense prediction tasks. Moreover, we focus on performing SOD by leveraging deep encoder-decoder models, and present an extensive empirical study on baseline encoder-decoder models in terms of different encoder backbones, loss functions, training batch sizes, and attention structures. Moreover, state-of-the-art encoder-decoder models adopted from semantic segmentation and deep CNN-based SOD models are also investigated. New baseline models that can outperform state-of-the-art performance were discovered. In addition, these newly discovered baseline models were further evaluated on three video-based SOD benchmark datasets. Experimental results demonstrate the effectiveness of these baseline models on both image- and video-based SOD tasks. This empirical study is concluded by a comprehensive summary which provides suggestions on future perspectives.
科研通智能强力驱动
Strongly Powered by AbleSci AI