目标检测
计算机科学
人工智能
对象(语法)
探测器
模式识别(心理学)
计算机视觉
集合(抽象数据类型)
特征(语言学)
视觉对象识别的认知神经科学
特征提取
趋同(经济学)
图像(数学)
电信
语言学
哲学
经济
程序设计语言
经济增长
作者
Peize Sun,Rufeng Zhang,Yi Jiang,Tao Kong,Chenfeng Xu,Wei Zhan,Masayoshi Tomizuka,Zehuan Yuan,Ping Luo
标识
DOI:10.1109/tpami.2023.3292030
摘要
Object detection serves as one of most fundamental computer vision tasks. Existing works on object detection heavily rely on dense object candidates, such as k anchor boxes pre-defined on all grids of an image feature map of size H×W. In this paper, we present Sparse R-CNN, a very simple and sparse method for object detection in images. In our method, a fixed sparse set of learned object proposals ( N in total) are provided to the object recognition head to perform classification and localization. By replacing HWk (up to hundreds of thousands) hand-designed object candidates with N (e.g., 100) learnable proposals, Sparse R-CNN makes all efforts related to object candidates design and one-to-many label assignment completely obsolete. More importantly, Sparse R-CNN directly outputs predictions without the non-maximum suppression (NMS) post-processing procedure. Thus, it establishes an end-to-end object detection framework. Sparse R-CNN demonstrates highly competitive accuracy, run-time and training convergence performance with the well-established detector baselines on the challenging COCO dataset and CrowdHuman dataset. We hope that our work can inspire re-thinking the convention of dense prior in object detectors and designing new high-performance detectors.
科研通智能强力驱动
Strongly Powered by AbleSci AI