期刊:IEEE transactions on intelligent vehicles [Institute of Electrical and Electronics Engineers] 日期:2023-06-05卷期号:9 (1): 1054-1065被引量:8
标识
DOI:10.1109/tiv.2023.3282996
摘要
Previous deep convolutional neural network research has made significant progress toward improving the speed and accuracy of object detection. However, despite these advancements, the inaccurate detection of multi-object (small objects) remains challenging in the traffic environments. In this paper, we propose a new architecture called YOLOM, which is specifically designed to achieve enhanced multi-object (small objects) detection precision. YOLOM incorporates several innovative features: a multi-spatial pyramid (MSP), an optimized focal loss (OFLoss) function, and an objectness loss that incorporates effective intersection over union (EIoU) calculations. These features collectively yield enhanced accuracy and reduce the miss rate of small objects, particularly in the multi-object cases. According to the sizes of receptive field features with different spatial scales with pooling layers, we propose the MSP module. We optimize the focal loss as a classification function instead of the cross-entropy loss, which solves some class imbalance problems caused by anchor-free detection when encountering disparate datasets. Due to the superior performance of EIoU in confidence scoring, we use EIoU to participate in the objectness loss calculation of our work. Therefore, our method substitutes EIoU for YOLOX's objectness loss. The experimental results demonstrate that our strategies significantly outperform some end-to-end object detection methods.