Abstract Computer vision-based deep learning models are of great significance in industrial defect quality detection. Unlike natural objects, defects in industrial products are typically quite small and exhibit highly uneven scales, resulting in suboptimal performance of conventional object detectors when encountered with complex defect detection. Hence, this paper introduces an efficient progressive aggregation enhanced network (EPAE-Net) with the goal of strengthening defect detection performance in complex scenarios. Firstly, a global context feature enhancement module (GCFEM) was designed to model the global context of images, enhancing the model’s ability to perceive key information. Secondly, a downsampling module was designed using self calibrated convolution to improve the detection performance of small targets. Subsequently, a multi-path aggregation module (MAM) was designed to further enhance the interaction between cross layer features, MAM enhanced the network’s ability to detect extreme aspect ratio defects by integrating multi-scale convolutional attention (MSCA) mechanisms. A multi-path aggregation feature pyramid network (FPN) was constructed using MAM, and adaptive spatial feature fusion was used to gradually integrate low-level features and alleviate interference caused by information conflicts during the fusion process. Finally, the efficient complete intersection over union (E-CIOU) loss function is introduced to refine the network and further enhance the performance of network defect detection.Experimental results obtained from three distinct industrial datasets, namely the Tianchi fabric dataset (achieved an mean average precision (mAP) of 77.1%), the printed circuit board (PCB) dataset (achieved an mAP of 98.7%), and the surface defect dataset of steel strip (NEU-DET dataset) (achieved an mAP of 81.5%), unequivocally demonstrate that the proposed EPAE-Net yields competitive outcomes when compared to other state-of-the-art methodologies.