Abstract Accurate identification of prohibited items in X-ray security images is essential for ensuring public safety. However, current methodologies struggle to simultaneously address irregular deformation, multi-scale features, and background occlusion of prohibited items, leading to inadequate detection accuracy. To address these challenges, we propose an Adaptive Efficient Focusing Network (AEFNet) designed to target regions, thereby enhancing the automatic detection of prohibited items. Specifically, to accommodate the irregular deformation of target regions, we introduce the DACSP module, which dynamically adjusts sampling positions to enhance the network’s adapt ability and focus on occluded targets. For addressing detail loss and managing multi-scale features, we propose the Multi-scale Focus Feature (MFF) module and the Focusing Diffusion Pyramid Network (FDPN), which enable the fusion of semantic and perceptual features, improving use of contextual information at different detection scales. Additionally, detail-enhanced convolution improves the efficacy of feature utilization at different scales, while facilitating a lightweight network design. Finally, we employ the PIoUv2 function to optimize localization loss, resulting in significant performance enhancement. Experimental results show that AEFNet performs effectively across various X-ray security image datasets (PIDray, CLCXray, OPIXray) achieving 74.7%, 61.3%, and 89.2% mAP respectively, and AEFNet also demonstrates strong generalization capabilities on the PASCAL VOC dataset in non-prohibited item detection scenarios.