To address the issue of low detection accuracy caused by the variety of steel surface defect types, large shape differences, and the similarity between defects and the background, this paper proposes an improved method for detecting steel surface defects based on RetinaNet. Firstly, deformable convolutions are integrated into the ResNet backbone for feature extraction, allowing the convolutional kernels to adaptively adjust their shapes when confronted with defects of varying shapes, thereby capturing defect regions more accurately. Secondly, a CA-BiFPN is proposed for feature fusion, which effectively integrates information from different feature layers using attention mechanisms and enhances the focus on defect features from the complete feature space with a CA attention module. Thirdly, an IA-BCELoss is introduced as the classification loss function, coupling classification and regression predictions to ensure high-quality detection boxes while maintaining classification accuracy. Finally, comparative experiments are conducted on the NEU-DET steel surface defect detection dataset. Results demonstrate that the proposed method achieves the highest accuracy, with a 6% improvement over the original model, achieving an mAP of 81.5%. Compared to YOLOv7-X and YOLOX-L, mAP increases by 5.2% and 5.3%, respectively, while the number of parameters is reduced by 37.96 M and 21.23 M. These findings indicate that the proposed method exhibits superior performance in steel surface defect detection tasks and holds significant practical value.