The quantity of wheat ears is among the three key variables influencing wheat yield and holds a crucial position in practical agricultural production. However, due to the complex natural environment and tracking stability, there are still considerable challenges for automated and accurate wheat ear counting to be deployed in practice. Therefore, this study presents an improved wheat ear counting method that combines object detection (based on YOLOv7), multiple object tracking (based on DeepSORT), and cross-line partitioning counting. Firstly, DCNv3 was used as a partial convolution within the backbone network to solve the inherent constraints associated with long-range dependence and adaptive spatial aggregation in standard convolution. Secondly, PConv in FasterNet was used as the standard convolution in the ELAN-W module to reduce redundant computations and memory accesses during the training process, resulting in a more lightweight model. In addition, the feature fusion process in the head was enhanced by improving the Concat operation and replacing the PANet structure with BiFPN to achieve a more efficient fusion of wheat ear features. Furthermore, three CBAM attention modules were increased at the connection between the backbone and head network to increase the sensitivity of the network to the characteristics of wheat ears. Finally, a cross-line partition counting method based on DeepSORT was designed in the study to overcome the problem of tracked wheat ear ID switch and to track wheat ears in continuous frames. Experiments on the test set showed that the improved YOLOv7 achieved a detection precision of 93.8 %, with [email protected] reaching 94.9 %, representing a 3.1 % improvement over the YOLOv7 model. The size of the improved YOLOv7 model is 57.7 MB, which is 79 % of the original model size. The precision of the improved YOLOv7+DeepSORT multi-target tracking model reached 93.0 %, which was 6.9 % higher than that of the initial YOLOv7+DeepSORT model, and the MOTA was 82.3 %, which was 9.6 % higher than the original model. The results of the counting experiment showed that the average accuracy of the cross-line partition counting method reached more than 97.5 %, and the model ran at a speed of 19.2 Fps, enabling stable real-time wheat ear counting.