作者
Gang Sun,Tonghai Liu,Hang Zhang,Bowen Tan,Yuwei Li
摘要
The yak is a symbol of the Tibetan Plateau and an indispensable livestock resource at high altitudes, with important ecological, economic, and cultural values. When yaks are sick, their excrement can cause serious damage to the highland ecosystem, so real-time monitoring of their health status is essential for ecological conservation. The daily behaviors of yaks, such as eating, lying, standing, and walking, contains a wealth of health information. By recognizing the behavior of yaks using computer vision technology, real-time monitoring of yak health status can be achieved, thus, effectively protecting the ecological environment while maintaining the economic benefits of yak breeding. This study proposes a non-contact yak behavior recognition method based on the SlowFast model. The method uses two paths with different sampling rates (i.e., Slow and Fast) to extract spatial and action features from the input video. The 3D Resnet50 network is selected as the backbone network of the SlowFast dual path after comparative analysis. The size of the 3D convolutional kernel is increased to improve the perceptual field of feature extraction, which in turn effectively improves the recognition accuracy of the algorithm. A total of 318 videos of yaks in different scenes and poses were captured for testing. Six different networks were selected to verify the performance of the proposed method: SlowFast-3DResnet50, SlowFast-3DResnet101, SlowFast-3DResnet152, 3DResnet50, C3D, and I3D. The experimental results show that the method achieves 96.6% recognition accuracy, 91.3% recall, and 90.5% precision in classifying the basic behaviors of yaks in natural scenes, and 97.3%, 99.1%, 95.9% and 94.1% for the four basic behaviors, respectively. These results are comprehensively better than the other six methods. In addition, compared with other 3D convolutional neural networks used for video classification, the method proposed in this paper can classify the target behavior from each video frame, which has a broader implications and application. The algorithm meets the needs for basic behavior recognition of yaks and lays the foundation for real-time monitoring of yak health status.