计算机科学
隐藏字幕
人工智能
短时记忆
残差神经网络
深度学习
循环神经网络
作者
Abid Kapadi,Chinmay Ram Kavimandan,Chinmay Sandeep Mandke,Sangita Chaudhari
出处
期刊:Studies in computational intelligence
日期:2021-01-01
卷期号:: 353-363
标识
DOI:10.1007/978-3-030-68291-0_28
摘要
Wildlife videos often have elaborate dynamics, and techniques for generating video captions for wildlife clips involve both natural language processing and computer vision. Current techniques for video captioning have shown encouraging results. However, these techniques derive captions based on video frames only, ignoring audio information. In this paper we propose to create video captions with the help of both audio and visual information, in natural language. We utilize deep neural networks with convolutional and recurrent neural networks both involved. Experimental results on a corpus of wildlife clips show that fusion of audio knowledge greatly improves the efficiency of video description. These superior results are achieved using convolutional neural networks (CNN) and recurrent neural networks (RNN).
科研通智能强力驱动
Strongly Powered by AbleSci AI