A Survey of Temporal Activity Localization via Language in Untrimmed Videos
计算机科学
人工智能
自然语言处理
作者
Yulan Yang,Zhaohui Li,Gangyan Zeng
出处
期刊:2020 International Conference on Culture-oriented Science & Technology (ICCST)日期:2020-10-01被引量:1
标识
DOI:10.1109/iccst50977.2020.00123
摘要
Video is one of the most informative media which consists of visual, textual and audio contents. As the number of videos on the Internet grows explosively, it is increasingly necessary for machines to understand the semantic information in the videos accurately. Temporally Activity Localization in a video is such a work which needs to localize the video moment that is most semantically similar to a given natural query. This task is quite challenging for that it not only requires to have a deep understanding of the sentences and videos, but also the fine-grained interactions between the two modalities. In this paper, we report a comprehensive survey of existed temporal sentence localization techniques. Firstly, we make a detailed classification and analysis of these methods. Then we discuss the experimental results and performance of existed approaches. Finally, we present some insights for future research direction.