计算机科学
构造(python库)
任务(项目管理)
事件(粒子物理)
领域(数学)
光学(聚焦)
绘图(图形)
多媒体
剪辑
人工智能
物理
量子力学
统计
数学
管理
纯数学
光学
经济
程序设计语言
作者
Tai-Te Chu,An-Zi Yen,Wei-Hong Ang,Hen‐Hsen Huang,Hsin‐Hsi Chen
标识
DOI:10.1145/3459637.3482022
摘要
Filming video blogs, which is shortened to vlog, becomes a popular way for people to record their life experiences in recent years. In this work, we present a novel task that is aimed at extracting life events from videos and constructing personal knowledge bases of individuals. In contrast to most existing researches in the field of computer vision that focus on identifying low-level script-like activities such as moving boxes, our goal is to extract life events where high-level activities like moving into a new house are recorded. The challenges to be tackled include: (1) identifying which objects in a given scene related to the life events of the protagonist we concern, and (2) determining the association between an extracted visual concept and a more high-level description of a video clip. To address the research issues, we construct a video life event extraction dataset VidLife by exploiting videos from the TV series The Big Bang Theory, in which the plot is around the daily lives of several characters. A pilot multitask learning model is proposed to extract life events given video clips and subtitles for storing in the personal knowledge base.
科研通智能强力驱动
Strongly Powered by AbleSci AI