计算机科学
自动汇总
关键帧
聚类分析
人工智能
钥匙(锁)
情报检索
帧(网络)
特征提取
计算机视觉
特征(语言学)
多媒体
计算机安全
语言学
电信
哲学
作者
Seema Rani,Mukesh Kumar
标识
DOI:10.1016/j.ipm.2019.102190
摘要
Social networking tools such as Facebook, YouTube, Twitter, and Instagram, are becoming major platforms for communication. YouTube as one of the primary video sharing platform serves over 100 million distinct videos, 300 hours of videos are uploaded on YouTube every minute along with textual data. This massive amount of multimedia data needs to be managed with high efficiency, the irrelevant and redundant data needs to be removed. Video summarization ideals with the problem of redundant data in a video. A summarized video contains the most distinct frames which are termed as key frames. Most of the research work on key frames extraction considers only a single visual feature which is not sufficient for capturing the full pictorial details and hence affecting the quality of video summary generated. So there is a need to explore multiple visual features for key frames extraction. In this research work a key frame extraction technique based upon fusion of four visual features namely: correlation of RGB color channels, color histogram, mutual information and moments of inertia is proposed. Kohonen Self Organizing map as a clustering approach is used to find the most representative frames from the list of frames coming after fusion. Useless frames are discarded and frames having maximum Euclidean distance within a cluster are selected as final key frames. The results of the proposed technique are compared with the existing video summarization techniques: User generated summary, Video SUMMarization (VSUMM), and Video Key Frame Extraction through Dynamic Delaunay Clustering (VKEDDCSC) which shows a considerable improvement in terms of fidelity and Shot Reconstruction Degree (SRD) score.
科研通智能强力驱动
Strongly Powered by AbleSci AI