计算机科学
搜索引擎索引
多媒体
语义学(计算机科学)
音频分析器
情报检索
分割
视听
试验台
音频信号处理
人工智能
音频信号
语音识别
万维网
语音编码
程序设计语言
作者
Yao Wang,Zhu Liu,Jincheng Huang
出处
期刊:IEEE Signal Processing Magazine
[Institute of Electrical and Electronics Engineers]
日期:2000-01-01
卷期号:17 (6): 12-36
被引量:377
摘要
Multimedia content analysis refers to the computerized understanding of the semantic meanings of a multimedia document, such as a video sequence with an accompanying audio track. With a multimedia document, its semantics are embedded in multiple forms that are usually complimentary of each other, Therefore, it is necessary to analyze all types of data: image frames, sound tracks, texts that can be extracted from image frames, and spoken words that can be deciphered from the audio track. This usually involves segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, and indexing and summarizing the document for efficient retrieval and browsing. We review advances in using audio and visual information jointly for accomplishing the above tasks. We describe audio and visual features that can effectively characterize scene content, present selected algorithms for segmentation and classification, and review some testbed systems for video archiving and retrieval. We also describe audio and visual descriptors and description schemes that are being considered by the MPEG-7 standard for multimedia content description.
科研通智能强力驱动
Strongly Powered by AbleSci AI