计算机科学
水准点(测量)
构造(python库)
数据库
视频质量
质量(理念)
PEVQ公司
情报检索
数据挖掘
人工智能
机器学习
主观视频质量
图像质量
图像(数学)
大地测量学
哲学
经济
公制(单位)
认识论
程序设计语言
地理
运营管理
作者
Yuqin Cao,Xiongkuo Min,Wei Sun,Guangtao Zhai
出处
期刊:IEEE transactions on image processing
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:32: 3847-3861
被引量:14
标识
DOI:10.1109/tip.2023.3290528
摘要
In recent years, User Generated Content (UGC) has grown dramatically in video sharing applications. It is necessary for service-providers to use video quality assessment (VQA) to monitor and control users' Quality of Experience when watching UGC videos. However, most existing UGC VQA studies only focus on the visual distortions of videos, ignoring that the perceptual quality also depends on the accompanying audio signals. In this paper, we conduct a comprehensive study on UGC audio-visual quality assessment (AVQA) from both subjective and objective perspectives. Specially, we construct the first UGC AVQA database named SJTU-UAV database, which includes 520 in-the-wild UGC audio and video (A/V) sequences collected from the YFCC100m database. A subjective AVQA experiment is conducted on the database to obtain the mean opinion scores (MOSs) of the A/V sequences. To demonstrate the content diversity of the SJTU-UAV database, we give a detailed analysis of the SJTU-UAV database as well as other two synthetically-distorted AVQA databases and one authentically-distorted VQA database, from both the audio and video aspects. Then, to facilitate the development of AVQA fields, we construct a benchmark of AVQA models on the proposed SJTU-UAV database and other two AVQA databases, of which the benchmark models consist of AVQA models designed for synthetically distorted A/V sequences and AVQA models built through combining the popular VQA methods and audio features via support vector regressor (SVR). Finally, considering benchmark AVQA models perform poorly in assessing in-the-wild UGC videos, we further propose an effective AVQA model via jointly learning quality-aware audio and visual feature representations in the temporal domain, which is seldom investigated by existing AVQA models. Our proposed model outperforms the aforementioned benchmark AVQA models on the SJTU-UAV database and two synthetically distorted AVQA databases. The SJTU-UAV database and the code of the proposed model will be released to facilitate further research.
科研通智能强力驱动
Strongly Powered by AbleSci AI