计算机科学
人工智能
编解码器
卷积神经网络
计算机视觉
数据压缩
目标检测
量化(信号处理)
编码器
视频质量
视频跟踪
深度学习
多视点视频编码
图像压缩
视频处理
模式识别(心理学)
图像处理
图像(数学)
计算机硬件
操作系统
运营管理
经济
公制(单位)
作者
Florian Beye,Hayato Itsumi,Charvi Vitthal,Koichi Nihei
标识
DOI:10.1109/icip46576.2022.9897530
摘要
Image recognition techniques such as object detection are useful for assisting humans in remote video surveillance tasks. However, compression algorithms used for efficient video transmission are usually tuned for low reconstruction error and not for machine vision, leading to suboptimal recognition accuracies. In this work, we propose convolutional encoder-decoder neural networks for compressing video data intended for object detection. These networks are trained for optimal detection accuracy and bitrate, and make use of a novel stochastic quantization technique. In our experiments, we evaluate our method using publicly available datasets and show that we can substantially reduce bitrate over traditional codecs such as H.265 and also over other deep learning based compression methods at identical object detection accuracy. Moreover, we argue that perceived image quality of our compression method is close to H.265 at similar bitrates.
科研通智能强力驱动
Strongly Powered by AbleSci AI