计算机科学
人工智能
JPEG 2000
编码器
语义计算
深度学习
语义相似性
JPEG格式
计算机视觉
图像压缩
图像处理
数据压缩
图像(数学)
语义网
操作系统
作者
Danlan Huang,Xiaoming Tao,Feifei Gao,Jianhua Lü
标识
DOI:10.1109/globecom46510.2021.9685667
摘要
This paper presents the Generative Adversarial Networks (GANs)-based image semantic coding, the goal of which is semantic exchange rather than symbol transmission. State-of-the-art visually pleasing reconstruction and semantic preserving performance are obtained in extreme low bitrate via a rate-perception-distortion optimization framework. In particular, we investigate convolutional encoder, quantizer, conditional SPADE generator, residual coding as well as perceptual losses. In contrast to previous work, we designed a coarse-to-fine image semantic coding model for multimedia semantic communication system. The base layer of the image is fully generated and preserves semantic information while the enhancement layer restores the fine details. We explore the perception and distortion performance trade-off by tuning the rate of base layer and enhancement layer. Different from the existing methods that adopt pixel accuracy as distortion metric, we train and evaluate the proposed image semantic coding model with multiple perception metrics, in line with the purpose of semantic communications. Experimental results demonstrate that our model could achieve visually pleasant and semantic consistent reconstruction, as well as saving times of bitrate, compared to BPG, WebP, JPEG2000, JPEG, and other deep learning-based image codecs.
科研通智能强力驱动
Strongly Powered by AbleSci AI