计算机科学
图像(数学)
图像编辑
人工智能
图像处理
计算机视觉
领域(数学分析)
比例(比率)
图像合成
数学
地图学
数学分析
地理
作者
K. Yamamoto,Keiji Yanai
标识
DOI:10.1145/3552484.3555751
摘要
Recently, the large-scale language-image pre-trained model, such as CLIP, has drawn much attention due to its remarkable ability for various tasks, including classification and image synthesis. The combination of CLIP and GAN can be used for text-based image manipulation and text-based image synthesis.Several models of a combination of CLIP and GAN have been proposed so far. However, their effectiveness in the food image domain has not been examined comprehensively yet. In this paper, we reported the results of the experiments on text-based food image manipulation using VQGAN-CLIP and discussed the possibility of food image manipulation by texts.
科研通智能强力驱动
Strongly Powered by AbleSci AI