计算机科学
基本事实
像素
编码(集合论)
分割
计算机图形学
绘图
人工智能
集合(抽象数据类型)
源代码
电脑游戏
训练集
图像(数学)
注释
计算机视觉
电子游戏
计算机图形学(图像)
作者
Stephan R. Richter,Vibhav Vineet,Stefan Roth,Vladlen Koltun
标识
DOI:10.1007/978-3-319-46475-6_7
摘要
Recent progress in computer vision has been driven by high-capacity models trained on large datasets. Unfortunately, creating large datasets with pixel-level labels has been extremely costly due to the amount of human effort required. In this paper, we present an approach to rapidly creating pixel-accurate semantic label maps for images extracted from modern computer games. Although the source code and the internal operation of commercial games are inaccessible, we show that associations between image patches can be reconstructed from the communication between the game and the graphics hardware. This enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content. We validate the presented approach by producing dense pixel-level semantic annotations for 25 thousand images synthesized by a photorealistic open-world computer game. Experiments on semantic segmentation datasets show that using the acquired data to supplement real-world images significantly increases accuracy and that the acquired data enables reducing the amount of hand-labeled real-world data: models trained with game data and just $$\tfrac{1}{3}$$ of the CamVid training set outperform models trained on the complete CamVid training set.
科研通智能强力驱动
Strongly Powered by AbleSci AI