计算机科学
RGB颜色模型
背景(考古学)
可扩展性
深度学习
钥匙(锁)
计算机视觉
对象(语法)
特征(语言学)
人工智能
注释
数据库
古生物学
语言学
哲学
计算机安全
生物
作者
Angela Dai,Angel X. Chang,Manolis Savva,Maciej Halber,Thomas Funkhouser,Matthias Nießner
出处
期刊:Computer Vision and Pattern Recognition
日期:2017-07-01
被引量:2116
标识
DOI:10.1109/cvpr.2017.261
摘要
A key requirement for leveraging supervised deep learning methods is the availability of large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very little data is available - current datasets cover a small range of scene views and have limited semantic annotations. To address this issue, we introduce ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations. To collect this data, we designed an easy-to-use and scalable RGB-D capture system that includes automated surface reconstruction and crowd-sourced semantic annotation.We show that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks, including 3D object classification, semantic voxel labeling, and CAD model retrieval.
科研通智能强力驱动
Strongly Powered by AbleSci AI