计算机科学
编码器
变压器
人工智能
特征提取
边缘设备
卷积神经网络
深度学习
实时计算
工程类
电压
电气工程
云计算
操作系统
作者
Xihao Liu,Wei Wei,Cheng Liu,Yuyang Peng,Jinhao Huang,Jun Li
出处
期刊:IEEE Transactions on Instrumentation and Measurement
[Institute of Electrical and Electronics Engineers]
日期:2023-01-01
卷期号:72: 1-9
被引量:6
标识
DOI:10.1109/tim.2023.3264039
摘要
Depth estimation is requisite to build 3D perceiving capability of artificial intelligence of things (AIoT). Real-time inference with extremely low computing resource consumption is critical on edge devices. However, most single-view depth estimation networks focus on the improvement of accuracy running on high-end GPUs, which go opposite to the real-time requirement on edge devices. To address this issue, this article proposed a novel encoder-decoder network to realize real-time monocular depth estimation on edge devices. The proposed network merges semantic information at global field via an efficient transformer-based module to provide more details of the object for depth assignment. The transformer-based module is integrated in the lowest level resolution of an encoder-decoder architecture to largely reduce the parameters of the Vision Transformer (ViT). In particular, we proposed a novel patch convolutional layer for low-latency feature extraction in the encoder and a SConv5 layer for effective depth assignment in the decoder. The proposed network achieves outstanding balance between accuracy and speed on the NYU Depth v2 dataset. A low RMSE of 0.554 and a fast speed of 58.98 FPS on NVIDIA Jetson Nano device with TensorRT optimization are obtained on NYU Depth v2, outperforming most state-of-the-art real-time results.
科研通智能强力驱动
Strongly Powered by AbleSci AI