计算机科学
变压器
光流
人工智能
电气工程
图像(数学)
电压
工程类
作者
Zhaoyang Huang,Xiaoyu Shi,Chao Zhang,Qiang Wang,Ka Chun Cheung,Hongwei Qin,Jifeng Dai,Hongsheng Li
标识
DOI:10.1007/978-3-031-19790-1_40
摘要
We introduce optical Flow transFormer, dubbed as FlowFormer, a transformer-based neural network architecture for learning optical flow. FlowFormer tokenizes the 4D cost volume built from an image pair, encodes the cost tokens into a cost memory with alternate-group transformer (AGT) layers in a novel latent space, and decodes the cost memory via a recurrent transformer decoder with dynamic positional cost queries. On the Sintel benchmark, FlowFormer achieves 1.144 and 2.183 average end-ponit-error (AEPE) on the clean and final pass, a 17.6% and 11.6% error reduction from the best published result (1.388 and 2.47). Besides, FlowFormer also achieves strong generalization performance. Without being trained on Sintel, FlowFormer achieves 0.95 AEPE on the Sintel training set clean pass, outperforming the best published result (1.29) by 26.9%.
科研通智能强力驱动
Strongly Powered by AbleSci AI