计算机科学
人工智能
关系(数据库)
模式识别(心理学)
稳健性(进化)
面子(社会学概念)
估计
计算机视觉
机器学习
数据挖掘
工程类
社会学
基因
生物化学
化学
系统工程
社会科学
作者
Jiahao Xia,Min Xu,Haimin Zhang,Jianguo Zhang,Wenjian Huang,Hu Cao,Shiping Wen
标识
DOI:10.1109/tpami.2023.3260926
摘要
Human tends to locate the facial landmarks with heavy occlusion by their relative position to the easily identified landmarks. The clue is defined as the landmark inherent relation while it is ignored by most existing methods. In this paper, we present Dynamic Sparse Local Patch Transformer (DSLPT), a novel face alignment framework for the inherent relation learning and uncertainty estimation. Unlike most existing methods that regress facial landmarks directly from global features, the DSLPT first generates a rough representation of each landmark from a local patch cropped from the feature map and then adaptively aggregates them by a case dependent inherent relation. Finally, the DSLPT predicts the coordinate and uncertainty of each landmark by regressing their probability distribution from the output features. Moreover, we introduce a coarse-to-fine framework to incorporate with DSLPT for an improved result. In the framework, the position and size of each patch are determined by the probability distribution of the corresponding landmark predicted in the previous stage. The dynamic patches will ensure a fine-grained landmark representation for inherent relation learning so that a rough prediction result can gradually converge to the target facial landmarks. We integrate the coarse-to-fine model into an end-to-end training pipeline and carry out experiments on the mainstream benchmarks. The results demonstrate that the DSLPT achieves state-of-the-art performance with much less computational complexity. The codes and models are available at https://github.com/Jiahao-UTS/DSLPT.
科研通智能强力驱动
Strongly Powered by AbleSci AI