分割
计算机科学
点云
人工智能
变压器
云计算
融合
比例(比率)
特征(语言学)
计算机视觉
模式识别(心理学)
地理
工程类
地图学
语言学
哲学
电压
电气工程
操作系统
作者
Junli Zhang*,Zhipeng Jiang,Qinjun Qiu,Zheng Liu
标识
DOI:10.1016/j.patcog.2024.110630
摘要
Point cloud semantic segmentation is an ingredient in understanding real-world scenes. Most existing approaches perform poorly on scene boundaries and struggle with recognizing objects of different scales. In this paper, we propose a novel framework that incorporates Transformer into the U-Net architecture for inferring pointwise semantics. Specifically, the Transformer-based cross-feature fusion module is designed first to employ geometric and semantic information to learn feature offsets to overcome the border ambiguity of segmentation results, and then it utilizes the Transformer to learn cross-feature enhanced and fused encoder features. Additionally, to facilitate the overall network's structure-to-detail perception capabilities, the adaptive perception module is designed, which employs cross-attention to adaptively allocate weights to encoder features at varying resolutions, establishing long-range contextual dependencies. Ablation studies validate the individual contributions of our module design choices. Compared with the existing competitive methods, our approach achieves state-of-the-art performance and exhibits superior results on benchmarks. Code is available at https://github.com/xiluo-cug/TCFAP-Net.
科研通智能强力驱动
Strongly Powered by AbleSci AI