计算机科学
分割
水准点(测量)
比例(比率)
编码(集合论)
人工智能
边距(机器学习)
航程(航空)
任务(项目管理)
航空影像
图像(数学)
图像分割
方向(向量空间)
计算机视觉
模式识别(心理学)
机器学习
地图学
地理
复合材料
程序设计语言
材料科学
管理
集合(抽象数据类型)
经济
数学
几何学
作者
Sihan Liu,Yuefeng Ma,Xiaoqing Zhang,Haowei Wang,Jiayi Ji,Xiaoshuai Sun,Rongrong Ji
出处
期刊:Cornell University - arXiv
日期:2023-12-19
标识
DOI:10.48550/arxiv.2312.12470
摘要
Referring Remote Sensing Image Segmentation (RRSIS) is a new challenge that combines computer vision and natural language processing, delineating specific regions in aerial images as described by textual queries. Traditional Referring Image Segmentation (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery, leading to suboptimal segmentation results. To address these challenges, we introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS. RMSIN incorporates an Intra-scale Interaction Module (IIM) to effectively address the fine-grained detail required at multiple scales and a Cross-scale Interaction Module (CIM) for integrating these details coherently across the network. Furthermore, RMSIN employs an Adaptive Rotated Convolution (ARC) to account for the diverse orientations of objects, a novel contribution that significantly enhances segmentation accuracy. To assess the efficacy of RMSIN, we have curated an expansive dataset comprising 17,402 image-caption-mask triplets, which is unparalleled in terms of scale and variety. This dataset not only presents the model with a wide range of spatial and rotational scenarios but also establishes a stringent benchmark for the RRSIS task, ensuring a rigorous evaluation of performance. Our experimental evaluations demonstrate the exceptional performance of RMSIN, surpassing existing state-of-the-art models by a significant margin. All datasets and code are made available at https://github.com/Lsan2401/RMSIN.
科研通智能强力驱动
Strongly Powered by AbleSci AI