全视子
解析
计算机科学
人工智能
视线
直线(几何图形)
视力
计算机视觉
光学
数学
物理
几何学
政治
政治学
天体物理学
法学
作者
Zhongqi Lin,Xudong Jiang,Zengwei Zheng
标识
DOI:10.1109/tip.2025.3540265
摘要
Current scene parsers have effectively distilled abstract relationships among refined instances, while overlooking the discrepancies arising from variations in scene depth. Hence, their potential to imitate the intrinsic 3D perception ability of humans is constrained. In accordance with the principle of perspective, we advocate first grading the depth of the scenes into several slices, and then digging semantic correlations within a slice or between multiple slices. Two attention-based components, namely the Scene Depth Grading Module (SDGM) and the Edge-oriented Correlation Refining Module (EoCRM), comprise our framework, the Line-of-Sight Depth Network (LoSDN). SDGM grades scene into several slices by calculating depth attention tendencies based on parameters with explicit physical meanings, e.g., albedo, occlusion, specular embeddings. This process allocates numerous multi-scale instances to each scene slice based on their line-of-sight extension distance, establishing a solid groundwork for ordered association mining in EoCRM. Since the primary step in distinguishing distant faint targets is boundary delineation, EoCRM implements edge-wise saliency quantification and association digging. Quantitative and diagnostic experiments on Cityscapes, ADE20K, and PASCAL Context datasets reveal the competitiveness of LoSDN and the individual contribution of each highlight. Visualizations display that our strategy offers clear benefits in detecting distant, faint targets.
科研通智能强力驱动
Strongly Powered by AbleSci AI