期刊:IEEE Geoscience and Remote Sensing Letters [Institute of Electrical and Electronics Engineers] 日期:2020-08-28卷期号:19: 1-5被引量:149
标识
DOI:10.1109/lgrs.2020.3017414
摘要
Deep learning (DL) has been garnering increasing attention in remote sensing (RS) due to its powerful data representation ability. In particular, deep models have been proven to be effective for RS data classification based on a single given modality. However, with one single modality, the ability in identifying the materials remains limited due to the lack of feature diversity. To overcome this limitation, we present a simple but effective multimodal DL baseline by following a deep encoder–decoder network architecture, EndNet for short, for the classification of hyperspectral and light detection and ranging (LiDAR) data. EndNet fuses the multimodal information by enforcing the fused features to reconstruct the multimodal input in turn. Such a reconstruction strategy is capable of better activating the neurons across modalities compared with some conventional and widely used fusion strategies, e.g., early fusion, middle fusion, and late fusion. Extensive experiments conducted on two popular hyperspectral and LiDAR data sets demonstrate the superiority and effectiveness of the proposed EndNet in comparison with several state-of-the-art baselines in the hyperspectral-LiDAR classification task. The codes will be available at https://github.com/danfenghong/IEEE_GRSL_EndNet , contributing to the RS community.