Substorm recognition method based on multimodal data is proposed in this paper. The multimodal data include auroral images from satellite imager, plasma velocities from ground-based radar stations and space parameters from aerocraft probes. Among the three modalities, the semantical features of auroral images are extracted by VGG-16 network. Then, LSTM group is utilized to extract sequential feature of independent modal data. And memory fusion network is used for multimodal sequential feature fusion. The experimental results of different embedding features on substorm recognition illustrate that particle precipitation recorded in auroral images is a key physical process in the substorm event process.