In recent years, many deep learning algorithms based on seismic signals have been proposed to solve the moving target recognition problem in unattended ground sensor systems. Despite the excellent performance of these deep networks, most of them can only be deployed on cloud-based devices and cannot be deployed on low-power hardware devices due to the large network size. Second, since seismic signals are affected by the terrain, employing only seismic signals as reconnaissance means for unattended ground sensors cannot achieve multiterrain-type adaptability. In response, this paper proposes an MFC-TinyNet method facing a multiterrain. The method adds depthwise separable convolutional layers to the network, which effectively reduces the size of the network while keeping the target recognition accuracy constant, and solves the problem that the model is difficult to deploy on low-power hardware. It also uses the Mel-frequency spectrum feature extraction method to fuse sound and seismic signals to improve the accuracy of the model’s moving target recognition on a multiterrain. Experiments demonstrate that the method can combine the two advantages of the small network model and multiterrain applicability.