A multi-sensor network usually produces a large scale of data, some of which represent specific meaningful events. For event-driven multi-sensor networks, event classification is the basis of subsequent high-level decisions and controls. However, the accuracy improvement of classification is always a challenge. Recently the deep learning methods have achieved vast success in many conventional fields, and one of the most popular deep architectures is convolutional neural network (CNN) which sufficiently utilizes partial features of the input images. In this paper, we make some analogy between an image and sensor data, then propose a CNN-based method to improve the event classification accuracy for homogenous multi-sensor networks. An variant of AlexNet has been designed and established for classifying the event by acoustic signals. The results indicate that this CNN-based classifier outperforms than k Nearest Neighbor (kNN) and Support Vector Machine (SVM) methods on our data set with a higher accuracy.