Electroencephalography (EEG) based motor imagery (MI) is one of the promising Brain–computer interface (BCI) paradigms enable humans to communicate with the outside world based solely on brain intentions. Although convolutional neural networks have been gradually applied to MI classification task and gained high performance, the following problems still exist to make effective decoding of raw EEG signals challenging: 1) EEG signals are non-linear, non-stationary, and low signal-noise ratio. 2) Most existing end-to-end MI models utilize single scale convolution which limits the result of classification because the best convolution scale varies with different subject (called subject difference). In addition, even for the same subject, the best convolution scale also differs from time to time (called time difference). In this paper, we propose a novel end-to-end model, named Multi-branch Multi-scale Convolutional Neural Network (MMCNN), for motor imagery classification. The MMCNN model effectively decodes raw EEG signals without any pre-processing including filtering. Meanwhile, the multi-branch multi-scale convolution structure successfully addresses the problems of subject difference and time difference based on parallel processing. In addition, multi-scale convolution can realize the characterization of information in different frequency bands, thereby effectively solving the problem that the best convolution scale is difficult to determine. Experiments on two public BCI competition datasets demonstrate that the proposed MMCNN model outperforms the state-of-the-art models. The implementation code is made publicly available https://github.com/jingwang2020/ECML-PKDD_MMCNN.