Imagined speech is a kind of intuitive paradigm of Brain Computer Interface (BCI). In recent years, the accuracy of the classification based on the electroencephalogram of imagined speech has increased owing to the more and more complex models. In order to further improve the accuracy and reduce the complexity of the model to a practical level, this paper proposed an efficient architecture consisted of Large Kernel and ConvMixer. This architecture increased the accuracy to 75.73%. The performance of the architecture is higher than other models, including combination EEGNet and Transformer which is the best model in the past. The results of further experiments show that Large Kernel can extract more task-related features from the EEG of imagined speech. ConvMixer can achieve a certain degree of performance improvement even with a greatly simplified structure of model.