Abstract Objective: Humans spend a significant portion of their lives in sleep (an essential driver of body metabolism). Moreover, as sleep deprivation could cause various health complications, it is crucial to develop an automatic sleep stage detection model to facilitate the tedious manual labeling process. Notably, recently proposed sleep staging algorithms lack model explainability and still require performance improvement. Approach: We implemented multiscale neurophysiology-mimicking kernels to capture sleep-related electroencephalogram (EEG) activities at varying frequencies and temporal lengths; the implemented model was named “Multiscale Temporal Convolutional Neural Network (MTCNN).” Further, we evaluated its performance using an open-source dataset (Sleep-EDF Database Expanded comprising 153 days of polysomnogram data). Main results: By investigating the learned kernel weights, we observed that MTCNN detected the EEG activities specific to each sleep stage, such as the frequencies, K-complexes, and sawtooth waves. Furthermore, regarding the characterization of these neurophysiologically significant features, MTCNN demonstrated an overall accuracy (OAcc) of 91.12% and a Cohen kappa coefficient of 0.86 in the cross-subject paradigm. Notably, it demonstrated an OAcc of 88.24% and a Cohen kappa coefficient of 0.80 in the leave-few-days-out analysis. Our MTCNN model also outperformed the existing deep learning models in sleep stage classification even when it was trained with only 16% of the total EEG data, achieving an OAcc of 85.62% and a Cohen kappa coefficient of 0.75 on the remaining 84% of testing data. Significance: The proposed MTCNN enables model explainability and it can be trained with lesser amount of data, which is beneficial to its application in the real-world because large amounts of training data are not often and readily available.