Unlike an RGB image, an NIR image has been widely used for night vision and surveillance because it is less noisy and can keep details and textures preserved even in low-light environments. However, the NIR image is useless for human cognitive system or computer vision algorithms. This is because NIR is invisible and has no color information. Several recent studies have tried to solve this problem by so-called NIR colorization, and the recent deep learning techniques achieve superior performances. However, there still exist fundamental limitations to NIR colorization. It is quite challenging to successfully restore the original color information from NIR with no color. In other words, NIR-to-RGB conversion is a significant ill-posed problem, and this inspired us to concentrate on effective priors. In this paper, we propose a novel network architecture which transfers the conversion knowledge from 'RGB+NIR' to RGB in the teacher network to the student by the distillation loss. Also, we built a new dataset which includes NIR multi-band images with the corresponding RGB ground truth. Comparison results show that NIR multi-band and feature distillation can contribute to higher quality NIR colorization.