Safe driving has been the core of the traditional traffic, it is also a top priority for the future of autonomous driving. In recent years, with the development of target detection, a large number of proven technique have entered the realm of driverless cars. Traffic sign detection has always been an important task of pattern recognition in the traffic field. After the rise of deep learning, it quickly replaced the traditional methods and became the mainstream technology path. Object detection algorithms represented by RCNN family and YOLO series have gained extensive attention and applications. These algorithms are difficult to strike a good balance between speed and quality, especially when deployed on mobile platforms with less ability for calculate. In this paper, YOLOv5 algorithm is selected as the basis for knowledge distillation and multiple attention modes are used to improve the accuracy of the algorithm, and sparse training is used to further reduce the size of the model to achieve ultra-lightweight. The TT100K data set was used for training and verification results. However, the 45 types of traffic signs in this data set do not have the most important signal light data in traffic indication,ours made a large amount of such data to expand TT100K and named it as ETT100K data set.Experimental results show that the addition of attention to YOLOv5 can not effectively improve the model accuracy compared with knowledge distillation, but knowledge distillation can significantly improve the results.