In order to protect the substrate during the cleaning process as well as evaluate the cleaning effect and surface quality after laser cleaning of aircraft coatings, a visual monitoring method based on deep learning is proposed. In this paper, the data sets of "flame recognition-cleaning quality evaluation" and "optical image-surface roughness" are constructed and data enhancement is performed. The SSEResNet backbone network which can effectively extract the details of the input image is designed by using the feature fusion method. The Cascade R-CNN object detection model is improved by using SSEResNet, BiFPN and Soft-NMS, and the SSEResNet101 regression model which can directly predict surface roughness from optical images is proposed based on ResNet101. Model comparison and ablation experiments show that the above two deep learning models proposed by us have excellent detection ability and regression prediction performance, and can realize flame recognition, cleaning effect judgment during laser cleaning as well as post-cleaning surface quality evaluation. In this paper, the effects of four different learning rate decay strategies on the models are further studied. The results show that the training effect of CosineAnnealing with warm restart method is the best. In SSEResNet101 model, the training mean square error (MSE) loss is 0.0249, the mean absolute error (MAE) is 0.278μm, and the test MAE is 0.245μm; In improved Cascade R-CNN model, the mean average precision (mAP) value of intersection over union (IoU=0.6) reaches 93.6%.