Current unsupervised domain adaptation (UDA) techniques in semantic segmentation effectively decrease the domain discrepancy between the labeled source domain and unlabeled target domain, thereby enhancing the model's pixel-wise discriminative capability for target domain data. However, in remote sensing images (RSIs), our study uncovers that these approaches perform poorly in the presence of class distribution inconsistencies between the source and target domains. In this work, we propose a one-stage mean teacher framework with a novel auxiliary prototype classifier, named MTA, to address this issue. Specifically, the teacher model assigns pseudo labels at pixel level for target samples and captures knowledge from the student model via exponential moving average (EMA). With labeled source samples and target samples that have pseudo labels, the student model can alleviate the divergence between the source and target domains. In addition, the auxiliary prototype classifier (APC) reduces the performance degradation in the parametric softmax classifier of the student model caused by class distribution divergence. We also propose a prototype computation scheme to obtain each class prototype in the APC. Specifically, we build a memory bank for each class of the two domains to store feature embeddings dynamically. Then, we compute the class prototype by applying the clustering algorithm on memory banks corresponding to the class. Meanwhile, the APC reduces the intra-class domain discrepancy by optimizing the cross-entropy loss, which brings each class feature distribution of the two domains closer to the class prototype. The experimental results on RSIs UDA semantic segmentation tasks show the superiority of our approach over comparative methods.