Semi-supervised multi-label video action detection aims to locate all the persons and recognize their multiple action labels by leveraging both labeled and unlabeled videos. Compared to the single-label scenario, semi-supervised learning in multi-label video action detection is more challenging due to two significant issues: generation of multiple pseudo labels and class-imbalanced data distribution. In this paper, we propose an effective semi-supervised learning method to tackle these challenges. Firstly, to make full use of the informative unlabeled data for better training, we design an effective multiple pseudo labeling strategy by setting dynamic learnable threshold for each class. Secondly, to handle the long-tailed distribution for each class, we propose the unlabeled class balancing strategy. We select training samples according to the multiple pseudo labels generated during the training iteration, instead of the usual data re-sampling that requires label information before training. Then the balanced re-weighting is leveraged to mitigate the class imbalance caused by multi-label co-occurrence. Extensive experiments conducted on two challenging benchmarks, AVA and UCF101-24, demonstrate the effectiveness of our proposed designs. By using the unlabeled data effectively, our method achieves the state-of-the-art performance in video action detection on both AVA and UCF101-24 datasets. Besides, it can still achieve competitive performance compared with fully-supervised methods when using limited annotations on AVA dataset.