One of the primary challenges facing medical visual question answering (Med-VQA) is the lack of large-scale well-annotated datasets for training. To overcome this challenge, this paper proposes a two-stage pre-training framework by learning transferable feature representations of radiology images and distilling a lightweight visual feature extractor for Med-VQA. Specifically, we leverage large amounts of unlabeled radiology images to train three teacher models for the body regions of brain, chest, and abdomen respectively via contrastive learning. Then, we distill the teacher models to a lightweight student model that can be used as a universal visual feature extractor for any Med-VQA system. The lightweight feature extractor can be readily fine-tuned on the training radiology images of any Med-VQA dataset, saving the annotation effort while preventing overfitting to small-scale training data. The effectiveness and advantages of the pre-trained model are demonstrated by extensive experiments with state-of-the-art Med-VQA methods on existing benchmarks. The source code and the pre-training dataset can be downloaded from https://github.com/awenbocc/cprd.