• A novel model is proposed to predict the severity of Parkinson's disease. • More efficient frequency features are obtained by graph wavelet transform. • The attentional mechanism weights decision tree models in the random forest. • We achieve better prediction performance compared to the state-of-the-art methods. The progress prediction of Parkinson's disease (PD) is one of the most important issues in early diagnosis of PD. Many researches have been conducted in this field, however, most existing methods focus on the selection of baseline features and regressors to reduce prediction errors. Different from the previous studies, the main goal of this paper is to obtain more effective features by feature transformation of baseline features to improve the prediction performance. Therefore, this paper proposes a prediction model based on graph wavelet transform (GWT) and attention weighted random forest (RF). Firstly, a clustering algorithm is adopted to reduce the prediction error of the model. Next, a multi-scale analysis of the feature vectors by GWT is conducted to yield a frequency feature representation that is more relevant to the target value. Finally, the frequency features are input into the attention weighted RF to predict the severity of PD, allowing the results of decision trees with better predictive performance in the RF to be highlighted while reducing the risk of overfitting. The effectiveness of the method is evaluated on the Parkinson's telemonitoring dataset collected by the University of Oxford. The experimental results show that the mean absolute error and root mean squared error of the proposed method for predicting PD severity (motor- and total-UPDRS) are 1.53, 2.13 and 1.91, 2.70, respectively. Compared with the quoted optimal method, the errors are reduced by 7.27%, 4.05% and 5.45%, 1.10%, respectively. This indicates that the proposed method has better prediction performance.