Wind power is a clean resource that is widely used as a renewable energy source. Accurate wind power forecasting is important for the efficient and stable use of wind energy. The erratic stochastic nature of wind power generation and the complexity of the data pose a significant challenge for short-term forecasting. Extracting features from the complex wind power data can improve the prediction models, which is a key issue for short-term forecasting. In this paper, a feature-weighted principal component analysis (WPCA) method and an improved gated recurrent unit (GRU) neural network model with optimized hyperparameters using a particle swarm optimization (PSO) algorithm are proposed. Compared with other good machine learning models, the proposed hybrid WPCA-PSO-GRU model is used to perform power prediction for a real-world wind farm. The results show that the MAE and RMSE of the WPCA-PSO-GRU model are reduced by 5.3%–16% and 10%–16% respectively, and R2 is increased by 2.1%–3.1% compared to the conventional model. The proposed model can reduce the impact of noisy data on model training, randomness, and the volatility of wind power generation. This study can also have wide applicability with complex data samples.