To solve the path planning problem of finding the optimal path for a ship in a complex navigation environment, this paper uses the AlphaZero algorithm. A sufficient number of paths can be selected from the replay buffer to pursue a higher cumulative reward value and obtain the best decision policy to improve the security and efficiency of navigation through neural network training and Monte Carlo Tree Search. By observing the experimental simulation results, it is found that the AlphaZero algorithm is more adaptable and accurate in policy evaluation, which improves the security and efficiency of navigation. AlphaZero is equipped with more adaptation, policy evaluation and the capacity of path planning is improved to a higher degree.