The widespread use of Autonomous Underwater Vehicles (AUVs) highlights the need for autonomous docking, during which accurate pose estimation and navigation play a vital role. This paper proposes a multi-sensor fusion navigation framework based on the factor graph optimization method, integrating tightly-coupled visual information from the light array to provide high-accuracy and high-frequency relative pose estimations between AUV and its mobile dock at the terminal docking stage. Simulation results demonstrate that the proposed algorithm outperforms PnP and achieves smaller RMSE in relative attitude and translation estimations. Furthermore, the experiments show that the proposed algorithm provides smoother estimation results and that it has the potential to be deployed in embedded applications.