Optical transport network (OTN) and software-defined networking (SDN) are widely used in backbone and metropolitan transmission networks, to improve network transmission capacity. In OTN, it is particularly important to correctly allocate routes and maximize network capacity. In order to solve this problem, this paper designs a random walk algorithm which is offline routing strategy based on Q-learning to optimize the network routing of OTN. The routing strategy uses traffic demand as the reward to drive the decision body, so that the decision body randomly explores the network topology, learns the current network state information, and makes a better routing decision. We model and experiment with the SD-OTN scenarios of this paper. The experimental results show that the routing strategy of the network architecture of this paper can achieve superior performance.