The traditional control method of the robotic arm requires the operator to control it through a preset fixed trajectory according to the specific task and combined with the working environment, which requires a high-precision environment model. However, it lacks adaptability and cannot be applied to other working scenarios. This paper proposes an end-to-end robotic arm control method based on DRL(Deep Reinforcement Learning) to overcome the above problems. The strategy control module of the robotic arm uses the PPO(Proximal Policy Optimization) algorithm, so that the robotic arm has the ability to learn independently in a complex working environment and completes the adaptive control. In this paper, the reward shaping method is adopted in the training process of the agent, which accelerates the learning of the agent and makes the algorithm converge faster. The DRL algorithm can converge in a shorter time as shown in experimental results, and it has excellent performance in the motion planning, collision avoidance and overall strategy control of the robotic arm in the simulation environment.