Automated orchard operation has been a firm goal of fruit farmers for a long time. Deep learning-based approaches have been widely used to improve the performance of fruit detection, branch pruning, production estimating and other agricultural operations. This paper proposes a novel method to detect keypoint on the branch, which enables branch pruning during fruit picking. Specifically, a top-down framework for bearing branch keypoint detection is developed. First, a candidate area is generated according to the fruit-growing position and the fruit stem keypoint detection, which provides an attention region for further keypoint detection. Second, a multi-level feature fusion network which combines features in the same spatial sizes (intra-level) and from different spatial sizes (inter-level) is proposed to detect keypoint within the candidate area. The network can learn the spatial and semantic information and model the relationship among bearing branch keypoints. In addition, this paper constructs a citrus bearing branch dataset, which contributes to comprehensively evaluating the proposed method. Evaluation metrics on the dataset indicate the proposed method reaches an AP of 77.4% and an accuracy score of 84.7% with smaller model size and lower computing power consumption, which significantly outperforms several state-of-the-art keypoint detection methods. This study provides the possibility and foundation for performing automatic branch pruning during fruit harvesting.