Attention deficit hyperactivity disorder (ADHD) is a common childhood mental disorder that encompasses three subtypes. Classifying each subtype has practical significance. However, the gold standard for subtype diagnosis depends on face-to-face consultation with psychiatrists, which is limited by medical resources. This paper proposes a graph-based multimodal fusion approach to classify each subtype objectively, alleviating the pressure on psychiatrists. The proposed method leverages heterogeneous signals, including motion and speech, which are significant indicators of ADHD. We construct a personal graph where each child is a vertex, and the similarity of their personal information measures edges. Since the associations between subjects modeled by the personal graph provide rich prior knowledge, we regard the problem of subtype classification as predicting the labels of vertices on a graph. A novel graph neural network model is proposed to enable information passing between children, fusing motion and speech features under the guidance of the personal graph. We design a reading scenario and collect a multimodal dataset containing 56 children with ADHD and 50 typically developing children. Results of ADHD subtype classification demonstrate the practical value of the proposed approach. We also perform ablation studies to verify the validity of the proposed method.