Bus delays significantly affect urban public transportation by reducing operational efficiency and incurring high costs. Understanding the causes of these delays is essential for developing targeted mitigation strategies. While traditional research focuses on correlation-based analysis, it often fails to uncover the underlying causal mechanisms. This study examines various causal graph discovery algorithms combined with structural equation models (SEMs) to infer the causal relationships among factors that affect bus delays. These algorithms generate causal graphs for bus delays, revealing the interrelations and impacts of various operational factors. SEM is used to quantify the causal effects. This study evaluates the performance of these algorithms from the perspectives of both the statistical data fitting and the causal relationships generated. A case study is conducted using General Transit Feed Specification (GTFS) data from frequent bus routes in Stockholm, Sweden. The validation results demonstrate the effectiveness of data-driven causal discovery models in identifying causal links, particularly when combined with domain knowledge. The empirical analysis shows the complexity of factors contributing to bus delays, emphasizing the necessity of integrating causality into bus delay analysis. For example, a high correlation between origin delay and bus arrival delay (coefficient = 0.63) does not indicate direct causation, and a strong causation between dwell time and arrival delay does not imply a higher correlation (coefficient = 0.12). Comparing variable importance with linear regression (LR) reveals notable differences; origin delay, which is often overlooked by previous studies, is significant in the causal graph model (standardized coefficient = 0.601) but ranks much lower in LR (standardized coefficient = 0.003). These insights underscore the importance of automated, data-driven causal discovery in enhancing decision-making processes and improving the efficiency and reliability of transit services.