计算机科学
梯度升压
支持向量机
机器学习
人工智能
数据挖掘
特征选择
随机森林
Boosting(机器学习)
多层感知器
毒物控制
人工神经网络
统计
数学
医学
环境卫生
作者
Yuan Chen,Ye Li,Helai Huang,Shiqi Wang,Zhenhao Sun,Yan Li
标识
DOI:10.1016/j.amar.2022.100217
摘要
The real-time conflict prediction model using traffic flow characteristics is much less studied than the crash-based model. This study aims at exploring the relationship between conflicts and traffic flow features with the consideration of heterogeneity and developing predictive models to identify conflict-prone conditions in a real-time manner. The high-resolution trajectory data from the HighD dataset is used as empirical data. A novel method with the virtual detector approach for traffic feature extraction and a two-step framework is proposed for the trajectory data analysis. The framework consists of an exploratory study by random parameter logit model with heterogeneity in means and variances and a comparative study on several machine learning methods, including eXtreme Gradient Boosting (Boosting), Random Forest (Bagging), Support Vector Machine (Single-classifier), and Multilayer-Perceptron (Deep neural network). Results indicate that (1) traffic flow characteristics have significant impacts on the probability of conflict occurrence; (2) the statistical model considering mean heterogeneity outperforms the counterpart and lane differences variables are found to significantly impact the means of random parameters for both lane variables and lane differences variables; (3) eXtreme Gradient Boosting trained on an under-sampled dataset turns out to be the best model with the highest AUC of 0.871 and precision of 0.867, showing that re-sampling techniques can significantly improve the model performance. The proposed model is found to be sensitive to the conflict threshold. Sensitivity analysis on feature selection further confirms that the conflict risk prediction should consider both subject lane features and lane difference features, which verifies the consistency with exploratory analysis based on the statistical model. The consistency between statistical models and machine learning methods improves the interpretability of results for the latter one.
科研通智能强力驱动
Strongly Powered by AbleSci AI