计算机科学
一般化
特征选择
样品(材料)
集成学习
特征(语言学)
数据挖掘
机器学习
事故(哲学)
钥匙(锁)
变量(数学)
人工智能
随机森林
选择(遗传算法)
计算机安全
数学
哲学
数学分析
化学
认识论
色谱法
语言学
作者
Leipeng Zhu,Zhiqing Zhang,Dongdong Song,Biao Chen
标识
DOI:10.1016/j.eswa.2023.121782
摘要
The causes analysis of road traffic accidents is often modelled based on high-dimensional small-sample data; however, such models often have low predictive accuracy and poor generalization performance. An analytical framework that considers both data augmentation and model optimization can enhance variable interpretation and predictive model performance, thereby improving the shortcomings of existing accident analysis methods. Our approach is as follows: 1) Starting with an analysis of the nature of road accidents, a symbolic operation is used to design a feature crosses algorithm. A random variable is added to construct a quantifiable feature selection algorithm, which can form a data augmentation method that conforms to the accident rules. 2) A highly reliable framework for analysing accident causes is constructed by using forward selection to optimize an ensemble learning model subset combined with feature crosses, feature selection and multiple-classification algorithms. A case study with accident data from a city in China shows that ensemble learning has the advantages of high predictive accuracy and strong generalization performance. It can accurately identify the key causes of accidents based on highly dimensional small-sample data. Driving behaviours such as lane changes and turns are the key causes of accidents. Giving drivers effective traffic environment information in a timely manner can significantly improve driving performance and reduce accident risk. This research provides a reference for the analysis of road accidents.
科研通智能强力驱动
Strongly Powered by AbleSci AI