欠采样
过采样
计算机科学
入侵检测系统
班级(哲学)
随机森林
人工智能
机器学习
数据挖掘
统计
数学
带宽(计算)
计算机网络
作者
Fayruz Rahma,Reza Fuad Rachmadi,Baskoro Adi Pratomo,Mauridhi Hery Purnomo
标识
DOI:10.1109/ieacon57683.2023.10370430
摘要
The imbalanced class distribution in intrusion detection systems has been a significant issue. Imbalanced class distribution can negatively impact the performance of intrusion detection systems as they may be biased towards the majority class. We explore the effectiveness of oversampling and under-sampling techniques to address this issue. Oversampling and undersampling techniques aim to balance the class distribution and improve the performance of the intrusion detection system. Oversampling increases the number of records in the minority class to make it closer in size to the majority class. Conversely, undersampling reduces the number of records in the majority class so that it is closer in size to the minority class. We assess the effectiveness of different oversampling and undersampling techniques, including Random OverSampling, SMOTE, ADASYN, Random UnderSampling, AllKNN, TomekLinks, SMOTEENN, and SMOTETomek. The experiment's findings indicate that the raw data achieved the highest accuracy score, 0.965. On the other hand, the Random Oversampling method yielded the highest F1 score, reaching a score of 0.589. When we see the evaluation scores of each class, the recall & F1 scores generally show high contrast between classes with a large amount of data and classes with (previously) a small amount of data, even though the data for training has been more balanced. We found that oversampling and undersampling can improve the performance of intrusion detection systems in specific ways, but this still needs improvement. These results can serve as a reference for researchers developing intrusion detection systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI