计算机科学
一致性(知识库)
可用性
质量(理念)
数据挖掘
领域(数学分析)
机器学习
交通分类
人工智能
数据科学
服务质量
计算机网络
人机交互
认识论
数学分析
哲学
数学
作者
Dominik Soukup,Peter Tisovcik,Karel Hynek,Tomáš Čejka
标识
DOI:10.23919/cnsm52442.2021.9615601
摘要
This paper deals with the quality of network traffic datasets created to train and validate machine learning classification and detection methods. Naturally, there is a long epoch of research targeted at data quality; however, it is focused mainly on data consistency, validity, precision, and other metrics, which are insufficient for network traffic use-cases. The rise of Machine learning usage in network monitoring applications requires a new methodology for evaluation datasets. There is a need to evaluate and compare traffic samples captured at different conditions and decide the usability of the already captured and annotated data. This paper aims to explain a use case of dataset creation, propose definitions regarding the quality of the network traffic datasets, and finally, describe a framework for datasets analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI