计算机科学
协议(科学)
聚类分析
代表(政治)
嵌入
数据挖掘
交通生成模型
分布式计算
机器学习
人工智能
计算机网络
医学
替代医学
病理
政治
政治学
法学
作者
Junchen Li,Guang Cheng,Zongyao Chen,Peng Zhao
标识
DOI:10.1016/j.cose.2023.103575
摘要
Protocol Reverse Engineering (PRE) has been widely studied in recent years as the most direct approach for analyzing unknown traffic, which is predominantly generated by private protocols. With the increase in private protocols, network traffic keeps deepening the unknown, leading to supervised learning methods struggling to obtain effective models when prior knowledge is absent. Furthermore, the unknown traffic captured in the real-world environment is actually mixed, and it cannot be directly provided to PRE for further analysis due to the lack of labels associated with private protocols. To address this issue in PRE, we propose an approach for dividing the unknown traffic into clusters with the labels of different private protocols in this paper, named FEAC. Firstly, we propose the general structure of protocol specification through an extensive investigation of protocols. Then, the unknown traffic is characterized as the Protocol Specification Fusion Vector (PSFV) based on word embedding, fusing the multidimensional information of protocol specification introduced before. After that, representation learning is employed in refining the information of the PSFVs to compress the dimension, reducing the complexity of computation. Finally, we combine the refined PSFVs and DBSCAN algorithm to implement the protocol clustering of unknown traffic, improving the analysis ability of PRE on unknown traffic. We carry out comprehensive experiments for comparison on real-world network traffic, and the experimental results demonstrate that FEAC achieves the ideal clustering performance and has advantages over previous work.
科研通智能强力驱动
Strongly Powered by AbleSci AI