僵尸网络
恶意软件
计算机科学
域名系统
领域(数学分析)
指挥与控制
杠杆(统计)
网络数据包
纳克
数据挖掘
网络安全
机器学习
人工智能
计算机安全
语言模型
万维网
互联网
数学分析
数学
电信
标识
DOI:10.1109/bdicn55575.2022.00014
摘要
The botnet is a severe threat to computer networks, and the detection of botnet behaviors is an important research area of cyber security. Malware authors leverage the Domain Generation Algorithm (DGA) to generate bulks of pseudo-random domain names to connect to the Command and Control (C&C) server, which makes the detections and preventions extremely difficult. Previous work mostly defended against the DGA domains through pre-registering, sink-holeing or publishing blacklists after reverse engineering the malware. However, these approaches can be easily bypassed by malware authors. For most of the communications between the botnet and the C&C server, the first step is generally sending Domain Name System (DNS) request packets. Thus, an alternative approach was based on capturing and analyzing the DNS traffic and classifying the domains. Most of the previous work tried to cluster the domains, and these techniques involved the usage of contextual information. Thus, it takes a long time period to run the algorithms, which means these techniques can not be used in real-time detection. Compared with the traditional methods, recent methods attempt to predict whether the domain is DGA generated based solely on the domain name string. Nevertheless, these methods involved human engineered features that can be readily circumvented by the attackers. In this paper, we proposed a method that extracts the linguistic features as well as applies machine learning algorithms to classify the domain name. To verify the performance of the proposed method, we designed and implemented a botnet detection system, and trained and tested the model with real data. The results demonstrate that the proposed method is able to capture the suspicious packets and accurately classify the domains. We evaluated our system with real traffic, it can correctly classify the DGA domains in 95% of the cases. Furthermore, when detecting unknown DGA domains, our system achieved a 88.5% accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI