Classifying Malicious Domains using DNS Traffic Analysis

网络钓鱼 恶意软件 僵尸网络 计算机科学 域名系统 计算机安全 领域(数学分析) 审查 互联网 黑名单 域名
作者
Samaneh Mahdavifar,Nasim Maleki,Arash Habibi Lashkari,Matt Broda,Amir H. Razavi
标识
DOI:10.1109/dasc-picom-cbdcom-cyberscitech52372.2021.00024
摘要

Malicious domains are one of the major threats that have jeopardized the viability of the Internet over the years. Threat actors usually abuse the Domain Name System (DNS) to lure users to be victims of malicious domains hosting drive-by-download malware, botnets, phishing websites, or spam messages. Each year, many large corporations are impacted by these threats, resulting in huge financial losses in a single attack. Thus, detecting and classifying a malicious domain in a timely manner is essential. Previously, filtering the domains against blacklists was the only way to detect malicious domains, however, this approach was unable to detect newly generated domains. Recently, Machine Learning (ML) techniques have helped to enhance the detection capability of domain vetting systems. A solid feature engineering mechanism plays a pivotal role in boosting the performance of any ML model. Therefore, we have extracted effective and practical features from DNS traffic categorizing them into three groups of lexical-based, DNS statistical-based, and third party-based features. Third party features are biographical information about a specific domain extracted from third party APIs. The benign to malicious domain ratio is also critical to simulate the real-world scheme where approximately 99% of the traffic is devoted to benign. In this paper, we generate and release a large DNS features dataset of 400,000 benign and 13,011 malicious samples processed from a million benign and 51,453 known-malicious domains from publicly available datasets. The malicious samples span between three categories of spam, phishing, and malware. Our dataset, namely CIC-Bell-DNS2021 replicates the real-world scenarios with frequent benign traffic and diverse malicious domain types. We train and validate a classification model that, unlike previous works that focus on binary detection, detects the type of the attack, i.e., spam, phishing, and malware. Classification performance of various ML algorithms on our generated dataset proves the effectiveness of our model, where we achieved the best results for $k$ -Nearest Neighbors $k$ -NN) with 94.8% and 99.4% F1-Score for balanced data ratio (60/40%) and imbalanced data ratio (97/3%), respectively. Finally, we have gone through feature evaluation using information gain analysis to get the merits of each feature in each category, proving the third party features as the most influential one among the top 13 features. keywords- Malicious Domain, DNS, Feature Engineering, Lexical, Statistical, Third Party, Classification
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
xiaosui完成签到 ,获得积分10
3秒前
李健春完成签到 ,获得积分10
5秒前
满意的念柏完成签到,获得积分10
7秒前
忧伤的慕梅完成签到 ,获得积分10
8秒前
Asumita完成签到,获得积分10
8秒前
9秒前
哟哟哟完成签到 ,获得积分10
9秒前
温柔觅松完成签到 ,获得积分10
13秒前
C_Li完成签到,获得积分10
14秒前
pzh完成签到 ,获得积分10
16秒前
17秒前
Hanguo发布了新的文献求助10
21秒前
yzxzdm完成签到 ,获得积分10
21秒前
celia完成签到 ,获得积分10
24秒前
风格完成签到,获得积分10
26秒前
合适的幻然完成签到,获得积分10
28秒前
量子星尘发布了新的文献求助10
28秒前
《子非鱼》完成签到,获得积分10
30秒前
微雨若,,完成签到 ,获得积分10
32秒前
隐形松完成签到 ,获得积分10
33秒前
cz完成签到 ,获得积分10
33秒前
淡然思卉完成签到,获得积分10
34秒前
Hanguo完成签到,获得积分10
34秒前
杭笑寒发布了新的文献求助10
34秒前
甲基醚完成签到 ,获得积分10
34秒前
安静严青完成签到 ,获得积分10
36秒前
凯凯搞科研完成签到,获得积分10
37秒前
lennon完成签到,获得积分10
43秒前
吴建文完成签到 ,获得积分10
44秒前
争当科研巨匠完成签到,获得积分10
48秒前
幸福妙柏完成签到 ,获得积分10
50秒前
Till完成签到 ,获得积分10
52秒前
脑洞疼应助wwrjj采纳,获得10
53秒前
奋斗奋斗再奋斗完成签到,获得积分10
54秒前
步步高完成签到,获得积分10
54秒前
一三二五七完成签到 ,获得积分10
54秒前
苦咖啡行僧完成签到 ,获得积分10
58秒前
阔达的宛白给阔达的宛白的求助进行了留言
58秒前
西宁完成签到,获得积分10
1分钟前
Grace159完成签到 ,获得积分10
1分钟前
高分求助中
【提示信息,请勿应助】关于scihub 10000
The Mother of All Tableaux: Order, Equivalence, and Geometry in the Large-scale Structure of Optimality Theory 3000
Social Research Methods (4th Edition) by Maggie Walter (2019) 2390
A new approach to the extrapolation of accelerated life test data 1000
北师大毕业论文 基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 390
Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370
Robot-supported joining of reinforcement textiles with one-sided sewing heads 360
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4008786
求助须知:如何正确求助?哪些是违规求助? 3548464
关于积分的说明 11298867
捐赠科研通 3283080
什么是DOI,文献DOI怎么找? 1810290
邀请新用户注册赠送积分活动 886000
科研通“疑难数据库(出版商)”最低求助积分说明 811220