Phishing Website Detection: An In‐Depth Investigation of Feature Selection and Deep Learning

网络钓鱼 计算机科学 机器学习 人工智能 随机森林 过度拟合 卷积神经网络 深度学习 特征选择 决策树 支持向量机 可扩展性 感知器 沙盒(软件开发) 数据挖掘 人工神经网络 互联网 万维网 数据库 软件工程
作者
Seyed Amin Mousavi,Mahdi Bahaghighat
出处
期刊:Expert Systems [Wiley]
卷期号:42 (3)
标识
DOI:10.1111/exsy.13824
摘要

ABSTRACT Cloud and fog computing technologies benefit from integrating AI‐driven phishing detection as it enhances security, scalability, real‐time reaction, and privacy. Nowadays, there is a noticeable rise in illegal activity taking place online. One of the illicit cybersecurity practices is phishing, in which hackers trick consumers by pretending to be authentic websites and spoofing them to obtain sensitive user information. Phishing attacks, regrettably, have increased dramatically in recent years, according to research. Machine learning (ML) and deep learning (DL) techniques have shown encouraging progress in thwarting these attacks. Consequently, we employed DL and ML techniques to identify phishing websites in this study. This article presents four scenarios in both ML and DL models. Two are proposed in ML, while the others are employed in DL. The outcomes of four scenarios were contrasted to determine which algorithm performed better at distinguishing between legal and illicit websites. Many popular ML techniques were used, including K‐nearest neighbour, random forest (RF), decision trees, and SVMs. PCA and Importance Features are implemented in both ML scenarios to find the best features. RF successfully reached an accuracy of 97.82% using the Importance Feature technique. However, the PCA method failed to improve the performance of ML algorithms. As a result of ML‐based scenarios, 98 features are selected for the final deep learning scenarios. In DL‐based scenarios, algorithm architectures are essential to avoid overfitting and bias due to various hyperparameters. Thus, in the third scenario, our aim focuses on DL architecture design. Multilayer perceptron and convolutional neural networks (CNNs) are employed to detect phishing websites. Finally, our proposed 1D CNN model, using stratified k‐fold cross‐validation, outperformed the classical ML algorithm, achieving 98.94% accuracy and 0.99 AUC‐ROC score in detecting phishing websites.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
研友_LJQXK8完成签到,获得积分10
刚刚
川Q邓紫棋完成签到 ,获得积分10
1秒前
安静发布了新的文献求助100
1秒前
mengna发布了新的文献求助10
1秒前
去码头整点薯条完成签到 ,获得积分10
1秒前
youoii发布了新的文献求助10
1秒前
Eraser完成签到,获得积分10
1秒前
2秒前
DijiaXu完成签到,获得积分10
2秒前
野生的马桶完成签到,获得积分10
2秒前
villanelle0308完成签到,获得积分10
3秒前
量子星尘发布了新的文献求助10
3秒前
怕黑的凌柏完成签到,获得积分10
3秒前
科目三应助跳跃盼波采纳,获得10
3秒前
Orange应助跳跃盼波采纳,获得10
3秒前
Hello应助夕荀采纳,获得10
3秒前
3秒前
星空物语完成签到,获得积分10
3秒前
清辉夜凝完成签到 ,获得积分10
3秒前
古娜拉黑暗之女神完成签到,获得积分10
3秒前
4秒前
我是老大应助你好采纳,获得10
4秒前
4秒前
5秒前
寒月完成签到,获得积分10
5秒前
桃花落完成签到,获得积分10
6秒前
少7一点8完成签到,获得积分10
6秒前
卷卷完成签到,获得积分10
6秒前
雪白的凡灵完成签到,获得积分10
6秒前
打打应助野生的马桶采纳,获得10
6秒前
所所应助微风往事采纳,获得10
7秒前
半山完成签到,获得积分10
7秒前
小透明应助能干的初瑶采纳,获得30
7秒前
8秒前
8秒前
Oliver_Pcf完成签到,获得积分10
8秒前
宁牛青完成签到,获得积分10
8秒前
傲娇的慕卉完成签到 ,获得积分10
8秒前
8秒前
芒go完成签到,获得积分10
8秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
List of 1,091 Public Pension Profiles by Region 1621
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] | NHBS Field Guides & Natural History 1500
Lloyd's Register of Shipping's Approach to the Control of Incidents of Brittle Fracture in Ship Structures 1000
Brittle fracture in welded ships 1000
Metagames: Games about Games 700
King Tyrant 680
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5573946
求助须知:如何正确求助?哪些是违规求助? 4660289
关于积分的说明 14728668
捐赠科研通 4600067
什么是DOI,文献DOI怎么找? 2524676
邀请新用户注册赠送积分活动 1495011
关于科研通互助平台的介绍 1465006