计算机科学
恶意软件
Android恶意软件
聚类分析
Android(操作系统)
人工智能
恶意软件分析
调用图
图形
机器学习
数据挖掘
程序设计语言
理论计算机科学
操作系统
作者
Hongyu Yang,Youwei Wang,Liang Zhang,Xiang Cheng,Ze Hu
标识
DOI:10.1016/j.cose.2023.103651
摘要
Due to the continuous evolution of both the Android framework and malware, conventional malware detection methods that have been trained using outdated apps are inadequate in effectively identifying sophisticated evolved malware. To address this issue, in this paper, we propose a novel Android malware detection method with API semantics extraction (AMDASE), it can effectively identify evolved malware instances. Firstly, AMDASE performs API clustering to obtain cluster centers representing API functions before malware detection. We design API sentence to summarize API features and employ natural language processing (NLP) tools to acquire embeddings of API sentence for clustering. With the help of API sentence, it becomes possible to effectively extract the semantics of API contained in features like method name that accurately represents its intended functionality, which also makes the clustering results more accurate. Secondly, AMDASE extracts call graph from each app and optimizes the call graph by removing nodes corresponding to unknown functions, while ensuring the preservation of connectivity between their predecessor and successor nodes. The optimized call graph can extract more robust API contextual information that accurately represents the behavior of each app. Thirdly, in order to maintain resilience against the evolution of Android malware, AMDASE extracts function call pairs from the optimized call graph and abstracts the APIs in function call pairs into cluster centers obtained in API clustering. Finally, feature vectors are generated using one-hot mapping and machine learning classifiers are used for malware detection. We evaluate AMDASE on a dataset of 42,154 benign and 42,450 malicious apps developed over a seven-year period. The experimental results demonstrate that AMDASE greatly outperforms the existing state-of-the-art methods and has a significantly slower aging speed.
科研通智能强力驱动
Strongly Powered by AbleSci AI