Who is Who on Ethereum? Account Labeling Using Heterophilic Graph Convolutional Network
计算机科学
图形
理论计算机科学
作者
Dan Lin,Jiajing Wu,Tao Huang,Kaixin Lin,Zibin Zheng
出处
期刊:IEEE transactions on systems, man, and cybernetics [Institute of Electrical and Electronics Engineers] 日期:2023-11-21卷期号:54 (3): 1541-1553被引量:4
标识
DOI:10.1109/tsmc.2023.3329520
摘要
To combat cybercrimes and maintain financial security for the blockchain ecosystem, "know your customer" (KYC) is an essential and also challenging process due to the pseudonymity nature of blockchain technology. To unlock the potential of KYC on blockchain-based platforms like Ethereum, account labeling is a powerful means which can de-anonymize addresses by mining public transaction records. Existing studies on account labeling are mainly conducted via machine learning (ML) methods fed with hand-crafted features or graph neural networks based on the modeled transaction network. However, ML approaches based on hand-crafted features ignore the global interaction information between accounts, making it easy for criminals to evade detection. Moreover, the performance of traditional GCN methods when applied to Ethereum transaction network encounters limitations due to label sparsity, network heterophily, and large network size of the transaction network. In this article, we first analyze Ethereum accounts involved in typical businesses, in terms of both account and topological features. Then based on the analytical results, we propose a novel GCN method named know-your-customer graph convolutional network (KYC-GCN) which contains two key designs: 1) multihop aggregators and importance-based sampling are designed to tackle the dilemma between accuracy and efficiency. 2) GCN architecture is improved to explicitly capture local and more global information. Experimental results on a realistic Ethereum dataset show that the proposed KYC-GCN (90.2% accuracy, 86.2% Marco-F1) achieves state-of-the-art classification performance, and results on six benchmarks demonstrate that it yields great performance under homophily and heterophily.