计算机科学
付款
大数据
数据库事务
SPARK(编程语言)
预处理器
事务处理
数据预处理
交易数据
数据挖掘
计算机安全
数据库
人工智能
万维网
程序设计语言
作者
Samiksha Dattaprasad Tawde,Sandhya Arora,Yashasvee Shitalkumar Thakur
标识
DOI:10.1007/978-3-031-50583-6_22
摘要
Modern economic life is now greatly facilitated by online payment systems, which allow for seamless financial transactions. However, the risk of online payment fraud has greatly increased along with the growth of digital transactions. This calls for the creation of sophisticated fraud detection systems that can instantly evaluate huge amounts of transaction data. This study suggests a novel method for identifying online payment fraud by utilizing big data management techniques, more specifically PySpark's capabilities. PySpark uses Resilient Distributed Datasets (RDD), data structure and stores data in RAM instead of writing it to disk after each operation. RDD operations are lazy i.e., they will not execute unless an action operation is called on them. After preprocessing the data Machine Learning algorithms from Spark ML package are applied, the ML library of PySpark provides optimized Machine Learning capabilities for Classification problems that require distributed computing. Further, models of classification algorithms that qualify with the best metrics are developed on our dataset and used for making accurate detections. Our Fraud detection system aims to assist Large organizations in assessing their enormous amount of transaction data to detect possible anomalies or fraudulent activities.
科研通智能强力驱动
Strongly Powered by AbleSci AI