主成分分析
降维
计算机科学
数据处理
数据挖掘
大数据
维数之咒
稀疏PCA
数据质量
模式识别(心理学)
信息处理
人工智能
数据库
工程类
公制(单位)
运营管理
神经科学
生物
标识
DOI:10.2478/amns-2024-0664
摘要
Abstract With the arrival of the significant data era, efficiently processing large-scale multidimensional data has become challenging. As a powerful data dimensionality reduction tool, Principal Component Analysis (PCA) plays a vital role in big data processing, especially in information extraction and data simplification, showing unique advantages. The research aims to simplify the data processing process and improve the data processing efficiency by PCA method. The research method adopts the basic theory of PCA, the improvement of the weighted principal component analysis algorithm, and standardized and homogenized data processing techniques to process large-scale multidimensional data sets. The results show that the data dimensionality is significantly reduced after using PCA, for example, in the Analysis of the earnings quality of listed companies in the e-commerce industry, the cumulative variance contribution rate of the first four principal components extracted by PCA reaches 81.623%, which effectively removes the primary information of the original data. PCA not only reduces the complexity of the data, but also retains a large amount of crucial information, which is a significant application value for the processing of big data, especially in the fields of data compression and pattern recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI