主成分分析
离群值
基质(化学分析)
多维标度
数据矩阵
缩放比例
不相交集
集合(抽象数据类型)
差异(会计)
变量(数学)
统计
数据集
数据分析
计算机科学
数据挖掘
数学
组合数学
系统发育树
材料科学
化学
数学分析
程序设计语言
几何学
复合材料
业务
会计
基因
克莱德
生物化学
作者
Svante Wold,Kim H. Esbensen,Paul Geladi
标识
DOI:10.1016/0169-7439(87)80084-9
摘要
Principal component analysis of a data matrix extracts the dominant patterns in the matrix in terms of a complementary set of score and loading plots. It is the responsibility of the data analyst to formulate the scientific issue at hand in terms of PC projections, PLS regressions, etc. Ask yourself, or the investigator, why the data matrix was collected, and for what purpose the experiments and measurements were made. Specify before the analysis what kinds of patterns you would expect and what you would find exciting. The results of the analysis depend on the scaling of the matrix, which therefore must be specified. Variance scaling, where each variable is scaled to unit variance, can be recommended for general use, provided that almost constant variables are left unscaled. Combining different types of variables warrants blockscaling. In the initial analysis, look for outliers and strong groupings in the plots, indicating that the data matrix perhaps should be “polished” or whether disjoint modeling is the proper course. For plotting purposes, two or three principal components are usually sufficient, but for modeling purposes the number of significant components should be properly determined, e.g. by cross-validation. Use the resulting principal components to guide your continued investigation or chemical experimentation, not as an end in itself.
科研通智能强力驱动
Strongly Powered by AbleSci AI