ILNumerics Ultimate VS

MachineLearningpca Method (InArrayDouble, OutArrayDouble, OutArrayDouble, OutArrayDouble)

ILNumerics Ultimate VS Documentation
ILNumerics - Technical Application Development
Principal Component Analysis (PCA).

[ILNumerics Machine Learning Toolbox]

Namespace:  ILNumerics.Toolboxes
Assembly:  ILNumerics.Toolboxes.MachineLearning (in ILNumerics.Toolboxes.MachineLearning.dll) Version: 5.5.0.0 (5.5.7503.3146)
Syntax

public static RetArray<double> pca(
	InArray<double> A,
	OutArray<double> outWeights = null,
	OutArray<double> outCenter = null,
	OutArray<double> outScores = null
)

Parameters

A
Type: ILNumericsInArrayDouble
Data matrix, size (m,n); each of n observations is expected in a column of m variables.
outWeights (Optional)
Type: ILNumericsOutArrayDouble
[Output] Weights for scaling the components according to the original data.
outCenter (Optional)
Type: ILNumericsOutArrayDouble
[Output] Vector pointing to the center of the input data A.
outScores (Optional)
Type: ILNumericsOutArrayDouble
[Output] Scaling factors for the component to recreate original data.

Return Value

Type: RetArrayDouble
PCA components; weights, center and scores are returned at optional output parameters.
Remarks

Principal Component Analysis (PCA) is commonly used as method for dimension reduction. It computes a number of 'principal components' which span a space of orthogonal directions. The nice property is, these directions are choosen such, as to maximize the variance of original data, once they are projected onto them. We can simply pick only a subset of components, having associated a high variance and leave out other components, which do not contribute much to the distribution of the original data. The resulting subspace is constructed of fewer dimensions as the original space - with a smaller reconstrution error. Therefore, PCA is commonly used for visualizing higher dimensional data in only two or three dimensional plots. It helps analyzing datasets which otherwise could only be visualized by picking individual dimensions. By help of PCA, 'interesting' directions in the data are identified.

Any output parameter are optional and may be ommited ore provided as null parameter:

  • components (return value) prinicipal components. Matrix of size m x m, m components are provided in columns. The first component marks the direction in the data A, which corresponds to the largest variance (i.e. by projecting the data onto that direction, the largest variance would be created). Adjacent components are all orthogonal to each other. The components are ordered in columns of decreasing variance.
  • weights vectors. While the components returned are normalized to length 1, the 'weights' vector contains the factors needed, to scale the components in order to reflect the real spacial distances in the original data.
  • center of the original data. The vector points to the weight middle of A.
  • scores is a matrix of size m by n. For each datapoint given in A, it contains factors for each component needed to reproduce the original data point in terms of the components.

[ILNumerics Machine Learning Toolbox]

See Also

Reference