Principal components analysis Guide, Meaning , Facts, Information and Description
In statistics, principal components analysis (PCA) is a technique that can be used to simplify a dataset; more formally it is a transform that chooses a new coordinate system for the data set such that the greatest variance by any projection of the data set comes to lie on the first axis (then called the first principal component), the second greatest variance on the second axis, and so on. PCA can be used for reducing dimensionality in a dataset while retaining those characteristics of the dataset that contribute most to its variance by eliminating the later principal components (by a more or less heuristic decision). These characteristics may be the 'most important', but this is not necessarily the case, depending on the application.PCA is also called the Karhunen-Loève transform (named after Kari Karhunen and Michel Loève) or the Hotelling transform (in honor of Harold Hotelling). PCA has the speciality of being the optimal linear transformation for keeping the subspace that has largest variance. However this comes at the price of greater computational requirement, e.g. if compared to the discrete cosine transform. Unlike other linear transforms, the PCA does not have a fixed set of basis vectors. Its basis vectors depend on the data set.
The principal component w1 of a dataset x can be defined as (assuming zero empirical mean, i.e. the empirical mean of the distribution has been subtracted away from the data set).
PCA is equivalent to empirical orthogonal functions (EOF).
PCA is a popular technique in pattern recognition. However, PCA is not optimized for class separability. An alternative is the linear discriminant analysis, which does take this into account. PCA optimally minimizes reconstruction error under the L2 norm.
| Table of contents |
|
2 Projecting new data 3 Derivation of PCA using the covariance method 4 See also |
Following is a detailed English description of PCA using the covariance method. Suppose you have n data vectors of d dimensions each, and you want to project your data into a k dimensional subspace.
Find the basis vectors
Suppose you have a d×1 data vector D. Then the k×1 projected vector is v = PT(D − M).Algorithm details
Projecting new data
