Principal Component Analysis

Principal Components Analysis (PCA)

Objective

Capture the intrinsic variability in the data.

Reduce the dimensionality of a data set, either to ease interpretation or as a way to avoid overfitting and to prepare for subsequent analysis.

The sample covariance matrix of $\mathbf{X}$ is $\mathbf{S} = \mathbf{X}^T\mathbf{X}/\mathbf{N}$ , since $\mathbf{X}$ has zero mean.

Eigen decomposition of $\mathbf{X}^T\mathbf{X}$ :

$\mathbf{X}^T\mathbf{X} = (\mathbf{U}\mathbf{D}\mathbf{V}^T)^T (\mathbf{U}\mathbf{D}\mathbf{V}^T) =\mathbf{V}\mathbf{D}^T\mathbf{U}^T\mathbf{U}\mathbf{D}\mathbf{V}^T = \mathbf{V}\mathbf{D}^2\mathbf{V}^T$

The eigenvectors of $\mathbf{X}^T\mathbf{X}$ (i.e., $v _ { j } j = 1 , \dots , p$ ) are called principal component directions of $\mathbf{X}$ .

The first principal component direction $\mathbf{v}_1$ has the following properties that

$\mathbf{v}_1$ is the eigenvector associated with the largest eigenvalue, $\mathbf{d}_1^2$ , of $\mathbf{X}^T\mathbf{X}$ .
$\mathbf{z}_1 = \mathbf{X}\mathbf{v}_1$ has the largest sample variance amongst all normalized linear combinations of the columns of X.
$\mathbf{z}_1$ is called the first principal component of $\mathbf{X}$ . And, we have $Var(\mathbf{z}_1)= d_1^2 / N$ .

The second principal component direction $v_2$ (the direction orthogonal to the first component that has the largest projected variance) is the eigenvector corresponding to the second largest eigenvalue, $\mathbf{d}_2^2$ , of $\mathbf{X}^T\mathbf{X}$ , and so on. (The eigenvector for the $k^{th}$ largest eigenvalue corresponds to the $k^{th}$ principal component direction $\mathbf{v}_k$ .)

The $k^{th}$ principal component of $\mathbf{X}$ , $\mathbf{z}_k$ , has maximum variance $\mathbf{d}_1^2 / N$ , subject to being orthogonal to the earlier ones.