Principal Component Analysis

Principal Component Analysis (PCA) is one of the widely applied statistical techniques to reduce the dimension of the data without losing information or losing minimum information. Nowadays, PCA is also a popular in unsupervised machine learning. PCA leverages the concept of spectral decomposition of matrix, which is quite popular in Linear Algebra.

Source: Understanding Principal Component Analysis | by Trist'n Joseph | Towards Data Science

In this article, we discuss how to implement PCA manually though many statistical software, such as EViews and STATA, will conduct PCA within few clicks. Also, open-source software, such as R and Python, has packages that produces principal components with ease.

Principal Component Analysis in Python Manually - Jovian

We use matrix to derive principal components. We denote matrix with capital letter.

Suppose we have `X_{m \times n}` and we need to reduce its dimension.

We need to carry out some preprocessing steps.

First, we need to standardize `X` and mean centering of standardized `X`.

` X_\text{std}=\frac{X-\bar{X}}{\sigma_X}`

`Z=X_\text{std}-\bar{X_\text{std}}`

We store the value of mean centered standardized `X` into `Z`.

We finish the preprocessing steps after completing these two steps.

Now, We proceed to estimate the variance-covariance matrix of `Z`.

`C = \frac{1}{n}Z^TZ`

`C` is the matrix containing variance and covariance of `Z`. It is a square matrix of `n \times n` in our example.

Furthermore, we estimate the eigenvalues and eigenvectors following spectral decomposition. We must be familiar with a popular technique called spectral decomposition of matrix or eigen decomposition of matrix. The spectral decomposition of matrix decomposes matrix into eigen value and eigen vector.

` C = \Phi \Lambda \Phi^{-1}`

The spectral decomposition of `C` - where `C` must be non-defective square matrix - decomposes `C` into `\Phi`, the matrix of eigenvectors, and `\Lambda`, a diagonal matrix containing eigenvalues.

Now, we need to order eigenvalues in descending order and order eigenvectors according to eigenvalues.

We need to obtain principal component.

` PC = Z\Phi^*`

Here, `\Phi^*` is the eigenvectors corresponding to eigenvalues arranged in descending order.

Now, factor loadings are obtained by multiplying eigenvectors `(\Phi^*)` by the square root of corresponding eigenvalues.

`\text{Loadings} = \sqrt{\text{eigenvalues}} \times \text{eigenvectors}`

Principal Component Analysis

Post a Comment

0 Comments

Econometrics

Categories

Home Ad

विचार

Video Tutorials

Data Analysis

Research

Macroeconomics

Popular

Possible effect of Russia-Ukraine tension on Nepalese economy

Expenditure Approach to National Income

Equilibrium and Stability Test [11 diagrams]

Statistics Practical: Kurtosis and Skewness (TU BBA 3rd Semester)

JSON Variables

Ads here

Menu Footer Widget

Contact form