Principle Component Analysis

Download

PCA3D Analysis

Download

tSNE Analysis

Download

Tables View

Development Information

1.Application Introduction:

App Name: PCA & t-SNE Analysis Online Serve

Platform: Shinyapps Base on R 3.6.1

R Packages: stat, ggplot2, factoextra, Rtsne, ggbiplot, scatterplot3d, DT, shiny, shinyjs

Note: Cite: Please Cite R Packages above

2.Author Introduction:

Author: benben miao

Email: benben.miao@outlook.com

Github: https://github.com/benben-miao/

Omics: https://omics.netlify.app

Pro: Bioinformatics, AI

Program: Python, R, Java, Julia, Shell, HTML,CSS,Javascript, Ruby, Perl, C, C++, SQL, GO, Linux.etc

3.Using Application:

DevFor: Bioinformatics Analysis

Principal component analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables (entities each of which takes on various numerical values) into a set of values of linearly uncorrelated variables called principal components.

This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components.

The resulting vectors (each being a linear combination of the variables and containing n observations) are an uncorrelated orthogonal basis set.

PCA is sensitive to the relative scaling of the original variables.

4.Knowledge of Chinese:

Principal Component Analysis (PCA) is a statistical method. A set of variables that may have a correlation is transformed into a linearly uncorrelated variable by orthogonal transformation. The converted set of variables is called the principal component.

Principle component analysis was first introduced by Karl Pearson on non-random variables, and H. Hotlin later extended this method to the case of random vectors. The size of the information is usually summed by the square of the deviation or Variance.

Try to recombine the original variables into a new set of several unrelated comprehensive variables, and at the same time, take out a few less comprehensive variables from it according to actual needs. The statistical method that reflects as much information as possible on the original variables is called the principal component. Analysis or principal component analysis, It is also a method used mathematically to reduce dimensions.

Principle component analysis is to delete all redundant variables (closely related variables) for all the variables originally proposed, and create as few new variables as possible, so that these new variables are irrelevant in pairs, and these new variables Variables retain the original information as much as possible in terms of reflecting the topic's information.

t-SNE (t-distributed stochastic neighbor embedding) is a machine learning algorithm for dimensionality reduction. It was proposed by Laurens van der Maaten and Geoffrey Hinton in 2008. In addition, t-SNE is a non- Linear dimensionality reduction algorithm, Very suitable for high-dimensional data reduction to 2 or 3 dimensions for visualization.

SNE is to map the data points to the probability distribution by affinitie transformation, which mainly includes two steps:

SNE constructs a probability distribution between high-dimensional objects, so that similar objects have a higher probability of being selected and dissimilar objects have a lower probability of being selected.

SNE is constructing the probability distribution of these points in a low-dimensional space so that the two probability distributions are as similar as possible.

t-SNE has a high computational complexity, occupies many resources and has a long calculation time in millions of sample data sets. PCA can complete the calculation in seconds or minutes.