Why is Matlab PCA calculation different from results from R and Orange3?
15 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I am trying to use Matlab to compute PCA of the iris dataset.
First, I went step by step by:
(1) first centring the data,
(2) correlation matrix calculation
(3) then calculating the Eigenvectors from the correlation matrix and
(4) Then multiplying the centred data with the Eigenvectors. The results were different from the figure below.
The Matlab results are also different from what R calculates. The results of R matches the results in orange.
I also used the PCA function in Matlab and the results were different from below.
Any ideas why please?
Many thanks.
6 comentarios
the cyclist
el 4 de Nov. de 2019
Editada: the cyclist
el 4 de Nov. de 2019
Another observation: The following R code agrees with MATLAB on the principal components:
library(datasets)
data(iris)
print(prcomp[,1:4])
Standard deviations (1, .., p=4):
[1] 2.0562689 0.4926162 0.2796596 0.1543862
Rotation (n x k) = (4 x 4):
PC1 PC2 PC3 PC4
Sepal.Length 0.36138659 -0.65658877 0.58202985 0.3154872
Sepal.Width -0.08452251 -0.73016143 -0.59791083 -0.3197231
Petal.Length 0.85667061 0.17337266 -0.07623608 -0.4798390
Petal.Width 0.35828920 0.07548102 -0.54583143 0.7536574
There is a sign disagreement in PC2 and PC3, but eigenvectors are only determined up to a sign.
Respuestas (1)
the cyclist
el 4 de Nov. de 2019
Editada: the cyclist
el 4 de Nov. de 2019
Here's my guess:
The difference between R and MATLAB is that in R, you scaled the data, in addition to centering them -- each column was divided by its standard deviation. In MATLAB, they were only centered.
If you do this to standardize in MATLAB:
iris_save_standardized = bsxfun(@minus,iris_save,mean(iris_save))./std(iris_save); %center and scale the dataset
you will get the same result.
Specifically, the following R code gives me the same output as MATLAB does, using the scaling in the above line. (The editor here of course assumes MATLAB code, so excuse the weird formatting. It still makes it easier for you to cut and paste into R if you want.)
library(datasets)
data(iris)
scaled_iris <- scale(x=iris[,1:4])
pc <- prcomp(scaled_iris)
score <- data.matrix(scaled_iris) %*% data.matrix(pc$rotation)
print(head(score))
This is a bit different from the output you pasted in your question, though. Not sure about that.
3 comentarios
the cyclist
el 5 de Nov. de 2019
I've never used Orange. (I'd never heard of it until this question.)
The fact that the results are very close, but not exact, suggests to me some minor algorithmic difference. But I really don't know.
Ver también
Categorías
Más información sobre Dimensionality Reduction and Feature Extraction en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!