PCA in Matlab reduce dimensionality

I just want to have a simple PCA to reduce my dimensionality of let say 400 * 5000 to 400 * 4
meaning reduce from 5000 to 4.
I am not sure where can i set the value of reduction.
coeff = pca(X)
I am trying to follow:
load hald
Then:
The dataset of ingredient is 13 * 4
Capture.PNG
coeff = pca(ingredients)
Output:
coeff = 4×4
-0.0678 -0.6460 0.5673 0.5062
-0.6785 -0.0200 -0.5440 0.4933
0.0290 0.7553 0.4036 0.5156
0.7309 -0.1085 -0.4684 0.4844
I am wondering can i change it to output of 13 *2

6 comentarios

Adam
Adam el 20 de Feb. de 2019
coeff gives you the principal component vectors as columns, score maps your data onto these and gives you your data re-aligned onto the principal components rather than x, y, z, etc original variable axes.
Matlaber
Matlaber el 20 de Feb. de 2019
Thanks!
Let say I have a dataset of 13*8 matrix
coefficient is 8* 8 matrix
score is 13 * 8 matrix
Make sense.
So let say I want the final output to be 13*2 matrix only, or even 13*6 matrix.
How can I do that?
Adam
Adam el 20 de Feb. de 2019
Just do as Elysi Cochin's answer shows and index into the scores. These are ordrered from 1st principal componet onwards so just throw away those you don't want.
Thanks.
I did that. However, it seemed throw away those matrix I do not want, is that means missing out some information by throwing away?
For example:
load hald
[coeff, score] = pca(ingredients);
reducedDimension = score(:,1:3);
Result of Score is 13*4 matrix
Capture.PNG
Result of ReduceDimension is 13*3 matrix
ssss.PNG
It looks like the 4th row is throwing away, is that mean dimension reduction using PCA?
looks like throwing the 4th row will miss some information?
Adam
Adam el 20 de Feb. de 2019
Editada: Adam el 20 de Feb. de 2019
Dimension reduction is 'throwing some information away'. It isn't magic, unfortunately. Unless you have perfectly correlated redundant variables then if you have 8 variables and you want to reduce down to 3 dimensions then you will obviously lose some information.
Of course, doing it without PCA you would lose a huge amount of information if you just chop off 5 variables.
Because you have used PCA though you are throwing away the dimensions that contain least information about the data.
Looking at the explained output from PCA will help you see what you are throwing away. This is a measure of how much of the data variation is captured by each dimension. You will usually see a large number (between 0 and 100, e.g. 80) for the first, then progressivley smaller numbers. Unless your data is very random you will often find that after the first few principal components the values in the explained vector are < 1 (i.e. that dimension hold less than 1% of the information so that is all you lose if you throw that dimension away).
Matlaber
Matlaber el 21 de Feb. de 2019
Thanks for your reply.
Yes, I checked the file of the PCA output, you are correct, usually large number for the first row and progressively smaller number.
Thanks once again.
Do you have any idea how can we use Linear Discriminant Analysis (LDA) aka. Fisher Discriminant Analysis (FDA) in matlab? It seemed do not have this function.

Iniciar sesión para comentar.

 Respuesta aceptada

Elysi Cochin
Elysi Cochin el 20 de Feb. de 2019
[coeff, score] = pca(ingr);
requiredResult = score(:,1:2);
or if you want to change coeff to 13 x 2 matrix, you'll have to use reshape function, but to use reshape your variable coeff must have atleast 13 x 2 elements
or you can use repmat, it will repeat copies of the array coeff

2 comentarios

Thanks!
Do you mind explain what is the different between "coeff" and "score"?
I did read the documenation, unable to understand.
load hald
[coeff, score] = pca(ingredients);
requiredResultscore = score(:,1:3);
requiredResultcoeff = coeff(:,1:3);
Orginal "ingredients" is 13*4 matrix
coefficient is 4 * 4 matrix
score is 13 * 4 matrix
requiredResultscore is 13 * 3 matrix
requiredResultcoeff is 4 * 3 matrix
The original dataset which is 'ingredient' is 13 * 4 matrix.
>> ingredients
ingredients =
7 26 6 60
1 29 15 52
11 56 8 20
11 31 8 47
7 52 6 33
11 55 9 22
3 71 17 6
1 31 22 44
2 54 18 22
21 47 4 26
1 40 23 34
11 66 9 12
10 68 8 12
After PCA:
load hald
coeff = pca(ingredients)
The output is of coeff is 4 * 4 matrix.
>> coeff
coeff =
-0.0678 -0.6460 0.5673 0.5062
-0.6785 -0.0200 -0.5440 0.4933
0.0290 0.7553 0.4036 0.5156
0.7309 -0.1085 -0.4684 0.4844
I am wondering how can I get a 13 * 2 matrix as output.
In your question "to use reshape your variable coeff must have atleast 13 x 2 elements". How can I get at least 13 * 2 elements.
Thanks

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Etiquetas

Preguntada:

el 19 de Feb. de 2019

Comentada:

el 21 de Feb. de 2019

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by