Calculating principal component scores from principal component coefficients of the new data

4 views (last 30 days)
Hi all,
I perfomed a PCA on dataset using the function
[coeff,score,latent,~,explained,mu]=pca(TrainingSet.X);
Then I generated new shapes (in the cartesian space) using a reduced number of principal components. Now I need to the principal component scores for these new shapes, but I can't figure out how!
Based on the fact that the original centered training data can be retrieved using
centeredData= score*coeff'
I used the following statements, which did not generate relevant results.
for i= 1:newShapesNum
newShapeScore(i,:)=newShape(i,:)*pinv(coeff(:,1:shapeModesNum)'); % i is the counter of new (generated) observations.
newSvalid=newShapeScore(i,:)*coeff(:,1:shapeModesNum)';
end
UPDATE
I also tried running a pca analysis on the new instances, and requested [score] and [coeff]. The mean shape looked good but using the centeredData formula above did not regenerate the original shape! I don't understand why though..
I'd appreciate your help in finding the principal component scores for the new shapes.
Many thanks
Amin
  2 Comments
Amin Kassab-Bachi
Amin Kassab-Bachi on 11 May 2021
Thanks for responding. Actually I'm creating new instances with good quality. But it's my first time working with PCA so I'm not familiar with the terms. The new instances (in cartesian space) are created from randomly generated standard deviation values. I'm trying to recover their scores in principal component space because I need to correlate the scores to some output from another analysis later on. After many tests I finally got to the conclusion that scores are the standard deviation values I used. So for each principal component, for each new instance, I saved the generated SD [i.e. a random weight×sqrt(latent)]. Hopefully you can confirm this is correct.
Thanks

Sign in to comment.

Accepted Answer

Aditya Patil
Aditya Patil on 12 May 2021
To get the scores for new data, you need to first get the outputs mu and coeff.
X = rand(100, 5);
XTrain = X(1:75, :)
XTrain = 75×5
0.1441 0.3071 0.3775 0.8840 0.6683 0.8057 0.3544 0.5524 0.7381 0.9861 0.7959 0.0033 0.3544 0.6425 0.4665 0.9191 0.7689 0.0454 0.1116 0.5821 0.7176 0.1236 0.6015 0.8224 0.3409 0.2391 0.1492 0.9006 0.5579 0.6631 0.1738 0.4541 0.5185 0.6817 0.8653 0.6194 0.2851 0.5203 0.8938 0.2486 0.0550 0.3670 0.9562 0.1952 0.4238 0.2783 0.3371 0.4914 0.6739 0.2944
XTest = X(76:100,:)
XTest = 25×5
0.4050 0.8916 0.0311 0.9368 0.4693 0.4280 0.2849 0.0614 0.1172 0.3371 0.9347 0.9498 0.3593 0.3842 0.0361 0.6781 0.4363 0.2563 0.5025 0.2534 0.6973 0.2147 0.0580 0.2153 0.6004 0.9774 0.1824 0.5365 0.0387 0.3407 0.6281 0.8394 0.6062 0.0771 0.7966 0.1263 0.8900 0.5766 0.7521 0.1489 0.4293 0.8312 0.9448 0.5362 0.1901 0.4643 0.9553 0.6214 0.8245 0.4738
[coeff,scoreTrain,~,~,explained,mu] = pca(XTrain);
Now, to apply the same transformation, that is to get scores for new data, apply the following equation.
idx = 3; % Keep 3 principal components
scoreTest = (XTest-mu)*coeff(:,1:idx)
scoreTest = 25×3
0.1243 0.3578 0.3699 0.2510 -0.1932 -0.3583 0.5351 -0.2519 0.0646 0.1803 -0.2631 0.0597 0.3561 -0.1946 -0.0985 0.3395 -0.6057 -0.2079 0.3735 0.2247 -0.2527 -0.2488 0.1930 -0.0451 -0.1706 -0.0489 -0.1127 -0.0553 0.2642 0.2388
For more details, see the Apply PCA to New Data and Generate C/C++ Code documentation.
  1 Comment
Amin Kassab-Bachi
Amin Kassab-Bachi on 12 May 2021
Thank you very much. This also confirmed what I calculated was correct. When testing my results previously I did not include mu, so the results did not look like anything useful! But now it's all starting to make more sense. Thanks.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by