# Distinguish between 2 variables using PCA

6 views (last 30 days)
moose on 9 Aug 2015
Edited: Sagar on 10 Aug 2015
Hello, I am trying to understand the PCA function. I have 6 recordings of Heart Rate. Four of them from person A, and two of them from person B. Does PCA can help me somehow to distinguish between the 2 persons? I mean, when I do coeff = pca(signal_matrix); ('signal_matrix' is the matrix of my 6 recordings) what exactly can I get from the coeff matrix I receive? How should I interpret it?

Sagar on 9 Aug 2015
PCA can certainly give some insights in your problem. Run PCA in your data and look at the different principal components. In you case, I guess the four variables would dominate in one principal component (presumably first) representing characteristics of A and the rest two variables would dominate another principal component (presumably second) representing characteristics of B. To make it more clear, when you look at the coefficients of the first principal component,the first four values should have higher values than the rest two. Similarly, in the second principal component, the last two coefficients should have higher values than the first four. Of course, I am presuming that A and B are distinguishable.
##### 2 CommentsShowHide 1 older comment
Sagar on 10 Aug 2015
In you first principal component, second element has the highest weight (0.9975) so it means that this component represents the characteristics of the second recording for A. Similarly, in the second principal component, first value has the largest weight (0.9966) so it represents characteristics of second recording of A. Similarly look at the highest values in other columns. But most importantly, look at the percentage variance explained by using a complete formula, [coeff,score,latent] = pca(___), where latent is the variance explained by the principal components. First value in latent divided by the sum of all the values in latent gives you the % variance explained by the first principal component. From those values you can know which components are important and which you can choose to drop. For further understanding, read this post: https://onlinecourses.science.psu.edu/stat505/node/54