Different correlation results using matrix with NaN values

16 views (last 30 days)
Hello, I am having problems when calculating the correlation coefficient in two different ways:
the first way is by eliminating all pairs of NaN values before correlating:
ign_nan = isfinite(M1) & isfinite(M2);
[RHO1,P] = corrcoef(M1',M2');
The second way is by leaving the matrix as the original one, but adding
'rows','pairwise' to ignore NaN
M1=reshape(M1,[size(M1,1)*size(M1,2) 1]);
M2=reshape(M2,[size(M2,1)*size(M2,2) 1]);
[RHO2,P] = corrcoef(M1',M2','rows','pairwise');
Can someone tell me why is RHO1 different from RHO2?
Thank you! Magui

Answers (1)

Kirby Fears
Kirby Fears on 18 Apr 2016
The documentation for corrcoef indicates that 'complete' is the 'rows' value corresponding to your first calculation.
Try using the following for RHO2 and compare it to RHO1.
[RHO2,P] = corrcoef(M1',M2','rows','complete');
Documentation for the 'rows' setting is below.
'rows' — Use of NaN option 'all' (default) | 'complete' | 'pairwise'
Use of NaN option, specified as one of these values:
'all' — Include all NaN values in the input before computing the correlation coefficients.
'complete' — Omit any rows of the input containing NaN values before computing the correlation coefficients. This option always returns a positive definite matrix.
'pairwise' — Omit any rows containing NaN only on a pairwise basis for each two-column correlation coefficient calculation. This option can return a matrix that is not positive definite.


Find more on Descriptive Statistics in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by