How to set a condition for sequential number of non NaN values in a vector?

1 visualización (últimos 30 días)
Hello, I have a matrix like that:
NaN -0.00042 0.00073 -0.00298 NaN
NaN -0.00114 0.00029 0.00265 NaN
NaN -0.00028 NaN -0.00066 NaN
NaN 0.00197 0.00219 0.00099 NaN
I would like to write a condition for each column: if the number of SEQUENTIAL non-NaN values >=300 (no matter where exactly in a column and non NaNs have not be the same, just any non NaNs), then leave the column, otherwise - delete the whole column. I'm stuck with that "sequential" - how can I say to matlab look for 300 consecutive non-NaNs in a row??
I would appreciate any help.
  1 comentario
dpb
dpb el 13 de Mayo de 2016
Gotta' run; realized I'm late for a meeting...but look for a "runs" submission on FileExchange (maybe Bruno, I forget???).

Iniciar sesión para comentar.

Respuesta aceptada

Roger Stafford
Roger Stafford el 13 de Mayo de 2016
Editada: Roger Stafford el 13 de Mayo de 2016
Let M be your matrix.
[m,n] = size(M);
T = diff([true(1,n);isnan(M);true(1,n)],1,1);
T2 = false(1,n);
for k = 1:n
f = find(T(:,k));
T2(k) = any((f(2:2:end)-f(1:2:end-1))>=300); % <-- Corrected
end
M2 = M(:,T2);
M2 will be your result. I assume that where you said "delete" you meant that literally - such columns should be removed from the matrix.
  5 comentarios
Ekaterina Serikova
Ekaterina Serikova el 19 de Mayo de 2016
Dear Roger, could you please advise how can I adjust the code for zeroes instead of NaNs? How can I Change isnan condition to zeroes? Thanks a lot.
dpb
dpb el 19 de Mayo de 2016
Should be to just replace isnan(M) with M==0 in the first expression (although you may need to check on the initial condition as as long as there are no NaN in M the first value can't be but might it be 0?

Iniciar sesión para comentar.

Más respuestas (1)

the cyclist
the cyclist el 13 de Mayo de 2016
Download RunLengths from the FEX.
Then run this code. (I used a length threshold of 3 for illustration. You'll want to use 300.)
LENGTH_THRESHOLD = 3;
M = [NaN -0.00042 0.00073 -0.00298 NaN
NaN -0.00114 0.00029 0.00265 NaN
NaN -0.00028 NaN -0.00066 NaN
NaN 0.00197 0.00219 0.00099 NaN]
numberColumns = size(M,2);
maxNonNanLength = zeros(1,numberColumns);
for nc = 1:numberColumns
[b,L] = RunLength(M(:,nc));
currentNonNanLength = 0;
for nb = 1:numel(b)
if isnan(b(nb))
currentNonNanLength = 0;
else
currentNonNanLength = currentNonNanLength + L(nb);
maxNonNanLength(nc) = max(maxNonNanLength(nc),currentNonNanLength);
end
end
end
hasSufficientNonNanLength = maxNonNanLength >= LENGTH_THRESHOLD;
M = M(:,hasSufficientNonNanLength);

Categorías

Más información sobre Genomics and Next Generation Sequencing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by