Vectorize a loop to save time

Question

Filip el 3 de Feb. de 2019

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time

Comentada: Walter Roberson el 4 de Feb. de 2019

I have a big data set and my current code takes 2 hours. I am hoping to save time by vectorization if that is possible in my case.

I have a table Table with variables ID, t1, tend, p. My code is sth like:

x=zeros(size(Table.ID,1));
for i=1:size(Table.ID,1)
x(i)=sum(Table.t1<Table.t1(i) & Table.tend>Table.tend(i) & abs(Table.p-Table.p(i))>1);
end

So for each observation, I want to find number of observations that start before, ends after and have a p value in the neighborhood of 1. It takes 2 hours to run this loop. Any suggestion?

Thanks in advance!

2 comentarios
Mostrar NingunoOcultar Ninguno

Walter Roberson el 4 de Feb. de 2019

How are the t1 and tend values arranged? Are tend(i+1) = t1(i) such that together they partition into consecutive ranges that are completely filled between the first and last? Do they act to partition into non-overlapping ranges but with gaps? Are there overlapping regions? Are the boundaries already sorted?

Filip el 4 de Feb. de 2019

There is no arrangement between t1 and tend values across observations. They might overlap for some observations, there might be gaps in time too.

All I know is that t1<tend for an observation.

Table is sorted wrt ID.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Jan el 4 de Feb. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359428

Editada: Jan el 4 de Feb. de 2019

Abrir en MATLAB Online

2 hours sounds long. Is the memory exhausted and the virtual memory slows down the execution? How large is the input?

Is this a typo:

x = zeros(size(Table.ID,1))

It creates a square matrix, but you access it as vector obly.

Does the table access need a remarkable amount of time?

n    = size(Table.ID,1);
t1   = Table.t1;
tend = Table.tend;
p    = Table.p;
x    = zeros(n, 1);
for i = 1:n
  x(i) = sum(t1 < t1(i) & tend > tend(i) & abs(p - p(i)) > 1);
end

If you sort one of the vectors, you could save some time:

[t1s, index] = sort(t1);
tends        = tend(index);
ps           = p(index);
for i = 2:n
  m    = t1s < t1s(i);
  x(i) = sum(tends(m) > tends(i) & ...
             abs(ps(m) - ps(i)) > 1);
end

Afterwards x has to be sorted inversly. If you provide some inputs, I could check the code before posting. I'm tired, perhaps I've overseen an obvious indexing error.

Is the shown code really the bottleneck of the original code?

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Filip el 4 de Feb. de 2019

I have more variables in the table and do more comparisons, but they are all similar. So, I wrote a sample here to give the idea.

x = zeros(size(Table.ID,1)) is obviously a typo.

I guess, sorting t1 will work, and also accessing table might be time consuming. I will update when I apply the changes but this seems promising. Thanks!

Iniciar sesión para comentar.

Answer 2

Walter Roberson el 4 de Feb. de 2019

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359344

Abrir en MATLAB Online

My mind is headed towards creating a pairwise mask matrix,

M = squareform(pdist(Table.p) > 1);    %important that Table.p is a column vector

That would be comparatively fast. If the table is very big then it could fill up memory, though.

abs() is not needed for this; pdist will already have calculated distance as a non-negative number.

Now

Mi = M(i,:);
x(i)=sum(Table.t1(Mi)<Table.t1(i) & Table.tend(Mi)>Table.tend(i));

However you should do timing tests against

Mi = M(i,:);
x(i)=sum(Mi & Table.t1<Table.t1(i) & Table.tend>Table.tend(i));

and

Mi = M(i,:);
Tt = Table(Mi);
x(i)=sum(Tt.t1<Table.t1(i) & Tt.tend>Table.tend(i));

2 comentarios
Mostrar NingunoOcultar Ninguno

Filip el 4 de Feb. de 2019

Unfortunately, this answer does not exactly work. But inspired by your answer, I believe that creating pairwise difference matrix by "bsxfun(@minus, T.t1, T.t1')" might work. I am not sure how faster it is gonna be and if I will have memory issues. I will try and update after.

Walter Roberson el 4 de Feb. de 2019

Abrir en MATLAB Online

abs(T.t1 - T.t1.')

would work as a distance function for you in R2016b and later.

Iniciar sesión para comentar.

Vectorize a loop to save time

2 comentarios
Mostrar NingunoOcultar Ninguno

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Vectorize a loop to save time

2 comentarios Mostrar NingunoOcultar Ninguno

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno