Vectorize a loop to save time

1 visualización (últimos 30 días)
Filip
Filip el 3 de Feb. de 2019
Comentada: Walter Roberson el 4 de Feb. de 2019
I have a big data set and my current code takes 2 hours. I am hoping to save time by vectorization if that is possible in my case.
I have a table Table with variables ID, t1, tend, p. My code is sth like:
x=zeros(size(Table.ID,1));
for i=1:size(Table.ID,1)
x(i)=sum(Table.t1<Table.t1(i) & Table.tend>Table.tend(i) & abs(Table.p-Table.p(i))>1);
end
So for each observation, I want to find number of observations that start before, ends after and have a p value in the neighborhood of 1. It takes 2 hours to run this loop. Any suggestion?
Thanks in advance!
  2 comentarios
Walter Roberson
Walter Roberson el 4 de Feb. de 2019
How are the t1 and tend values arranged? Are tend(i+1) = t1(i) such that together they partition into consecutive ranges that are completely filled between the first and last? Do they act to partition into non-overlapping ranges but with gaps? Are there overlapping regions? Are the boundaries already sorted?
Filip
Filip el 4 de Feb. de 2019
There is no arrangement between t1 and tend values across observations. They might overlap for some observations, there might be gaps in time too.
All I know is that t1<tend for an observation.
Table is sorted wrt ID.

Iniciar sesión para comentar.

Respuesta aceptada

Jan
Jan el 4 de Feb. de 2019
Editada: Jan el 4 de Feb. de 2019
2 hours sounds long. Is the memory exhausted and the virtual memory slows down the execution? How large is the input?
Is this a typo:
x = zeros(size(Table.ID,1))
It creates a square matrix, but you access it as vector obly.
Does the table access need a remarkable amount of time?
n = size(Table.ID,1);
t1 = Table.t1;
tend = Table.tend;
p = Table.p;
x = zeros(n, 1);
for i = 1:n
x(i) = sum(t1 < t1(i) & tend > tend(i) & abs(p - p(i)) > 1);
end
If you sort one of the vectors, you could save some time:
[t1s, index] = sort(t1);
tends = tend(index);
ps = p(index);
for i = 2:n
m = t1s < t1s(i);
x(i) = sum(tends(m) > tends(i) & ...
abs(ps(m) - ps(i)) > 1);
end
Afterwards x has to be sorted inversly. If you provide some inputs, I could check the code before posting. I'm tired, perhaps I've overseen an obvious indexing error.
Is the shown code really the bottleneck of the original code?
  1 comentario
Filip
Filip el 4 de Feb. de 2019
I have more variables in the table and do more comparisons, but they are all similar. So, I wrote a sample here to give the idea.
x = zeros(size(Table.ID,1)) is obviously a typo.
I guess, sorting t1 will work, and also accessing table might be time consuming. I will update when I apply the changes but this seems promising. Thanks!

Iniciar sesión para comentar.

Más respuestas (1)

Walter Roberson
Walter Roberson el 4 de Feb. de 2019
My mind is headed towards creating a pairwise mask matrix,
M = squareform(pdist(Table.p) > 1); %important that Table.p is a column vector
That would be comparatively fast. If the table is very big then it could fill up memory, though.
abs() is not needed for this; pdist will already have calculated distance as a non-negative number.
Now
Mi = M(i,:);
x(i)=sum(Table.t1(Mi)<Table.t1(i) & Table.tend(Mi)>Table.tend(i));
However you should do timing tests against
Mi = M(i,:);
x(i)=sum(Mi & Table.t1<Table.t1(i) & Table.tend>Table.tend(i));
and
Mi = M(i,:);
Tt = Table(Mi);
x(i)=sum(Tt.t1<Table.t1(i) & Tt.tend>Table.tend(i));
  2 comentarios
Filip
Filip el 4 de Feb. de 2019
Unfortunately, this answer does not exactly work. But inspired by your answer, I believe that creating pairwise difference matrix by "bsxfun(@minus, T.t1, T.t1')" might work. I am not sure how faster it is gonna be and if I will have memory issues. I will try and update after.
Walter Roberson
Walter Roberson el 4 de Feb. de 2019
abs(T.t1 - T.t1.')
would work as a distance function for you in R2016b and later.

Iniciar sesión para comentar.

Categorías

Más información sobre Matrix Indexing en Help Center y File Exchange.

Productos


Versión

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by