Fast way to perform multiple searches on a large array

Question

0 votos

I have a large time series array (10,000,000 elements) :

ts = [2; 1; 3; 4; 6; 7; .......]

I have a corresponding time array (same size as the above) :

times = [d1; d2; d3; d4; d5.......]

I have 2 arrays of start times and end times (also large ~ 30000 elements):

st = [dd1 dd2 dd3 ....]
en = [de1 de2 de3 ....]

I need to create a new matrix with many many finds. Logic is :

results = NaN(300, numel(st));
for i=1:numel(st);
  temp = ts(find(times > st(i) & times < en(i) , 300,'first');
  results(:,i) = temp;
end;

Is there any ay I do this faster (ideally without a loop) ?

I have a 64 bit version so I can try a large in-memory solution.

Many thanks in advance, Nigel

8 comentarios
Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

Daniel Shub el 4 de Oct. de 2011

Just to confirm times, st and en are all sorted?

Nigel el 4 de Oct. de 2011

Yes they are sorted by st and en(i)-st(i) = 300 seconds

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Daniel Shub el 4 de Oct. de 2011

Abrir en MATLAB Online

0 votos

I think by dumping the past times you might be able to speed up the find. If st(i+1) > en(i), then you could dump even more elements, but I think the savings will be small. This code relies on times, st, and en being sorted.

results = NaN(300, numel(st));
offset = 0;
for i=1:numel(st);
  idx = find(times > st(i), 1,'first');
  offset = offset+idx-1;
  times = times(idx:end);
  results(:,i) = ts(0:299+idx+offset);
end

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Nigel el 10 de Oct. de 2011

Hi Daniel,

I used a modified version of your solution. Indeed it is a LOT quicker to search over smaller sized arrays.

Thank you all for your help.

N.

Iniciar sesión para comentar.

Answer 2

Jan el 4 de Oct. de 2011

Abrir en MATLAB Online

0 votos

Never let an array grow in each iteration! Pre-allocate the output:

results = NaN(300, numel(st));
for i = 1:numel(st)   % Not size(st), which is a vector!
  temp = ts(find(times > st(i) & times < en(i), 300, 'first');
  if length(temp) == 300
    results(:, i) = temp;
  else
    results(1:length(temp), i) = temp;
  end
end
results = results(~isnan(results));

If st and times are sorted, it wastes a lot of time to compare all values. But for vectorizing this, a very large matrix would be needed, such that I assume it will be slower than the loop.

Can you solve the problem by using HISTC?

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

Daniel Shub el 4 de Oct. de 2011

and since times and st are sorted

0:299+find(times > st(i), 1, 'first')

Nigel el 4 de Oct. de 2011

WOW by removing the < en(i)the processing time nearly halved !!

Iniciar sesión para comentar.

Answer 3

Nigel el 4 de Oct. de 2011

0 votos

Certainly taking away the < en(i) helped. I'm a little hesitant to implement the dumping the past times part because I need the data for something a little later on.

Just for my own learning I would really like to know how could I vectorise this operation such that I didn't need to do this in a loop.

Thank you all once again for taking the time to look at and respond to my question.

N.

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Bjorn Gustavsson el 10 de Oct. de 2011

Well then at least do the consequtive 'find's on shortened sections of times (with 'offset' as in Daniel's example):

idx = find(times(offset:end) > st(i), 1,'first');

Then you'd get the benefit from increasingly shorter arrays to search over but without loosing the data.

Daniel Shub el 10 de Oct. de 2011

I wonder if this would be faster. I would hope MATLAB is smart enough not to have to reallocate memory for my method. Yours is probably a little safer. I was also thinking that working from the end backwards might ultimately be the fastest.

Iniciar sesión para comentar.

Fast way to perform multiple searches on a large array

8 comentarios
Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Más respuestas (2)

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Categorías

Etiquetas

Community Treasure Hunt

Fast way to perform multiple searches on a large array

8 comentarios Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Más respuestas (2)

6 comentarios Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

2 comentarios Mostrar Ninguno Ocultar Ninguno

Categorías

Etiquetas

Ver también

Community Treasure Hunt

8 comentarios
Mostrar 6 comentarios más antiguos Ocultar 6 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

6 comentarios
Mostrar 4 comentarios más antiguos Ocultar 4 comentarios más antiguos

2 comentarios
Mostrar Ninguno Ocultar Ninguno