How to efficiently compute a sum of large vectors with elements depend on the sum index?

3 visualizaciones (últimos 30 días)
I have to compute the sum: , where are large column vectors ( dimension is about 10 millions) and κ is about 100 to 200.
My first attempt is , with (a row vector) and denotes the element-wise product.
My second attempt is to use for loop.
Both attempt are still very slow but for loop is slightly faster.
My questions is : Is there any way to compute the sum more efficiently?
Any suggestion would be much appreciated !

Respuesta aceptada

Walter Roberson
Walter Roberson el 6 de Feb. de 2021
%assume that z and y are COLUMN vectors
K = (1:kappa).'; %COLUMN vector not row vector
result = sum( K.*sin(K.*z.').*exp(K.*y.') );
What this does is improve locality of reference. The values to be summed will be consecutive ones in the matrix, instead of being scattered around the matrix.
  2 comentarios
Hung Dao
Hung Dao el 7 de Feb. de 2021
Thanks. I've tried your suggestion and it slightly improves.
My first attempt: 4.29s vs your suggestion: 3.8s which is good but is not as effecient as I expected.
Is it possible to reduce the computational time even more?
Thanks
Walter Roberson
Walter Roberson el 7 de Feb. de 2021
In the case that there are a lot of duplicate values in z or y, then it might become practical to find the unique values, take the sin only for them, and get back to the appropriate extended value using indexing. This would, however, generally be slower because of the overhead in finding the duplicates. For Intel i9 "Coffee Lake" the cycle time for FSIN is 53 to 105 cyles, with another result available after 50 to 120 cycles (so if you have a bunch queued up you might shave 3 cycles off.) Therefore if you were able to do duplicate detection in less than 50 cycles, you might be able to save time overall. As a rough estimation, that would start to become practical if you had on the order of less than 10 unique values (maybe 20-ish even.)
Another situation in which you might be able to get some improvement is the case in which you are tight on memory, so that forming the full arrays is causing a lot of overhead on memory management.
What magnitude range do the z values have? If they are negative then K*z < roughly -36 is going to have an exp() that does not contribute significantly to the sum and it might be possible to filter out values. exp(400*z) is going to be either very big or very small unless abs(z) < 1/10 (approximately)

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Type Identification en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by