Can I speed up this matrix multiplication?

81 visualizaciones (últimos 30 días)
Zenan Li
Zenan Li el 7 de Mayo de 2021
Editada: Bruno Luong el 18 de Mayo de 2021
Sorry for this stupid question. I know that the matrix multiplication has been highly optimized by Matlab, but I am still want to figure out that can I further speed up only for this special case?
I have no idea for this bottleneck in my code. The simplified code is uploaded in the following.
% X, Q are given symmetric matrix
[P, ~] = eig(X);
ans = P*(Q.*(P'*Y*P))*P'; % Y will change in each loop.
The X, Y, and Q are 1000x1000 matrices. In each loop, I have to compute four matrix multiplications,
But actually, P and Q are constant matrices in each loop, so I want to know what I could do to further speed up this code?
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Any suggestion will be appreciated.
  6 comentarios
Matt J
Matt J el 10 de Mayo de 2021
Is Q low rank, by any chance?
Zenan Li
Zenan Li el 10 de Mayo de 2021
Thanks, it seems Q can be low rank..

Iniciar sesión para comentar.

Respuesta aceptada

Matt J
Matt J el 10 de Mayo de 2021
Editada: Matt J el 10 de Mayo de 2021
For example, the size of Q is 1000x1000, but its rank is 10.
If so, then we can decompose Q into a short sum where and are the non-zero eigenvalues and vectors respectively of Q. Then the computation becomes, with representing the Kronecker product and representing diag(v_i),
or
So, you can now pre-compute the matrices and reduce the number of matrix multiplications in each of the above terms to two. Also, you have a 16-core CPU, so each of the terms can be computed in parallel and summed using parfor.
  10 comentarios
Nguyen Van Hieu
Nguyen Van Hieu el 18 de Mayo de 2021
My problem is very similar this question. I need to compute:
ans=A*P*A', where A is sparse, P is symetric and number of zero elements in P are 0.
How can I speed up the calculation using paralell toolbox
Bruno Luong
Bruno Luong el 18 de Mayo de 2021
Editada: Bruno Luong el 18 de Mayo de 2021
MATLAB matruix multiplication is perhaps multi-threaded.
You won't be able to speed up with putting parrallel on the single processor (PC).

Iniciar sesión para comentar.

Más respuestas (1)

Matt J
Matt J el 8 de Mayo de 2021
Editada: Matt J el 8 de Mayo de 2021
I have tried gpuArray, but it cannot be faster than normal Matlab * operation (maybe my GPU is not good enough..)
Depending on your GPU, and your precision requirements, it may be worth pre-casting the matrices to single precision. Some GPUs are not well-optimized for double float operations. For example, on my machine with the GeForce GTX 1050,
dtype={'double','single'};
for i=1:2
[P,Q,Y]=deal(rand(1000,dtype{i}));
timeit(@() P*(Q.*(P'*Y*P))*P')
[P,Q,Y]=deal(gpuArray.rand(1000,dtype{i}));
gputimeit(@() P*(Q.*(P'*Y*P))*P.')
end
I obtain,
ans = %double CPU
0.0607
ans = %double GPU
0.1678
ans = %single CPU
0.0261
ans = %single GPU
0.0065
  13 comentarios
Zenan Li
Zenan Li el 10 de Mayo de 2021
Thanks, and I find that although Q is not sparse but it is low rank. How could I do?
Zenan Li
Zenan Li el 10 de Mayo de 2021
For example, the size of Q is 1000x1000, but its rank is 10.

Iniciar sesión para comentar.

Categorías

Más información sobre GPU Computing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by