Massive slowdown for Apple Silicon in computing SVD
Mostrar comentarios más antiguos
I recently notice that there is an extreme slowdown in my version of Matlab while computing an SVD when the size of the matrix crosses some threshold. I came up with the following example that demonstrates my issue:
N = [10000 11000 12000 13000];
for i = 1:4
A = randn(N(i),3);
tic;
[U,S,V] = svd(A,0);
toc;
end
When I run this in Matlab R2024b (macOS Apple silicon), the output is:
Elapsed time is 0.000396 seconds.
Elapsed time is 0.000275 seconds.
Elapsed time is 0.000264 seconds.
Elapsed time is 0.083150 seconds.
Of course the exact numbers vary trial to trial, but the speed for the last run (where N = 13000) is consistently orders of magnitude slower.
When I run this same code on Matlab R2024b (Intel processor) on the same computer, this slow down does not happen. I was able to replicate this issue across two different Macs (one with M1 and another with M3) and different versions of Matlab (going back to R2023b).
Any idea why this might be happening in the silicon version?
Edit: I'm running macOS 15.1.1
Respuesta aceptada
Más respuestas (1)
Heiko Weichelt
el 21 de Dic. de 2024
2 votos
Thanks for reporting this.
We identified the problem and are working on improving this in a future release.
As a temporary workaround, we recommend replacing:
[U,S,V]=svd(A,0);
with
[Q,R]=qr(A,"econ"); [U,S,V]=svd(R); U=Q*U;
In general, this step is not needed as the SVD performs the QR inside itself. The LAPACK library currently used on Apple Silicon, however, had suboptimal tuning parameters for this case.
On my machine, the time for the largest example improved as following:
>> tic; [U,S,V]=svd(A,0); toc
Elapsed time is 0.086851 seconds.
>> tic; [Q,R]=qr(A,"econ"); [U,S,V]=svd(R); U=Q*U; toc
Elapsed time is 0.000977 seconds.
3 comentarios
P Jeffrey Ungar
el 26 de Mzo. de 2025
Just diagonalizing the overlap matrix is much faster (but still not as fast as the proper implementation). For 1000000 x 3 (on an M4 Max Mac)
>> tm = tic(); [V,S] = eig(A.'*A); toc(tm);
Elapsed time is 0.003637 seconds.
Heiko Weichelt
el 26 de Mzo. de 2025
For the initial example, we also compute U, which is of same dimension as A, i.e., tall and skinny. Your solution isn't computing that yet.
Furthmore, the condition number of A.'*A might be as bad as the square of the condition number of A itself which can cause additional trouble for EIG. So I wouldn't advice this workaround as a general solution.
P Jeffrey Ungar
el 26 de Mzo. de 2025
Editada: P Jeffrey Ungar
el 26 de Mzo. de 2025
I neglected part of the solution. Yours is more complete, but the condition number consideration is hardly a problem for a small number of vectors, even very long ones. My application is to get an orthonormal basis for a small set of long vectors that are guaranteed to be linearly independent. They are, in fact, a set of eigenvectors for a degenerate eigenvalue already obtained by the likes of eigs(). These are not guaranteed to be orthogonal. Below shows finishing the work still gives much faster performance.
The performance of svd() right now on R2025a (prerelease) makes it virtually unusable. For laughs give it a single vector of length 1000000 and watch it take 12 seconds on M4 Max! I sincerely hope this problem is addressed properly by the time it is released.
>> A = randn(10000000,10);
tm = tic(); [V,S] = eig(A.'*A); U = A*V./sqrt(diag(S).'); delt=toc(tm)
delt =
0.1727
>> tm = tic(); [Q,R] = qr(A,"econ"); [U,~,~] = svd(R); U = Q*U; toc(tm);
Elapsed time is 0.390255 seconds.
>>
Categorías
Más información sobre Get Started with MATLAB en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!