Nested for-loops too slow
9 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Hello,
I'm having trouble to compute a summation with four indices. My code below is very time consuming. The problem is have to compute for every i,j,k,l the three smallest elements of A. Any other suggestions that run a bit faster than my code? (N is approximately 156)
sum = 0;
for i=1:N
for j=1:N
for k=1:N
for l=1:N
A = [N-(i-1),N-(j-1),N-(k-1),N-(l-1)];
B = sort(A);
sum = sum + B(1)*B(2)*B(3);
end
end
end
end
Thank you very much!
0 comentarios
Respuesta aceptada
Egon Geerardyn
el 18 de Abr. de 2011
You might want to work more on that code manually before implementing it like that. As the only thing that really matters is N, I think it should be possible to derive a more closed-form formula for the total sum and at least it should be possible to reduce the number of iterations by factoring out some constants.
E.g. I note [i,j,k,l] as your complete index: going from [1,1,1,1] to [1,1,1,X], you know what the results will be for any X (larger than 1): you already know the smallest values so you are wasting time iterating of all other values, sorting, multiplying and summing while you could just see that in that case you can add (N-1) * 1 * 1 * 1
Also remember that your case is quite symmetric towards your indexes: [i,j,k,l] will produce the same output as e.g. [j,i,k,l] and [i,j,l,k] (and so forth). When you think about it, you will see that there are 24 possible permutations for the most general case (where [i,j,k,l] are all different). So by taking that into account, I assume it should at least be possible to limit the iteration range you took by calculating the output once and adding 24*sum for the general case. You will get different situations for [i,i,k,l] (12 cases for every i); [i,i,i,l] (in 4 cases for every i) and [i,i,i,i] (1 case for every i). You will need to account for that in your code.
When doing something like this, I would suggest you first leave out some dimensions, start to work with a 2 dimensional loop, first of all that will allow you to use this crude code and secondly that also allows you to study a simpler case first, find a faster solution and generalize that solution to your exact case.
For the 2D case, [i,j] represents a matrix and you will see that you can see that [i,j] and [j,i] gives the same output. However, a special case exists [i,i] only occurs once.
So for 2D, a naive implemtation might be something like:
sum = 0;
N=156;
for i=1:N
for j=1:N
A = N - [i,j] + 1;
sum = sum + min(A);
end
end
sum
An implementation with fewer iterations (approximately half as many, if I'm right):
sum = 0;
N=156;
for i=1:N
A = N - [i,i] + 1;
sum = sum + min(A);
for j=(i+1):N
A = N - [i,j] + 1;
sum = sum + 2*min(A);
end
end
sum
You can of course combine this with Robert Cumming's answer, precalculate the parts you already know.
3 comentarios
Egon Geerardyn
el 18 de Abr. de 2011
@Niels: How do you expect to get real answers when we don't get the real problem? Due to the symmetry you can reduce the execution time, but if your real problem doesn't show any symmetry, you might as well be stuck with the nested for loops.
Something you might also try is to vectorize the inner loop (or even multiple inner loops), that will give you some improvement in performance, trading off memory usage.
If algorithmic optimizations are not possible, you might want to take a look at MEX/Java/.NET/... functionality since your problem size is rather large (156^4, approx. half a billion executions). These will execute faster, but their limiting behavior will be the same, such that they might not allow you to calculate N = 500 (or 2000) while a better algorithm might.
Más respuestas (2)
Teja Muppirala
el 18 de Abr. de 2011
You really shouldn't use "sum" as a variable name. That is the name of a very useful built-in function. I'll call your variable "S" instead.
Looking at the code, you can deduce that S should be a polynomial function of N. Doing a little bit of curve fitting you can indeed get an analytic expression: S(N) =
(1/420)*(30*N.^7 + 105*N.^6 + 147*N.^5 + 105*N.^4 + 35*N.^3 - 2*N)
So If were to type into MATLAB:
S = @(N)(1/420)*(30*N.^7 + 105*N.^6 + 147*N.^5 + 105*N.^4 + 35*N.^3 - 2*N);
S(uint64(156))
Then you have your very long answer:
S = 164235165014258
3 comentarios
Teja Muppirala
el 18 de Abr. de 2011
I believe the polynomial given is correct for all N, not just around 156. The code below confirms this for N = 1 to 20. I'm not sure what discrepancies you are referring to. There are 4 for loops, and in the inner most loop you are adding factors of order N^3, so it's not quite unexpected that the final result would be a polynomial of order 7. What I meant by "curve fitting", is an analytic curve fitting of a 7th order polynomial at 7 points.
clear S;
for N = 1:20
S(N,1) = 0;
for i=1:N
for j=1:N
for k=1:N
for l=1:N
A = [N-(i-1),N-(j-1),N-(k-1),N-(l-1)];
B = sort(A);
S(N) = S(N) + B(1)*B(2)*B(3);
end
end
end
end
end
S_ANALYTIC = @(N)(1/420)*(30*N.^7 + 105*N.^6 + 147*N.^5 + 105*N.^4 + 35*N.^3 - 2*N);
S2 = S_ANALYTIC(uint64(1:20)')
[S S2]
aretheyequal = isequal(S2,S)
Teja Muppirala
el 18 de Abr. de 2011
That should be:
A 7th order polynomial fit to "8 points" not "7 points"
Robert Cumming
el 18 de Abr. de 2011
Pulling out the i, j and k elements only when they changewill speed it up:
sum = 0;
A = zeros(4,1);
for i=1:N
A(1) = N-(i-1);
for j=1:N
A(2) = N-(j-1);
for k=1:N
A(3) = N-(k-1);
for l=1:N
A(4) = N-(l-1);
B = sort(A);
sum = sum + B(1)*B(2)*B(3);
end
end
end
end
Ver también
Categorías
Más información sobre Loops and Conditional Statements en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!