# Moving sum of a matrix

11 views (last 30 days)
Furkan Küçük on 4 Oct 2017
Commented: Jan on 4 Oct 2017
I need to calculate a moving sum of a matrix between rows and I want it without a for loop, just with matrix manipulation for speed of code concerns. The sum can be illustrated like this:
3 2 5 6 1 4 0 0 0 0 0 0
0 0 3 6 9 3 4 1 0 0 0 0
0 0 0 0 8 1 9 0 5 2 0 0
0 0 0 0 0 0 4 8 7 6 2 5
so the sum vector I am looking for is
3 2 8 12 18 8 17 9 12 8 2 5
I am doing this with a for loop like this (N denotes for the moving size(the example I gave above is N = 2)):
for ii = 1:length(sum)
the_vector_I_want_to_get(1, (ii - 1) * N + 1: (ii + 2) * N) = the_vector_I_want_to_get(1, (ii - 1) * N + 1: (ii + 2) * N) + vector_being_added
end
I hope I could explain the question well.
Edit: well, I see that I couldn't explain my question well. I am not looking for a simple "sum()" function. Actually my matrix is like this:
A = [3 2 5 6 1 4;
3 6 9 3 4 1;
8 1 9 0 5 2;
4 8 7 6 2 5];
B = [3 2 5 6 1 4 0 0 0 0 0 0;
0 0 3 6 9 3 4 1 0 0 0 0;
0 0 0 0 8 1 9 0 5 2 0 0;
0 0 0 0 0 0 4 8 7 6 2 5];
So I want to do the job as
sum(B);
with the "A" matrix. "padArray" function is out of question since I want to avoid "for" loop.
Sorry for the unclear question. I hope it is clearer now.

Furkan Küçük on 4 Oct 2017
They are very beneficial with the algorithms, however since Matlab is based on matrix manipulations and operations, and as a result of Matlab being an interpreting language, matrix operations are much faster.
I have a bit series of 1000000 (million) and the simulation should run 1000*20 so it is a lot of numbers over here.
I always construct my simulation algorithms based on for loops and then I try to speed it up. The last simulation I built with "for" loops take 220 seconds for each trial. It could sucked worse if I wouldn't have some experience in Matlab and use larger number of for loops. Now it takes 3,5 seconds for each trial thanks to trying to avoid for loops in order to shorten the simulation time.
Hence, yes they are evil for simulations that requires massive trials :)
Jan on 4 Oct 2017
@Furkan: Try my code and compare the run times. Note that
R(i1:i2) = R(i1:i2) + B(:, k);
is an efficient matrix manipulation also, although it is embedded in a loop. As soon as a vectorized code requires to create a large index matrix or an huge temporary array, loops can be remarkably faster, if they are programmed efficiently. For a [1e5 x 1e3] matrix transposing the array before the summing let the code run 3 times faster, because the columnwise processing is more efficient.
It is a too coarse rumor, that loops are slow in general.
Please post the typical size of your inputs to allow realistic tests.
Furkan Küçük on 4 Oct 2017
The size is 1000000 * 26 for the A matrix. Yeah just saw that a loop structure designed intelligently can be fast too

Guillaume on 4 Oct 2017
Edited: Guillaume on 4 Oct 2017
Following the edit to the question, one way to do what you want:
A = [3 2 5 6 1 4;
3 6 9 3 4 1;
8 1 9 0 5 2;
4 8 7 6 2 5];
step = 2;
subs = (1:size(A, 2)) + (0:size(A, 1)-1).' * step;
result = accumarray(subs(:), A(:)).'
edit: Another way that creates the B matrix:
subs = (1:size(A, 2)) + (0:size(A, 1)-1).' * step;
B = zeros(size(A, 1), max(subs(:)));
B(sub2ind(size(B), repmat((1:size(A, 1))', 1, size(A, 2)), subs)) = A;
result = sum(B, 1)

Guillaume on 4 Oct 2017
I can't see that matrix dimensions would agree
This requires the implicit expansion that was introduced in R2016b. In earlier versions, the implicit expansion has to be done explicitly with bsxfun as shown in Andrei's comment.
Furkan Küçük on 4 Oct 2017
Jan on 4 Oct 2017
+1. Fast and compact.

Jan on 4 Oct 2017
Edited: Jan on 4 Oct 2017
I know, you do not like loops, but for a comparison of the speed this might be interesting:
A = [3 2 5 6 1 4;
3 6 9 3 4 1;
8 1 9 0 5 2;
4 8 7 6 2 5];
[s1, s2] = size(A);
R = zeros(1, s2 + 2 * (s1 - 1));
i1 = 1;
i2 = s2;
for k = 1:s1
R(i1:i2) = R(i1:i2) + A(k, :);
i1 = i1 + 2;
i2 = i2 + 2;
end
A vectorized version would require to create a large index matrix, or create the padded array explicitly. I assume, that this will be slower in consequence.
Do you have a compiler? A C-mex would increase the speed here most likely. But then try it the other way around and process the data columnwise. It matters if you process [6 x 1e6] or [1e6 x 6] matrices. Please explain this, when you ask for a speed optimization.
[EDITED] The columnwise processing is much faster inside Matlab already:
[s1, s2] = size(A);
B = A.';
R = zeros(s2 + 2 * (s1 - 1), 1);
i1 = 1;
i2 = s2;
for k = 1:s1
R(i1:i2) = R(i1:i2) + B(:, k);
i1 = i1 + 2;
i2 = i2 + 2;
end
This would be even faster, if you can store the data in the transposed orientation already.

Furkan Küçük on 4 Oct 2017
Oh and the point of approach is quite brilliant :)
Furkan Küçük on 4 Oct 2017
A little update: columnwise processing made no difference. But my code is %50 faster now thanks to your approach :)
Jan on 4 Oct 2017
Speed test, R2009a:
A = rand(1e5, 26);
tic;
[s1, s2] = size(A);
R = zeros(1, s2 + 2 * (s1 - 1));
i1 = 1;
i2 = s2;
for k = 1:s1
R(i1:i2) = R(i1:i2) + A(k, :);
i1 = i1 + 2;
i2 = i2 + 2;
end
toc;
tic;
subs = bsxfun(@plus,1:size(A, 2),(0:size(A, 1)-1).' * 2);
R2 = accumarray(subs(:), A(:)).';
toc;
tic;
B = reshape(permute(cat(3,A,zeros([s1,s2,2-1])),[3,1,2]),[],s2);
R3 = sum(spdiags(B(1:end-1,:),0:s2-1,s1*2-1,s2 + s1*2-2));
toc
Elapsed time is 0.255235 seconds. Loop
Elapsed time is 0.039196 seconds. Accumarray
Elapsed time is 1.389395 seconds. spdiags
The loop method is efficient for A=rand(26, 1e5), but even then Guillaume's ACCUMARRAY is better:
Elapsed time is 0.059586 seconds.
Elapsed time is 0.037151 seconds.
Elapsed time is 1.610867 seconds.

Andrei Bobrov on 4 Oct 2017
Edited: Andrei Bobrov on 4 Oct 2017
exotics:
[m,n] = size(A);
B = reshape(permute(cat(3,A,zeros([m,n,N-1])),[3,1,2]),[],n);
out = sum(spdiags(B(1:end-1,:),0:n-1,m*N-1,n + m*N-2));