Vectorized method to sum missed one values

Hi there,
I am trying to achieve something, but I can't think of a vectorized way of doing this. The problem is as follows.
Say I have a vector of 0's and 1's, e.g. [0, 1, 1, 0, 0, 1, 0, 1]. Then I want to manipulate it in such a way that I get the following vector: [0, 2, 1, 0, 0, 3, 0, 2]. Hence, from left to right, every time a 1 occurs, it adds the number of consecutive preceding zero's, if any.
This can easily be done in a loop, but because of the vast number of computations, I am looking for a vectorized way to achieve this.
Any help is appreciated!
Best, Robert

 Respuesta aceptada

Amir Xz
Amir Xz el 19 de Sept. de 2018
Editada: Amir Xz el 19 de Sept. de 2018
A=[0, 1, 1, 0, 0, 1, 0, 1];
[~,NonZr] = find(A~=0);
A(NonZr) = [NonZr(1),NonZr(2:end)-NonZr(1:end-1)];
Result:
A =
0 2 1 0 0 3 0 2

4 comentarios

This is probably the simplest solution.
There's no point in using the two outputs version of find, so:
NonZr = find(A~=0);
which then works equally well for column vectors.
Amir Xz
Amir Xz el 20 de Sept. de 2018
Editada: Amir Xz el 20 de Sept. de 2018
Thank you Guillaume, then more simple way to find indices of nonzero elements is:
NonZr = find(A);
Robert Vullings
Robert Vullings el 20 de Sept. de 2018
Thank you very much, this is indeed a very smart way to do it!
Any ideas on how to expand this (or another method) to a 2D array?
For example, say
A=[0, 1, 1, 0, 0, 1, 0, 1;
1, 0, 1, 0, 1, 1, 0, 1];
would then become
A=[0, 2, 1, 0, 0, 3, 0, 2;
1, 0, 2, 0, 2, 1, 0, 2];

Iniciar sesión para comentar.

Más respuestas (2)

Guillaume
Guillaume el 19 de Sept. de 2018
Editada: Guillaume el 19 de Sept. de 2018
You can replace the earlier part of this answer by the compiled version of rcumsumc for speed
v = [0, 1, 1, 0, 0, 1, 0, 1];
rcum = double(~v);
csum = cumsum(rcum);
rcum(v == 1) = -diff([0, csum(v == 1)]);
rcsum = cumsum(rcum) + 1;
%all the above can be replaced by rcumsum
%rcsum = rcumsum(~v) + 1;
reploc = diff(v) == 1
v([false, reploc]) = rcsum([reploc, false])
Note that I'm not convinced that it will be faster than a well written loop (which can do the job in only one pass over the data).

1 comentario

Christopher Wallace
Christopher Wallace el 19 de Sept. de 2018
This is about 20x faster than my answer when run on my machine. Nice work!

Iniciar sesión para comentar.

Christopher Wallace
Christopher Wallace el 19 de Sept. de 2018
startingData = [0, 1, 1, 0, 0, 1, 0, 1];
stringArr = sprintf('%d', startingData ); % Convert to string for use with regexp
zerosLoc = regexp(stringArr , '(0*)'); % Find starting index of groups of 0's
onesLoc = regexp(stringArr , '(1*)'); % Find starting index of groups of 1's
startingData(onesLoc) = (onesLoc - zerosLoc) + 1; The difference in the starting location of the ones and the starting location of the zeros which will result in the number of zeros leading up to the 1.

1 comentario

Conversions from numbers to strings are never fast, but
stringArr = char(startingData + '0');
will be a lot faster than using sprintf.
However, you don't need regexp and strings to find the start of the sequences.
zerosLoc = find(diff([1, startingData]) == -1); %find [1 0] transitions
onesLoc = find(diff([0, startingData]) == 1); %find [0 1] transitions
With this it may actually be faster than my solution.

Iniciar sesión para comentar.

Categorías

Más información sobre Characters and Strings en Centro de ayuda y File Exchange.

Preguntada:

el 19 de Sept. de 2018

Comentada:

el 2 de Nov. de 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by