Updating mean and standard deviation

Hi
I am working on a project which includes Bayesian inversion. And am working on an idea about updating the prior model iteratively, in a loop.
Here is a simplification of the code that troubles me.
xAll = zeros(4, 400000);
for i = 1:400000
x = randi(5, 4, 1);
xAll(:, i) = x;
mu = mean(xAll, 2);
sigma = std(xAll, 0, 2);
fvec = (mu-x)./sigma;
logprior = -1/2*sum(fvec.^2);
% Then comes lots of other stuff that uses the above, but is not relevant for the time being.
% But normally this section would determine a new "x".
end
When determining the "logprior"-value I need the mean and std of "xAll". This takes time to do over and over again. Do any of you know a way of updating "mu" and "sigma", without recomputing them?
Thanks in advance

 Respuesta aceptada

dpb
dpb el 9 de Abr. de 2018
Editada: dpb el 9 de Abr. de 2018
You can keep the partial sums
mu=SUM(x)/N --> SUMX/N
SUMX=SUMX+xnew-xold;
xold=xnew;
in each iteration. Similarly for STD excepting need SUMXSQ as well;
v = [sum(x.^2)-1/N*(sum(x))^2]/(N-1)
sd=sqrt(v)
or, simplifying the sums to variables;
v=[SUMXSQ-N*MU^2]/(N-1)
IOW, you keep the two running sums of sum(x) and sum(x^2) and update them for each iteration by removing the previous and adding the new terms.
From what's shown it doesn't appear that there's any reason you couldn't vectorize the i loop away, but perhaps what's not shown precludes doing so...

6 comentarios

Thanks for the answer.
I am not sure what you mean by:
SUMX=SUMX+xnew-xold;
In by head the sum will just end up being the "x"-value of the given iteration. Can you elaborate?
dpb
dpb el 9 de Abr. de 2018
Editada: dpb el 10 de Abr. de 2018
That assumes a moving average of N elements; if it really is the accumulating sum that you're using then you're correct, the new sum is just SUMX=SUMX+x
My background for doing such machinations is so tied with process control, Kalman filtering and the like it just automagic to write it that way without thinking about it...
ADDENDUM
If you're confused by the +xnew-xold nomenclature, that is simply taking the previous iteration sum, adding the new x for the current generation, then subtracting the old value. What isn't shown explicitly is that you have to keep N values of x, the new one gets added and the oldest drops off the end each time step; it isn't actual code, sorry if that misled.
Michael Madelaire
Michael Madelaire el 10 de Abr. de 2018
Okay, so you choose how many of the previous values are needed. Let's say 5000. When I reach iteration 5001 I'll remove the first "x", making sure that it is only an average over the last 5000?
dpb
dpb el 10 de Abr. de 2018
Editada: dpb el 10 de Abr. de 2018
That's what is showing how to do; yes.
It's up to what you're trying to do...your code as written just computes over the number of trials; that's just the sums without removing anything; the other is a running average of the last N observations.
What you want/need is up to your problem; not enough information to know...
Michael Madelaire
Michael Madelaire el 10 de Abr. de 2018
Thanks for the clarification.
I will accept the answer since it is a solution. Although it is not specifically what I am looking for.
dpb
dpb el 10 de Abr. de 2018
What, then, specifically, are you looking for? The only way I know to compute mean/var on the fly is as shown; keep the intermediary terms and update. If one talks of exceedingly long sequences it becomes rather impractical, unfortunately. You could make approximations; but that's different and if the difference in computed value once you've gotten several K of observations is significant enough to bother to recompute then the approximation probably isn't good enough, either, as compared to just using the previously calculated value.

Iniciar sesión para comentar.

Más respuestas (0)

Preguntada:

el 9 de Abr. de 2018

Comentada:

dpb
el 10 de Abr. de 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by