Calculating the Gaussian distribution paramaters

26 visualizaciones (últimos 30 días)
Adrien
Adrien el 21 de Jun. de 2022
Comentada: Adrien el 21 de Jun. de 2022
Hello,
I'm trying do a small script to try the EM algorithm in which I have 2 sets of 1 dimension points that belong to 2 different guassians but I don't know which point belongs to which data set, and the EM algorithm estimates the gaussian parameters (mean,variance) for both.
For that I first create a small data set
data1 = normrnd(-6,3,[200 1]);
data2 = normrnd(6,1,[200 1]);
data = [data1;data2];
Then to compare the results outputed by the EM algoritm, I first calculate the gaussian distrubution parameters. However the result I get is slightly different if i use the matlab funtion fitdist or if I code the math it self: (left is manual math, right is fitdist)
Why is that?
PS:
The math I did was for mu and sigma:
The manual math is coded as:
% (μ,σ²)
distGauss1.mu = mean(data1);
distGauss1.sigma = mean((data1-distGauss1.mu).^2);
distGauss2.mu = mean(data2);
distGauss2.sigma = mean((data2-distGauss1.mu).^2);

Respuesta aceptada

dpb
dpb el 21 de Jun. de 2022
Let's try your formula with numbers...
>> data1 = normrnd(-6,3,[200 1]);
>> mean(data1)
ans =
-6.1098
>> std(data1)
ans =
3.0128
>>
OK, that returns what we would expect, pretty close to the input parameters ot the RNG...
Now what does your calculation give...
>> mean((data1-mean(data1)).^2)
ans =
9.0315
>>
Woops!!! You forgot two things -- first is
sqrt(mean((data1-mean(data1)).^2))
ans =
3.0052
>>
That's much closer, but still not quite the same identical answer as std returned -- but you used mean which divides by n and the unbiased estimator of the std uses n-1
So, as the LH plot shows, your distribution is much fatter than it should be...3X the width since the input sigma was 3. The result is much closer for the other as sqrt(1) --> 1 so the difference just doesn't show up numerically.
  2 comentarios
Adrien
Adrien el 21 de Jun. de 2022
Ooh I see thank you, the notation for the parameters was and I just and just forgot about he "".
Adrien
Adrien el 21 de Jun. de 2022
oh, and that also fixed the ones inside the EM, I also had an issue where the EM algorithm wasn't working properly, best it could do was a really rough approximation of what it was supposed to find ,sometimes not working at all computing the mu and/or sigma so small matlab just said "NaN". This comment doesn't add anything, just wanted to say thx again

Iniciar sesión para comentar.

Más respuestas (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by