Gaussian mixture model sometimes seems to fit very badly
Mostrar comentarios más antiguos
In the following code, I fit a gaussian mixture model (GMM) to some randomly sampled data. I do this twice. Each time, the data represent two well separated gaussians, the only difference being the seed I use for the random number generator.
N = 100000;
EFFECT_SIZE = 5;
seedList = [1 6];
for s = seedList
rng(s)
X = [randn(N,1); randn(N,1)+EFFECT_SIZE];
figure
hist(X,101)
GMModel = fitgmdist(X,2)
end
If you run that code -- you will need the Statistics Toolbox -- you will see that the first distribution is fit very well, and the second one terribly. I am trying to understand why. I would expect such well separated peaks to be fit well essentially every time.
This is not a fluke. I ran 1,000 different seeds, and got the bad fit about 18% of the time. Also, those bad fits tend to cluster relatively close the same parameter values.
Any thoughts? I am a novice at using GMM, so maybe I am just naive about how well this should do.
I am running R2014b on Mac OS X Yosemite.
Respuesta aceptada
Más respuestas (0)
Categorías
Más información sobre Gaussian Mixture Models en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!