normal distribution fit vs histogram

2 visualizaciones (últimos 30 días)
Jurgen Casha
Jurgen Casha el 10 de Mzo. de 2017
Comentada: Jurgen Casha el 12 de Mzo. de 2017
The question is not exactly MATLAB related but I cannot think of anywhere else to find an answer.
I have a sample of sugar and created the following histogram using the particle size distribution after a sieve analysis using excel:
I have worked out the mean by multiplying the percentage retained by each bin middle value, summing for all instances and divided by the total number of frequencies (in this case 100 since it is a percentage). (I approximated the middle bin value of the >1000 bin to be 1200). I got a mean value of 448.
Next I fitted a normal distribution in MATLAB where the x axis has the middle values of the bins and the y axis had the percentage retained of each bin. The distribution R-squared value was around 0.97 and the mean was not exactly the same and gave a value of 460.
My question is first of all, should the values be identical? and secondly, if the values should not be identical, which value makes more sense? Finally, are there any other better ways of fitting distributions to my histogram using MATLAB?

Respuestas (1)

Image Analyst
Image Analyst el 10 de Mzo. de 2017
You can use the function fitdist() in the Statistics and Machine Learning Toolbox. It can fit Normal, lognormal, and a bunch of other distributions. The toolbox also has functions like probplot(), kstest(), etc. to show how well your distribution fits a theoretical distribution.
As you can see, you don't have a normal distribution. Like with almost all particle characteristic distributions it's a skewed distribution, more like a log normal than a normal. I suggest you consider using log normal, and don't use normal. Normal is nice for explaining the theory in your college classes because the math is simple, but the real world is not so simple.
  1 comentario
Jurgen Casha
Jurgen Casha el 12 de Mzo. de 2017
I am trying to use the fitdist() function but the problem is that it only accepts the raw data. Unfortunately I do not have the raw data and only have a specific range of data as can be seen from the histogram. Should I approximate each column of the histogram as being a single value being the arithmetic or geometric average of that range or is there a better way of handling the problem?

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by