Chi squared test to test if data is from same distribution

Hello,
I have recorded some discrete data with an unspecified distribution.
I have generated some discrete data from a model.
I looking to check to see if the generated data has the same distribution as the real data.
If the data was continious, I would use a Q-Q plot and a striaght line would indicate that it is true.
As the data is discrete, I need another test.
I was thinking a chi-squared test would be suitbale?
Would Matlab have such a function? I would be grateful if somebody could perhaps demonstrate an example?
kind regards

 Respuesta aceptada

You could use a two-sample Kolmogorov-Smirnov test. This tests the hypothesis that the two samples come from the same distribution.
doc kstest2

3 comentarios

Hi,
Thank you for your reply. The documentation says that the Kolmogorov-Smirnov test is for continious variables. Do you know if it matters if you use it for discrete variables?
Thank you
I am not sure I follow. It sounds like the KS test is what you are looking for. The documentation says that the sample comes from continous distributions. It says nothing about the sample themselves, which is what you are comparing. Or maybe I am missing something.
Thanks Jose, I confused the two terminology. That is what I wanted.
Cheers

Iniciar sesión para comentar.

Más respuestas (1)

How abot anything here:
Or some of the anova tests:
doc anova1
doc anova2
doc anovan

4 comentarios

Thanks, my problem is that I do not know which test is suitbale for discrete data??
What does your data look like?
I'm not a stats expert so I won't discuss the theoretical stuff (Tom, Ilya or Peter may chime in...)
But I would highly recommend reading the doc because it will describe exactly what is going on for your data.
John
John el 8 de Feb. de 2013
Editada: John el 8 de Feb. de 2013
Thanks,
I have recorded the distances of thousands of car journeys (to the nearest mile). I have a model that generates journey distances also. I want to determine if the journey distances produced by the model are from the same distribution as the real-world data. I'm not a stats expert either :( . I have looked at the docs and they refer continious data (mine is discrete) so I'm not sure if they are suitbale?
The KS test if for discrete data. What you assume is that the distribution they come from is continuous. That's a different thing.

Iniciar sesión para comentar.

Preguntada:

el 8 de Feb. de 2013

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by