N-way ANOVA wrong results

10 visualizaciones (últimos 30 días)
Elsa Martinez
Elsa Martinez el 3 de Jun. de 2019
Editada: Adam Danz el 5 de Mayo de 2020
Hi everyone,
I am doing some statistics functions in Matlab and checking them in SPSS. I am doing N-way ANOVA in Matlab, but it gives me wrong results comparing it to SPSS. As you can see later, the results completely differ, they are not even close. This is my code (I am showing you the brief version of the data to show what is going on):
connection= [0.3950; 0.3774; 0.0622; 0.0140; 0.1424; 0.1029; 0.1711; ...
0.1595; 0.0774; 0.0111];
abstinence=[12;6;8;11;14;6;3;6;3;4];
relapse=[0;1;2;0;1;1;3;0;1;0];
[~,tbl]=anovan(connection,{abstinence, relapse},'display','on','model','full',...
'varnames',{'Abstinence','Relapse'}, 'continuous',[1,2]);
I am doing it to be full interaction and with continuous predictors, the same as in SPSS. And this gives me this results in Matlab:
Source Sum Sq. d.f. Mean Sq. F Prob>F
----------------------------------------------------------------
Abstinence 0.02207 1 0.02207 0.93 0.3714
Relapse 0.01428 1 0.01428 0.6 0.4667
Abstinence*Relapse 0.01587 1 0.01587 0.67 0.4441
Error 0.14196 6 0.02366
Total 0.1653 9
And in SPSS (sorry it is in Spanish: Abstinencia is Abstinence and Recaidas is Relapse, just in case):
5.JPG
I don't know why this is happening or if I am doing something wrong. Besides, I have changed 'model' to 'interaction' and it does not work either. Please any help is welcomed!
Thanks in advanced.

Respuesta aceptada

Jeff Miller
Jeff Miller el 4 de Jun. de 2019
Both of the MATLAB output tables make sense to me, but I don't really know what SPSS is doing.
For the MATLAB continuous case, the model says that connection changes linearly with Abstinence, with Relapse, and with Abstinence*Relapse. The 1 df for each of those terms is a slope relating connection to each of these three numerical predictors. (This is really a regression model.)
For the MATLAB non-continuous case, each different score on Abstinence and Relapse indicates a qualitatively different condition (i.e., imagine that the numbers were just arbitrary labels). Since you have 7 different Abstinence values, this model requires 6 df for Abstinence differences. Similarly you have a lot of distinct Relapse values & a whole lot of possible interaction terms. Basically, you have no where near enough data to estimate the parameters of this model, hence all those zeros in the table.
SPSS is claiming 5 dfs for Abstinence and 2 for Relapse, but I don't know why those aren't 6 & 3 (number of distinct scores minus 1). It has excluded the interaction term completely (0 df), presumably because it recognizes that there are not enough data to estimate those terms.
Bottom line: With your data, it does not seem feasible to use ANOVA as you are trying to do it. This analysis is meant to be used with categorical predictors (i.e., having a few distinct values, which are not assumed to be numerically meaningful), and with lots of data for all possible combinations of values on all the different predictors. For example, you would need several cases with Abstinence=4 & Relapse=0, several with Abstinence=4 & Relapse=1, and so on for all possible cases. I think regression modelling is more suitable for the type of data that you seem to have.
  2 comentarios
Elsa Martinez
Elsa Martinez el 5 de Jun. de 2019
Well, I have submit here only a small sample of the real data, I have a long dataset. What happens is that with the complete dataset, Abstinence*Relapse match in both SPSS and Matlab (only in the non-continuous case), but it doesn't in Abstinence and Relapse individually, I don't know why this happens. So I was wondering if you know the difference between SPSS and anovan in Matlab and if I could do something to make them give the same results.
Elsa Martinez
Elsa Martinez el 5 de Jun. de 2019
I have discovered that the Sum of Squares Type II corresponds with the Sum of Squares Type III in SPSS, so the problem is solved!
But anyways, thanks a lot for your help Jeff!

Iniciar sesión para comentar.

Más respuestas (1)

Jeff Miller
Jeff Miller el 4 de Jun. de 2019
The SPSS output does not match up with the data that you posted, so maybe you should check that analysis first. For your connection values, these are the correct SS & df Total values, as produced by MATLAB: 0.1653 9. Since those numbers don't appear in the SPSS output table, I am wondering whether VAR00005 is some other variable. Even if the programs were fitting slightly different models, MATLAB's SS total and SPSS's Total corrigedo should be identical if the data values were the same.
  6 comentarios
Adam Danz
Adam Danz el 5 de Mayo de 2020
Editada: Adam Danz el 5 de Mayo de 2020
@John TS you could go to the anovan page of the online documentation and scroll to the very bottom where you can give that page a star-rating and after that you have the opportunity to type suggestions for improvement to the documentation. The doc team will get that feedback.
@Jeff Miller , thanks for answering stats questions and offering great advice. Not many contributors here take on the stats questions.
Jeff Miller
Jeff Miller el 5 de Mayo de 2020
Thanks, Adam, I appreciate the feedback.

Iniciar sesión para comentar.

Productos


Versión

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by