Inconsistent results with fitlme

11 visualizaciones (últimos 30 días)
Maria Garcia Garrido
Maria Garcia Garrido el 27 de Jul. de 2022
Editada: Darshak el 29 de Abr. de 2025
I am trying to analyse some data with linear mixed-effects models using the function 'fitlme', but I noticed that I obtain different results depending on the alphabetical order of a categorical grouping variable. Could you let me know if I am doing something wrong?
I first create a table with the dataset:
grouping_variable1 = categorical(grouping_variable1);
grouping_variable2 = categorical(grouping_variable2);
ID = categorical(ID);
tbl1 = table(ID,grouping_variable1,grouping_variable2,dependent_variable);
I then create the linear mixed-effects model with the following line:
lme = fitlme(tbl1,'dependent_variable ~ grouping_variable1*grouping_variable2 + (1|ID)');
These are the two results I get deppending on the alphabetical order of grouping_variable2/4:
Option 1
grouping_variable2 = string(zeros(23*2,1));
grouping_variable2(1:23) = 'Training';
grouping_variable2(23+1:23*2) = 'Test';
grouping_variable2 = categorical(grouping_variable2);
tbl1 = table(animal_ID,grouping_variable1,grouping_variable2,dependent_variable);
lme = fitlme(tbl1,'dependent_variable ~ grouping_variable1*grouping_variable2 + (1|ID)')
lme =
Linear mixed-effects model fit by ML
Model information:
Number of observations 46
Fixed effects coefficients 8
Random effects coefficients 23
Covariance parameters 2
Formula:
dependent_variable ~ 1 + grouping_variable1*grouping_variable2 + (1 | ID)
Model fit statistics:
AIC BIC LogLikelihood Deviance
-28.98 -10.693 24.49 -48.98
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower
{'(Intercept)' } 0.46417 0.058778 7.8969 38 1.557e-09 0.34518
{'grouping_variable1_2' } 0.16375 0.092937 1.7619 38 0.086122 -0.024391
{'grouping_variable1_3' } 0.023056 0.083125 0.27736 38 0.78301 -0.14522
{'grouping_variable1_4' } 0.20131 0.080102 2.5132 38 0.016325 0.039152
{'grouping_variable2_Training' } 0.020833 0.073085 0.28506 38 0.77715 -0.12712
{'grouping_variable1_2:grouping_variable2_Training'} -0.19375 0.11556 -1.6767 38 0.10182 -0.42768
{'grouping_variable1_3:grouping_variable2_Training' } -0.060833 0.10336 -0.58857 38 0.55963 -0.27007
{'grouping_variable1_4:grouping_variable2_Training'} -0.20012 0.099597 -2.0093 38 0.051649 -0.40174
Upper
0.58316
0.35189
0.19133
0.36347
0.16879
0.040182
0.1484
0.0015053
Random effects covariance parameters (95% CIs):
Group: ID (23 Levels)
Name1 Name2 Type Estimate Lower Upper
{'(Intercept)'} {'(Intercept)'} {'std'} 0.068596 0.027252 0.17266
Group: Error
Name Estimate Lower Upper
{'Res Std'} 0.12659 0.094816 0.169
results = anova(lme)
results =
ANOVA marginal tests: DFMethod = 'Residual'
Term FStat DF1 DF2 pValue
{'(Intercept)' } 62.361 1 38 1.557e-09
{'grouping_variable1' } 2.9461 3 38 0.045032
{'grouping_variable2' } 0.081258 1 38 0.77715
{'grouping_variable1:grouping_variable2'} 1.789 3 38 0.16568
Option 2
grouping_variable4 = string(zeros(23*2,1));
grouping_variable4(1:23) = 'ATraining';
grouping_variable4(23+1:23*2) = 'BTest';
grouping_variable4 = categorical(grouping_variable4);
tbl1 = table(animal_ID,grouping_variable1,grouping_variable4,dependent_variable);
lme = fitlme(tbl1,'dependent_variable ~ grouping_variable1*grouping_variable4 + (1|ID)')
lme =
Linear mixed-effects model fit by ML
Model information:
Number of observations 46
Fixed effects coefficients 8
Random effects coefficients 23
Covariance parameters 2
Formula:
dependent_variable ~ 1 + grouping_variable1*grouping_variable4 + (1 | ID)
Model fit statistics:
AIC BIC LogLikelihood Deviance
-28.98 -10.693 24.49 -48.98
Fixed effects coefficients (95% CIs):
Name Estimate SE tStat DF pValue Lower Upper
{'(Intercept)' } 0.485 0.058778 8.2513 38 5.3474e-10 0.36601 0.60399
{'grouping_variable1_2' } -0.03 0.092937 -0.3228 38 0.74862 -0.21814 0.15814
{'grouping_variable1_3' } -0.037778 0.083125 -0.45447 38 0.65208 -0.20606 0.1305
{'grouping_variable1_4' } 0.0011905 0.080102 0.014862 38 0.98822 -0.16097 0.16335
{'grouping_variable4_BTest' } -0.020833 0.073085 -0.28506 38 0.77715 -0.16879 0.12712
{'grouping_variable1_2:grouping_variable4_BTest'} 0.19375 0.11556 1.6767 38 0.10182 -0.040182 0.42768
{'grouping_variable1_3:grouping_variable4_BTest' } 0.060833 0.10336 0.58857 38 0.55963 -0.1484 0.27007
{'grouping_variable1_4:grouping_variable4_BTest'} 0.20012 0.099597 2.0093 38 0.051649 -0.0015053 0.40174
Random effects covariance parameters (95% CIs):
Group: ID (23 Levels)
Name1 Name2 Type Estimate Lower Upper
{'(Intercept)'} {'(Intercept)'} {'std'} 0.068596 0.027252 0.17266
Group: Error
Name Estimate Lower Upper
{'Res Std'} 0.12659 0.094816 0.169
results = anova(lme)
results =
ANOVA marginal tests: DFMethod = 'Residual'
Term FStat DF1 DF2 pValue
{'(Intercept)' } 68.084 1 38 5.3474e-10
{'grouping_variable1' } 0.11571 3 38 0.95036
{'grouping_variable4' } 0.081258 1 38 0.77715
{'grouping_variable1:grouping_variable4'} 1.789 3 38 0.16568
The p value of grouping_variable1 is different in the ANOVA results, depending of the other variable. I am not sure of what I may be doing wrong, or why the results would be different because of this, so any help or guidance would be greatly appreciated.
  1 comentario
Maria Garcia Garrido
Maria Garcia Garrido el 29 de Jul. de 2022
The problem occurred when interactions were included in the model. Setting 'DummyVasCoding' to 'effects' to create dummy variables using effects coding solved the problem.

Iniciar sesión para comentar.

Respuestas (1)

Darshak
Darshak el 28 de Abr. de 2025
Editada: Darshak el 29 de Abr. de 2025
Hello Maria,
This behavior with “fitlme” is a known thing when working with categorical variables in MATLAB. It happens because of how MATLAB internally handles categorical predictors. When you fit a linear mixed-effects model like:
lme = fitlme(tbl1, 'dependent_variable ~ grouping_variable1*grouping_variable2 + (1|ID)');
and “grouping_variable2 ” is a categorical variable, MATLAB automatically picks one of the categories as a reference level. By default, it picks the one that comes first alphabetically.
The important part is:
  • The reference level affects how the model interprets both main effects and interaction terms.
  • If the order of categories is changed (like renaming “Training” and 'Test' to “ATraining” and “BTest”), you're effectively changing the reference category.
  • That means the model's coefficients, p-values, and ANOVA results can shift, even though the overall fit (AIC, BIC, LogLikelihood) stays identical.
It’s just the way one-hot encoding and contrast settings work.
To avoid getting different results depending on naming or alphabetical order, it's a good idea to explicitly set the reference category for categorical variables.
In MATLAB, this can be done with “reordercats”:
grouping_variable2 = reordercats(grouping_variable2, {'Test', 'Training'});
Or
“setcats”:
grouping_variable2 = setcats(grouping_variable2, {'Test', 'Training'});
The following documentations can be referred for these functions:
The following documentation can be used to refer to ‘fitlme’:
Hope this helps with the issue!

Categorías

Más información sobre Analysis of Variance and Covariance en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by