Which formula fits better for this Linear Mixed-Effects Model?
1 view (last 30 days)
Show older comments
Miquel Bosch Bruguera on 10 Mar 2020
I am currently analyzing a dataset that contains a list of flight simulator tests performed by different pilots. I want to analyze if a certain flight parameter (i.e. amount of input errors during flight, lateral deviation to ideal path, etc.) are affected by the categorical variables in the following list:
- Experimental Campaign: the pilots flew the same flights but in different places (and environmental conditions, such as lack of oxygen, isolation, etc). 5 different campaigns were done. A performance difference is likely to appear depending on the campaigns.
- Group: in each campaign, the pilots were divided in two groups: Frequend and infrequent flyers. A performance difference is expected between both groups.
- Session: During each campaign, the same amount of flights were performed, each month for FF pilots, or every three months for IF Pilots. In total, 10 sessions were made. A variation of performance might happen throughout the experiment, also affected by Group and Campaign.
- Flight Scenario: three different flight scenarios were flown, which required different skill levels. The performance is also expected to vary between type of scenario.
Additionally, an extra list of categorical variables could be considered (Gender, Age, Background, etc.).
Could you please tell me which LME Model formula would you better implement in order to understand the dataset presented? And if you wish, how would you better plot the results of such an analysis?
Paul on 24 Jul 2020
Edited: Paul on 24 Jul 2020
From what you describe it seems you only have fixed effects, no random effects. Is that correct? Random effects are those factors which can vary between datapoints, and which can have an effect on your data, but about which you are not interested, for instance the ID of the pilot (one pilot might have produced multiple datapoints because he participated in multiple tests) or time of day when the measurement happened (morning/afternoon/evening) or something like that.
If you have no random effects, your formula could be this:
'FlightParameter ~ ExperimentalCampaign*Group*Session*FlightScenario'
In this case you are comparing interactions between all four fixed effects, which would produce a rather complicated model, but LMEs should be able to handle this fairly easily as long as you have enough datapoints.
How you can best plot this depends entirely on what the goal of the visualization is. What is the goal of your visualization?
Peng Li on 5 Aug 2020
To understand your dataset, I still believe that descriptive statics can give you the broad general picture. In terms of statistical analysis, the hypothesis always comes first and then models, not the other way around. So get back to your question, I don't really recommend you do the linear regressions, whatever types, just blindly before you have any hypotheses in mind. And for sure that the software will throw out results whatsoever, even you design has problems. The results will then make no sense.
It might be okay to explore around. But the four-way interaction is way too complicated to be interpretable! It seems that in your question, the same pilot may have completed multiple sessions. That will be where random effects are needed.
To showcase a simpler scenario, for example you'd like to test whether frequent and infrequent pilots (group factor) perform differently in different places, and you'd like to control for demographic varations, you may want to apply this lme model: outcome ~ group * place + age + sex + background + (1|pilot)
the (1|pilot) part in the formula is to take the within-pilot correlation into consideration (random effect).
Find more on Mixed Effects in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!