real or categorical predictors, which one is faster?
Mostrar comentarios más antiguos
In regressions, is there a guidline to treat predictors as real values or categorical?
In a fitting problem with input as X, y where X contains the hour of the day information, e.g. 1, 2, 3, etc.., I tend to consider it as a categorical predictor because the length of unique(X) is limited (i.e. 24). Surprislingly, the fitting procedures seem slower than treating it as real values in a gaussian process fitrgp.
My questions are:
- why does it take longer with categorical predictor?
- in a similar situation, is there a guidline to decide whether take the predictors as real values or categorical inputs?
3 comentarios
Walter Roberson
el 17 de Sept. de 2023
Have you experimented with passing uint8 data? I don't know if that is permitted; if it is then it would signal that discrete algorithms are to be used
mono
el 17 de Sept. de 2023
"why does it take longer with categorical predictor?"
I'd venture owing to the large number of dummy variables introduced by having 24 levels of time being modeled as categorical instead of continuous/discrete. You could try artificially reducing the same data set to 24, 12, 2 levels and see if that hypothesis is correct.
Regardless of whether it's true or not, it's still the model definition and purpose that should be controlling decisions such as this, not anything to do with compute time.
Respuesta aceptada
Más respuestas (0)
Categorías
Más información sobre Gaussian Process Regression en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!