How to do feature selection by maximizing Rsquared for linear regression model ?

1 visualización (últimos 30 días)
Hi everyone,
I am doing project to build predictive model but before that I just want the important features for the model. So I am using feature selection.
I went through this link and code is working properly https://www.mathworks.com/help/stats/feature-selection.html.
But I want to use Rsquared instead of Deviance which is used in the above link, that is I want to select those features that give good Rsquared value(>0.85) .
Can anyone help me out with the code , thanks !
  1 comentario
the cyclist
the cyclist el 11 de Jun. de 2019
Also, I should mention that if your full model (i.e. with all features) does not achieve R^2 > 0.85, then a reduced feature set cannot achieve that. Is that what you were hoping for?

Iniciar sesión para comentar.

Respuestas (1)

the cyclist
the cyclist el 10 de Jun. de 2019
Editada: the cyclist el 10 de Jun. de 2019
I believe you just need to redefine the critfun function from the one in the example:
function dev = critfun(X,Y)
model = fitglm(X,Y,'Distribution','binomial');
dev = model.Deviance;
end
replacing the critical value with
dev = model.RSquared
You might want to rename that variable something like rsqr, to avoid confusion.
EDIT:
After reading that example, and thinking about it a bit more, there might be some other nuances. That example states, "Adding a feature with no effect reduces the deviance by an amount that has a chi-square distribution with one degree of freedom". I'm not sure the same is true for R^2. So, that might bear some thought.
Also, I believe the deviance measure is something that is minimized, whereas R^2 is maximized. There is probably an adjustment that needs to be made for that as well. (One simple possibility would be to return 1-R^2 in the critical function, I guess.)
  4 comentarios
Shreeraksha Raviprakash
Shreeraksha Raviprakash el 11 de Jun. de 2019
Was just a copy paste error , but had run with correct variable and still could not get the code to select features.
the cyclist
the cyclist el 11 de Jun. de 2019
I have to admit that I have not tried to deeply understand the example. But it seems to me that you still need to deal with the fact that you want to maximize R^2, not minimize it.
Also, I think you have not fully understood the purpose of the lines
maxR=chi2inv(0.4,1);
...
'TolFun',maxdev,...
(where I assume the mismatch here is another typo).
That line is not about defining the absolute level of R^2 that defines the stopping criterion. It is about the relative level, compared to the prior models with fewer feature (I think).
All in all, my impression is that you are trying to make these changes without getting a deeper understanding of what everything is doing, which is hazardous to getting the correct result.

Iniciar sesión para comentar.

Categorías

Más información sobre Dimensionality Reduction and Feature Extraction en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by