Need Predictor Importance in Random Forest Expressed as a Percentage

10 visualizaciones (últimos 30 días)
Hi. I'm running a code to see the importance of demographics (Predictors) on my response (Complaints). I need to express the importance as percentage, as a scale of 0 to 1 (or 0% to 100%). This is the figure I am getting is attached as "RF Importance Chart". My predictors data is attached as "PredictorsOnly.xlsx" and my response data is attached as "TotalComplaintsRF.xlsx"
X = readtable('PredictorsOnly.xlsx','PreserveVariableNames',true)
Y = readtable('TotalComplaintsRF.xlsx','PreserveVariableNames',true)
t = templateTree('NumVariablesToSample','all',...
'PredictorSelection','interaction-curvature','Surrogate','on');
rng(1); % For reproducibility
Mdl = fitrensemble(X,Y,'Method','Bag','NumLearningCycles',200, ...
'Learners',t);
yHat = oobPredict(Mdl);
R2 = corr(Mdl.Y,yHat)^2
impOOB = oobPermutedPredictorImportance(Mdl);
figure
bar(impOOB)
title('Unbiased Predictor Importance Estimates')
xlabel('Predictor variable')
ylabel('Importance')
h = gca;
h.XTickLabel = Mdl.PredictorNames;
h.XTickLabelRotation = 45;
h.TickLabelInterpreter = 'none';

Respuesta aceptada

Pratyush Roy
Pratyush Roy el 9 de Abr. de 2021
Hi,
oobPermutedPredictorImportance normalizes the predictor importance by the standard error (this is common practice in the field), therefore values are not strictly scaled between 0 and 1. However one can rescale predictor importance, for example:
imp(imp<0) = 0;
imp = imp./sum(imp);
Hope this helps!

Más respuestas (0)

Categorías

Más información sobre Dimensionality Reduction and Feature Extraction en Help Center y File Exchange.

Productos

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by