how to use ReliefF algorithm for feteare selection?
17 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I want to use ReliefF Algorithm for feature selection problem,I have a dataset (CNS.mat) I wanted to apply ReliefF Algoritm on this data and obtain the top 30 features, then apply classifier on the result of ReliefF Algorithm. I studied about how this Algorithm works in MATLAB Help:
[RANKED,WEIGHT] = relieff(X,Y,K)
[RANKED,WEIGHT] = relieff(X,Y,K,'PARAM1',val1,'PARAM2',val2,...)
and also I studied this example of ReliefF in MATLAB HELP:
load fisheriris
[ranked,weight] = relieff(meas,species,10)
ranked =
4 3 1 2
weight =
0.1399 0.1226 0.3590 0.3754
But I don't know if this code works the way I descripted, (selects top features and save them as result for classify), my aim is to apply ReliefF Algorithm as feature selection on CNS data and compare the results of this algorithm with other algorithms like SVM-RFE,InfoGain.
I'll be very gratefull your opinions how to use ReliefF for feature selection.
0 comentarios
Respuestas (2)
MeLearningProgramming
el 23 de Jul. de 2020
Editada: MeLearningProgramming
el 23 de Jul. de 2020
Hey guy,
I am using the relieff as well. you have to watch out, how the outputs are given.
weight = 0.1399 0.1226 0.3590 0.3754
means that the first parameter in meas got the weight 0.1399 (first line = first parameter of meas)
ranked = 4 3 1 2 dosn't mean first line = first parameter of meas = ranking number 4
it means that the first parameter in meas got the ranking position 3 (position of the number 1 = first parameter)
How to use relieff?
X should a Matix with datapoint x parameter (in my case for example 147510x10) and y should be a vector datapoint x 1 (147510x1)
first you should estimate the best k-value, like this:
ParamLabels = {'P1','P2','P3','P4','P5','P6','P7','P8','P9','P10'};
for k=1:200 %or parfor
[idx,weights] = relieff(X,y,k);
RankImportanceIdx(:,k) = idx';
RankImportanceWeight(:,k) = weights';
end
by a simple plot of RankImportanceWeight you can see at which k-value the results stay equal => best k-value.
In my case, the best k value for example is 75! afterwards you could plot the results like this:
plot(RankImportanceWeight(RankImportanceIdx(1:end,75),1:end)','LineWidth',2);
title(['Relief algorithm weights vs. k-values','FontWeight','normal')
xlabel('size of k-nearest neighbor'); ylabel('weights');
legend(ParamLabels(RankImportanceIdx(1:end,75)),'Box','off');
set(gca,'FontName','Arial','FontSize',16);
and/or you could create a table, like this:
for pidx=1:size(ParamLabels,2)
[a,~] = find(strcmp(ParamLabels(RankImportanceIdx(1:end,75)),ParamLabels{pidx}));
RankImportanceTbl{pidx,:} = a;
end
by this you could chose the best 30 parameter that fits to your y.
hope this helps to adapt it to your problem,
regards,
MLP
0 comentarios
Ver también
Categorías
Más información sobre Data Distribution Plots en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!