over sampling method( SMOTE)

2 visualizaciones (últimos 30 días)
Maryam Samami
Maryam Samami el 14 de Ag. de 2017
Editada: Walter Roberson el 4 de Jul. de 2018
Dear all, I have used SMOTE (an oversampling method for balancing data set),but after balancing, the obtained balanced data set has not the label column. the rows related to the balanced data set get increase but the label column would not increase. the main data set is 1000*25. the obtained balanced data set will be 2200*24. without label column. label column goes to "final_labels" parameter. it is 2200*1 but it contains only label 1. it must contain both labels 2 and 1 .
I will be so happy if any one would be able to guide me. any suggestion will be appreciated.
------------------------------------------------
this is my script code to balancing data set.
-----------------------------------------------------
load creditgerman.mat
a=creditgerman;
[n,m]=size(a);
total_rows=(1:n);
original_features=a(:,1:m-1);
original_mark=a(:,m);
[creditgerman_balanced_SMOTE,final_labels]=SMOTE(original_features, original_mark);
--------------------------------------------------------------------------
and this is the utilized SMOTE code.
function [final_features , final_mark] = SMOTE(original_features, original_mark)
ind = find(original_mark ==2);
% P = candidate points
P = original_features(ind ,:);
T = P';
% X = Complete Feature Vector
X = T;
% Finding the 5 positive nearest neighbours of all the positive blobs
I = nearestneighbour(T, X, 'NumberOfNeighbours', 6);
I = I';
[r, c] = size(I);
S = [];
th=0.3;
for i=1:r
for j=2:c
index = I(i,j);
new_P=P(i,:)+((P(index,:)-P(i,:))*rand);
S = [S;new_P];
end
end
original_features = [original_features;S];
[r c] = size(S);
mark = ones(r,1);
original_mark = [original_mark;mark];
train_incl = ones(length(original_mark), 1);
I = nearestneighbour(original_features', original_features', 'NumberOfNeighbours', 6);
I = I';
for j = 1:length(original_mark)
neighbors = I(j, 2:6);
len = length(find(original_mark(neighbors) ~= original_mark(j,1)));
if(len >= 2)
if(original_mark(j,1) == 1)
train_incl(neighbors(original_mark(neighbors) ~= original_mark(j,1)),1) = 0;
else
train_incl(j,1) = 0;
end
end
end
final_features = original_features(train_incl == 1, :);
final_mark = original_mark(train_incl ==1, :);
end
-----------------------------------------------------------

Respuestas (0)

Categorías

Más información sobre Matrix Indexing en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by