How to use KNN to classify data in MATLAB?

8 visualizaciones (últimos 30 días)
Pedro
Pedro el 11 de Jul. de 2014
Comentada: merlin toche el 2 de Dic. de 2022
I'm having problems in understanding how K-NN classification works in MATLAB.´ Here's the problem, I have a large dataset (65 features for over 1500 subjects) and its respective classes' label (0 or 1). According to what's been explained to me, I have to divide the data into training, test and validation subsets to perform supervised training on the data, and classify it via K-NN. First of all, what's the best ratio to divide the 3 subgroups (1/3 of the size of the dataset each?).
I've looked into ClassificationKNN/fitcknn functions, as well as the crossval function (idealy to divide data), but I'm really not sure how to use them.
To sum up, I wanted to - divide data into 3 groups - "train" the KNN (I know it's not a method that requires training, but the equivalent to training) with the training subset - classify the test subset and get it's classification error/performance - what's the point of having a validation test?
I hope you can help me, thank you in advance
  1 comentario
merlin toche
merlin toche el 2 de Dic. de 2022
Hi
please anyone can help me on how to create open circuit fault inverter using simulink (tutorial ,i need please)

Iniciar sesión para comentar.

Respuestas (1)

Shashank Prasanna
Shashank Prasanna el 11 de Jul. de 2014
You need a validation set if you want to tune certain parameters in the classifier. For example if you were to use SVM with rbf kernel, then you can choose the kernel parameters using validation. A model trained on the training data is tested on Test data to see how it performs on unseen data. If these concepts are not clear, I recommend reading some literature on this topic before proceeding.
The approach you mention is holdout. There are other way to do cross validation. For holdout, how much to divide the data is upto you and of course the nature of the dataset.
Take a look at how you can use cvpartition to try different cross validation techniques:
All the function have examples you can run directly.
  4 comentarios
Pedro
Pedro el 11 de Jul. de 2014
I think I was able to do it, but, if that's not asking too much, could you see if I missed something? This is my code, for a random case:
nfeats=60;ninds=1000;
trainRatio=0.8;valRatio=.1;testRatio=.1;
kmax=100; %for instance...
data=randi(100,nfeats,ninds);
class=randi(2,1,ninds);
[trainInd,valInd,testInd] = dividerand(1000,trainRatio,valRatio,testRatio);
train=data(:,trainInd);
test=data(:,testInd);
val=data(:,valInd);
train_class=class(:,trainInd);
test_class=class(:,testInd);
val_class=class(:,valInd);
precisionmax=0;
koptimal=0;
for know=1:kmax
%is it the same thing use knnclassify or fitcknn+predict??
predicted_class = knnclassify(val', train', train_class',know);
mdl = fitcknn(train',train_class','NumNeighbors',know) ;
label = predict(mdl,val');
consistency=sum(label==val_class')/length(val_class);
if consistency>precisionmax
precisionmax=consistency;
koptimal=know;
end
end
mdl_final = fitcknn(train',train_class','NumNeighbors',know) ;
label_final = predict(mdl,test');
consistency_final=sum(label==test_class')/length(test_class);
Thank you very much for all your help, Shashank Prasanna.
Pedro
Pedro el 11 de Jul. de 2014
I don't know if there's a better way of comparing the true test_label and the predicted label...

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by