MATLAB Answers

How does crossval (for k-fold CV) work in MATLAB after training a classifier?

105 views (last 30 days)
Sanjay Yadav
Sanjay Yadav on 7 Mar 2016
Commented: seung ho yeom on 1 Feb 2019
To my knowledge, k-fold CV is a technique for model selection where the data is first divided into k-folds where the data in each fold is stratified. Now, consider the following code:
trainedClassifier = fitcnb(X, Y);
partitionedModel = crossval(trainedClassifier, 'KFold', 10);
accuracy = 1 - kfoldLoss(partitionedModel, 'LossFun', 'ClassifError');
The above code first trains the data in matrix X as per the class labels in vector Y. The trainedClassifier is then used in the function crossval(). My doubt is very simple. Does this line of code
partitionedModel = crossval(trainedClassifier, 'KFold', 10);
divide the matrix X into ten folds and then trains on 9 folds, testes on the remaining fold and this is repeated 10 times with each fold as test matrix or does it simply use the trainedClassifier that was trained in the previous line on the whole matrix X and then testes on each fold as I can only see that the fitcnb has been used only once. Does the function crossval() works upon it internally? If it doesn't, then the training is being done on the whole data instead of on the 9 folds in each iteration as is defined by cross-validation.
Fellow members of the community, I will be highly obliged if this doubt of mine can be cleared. Thanking you in anticipation.

  3 Comments

Sign in to comment.

Answers (3)

Don Mathis
Don Mathis on 30 Nov 2018
The answer is that it divides the dataset into 10 folds and trains the model 10 times on 9 folds each time, using the remaining fold as the test set. The only information taken from 'trainedClassifier' are the hyperparameter values, which are used in each of the 10 trainings. 'fitcnb' is not called 10 times, 'ClassificationNaiveBayes.fit' is.

  11 Comments

Show 8 older comments
seung ho yeom
seung ho yeom on 17 Jan 2019
Yes, I understand now. Thank you.
I've run it directly with the above code, and I've verified that the optimized hyperparameter sigma has the same value in all 10 partitionmodels. However, the other hyperparameter, the kernel length scales, had different values for each of the 10 partiotion models. What happened?

Sign in to comment.



Sign in to answer this question.