MATLAB Answers

Prior probability in binary fitcsvm to take into account different class proportions in training and test sets

20 views (last 30 days)
Hello,
I am working in a binary classification problem using svm. Due to unavoidable reasons, my training and test sets have different class proportions, (roughly 1:3 vs 1:5). I would like to know whether the introduction of the corresponding test prior probabilities in the option 'Prior' when training fitcsvm is going to take into account this difference when predicting in the test set.

  0 Comments

Sign in to comment.

Accepted Answer

Carl
Carl on 10 Oct 2017
Edited: Carl on 10 Oct 2017
Hi Alexis. Specifying a value for 'Prior' will affect the training process for the SVM, which will then make a difference in how it predicts for the test set. In any case, the values for 'Prior' shouldn't necessarily be the prior probabilities of your test set, but rather, the realistic class prior probabilities.
It can be problematic when the real prior probabilities differ significantly from the prior probabilities in your training set. If your training set is representative of the population, then you shouldn't have to provide anything for 'Prior'.
This is a more general problem known as class imbalance, or imbalanced data sets. You can see the Answers post below for previous suggestions on how to account for this problem:
https://www.mathworks.com/matlabcentral/answers/11549-leraning-classification-with-most-training-samples-in-one-category

  1 Comment

Alexis Moscoso Rial
Alexis Moscoso Rial on 17 Oct 2017
In that case I'm going to switch the prior probabilities to those of the test set, which are the realistic ones. Thank you very much.

Sign in to comment.

More Answers (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by