Misclassification Costs in Classification Learner App
By default, the Classification Learner app creates models that assign the same penalty to all misclassifications during training. For a given observation, the app assigns a penalty of 0 if the observation is classified correctly and a penalty of 1 if the observation is classified incorrectly. In some cases, this assignment is inappropriate. For example, suppose you want to classify patients as either healthy or sick. The cost of misclassifying a sick person as healthy might be five times the cost of misclassifying a healthy person as sick. For cases where you know the cost of misclassifying observations of one class into another, and the costs vary across the classes, specify the misclassification costs before training your models.
Note
Custom misclassification costs are not supported for logistic regression models.
Specify Misclassification Costs
In the Classification Learner app, in the Options section of the Learn tab, select Costs. The app opens a dialog box that shows the default misclassification costs (cost matrix) as a table with row and column labels determined by the classes in the response variable. The rows of the table correspond to the true classes, and the columns correspond to the predicted classes. You can interpret the cost matrix in this way: the entry in row i and column j is the cost of misclassifying ith class observations into the jth class. The diagonal entries of the cost matrix must be 0, and the off-diagonal entries must be nonnegative real numbers.
You can specify your own misclassification costs in two ways: by entering values directly into the table in the dialog box or by importing a workspace variable that contains the cost values.
Note
A scaled version of the cost matrix gives the same classification results (for
example, confusion matrix and accuracy), but with a different total
misclassification cost. That is, if CostMat
is the
misclassification cost matrix and a
is a positive, real
scalar, then a model trained with the cost matrix a*CostMat
has the same confusion matrix as that model trained with
CostMat
.
Enter Costs Directly in Dialog Box
In the misclassification costs dialog box, double-click an entry in the table that you want to edit. Delete the value and type the correct misclassification cost for the entry. When you are done editing the table, click Save and Apply to save your changes. The changes apply to all existing draft models and to any new draft models you create using the Models gallery on the Learn tab.
Import Workspace Variable Containing Costs
In the misclassification costs dialog box, click Import from Workspace. The app opens a dialog box for importing costs from a variable in the MATLAB® workspace.
From the Cost variable list, select the cost matrix or structure that contains the misclassification costs.
Cost matrix — The matrix must contain the misclassification costs. The diagonal entries must be 0, and the off-diagonal entries must be nonnegative real numbers. By default, the app uses the class order shown in the previous misclassification costs dialog box to interpret the cost matrix values.
To specify the order of the classes in the cost matrix, create a separate workspace variable containing the class names in the correct order. In the import dialog box, select the appropriate variable from the Class order in cost variable list. The workspace variable containing the class names must be a categorical vector, logical vector, numeric vector, string array, or cell array of character vectors. The class names must match (in spelling and capitalization) the class names in the response variable.
Structure — The structure must contain the fields
ClassificationCosts
andClassNames
with these specifications:ClassificationCosts
— Matrix that contains misclassification costs.ClassNames
— Names of the classes. The order of the classes inClassNames
determines the order of the rows and columns ofClassificationCosts
. The variableClassNames
must be a categorical vector, logical vector, numeric vector, string array, or cell array of character vectors. The class names must match (in spelling and capitalization) the class names in the response variable.
After specifying the cost variable and the class order in the cost variable, click Import. The app updates the table in the misclassification costs dialog box.
After you specify a cost matrix that differs from the default, the app updates the Summary tab of existing draft models. In the Summary pane, the app displays a Misclassification Costs: Custom section. For models that use the default misclassification costs, the app displays a Misclassification Costs: Default section.
You can click Misclassification Costs: Custom to expand the section and view the table of misclassification costs.
Assess Model Performance
After specifying misclassification costs, you can train and tune your models as
usual. However, using custom misclassification costs can change how you assess the
performance of a model. For example, instead of choosing the model with the best
accuracy, choose a model that has good accuracy and a low total misclassification
cost. The total misclassification cost for a model is
sum(CostMat.*ConfusionMat,"all")
, where
CostMat
is the misclassification cost matrix and
ConfusionMat
is the confusion matrix for the model. The
confusion matrix shows how the model classifies observations in each class. See
Check Performance Per Class in the Confusion Matrix.
To inspect the total misclassification cost of a trained model, select the model in the Models pane. Right-click the model and select Summary. In the Summary tab, look at the Training Results section. The total misclassification cost is listed below the accuracy of the model.
Misclassification Costs in Exported Model and Generated Code
After you train a model with custom misclassification costs and export it from the
app, you can find the custom costs inside the exported model. For example, if you
export a tree model as a structure named trainedModel
, you can
use the following code to access the cost matrix and the order of the classes in the
matrix.
trainedModel.ClassificationTree.Cost trainedModel.ClassificationTree.ClassNames
When you generate MATLAB code for a model trained with custom misclassification costs, the
generated code includes a cost matrix that is passed to the training function
through the Cost
name-value argument.
Related Topics
- Train and Compare Classifiers Using Misclassification Costs in Classification Learner App
- Train Classification Models in Classification Learner App
- Select Data for Classification or Open Saved App Session
- Feature Selection and Feature Transformation Using Classification Learner App
- Choose Classifier Options
- Visualize and Assess Classifier Performance in Classification Learner
- Export Classification Model to Predict New Data