Esta página aún no se ha traducido para esta versión. Puede ver la versión más reciente de esta página en inglés.

confusionmat

Compute confusion matrix for classification problem

Sintaxis

C = confusionmat(group,grouphat)
C = confusionmat(group,grouphat,'Order',grouporder)
[C,order] = confusionmat(___)

Descripción

ejemplo

C = confusionmat(group,grouphat) returns the confusion matrix C determined by the known and predicted groups in group and grouphat, respectively.

ejemplo

C = confusionmat(group,grouphat,'Order',grouporder) uses grouporder to order the rows and columns of C.

ejemplo

[C,order] = confusionmat(___) also returns the order of the rows and columns of C in the variable order using any of the input arguments in previous syntaxes.

Ejemplos

contraer todo

Display the confusion matrix for data with two misclassifications and one missing classification.

Create vectors for the known groups and the predicted groups.

g1 = [3 2 2 3 1 1]';	% Known groups
g2 = [4 2 3 NaN 1 1]';	% Predicted groups

Return the confusion matrix.

C = confusionmat(g1,g2)
C = 4×4

     2     0     0     0
     0     1     1     0
     0     0     0     1
     0     0     0     0

The indices of the rows and columns of the confusion matrix C are identical and arranged by default in the sorted order of [g1;g2], that is, (1,2,3,4).

The confusion matrix shows that the two data points known to be in group 1 are classified correctly. For group 2, one of the data points is misclassified into group 3. Also, one of the data points known to be in group 3 is misclassified into group 4. confusionmat treats the NaN value in the grouping variable g2 as a missing value and does not include it in the rows and columns of C.

Plot the confusion matrix as a confusion matrix chart by using confusionchart.

confusionchart(C);

You do not need to calculate the confusion matrix first and then plot it. Instead, plot a confusion matrix chart directly from the true and predicted labels by using confusionchart.

cm = confusionchart(g1,g2)

cm = 
  ConfusionMatrixChart with properties:

    NormalizedValues: [4x4 double]
         ClassLabels: [4x1 double]

  Show all properties

The ConfusionMatrixChart object stores the numeric confusion matrix in the NormalizedValues property and the classes in the ClassLabels property. Display these properties using dot notation.

cm.NormalizedValues
ans = 4×4

     2     0     0     0
     0     1     1     0
     0     0     0     1
     0     0     0     0

cm.ClassLabels
ans = 4×1

     1
     2
     3
     4

Display the confusion matrix for data with two misclassifications and one missing classification, and specify the group order.

Create vectors for the known groups and the predicted groups.

g1 = [3 2 2 3 1 1]';	% Known groups
g2 = [4 2 3 NaN 1 1]';	% Predicted groups

Specify the group order and return the confusion matrix.

C = confusionmat(g1,g2,'Order',[4 3 2 1])
C = 4×4

     0     0     0     0
     1     0     0     0
     0     1     1     0
     0     0     0     2

The indices of the rows and columns of the confusion matrix C are identical and arranged in the order specified by the group order, that is, (4,3,2,1).

The second row of the confusion matrix C shows that one of the data points known to be in group 3 is misclassified into group 4. The third row of C shows that one of the data points belonging to group 2 is misclassified into group 3, and the fourth row shows that the two data points known to be in group 1 are classified correctly. confusionmat treats the NaN value in the grouping variable g2 as a missing value and does not include it in the rows and columns of C.

Perform classification on a sample of the fisheriris data set and display the confusion matrix for the resulting classification.

Load Fisher's iris data set.

load fisheriris

Randomize the measurements and groups in the data.

rng(0,'twister'); % For reproducibility
numObs = length(species);
p = randperm(numObs);
meas = meas(p,:);
species = species(p);

Train a discriminant analysis classifier by using measurements in the first half of the data.

half = floor(numObs/2);
training = meas(1:half,:);
trainingSpecies = species(1:half);
Mdl = fitcdiscr(training,trainingSpecies);

Predict labels for the measurements in the second half of the data by using the trained classifier.

sample = meas(half+1:end,:);
grouphat = predict(Mdl,sample);

Specify the group order and display the confusion matrix for the resulting classification.

group = species(half+1:end);
[C,order] = confusionmat(group,grouphat,'Order',{'setosa','versicolor','virginica'})
C = 3×3

    29     0     0
     0    22     2
     0     0    22

order = 3x1 cell array
    {'setosa'    }
    {'versicolor'}
    {'virginica' }

The confusion matrix shows that the measurements belonging to setosa and virginica are classified correctly, while two of the measurements belonging to versicolor are misclassified as virginica. The output order contains the order of the rows and columns of the confusion matrix in the sequence specified by the group order {'setosa','versicolor','virginica'}.

Perform classification on a tall array of the fisheriris data set, compute a confusion matrix for the known and predicted tall labels by using the confusionmat function, and plot the confusion matrix by using the confusionchart function.

When you execute calculations on tall arrays, the default execution environment uses either the local MATLAB session or a local parallel pool (if you have Parallel Computing Toolbox™). You can use the mapreducer function to change the execution environment.

Load Fisher's iris data set.

load fisheriris

Convert the in-memory arrays meas and species to tall arrays.

tx = tall(meas);
ty = tall(species);

Find the number of observations in the tall array.

numObs = gather(length(ty));   % gather collects tall array into memory
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.23 sec

Set the random number stream for reproducibility, and randomly select training samples.

s = RandStream('mlfg6331_64'); % For reproducibility
numTrain = floor(numObs/2);
[txTrain,trIdx] = datasample(s,tx,numTrain,'Replace',false);
tyTrain = ty(trIdx); 

Fit a decision tree classifier model on the training samples.

mdl = fitctree(txTrain,tyTrain); 
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.4 sec
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.78 sec
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.45 sec
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.37 sec
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.35 sec

Predict labels for the test samples by using the trained model.

txTest = tx(~trIdx,:);
label = predict(mdl,txTest);

Compute the confusion matrix for the resulting classification.

tyTest = ty(~trIdx);
[C,order] = confusionmat(tyTest,label)
C =

  3×3 tall double matrix

    20     0     0
     0    27     2
     0     1    25


order =

  3×1 tall cell array

    {'setosa'    }
    {'versicolor'}
    {'virginica' }

The confusion matrix shows that two measurements in the versicolor class are misclassified as virginica, and one measurement in the virginica class is misclassified as versicolor. All the measurements belonging to setosa are classified correctly.

To compute and plot the confusion matrix, use gather and confusionchart.

cm = confusionchart(gather(tyTest),gather(label))
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.021 sec
Evaluating tall expression using the Parallel Pool 'local':
Evaluation completed in 0.021 sec

cm = 
  ConfusionMatrixChart with properties:

    NormalizedValues: [3×3 double]
         ClassLabels: [3×1 categorical]

  Show all properties

Argumentos de entrada

contraer todo

Known groups for categorizing observations, specified as a numeric vector, logical vector, character array, string array, cell array of character vectors, or categorical vector.

group is a grouping variable of the same type as grouphat. The group argument must have the same number of observations as grouphat, as described in Grouping Variables. The confusionmat function treats character arrays and string arrays as cell arrays of character vectors. Additionally, confusionmat treats NaN, empty, and 'undefined' values in group as missing values and does not count them as distinct groups or categories.

Ejemplo: {'Male','Female','Female','Male','Female'}

Tipos de datos: single | double | logical | char | string | cell | categorical

Predicted groups for categorizing observations, specified as a numeric vector, logical vector, character array, string array, cell array of character vectors, or categorical vector.

grouphat is a grouping variable of the same type as group. The grouphat argument must have the same number of observations as group, as described in Grouping Variables. The confusionmat function treats character arrays and string arrays as cell arrays of character vectors. Additionally, confusionmat treats NaN, empty, and 'undefined' values in grouphat as missing values and does not count them as distinct groups or categories.

Ejemplo: [1 0 0 1 0]

Tipos de datos: single | double | logical | char | string | cell | categorical

Group order, specified as a numeric vector, logical vector, character array, string array, cell array of character vectors, or categorical vector.

grouporder is a grouping variable containing all the distinct elements in group and grouphat. Specify grouporder to define the order of the rows and columns of C. If grouporder contains elements that are not in group or grouphat, the corresponding entries in C are 0.

By default, the group order depends on the data type of s = [group;grouphat]:

  • For numeric and logical vectors, the order is the sorted order of s.

  • For categorical vectors, the order is the order returned by categories(s).

  • For other data types, the order is the order of first appearance in s.

Ejemplo: 'order',{'setosa','versicolor','virginica'}

Tipos de datos: single | double | logical | char | string | cell | categorical

Output Arguments

contraer todo

Confusion matrix, returned as a square matrix with size equal to the total number of distinct elements in the group and grouphat arguments. C(i,j) is the count of observations known to be in group i but predicted to be in group j.

The rows and columns of C have identical ordering of the same group indices. By default, the group order depends on the data type of s = [group;grouphat]:

  • For numeric and logical vectors, the order is the sorted order of s.

  • For categorical vectors, the order is the order returned by categories(s).

  • For other data types, the order is the order of first appearance in s.

To change the order, specify grouporder,

The confusionmat function treats NaN, empty, and 'undefined' values in the grouping variables as missing values and does not include them in the rows and columns of C.

Order of rows and columns in C, returned as a numeric vector, logical vector, categorical vector, or cell array of character vectors. If group and grouphat are character arrays, string arrays, or cell arrays of character vectors, then the variable order is a cell array of character vectors. Otherwise, order is of the same type as group and grouphat.

Alternative Functionality

  • Use confusionchart to calculate and plot a confusion matrix. Additionally, confusionchart displays summary statistics about your data and sorts the classes of the confusion matrix according to the class-wise precision (positive predictive value), class-wise recall (true positive rate), or total number of correctly classified observations.

Capacidades ampliadas

Introducido en R2008b