Main Content

predict

Predict responses for new observations from naive Bayes classification model for incremental learning

Description

example

label = predict(Mdl,X) returns the predicted responses or labels label of the observations in the predictor data X from the naive Bayes classification model for incremental learning Mdl.

example

[label,Posterior,Cost] = predict(Mdl,X) also returns the posterior probabilities (Posterior) and predicted (expected) misclassification costs (Cost) corresponding to the observations (rows) in X. For each observation in X, the predicted class label corresponds to the minimum expected classification cost among all classes.

Examples

collapse all

Load the human activity data set.

load humanactivity

For details on the data set, enter Description at the command line.

Fit a naive Bayes classification model to the entire data set.

TTMdl = fitcnb(feat,actid)
TTMdl = 
  ClassificationNaiveBayes
              ResponseName: 'Y'
     CategoricalPredictors: []
                ClassNames: [1 2 3 4 5]
            ScoreTransform: 'none'
           NumObservations: 24075
         DistributionNames: {1×60 cell}
    DistributionParameters: {5×60 cell}


  Properties, Methods

TTMdl is a ClassificationNaiveBayes model object representing a traditionally trained model.

Convert the traditionally trained model to a naive Bayes classification model for incremental learning.

IncrementalMdl = incrementalLearner(TTMdl)
IncrementalMdl = 
  incrementalClassificationNaiveBayes

                    IsWarm: 1
                   Metrics: [1×2 table]
                ClassNames: [1 2 3 4 5]
            ScoreTransform: 'none'
         DistributionNames: {1×60 cell}
    DistributionParameters: {5×60 cell}


  Properties, Methods

IncrementalMdl is an incrementalClassificationNaiveBayes model object prepared for incremental learning.

  • The incrementalLearner function initializes the incremental learner by passing learned conditional predictor distribution parameters to it, along with other information TTMdl learned from the training data.

  • IncrementalMdl is warm (IsWarm is 1), which means that incremental learning functions can start tracking performance metrics.

An incremental learner created from converting a traditionally trained model can generate predictions without further processing.

Predict class labels for all observations using both models.

ttlabels = predict(TTMdl,feat);
illables = predict(IncrementalMdl,feat);
sameLabels = sum(ttlabels ~= illables) == 0
sameLabels = logical
   1

Both models predict the same labels for each observation.

Load the human activity data set. Randomly shuffle the data.

load humanactivity
n = numel(actid);
rng(10); % For reproducibility
idx = randsample(n,n);
X = feat(idx,:);
Y = actid(idx);

For details on the data set, enter Description at the command line.

Create a naive Bayes classification model for incremental learning; specify the class names. Prepare it for predict by fitting the model to the first 10 observations.

Mdl = incrementalClassificationNaiveBayes('ClassNames',unique(Y));
initobs = 10;
Mdl = fit(Mdl,X(1:initobs,:),Y(1:initobs));
canPredict = size(Mdl.DistributionParameters,1) == numel(Mdl.ClassNames)
canPredict = logical
   1

Mdl is an incrementalClassificationNaiveBayes model. All its properties are read-only. The model is configured to generate predictions.

Simulate a data stream, and perform the following actions on each incoming chunk of 100 observations.

  1. Call predict to compute class posterior probabilities for each observation in the incoming chunk of data.

  2. Consider incrementally measuring how well the model predicts whether a subject is dancing (Y is 5). You can accomplish this by computing the AUC of an ROC curve created by passing, for each observation in the chunk, the difference between the posterior probability of class 5 and the maximum posterior probability among the other classes to perfcurve.

  3. Call fit to fit the model to the incoming chunk. Overwrite the previous incremental model with a new one fitted to the incoming observation.

numObsPerChunk = 100;
nchunk = floor((n - initobs)/numObsPerChunk) - 1;
Posterior = zeros(nchunk,numel(Mdl.ClassNames));
auc = zeros(nchunk,1);
classauc = 5;

% Incremental learning
for j = 1:nchunk
    ibegin = min(n,numObsPerChunk*(j-1) + 1 + initobs);
    iend   = min(n,numObsPerChunk*j + initobs);
    idx = ibegin:iend;    
    [~,Posterior(idx,:)] = predict(Mdl,X(idx,:));  
    diffscore = Posterior(idx,classauc) - max(Posterior(idx,setdiff(Mdl.ClassNames,classauc)),[],2);
    [~,~,~,auc(j)] = perfcurve(Y(idx),diffscore,Mdl.ClassNames(classauc));
    Mdl = fit(Mdl,X(idx,:),Y(idx));
end

Mdl is an incrementalClassificationNaiveBayes model object trained on all the data in the stream.

Plot the AUC on the incoming chunks of data.

plot(auc)
ylabel('AUC')
xlabel('Iteration')

The AUC suggests that the classifier correctly predicts dancing subjects well during incremental learning.

Input Arguments

collapse all

Naive Bayes classification model for incremental learning, specified as an incrementalClassificationNaiveBayes model object. You can create Mdl directly or by converting a supported, traditionally trained machine learning model using the incrementalLearner function. For more details, see the corresponding reference page.

You must configure Mdl to predict labels for a batch of observations.

  • If Mdl is a converted, traditionally trained model, you can predict labels without any modifications.

  • Otherwise, Mdl.DistributionParameters must be a cell matrix with Mdl.NumPredictors > 0 columns and at least one row, where each row corresponds to each class name in Mdl.ClassNames.

Batch of predictor data for which to predict labels, specified as an n-by-Mdl.NumPredictors floating-point matrix.

The length of the observation labels Y and the number of observations in X must be equal; Y(j) is the label of observation j (row or column) in X.

Note

predict supports only floating-point input predictor data. If the input model Mdl represents a converted, traditionally trained model fit to categorical data, use dummyvar to convert each categorical variable to a numeric matrix of dummy variables, and concatenate all dummy variable matrices and any other numeric predictors. For more details, see Dummy Variables.

Data Types: single | double

Output Arguments

collapse all

Predicted responses (or labels), returned as a categorical or character array; floating-point, logical, or string vector; or cell array of character vectors with n rows. n is the number of observations in X, and label(j) is the predicted response for observation j.

label has the same data type as the class names stored in Mdl.ClassNames. (The software treats string arrays as cell arrays of character vectors.)

Class posterior probabilities, returned as an n-by-2 floating-point matrix. Posterior(j,k) is the posterior probability that observation j is in class k. Mdl.ClassNames specifies the order of the classes.

Expected misclassification costs, returned as an n-by-numel(Mdl.ClassNames) floating-point matrix.

Cost(j,k) is the expected misclassification cost of the observation in row j of X predicted into class k (Mdl.ClassNames(k)).

More About

collapse all

Misclassification Cost

A misclassification cost is the relative severity of a classifier labeling an observation into the wrong class.

There are two types of misclassification costs: true and expected. Let K be the number of classes.

  • True misclassification cost — A K-by-K matrix, where element (i,j) indicates the misclassification cost of predicting an observation into class j if its true class is i. The software stores the misclassification cost in the property Mdl.Cost, and uses it in computations. By default, Mdl.Cost(i,j) = 1 if ij, and Mdl.Cost(i,j) = 0 if i = j. In other words, the cost is 0 for correct classification and 1 for any incorrect classification.

  • Expected misclassification cost — A K-dimensional vector, where element k is the weighted average misclassification cost of classifying an observation into class k, weighted by the class posterior probabilities.

    ck=j=1KP^(Y=j|x1,...,xP)Costjk.

    In other words, the software classifies observations to the class corresponding with the lowest expected misclassification cost.

Posterior Probability

The posterior probability is the probability that an observation belongs in a particular class, given the data.

For naive Bayes, the posterior probability that a classification is k for a given observation (x1,...,xP) is

P^(Y=k|x1,..,xP)=P(X1,...,XP|y=k)π(Y=k)P(X1,...,XP),

where:

  • P(X1,...,XP|y=k) is the conditional joint density of the predictors given they are in class k. Mdl.DistributionNames stores the distribution names of the predictors.

  • π(Y = k) is the class prior probability distribution. Mdl.Prior stores the prior distribution.

  • P(X1,..,XP) is the joint density of the predictors. The classes are discrete, so P(X1,...,XP)=k=1KP(X1,...,XP|y=k)π(Y=k).

Introduced in R2021a