# predict

Class: ClassificationLinear

Predict labels for linear classification models

## Description

example

Label = predict(Mdl,X) returns predicted class labels for each observation in the predictor data X based on the trained, binary, linear classification model Mdl. Label contains class labels for each regularization strength in Mdl.

example

Label = predict(Mdl,X,Name,Value) returns predicted class labels with additional options specified by one or more Name,Value pair arguments. For example, you can specify that columns in the predictor data correspond to observations.

example

[Label,Score] = predict(___) also returns classification scores for both classes using any of the previous syntaxes. Score contains classification scores for each regularization strength in Mdl.

## Input Arguments

expand all

Binary, linear classification model, specified as a ClassificationLinear model object. You can create a ClassificationLinear model object using fitclinear.

Predictor data, specified as an n-by-p full or sparse matrix. This orientation of X indicates that rows correspond to individual observations, and columns correspond to individual predictor variables.

### Note

If you orient your predictor matrix so that observations correspond to columns and specify 'ObservationsIn','columns', then you might experience a significant reduction in computation time.

Data Types: single | double

### Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Predictor data observation dimension, specified as the comma-separated pair consisting of 'ObservationsIn' and 'columns' or 'rows'.

### Note

If you orient your predictor matrix so that observations correspond to columns and specify 'ObservationsIn','columns', then you might experience a significant reduction in optimization-execution time.

## Output Arguments

expand all

Predicted class labels, returned as a categorical or character array, logical or numeric matrix, or cell array of character vectors.

In most cases, Label is an n-by-L array of the same data type as the observed class labels (Y) used to train Mdl. (The software treats string arrays as cell arrays of character vectors.) n is the number of observations in X and L is the number of regularization strengths in Mdl.Lambda. That is, Label(i,j) is the predicted class label for observation i using the linear classification model that has regularization strength Mdl.Lambda(j).

If Y is a character array and L > 1, then Label is a cell array of class labels.

Classification scores, returned as a n-by-2-by-L numeric array. n is the number of observations in X and L is the number of regularization strengths in Mdl.Lambda. Score(i,k,j) is the score for classifying observation i into class k using the linear classification model that has regularization strength Mdl.Lambda(j). Mdl.ClassNames stores the order of the classes.

If Mdl.Learner is 'logistic', then classification scores are posterior probabilities.

## Examples

expand all

X is a sparse matrix of predictor data, and Y is a categorical vector of class labels. There are more than two classes in the data.

The models should identify whether the word counts in a web page are from the Statistics and Machine Learning Toolbox™ documentation. So, identify the labels that correspond to the Statistics and Machine Learning Toolbox™ documentation web pages.

Ystats = Y == 'stats';

Train a binary, linear classification model using the entire data set, which can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation.

rng(1); % For reproducibility
Mdl = fitclinear(X,Ystats);

Mdl is a ClassificationLinear model.

Predict the training-sample, or resubstitution, labels.

label = predict(Mdl,X);

Because there is one regularization strength in Mdl, label is column vectors with lengths equal to the number of observations.

Construct a confusion matrix.

ConfusionTrain = confusionchart(Ystats,label);

The model misclassifies only one 'stats' documentation page as being outside of the Statistics and Machine Learning Toolbox documentation.

Load the NLP data set and preprocess it as in Predict Training-Sample Labels. Transpose the predictor data matrix.

Ystats = Y == 'stats';
X = X';

Train a binary, linear classification model that can identify whether the word counts in a documentation web page are from the Statistics and Machine Learning Toolbox™ documentation. Specify to hold out 30% of the observations. Optimize the objective function using SpaRSA.

rng(1) % For reproducibility
CVMdl = fitclinear(X,Ystats,'Solver','sparsa','Holdout',0.30,...
'ObservationsIn','columns');
Mdl = CVMdl.Trained{1};

CVMdl is a ClassificationPartitionedLinear model. It contains the property Trained, which is a 1-by-1 cell array holding a ClassificationLinear model that the software trained using the training set.

Extract the training and test data from the partition definition.

trainIdx = training(CVMdl.Partition);
testIdx = test(CVMdl.Partition);

Predict the training- and test-sample labels.

labelTrain = predict(Mdl,X(:,trainIdx),'ObservationsIn','columns');
labelTest = predict(Mdl,X(:,testIdx),'ObservationsIn','columns');

Because there is one regularization strength in Mdl, labelTrain and labelTest are column vectors with lengths equal to the number of training and test observations, respectively.

Construct a confusion matrix for the training data.

ConfusionTrain = confusionchart(Ystats(trainIdx),labelTrain);

The model misclassifies only three documentation pages as being outside of Statistics and Machine Learning Toolbox documentation.

Construct a confusion matrix for the test data.

ConfusionTest = confusionchart(Ystats(testIdx),labelTest);

The model misclassifies three documentation pages as being outside the Statistics and Machine Learning Toolbox, and two pages as being inside.

Estimate test-sample, posterior class probabilities, and determine the quality of the model by plotting a ROC curve. Linear classification models return posterior probabilities for logistic regression learners only.

Load the NLP data set and preprocess it as in Predict Test-Sample Labels.

Ystats = Y == 'stats';
X = X';

Randomly partition the data into training and test sets by specifying a 30% holdout sample. Identify the test-set indices.

cvp = cvpartition(Ystats,'Holdout',0.30);
idxTest = test(cvp);

Train a binary linear classification model. Fit logistic regression learners using SpaRSA. To hold out the test set, specify the partitioned model.

CVMdl = fitclinear(X,Ystats,'ObservationsIn','columns','CVPartition',cvp,...
'Learner','logistic','Solver','sparsa');
Mdl = CVMdl.Trained{1};

Mdl is a ClassificationLinear model trained using the training set specified in the partition cvp only.

Predict the test-sample posterior class probabilities.

[~,posterior] = predict(Mdl,X(:,idxTest),'ObservationsIn','columns');

Because there is one regularization strength in Mdl, posterior is a matrix with 2 columns and rows equal to the number of test-set observations. Column i contains posterior probabilities of Mdl.ClassNames(i) given a particular observation.

Obtain false and true positive rates, and estimate the AUC. Specify that the second class is the positive class.

[fpr,tpr,~,auc] = perfcurve(Ystats(idxTest),posterior(:,2),Mdl.ClassNames(2));
auc
auc = 0.9986

The AUC is 1, which indicates a model that predicts well.

Plot an ROC curve.

figure;
plot(fpr,tpr)
h = gca;
h.XLim(1) = -0.1;
h.YLim(2) = 1.1;
xlabel('False positive rate')
ylabel('True positive rate')
title('ROC Curve')

The ROC curve and AUC indicate that the model classifies the test-sample observations almost perfectly.

To determine a good lasso-penalty strength for a linear classification model that uses a logistic regression learner, compare test-sample values of the AUC.

Load the NLP data set. Preprocess the data as in Predict Test-Sample Labels.

Ystats = Y == 'stats';
X = X';

Create a data partition that specifies to holdout 10% of the observations. Extract test-sample indices.

rng(10); % For reproducibility
Partition = cvpartition(Ystats,'Holdout',0.10);
testIdx = test(Partition);
XTest = X(:,testIdx);
n = sum(testIdx)
n = 3157
YTest = Ystats(testIdx);

There are 3157 observations in the test sample.

Create a set of 11 logarithmically-spaced regularization strengths from $1{0}^{-6}$ through $1{0}^{-0.5}$.

Lambda = logspace(-6,-0.5,11);

Train binary, linear classification models that use each of the regularization strengths. Optimize the objective function using SpaRSA. Lower the tolerance on the gradient of the objective function to 1e-8.

CVMdl = fitclinear(X,Ystats,'ObservationsIn','columns',...
'CVPartition',Partition,'Learner','logistic','Solver','sparsa',...
CVMdl =
classreg.learning.partition.ClassificationPartitionedLinear
CrossValidatedModel: 'Linear'
ResponseName: 'Y'
NumObservations: 31572
KFold: 1
Partition: [1x1 cvpartition]
ClassNames: [0 1]
ScoreTransform: 'none'

Properties, Methods

Extract the trained linear classification model.

Mdl1 = CVMdl.Trained{1}
Mdl1 =
ClassificationLinear
ResponseName: 'Y'
ClassNames: [0 1]
ScoreTransform: 'logit'
Beta: [34023x11 double]
Bias: [1x11 double]
Lambda: [1x11 double]
Learner: 'logistic'

Properties, Methods

Mdl is a ClassificationLinear model object. Because Lambda is a sequence of regularization strengths, you can think of Mdl as 11 models, one for each regularization strength in Lambda.

Estimate the test-sample predicted labels and posterior class probabilities.

[label,posterior] = predict(Mdl1,XTest,'ObservationsIn','columns');
Mdl1.ClassNames;
posterior(3,1,5)
ans = 1.0000

label is a 3157-by-11 matrix of predicted labels. Each column corresponds to the predicted labels of the model trained using the corresponding regularization strength. posterior is a 3157-by-2-by-11 matrix of posterior class probabilities. Columns correspond to classes and pages correspond to regularization strengths. For example, posterior(3,1,5) indicates that the posterior probability that the first class (label 0) is assigned to observation 3 by the model that uses Lambda(5) as a regularization strength is 1.0000.

For each model, compute the AUC. Designate the second class as the positive class.

auc = 1:numel(Lambda);  % Preallocation
for j = 1:numel(Lambda)
[~,~,~,auc(j)] = perfcurve(YTest,posterior(:,2,j),Mdl1.ClassNames(2));
end

Higher values of Lambda lead to predictor variable sparsity, which is a good quality of a classifier. For each regularization strength, train a linear classification model using the entire data set and the same options as when you trained the model. Determine the number of nonzero coefficients per model.

Mdl = fitclinear(X,Ystats,'ObservationsIn','columns',...
'Learner','logistic','Solver','sparsa','Regularization','lasso',...
numNZCoeff = sum(Mdl.Beta~=0);

In the same figure, plot the test-sample error rates and frequency of nonzero coefficients for each regularization strength. Plot all variables on the log scale.

figure;
[h,hL1,hL2] = plotyy(log10(Lambda),log10(auc),...
log10(Lambda),log10(numNZCoeff + 1));
hL1.Marker = 'o';
hL2.Marker = 'o';
ylabel(h(1),'log_{10} AUC')
ylabel(h(2),'log_{10} nonzero-coefficient frequency')
xlabel('log_{10} Lambda')
title('Test-Sample Statistics')
hold off

Choose the index of the regularization strength that balances predictor variable sparsity and high AUC. In this case, a value between $1{0}^{-2}$ to $1{0}^{-1}$ should suffice.

idxFinal = 9;

Select the model from Mdl with the chosen regularization strength.

MdlFinal = selectModels(Mdl,idxFinal);

MdlFinal is a ClassificationLinear model containing one regularization strength. To estimate labels for new observations, pass MdlFinal and the new data to predict.

expand all