Documentation

cvpartition

Class: cvpartition

Create cross validation partition for data

Syntax

c = cvpartition(n,'KFold',k)
c = cvpartition(group,'KFold',k)
c = cvpartition(n,'HoldOut',p)
c = cvpartition(group,'HoldOut',p)
c = cvpartition(n,'LeaveOut')
c = cvpartition(n,'resubstitution')

Description

c = cvpartition(n,'KFold',k) constructs an object c of the cvpartition class defining a random partition for k-fold cross validation on n observations. The partition divides the observations into k disjoint subsamples (or folds), chosen randomly but with roughly equal size. The default value of k is 10.

c = cvpartition(group,'KFold',k) creates a random partition for a stratified k-fold cross validation. group is a numeric vector, categorical array, string array, or cell array of strings indicating the class of each observation. Each subsample has roughly equal size and roughly the same class proportions as in group. cvpartition treats NaNs or empty strings in group as missing values.

c = cvpartition(n,'HoldOut',p) creates a random partition for holdout validation on n observations. This partition divides the observations into a training set and a test (or holdout) set. The parameter p must be a scalar. When 0 < p < 1, cvpartition randomly selects approximately p*n observations for the test set. When p is an integer, cvpartition randomly selects p observations for the test set. The default value of p is 1/10.

c = cvpartition(group,'HoldOut',p) randomly partitions observations into a training set and a test set with stratification, using the class information in group; that is, both training and test sets have roughly the same class proportions as in group.

c = cvpartition(n,'LeaveOut') creates a random partition for leave-one-out cross validation on n observations. Leave-one-out is a special case of 'KFold', in which the number of folds equals the number of observations.

c = cvpartition(n,'resubstitution') creates an object c that does not partition the data. Both the training set and the test set contain all of the original n observations.

Examples

Use stratified 10-fold cross validation to compute misclassification rate:

load fisheriris;
y = species;
c = cvpartition(y,'k',10);

fun = @(xT,yT,xt,yt)(sum(~strcmp(yt,classify(xt,xT,yT))));

rate = sum(crossval(fun,meas,y,'partition',c))...
           /sum(c.TestSize)
rate =
    0.0200
Was this topic helpful?