Documentación

Esta página aún no se ha traducido para esta versión. Puede ver la versión más reciente de esta página en inglés.

# loss

Classification loss for multiclass, error-correcting output codes model

## Sintaxis

``L = loss(Mdl,tbl,ResponseVarName)``
``L = loss(Mdl,tbl,Y)``
``L = loss(Mdl,X,Y)``
``L = loss(___,Name,Value)``

## Description

````L = loss(Mdl,tbl,ResponseVarName)` returns the classification loss (`L`), a scalar representing how well the trained, multiclass, error-correcting output code (ECOC) model `Mdl` classifies the predictor data (`tbl`) as compared to the true class labels (`ResponseVarName`). Each row of `tbl` and `ResponseVarName` is an observation.```
````L = loss(Mdl,tbl,Y)` returns the classification loss (`L`), a scalar representing how well the trained error-correcting output code (ECOC) multiclass classifier `Mdl` classifies the predictor data (`tbl`) as compared to the true class labels (`Y`). Each row of `tbl` and `Y` is an observation.```

ejemplo

````L = loss(Mdl,X,Y)` returns the classification loss (`L`), a scalar representing how well the trained error-correcting output code (ECOC) multiclass classifier `Mdl` classifies the predictor data (`X`) as compared to the true class labels (`Y`). Each row of `X` and `Y` is an observation.```

ejemplo

````L = loss(___,Name,Value)` returns the classification loss with additional options specified by one or more `Name,Value` pair arguments, using any of the previous syntaxes. For example, you can specify a decoding scheme, classification loss function, or verbosity level.```

expandir todo

Full or compact, multiclass ECOC model, specified as a `ClassificationECOC` or `CompactClassificationECOC` model object.

To create a full or compact ECOC model, see `ClassificationECOC` or `CompactClassificationECOC`.

Sample data, specified as a table. Each row of `tbl` corresponds to one observation, and each column corresponds to one predictor variable. Optionally, `tbl` can contain additional columns for the response variable and observation weights. `tbl` must contain all the predictors used to train `Mdl`. Multi-column variables and cell arrays other than cell arrays of character vectors are not allowed.

If you trained `Mdl` using sample data contained in a `table`, then the input data for this method must also be in a table.

### Nota

If `Mdl.BinaryLearners` contains linear or kernel classification models (that is, `ClassificationLinear` or `ClassificationKernel` model objects), then you cannot specify sample data in a `table`. Instead, pass a matrix (`X`) and class labels (`Y`).

Tipos de datos: `table`

Response variable name, specified as the name of a variable in `tbl`.

You must specify `ResponseVarName` as a character vector or string scalar. For example, if the response variable `y` is stored as `tbl.y`, then specify it as `'y'`. Otherwise, the software treats all columns of `tbl`, including `y`, as predictors when training the model.

The response variable must be a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. If the response variable is a character array, then each element must correspond to one row of the array.

Tipos de datos: `char` | `string`

Predictor data, specified as a numeric matrix.

Each row of `X` corresponds to one observation, and each column corresponds to one variable. The variables composing the columns of `X` should be the same as the variables that trained the `Mdl` classifier.

The length of `Y` and the number of rows of `X` must be equal.

Tipos de datos: `double` | `single`

Class labels, specified as a categorical, character, or string array, logical or numeric vector, or cell array of character vectors. `Y` must be the same as the data type of `Mdl.ClassNames`. (The software treats string arrays as cell arrays of character vectors.)

The length of `Y` and the number of rows of `X` must be equal.

Tipos de datos: `categorical` | `char` | `string` | `logical` | `single` | `double` | `cell`

### Argumentos de par nombre-valor

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Binary learner loss function, specified as the comma-separated pair consisting of `'BinaryLoss'` and a built-in loss function name or function handle.

• This table contains names and descriptions of the built-in functions, where yj is a class label for a particular binary learner (in the set {–1,1,0}), sj is the score for observation j, and g(yj,sj) is the binary loss formula.

ValueDescriptionScore Domaing(yj,sj)
`'binodeviance'`Binomial deviance(–∞,∞)log[1 + exp(–2yjsj)]/[2log(2)]
`'exponential'`Exponential(–∞,∞)exp(–yjsj)/2
`'hamming'`Hamming[0,1] or (–∞,∞)[1 – sign(yjsj)]/2
`'hinge'`Hinge(–∞,∞)max(0,1 – yjsj)/2
`'linear'`Linear(–∞,∞)(1 – yjsj)/2
`'logit'`Logistic(–∞,∞)log[1 + exp(–yjsj)]/[2log(2)]
`'quadratic'`Quadratic[0,1][1 – yj(2sj – 1)]2/2

The software normalizes binary losses such that the loss is 0.5 when yj = 0. Also, the software calculates the mean binary loss for each class.

• For a custom binary loss function, for example, `customFunction`, specify its function handle `'BinaryLoss',@customFunction`.

`customFunction` has this form:

`bLoss = customFunction(M,s)`
where:

• `M` is the K-by-L coding matrix stored in `Mdl.CodingMatrix`.

• `s` is the 1-by-L row vector of classification scores.

• `bLoss` is the classification loss. This scalar aggregates the binary losses for every learner in a particular class. For example, you can use the mean binary loss to aggregate the loss over the learners for each class.

• K is the number of classes.

• L is the number of binary learners.

For an example of passing a custom binary loss function, see Predict Test-Sample Labels of ECOC Models Using Custom Binary Loss Function.

By default, if all binary learners are:

• SVMs or either linear or kernel classification models of SVM learners, then `BinaryLoss` is `'hinge'`

• Ensembles trained by `AdaboostM1` or `GentleBoost`, then `BinaryLoss` is `'exponential'`

• Ensembles trained by `LogitBoost`, then `BinaryLoss` is `'binodeviance'`

• Linear or kernel classification models of logistic regression learners, or you specify to predict class posterior probabilities (that is, set `'FitPosterior',1` in `fitcecoc`), then `BinaryLoss` is `'quadratic'`

Otherwise, the default value for `'BinaryLoss'` is `'hamming'`. To check the default value, use dot notation to display the `BinaryLoss` property of the trained model at the command line.

Ejemplo: `'BinaryLoss','binodeviance'`

Tipos de datos: `char` | `string` | `function_handle`

Decoding scheme that aggregates the binary losses, specified as the comma-separated pair consisting of `'Decoding'` and `'lossweighted'` or `'lossbased'`. For more information, see Binary Loss.

Ejemplo: `'Decoding','lossbased'`

Loss function, specified as the comma-separated pair consisting of `'LossFun'` and `'classiferror'` or a function handle.

You can:

• Specify the built-in function `'classiferror'` for classification error, i.e., the proportion of misclassified observations.

• Specify your own function using function handle notation.

Suppose that `n` = `size(X,1)` is the sample size and `k` is the number of classes. Your function must have the signature ```lossvalue = lossfun(C,S,W,Cost)```, where:

• The output argument `lossvalue` is a scalar.

• You choose the function name (`lossfun`).

• `C` is an `n`-by-`k` logical matrix with rows indicating which class the corresponding observation belongs. The column order corresponds to the class order in `Mdl.ClassNames`.

Construct `C` by setting `C(p,q) = 1` if observation `p` is in class `q`, for each row. Set all other elements of row `p` to `0`.

• `S` is an `n`-by-`k` numeric matrix of negated loss values for classes. Each row corresponds to an observation. The column order corresponds to the class order in `Mdl.ClassNames`. `S` resembles the output argument `NegLoss` of `predict`.

• `W` is an `n`-by-1 numeric vector of observation weights. If you pass `W`, the software normalizes its elements to sum to `1`.

• `Cost` is a `k`-by-`k` numeric matrix of misclassification costs. For example, `Cost` = `ones(K) -eye(K)` specifies a cost of 0 for correct classification, and 1 for misclassification.

Specify your function using `'LossFun',@lossfun`.

Tipos de datos: `char` | `string` | `function_handle`

Predictor data observation dimension, specified as the comma-separated pair consisting of `'ObservationsIn'` and `'columns'` or `'rows'`. `Mdl.BinaryLearners` must contain linear classification models.

### Nota

If you orient your predictor matrix so that observations correspond to columns and specify `'ObservationsIn','columns'`, you can experience a significant reduction in execution time.

Estimation options, specified as the comma-separated pair consisting of `'Options'` and a structure array returned by `statset`.

To invoke parallel computing:

• You need a Parallel Computing Toolbox™ license.

• Specify `'Options',statset('UseParallel',1)`.

Verbosity level, specified as the comma-separated pair consisting of `'Verbose'` and `0` or `1`. `Verbose` controls the number of diagnostic messages that the software displays in the Command Window.

If `Verbose` is `0`, then the software does not display diagnostic messages. Otherwise, the software displays diagnostic messages.

Ejemplo: `'Verbose',1`

Tipos de datos: `single` | `double`

Observation weights, specified as the comma-separated pair consisting of `'Weights'` and a numeric vector or the name of a variable in `tbl`. If you supply weights, then `loss` computes the weighted loss.

`Weights` requires the same length as the number of observations in `X` or `tbl`.

If you specify `Weights` as the name of a variable in `tbl`, you must do so as a character vector or string scalar. For example, if the weights are stored as `tbl.w`, then specify it as `'w'`. Otherwise, the software treats all columns of `tbl`, including `tbl.w`, as predictors.

If you do not specify your own loss function (using `LossFun`), then the software normalizes `Weights` to sum up to the value of the prior probability in the respective class.

If `Mdl.BinaryLearners` contains linear classification models, then you must specify a vector.

Tipos de datos: `single` | `double` | `char` | `string`

## Output Arguments

expandir todo

Classification loss, returned as a numeric scalar or row vector. `L` is a generalization or resubstitution quality measure. Its interpretation depends on the loss function and weighting scheme, but, in general, better classifiers yield smaller loss values.

If `Mdl.BinaryLearners` contains linear classification models, then `L` is a 1-by- vector, where is the number of regularization strengths in the linear classification models (i.e., `numel(Mdl.BinaryLearners{1}.Lambda)`). `L(j)` is the loss for the model trained using regularization strength `Mdl.BinaryLearners{1}.Lambda(j)`.

Otherwise, `L` is a scalar.

## Ejemplos

expandir todo

```load fisheriris X = meas; Y = categorical(species); classOrder = unique(Y); % Class order rng(1); % For reproducibility```

Train an ECOC model using SVM binary classifiers, and specify a 15% holdout sample. It is good practice to standardize the predictors and define the class order. Specify to standardize the predictors using an SVM template.

```t = templateSVM('Standardize',1); CVMdl = fitcecoc(X,Y,'Holdout',0.15,'Learners',t,'ClassNames',classOrder); CMdl = CVMdl.Trained{1}; % Extract trained, compact classifier testInds = test(CVMdl.Partition); % Extract the test indices XTest = X(testInds,:); YTest = Y(testInds,:);```

`CVMdl` is a `ClassificationPartitionedECOC` model. It contains the property `Trained`, which is a 1-by-1 cell array holding a `CompactClassificationECOC` model that the software trained using the training set.

Estimate the test-sample loss.

`L = loss(CMdl,XTest,YTest)`
```L = 0 ```

The ECOC model correctly classifies all out-of-sample irises.

Suppose that it is interesting to know how well a model classifies a particular class. This example shows how to pass such a custom loss function to `loss`.

```load fisheriris X = meas; Y = categorical(species); n = numel(Y); % Sample size classOrder = unique(Y) % Class order```
```classOrder = 3x1 categorical array setosa versicolor virginica ```
```K = numel(classOrder); % Number of classes rng(1) % For reproducibility```

Train an ECOC model using SVM binary classifiers and specifying a 15% holdout sample. It is good practice to define the class order. Specify to standardize the predictors using an SVM template.

```t = templateSVM('Standardize',1); CVMdl = fitcecoc(X,Y,'Holdout',0.15,'Learners',t,'ClassNames',classOrder); CMdl = CVMdl.Trained{1}; % Extract trained, compact classifier testInds = test(CVMdl.Partition); % Extract the test indices XTest = X(testInds,:); YTest = Y(testInds,:);```

`CVMdl` is a `ClassificationPartitionedECOC` model. It contains the property `Trained`, which is a 1-by-1 cell array holding a `CompactClassificationECOC` model that the software trained using the training set.

Compute the negated losses for the test-sample observations.

`[~,negLoss] = predict(CMdl,XTest);`

Create a function that takes the minimal loss for each observation, and then averages the minimal losses across all observations.

`lossfun = @(~,S,~,~)mean(min(-S,[],2));`

Compute the test-sample custom loss.

`loss(CMdl,XTest,YTest,'LossFun',lossfun)`
```ans = 0.0033 ```

The average, minimal, binary loss in the test sample is 0.0033.

expandir todo

## Algoritmos

If you trained `Mdl` specifying to standardize the predictor data, then the software standardizes the columns of `X` using the corresponding means and standard deviations that the software stored in `Mdl.BinaryLearner{j}.Mu` and `Mdl.BinaryLearner{j}.Sigma` for learner `j`.

## References

[1] Allwein, E., R. Schapire, and Y. Singer. “Reducing multiclass to binary: A unifying approach for margin classiﬁers.” Journal of Machine Learning Research. Vol. 1, 2000, pp. 113–141.

[2] Escalera, S., O. Pujol, and P. Radeva. “On the decoding process in ternary error-correcting output codes.” IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 32, Issue 7, 2010, pp. 120–134.

[3] Escalera, S., O. Pujol, and P. Radeva. “Separability of ternary codes for sparse designs of error-correcting output codes.” Pattern Recogn. Vol. 30, Issue 3, 2009, pp. 285–297.