loss
Class: classreg.learning.regr.CompactRegressionSVM, RegressionSVM
Namespace: classreg.learning.regr
Regression error for support vector machine regression model
Syntax
L = loss(mdl,Tbl,ResponseVarName)
L = loss(mdl,Tbl,Y)
L = loss(mdl,X,Y)
L = loss(___,Name,Value)
Description
returns the loss for the predictions of the support vector machine (SVM) regression
model, L
= loss(mdl
,Tbl
,ResponseVarName
)mdl
, based on the predictor data in the table
Tbl
and the true response values in
Tbl.ResponseVarName
.
returns the loss for the predictions of the support vector machine (SVM) regression
model, L
= loss(mdl
,Tbl
,Y
)mdl
, based on the predictor data in the table
X
and the true response values in the vector
Y
.
returns the loss for the predictions of the support vector machine (SVM) regression
model, L
= loss(mdl
,X
,Y
)mdl
, based on the predictor data in X
and the true responses in Y
.
returns the loss with additional options specified by one or more name-value arguments,
using any of the previous syntaxes. For example, you can specify the loss function or
observation weights.L
= loss(___,Name,Value
)
Input Arguments
mdl
— SVM regression model
RegressionSVM
model | CompactRegressionSVM
model
SVM regression model, specified as a RegressionSVM
model or CompactRegressionSVM
model
returned by fitrsvm
or compact
, respectively.
Tbl
— Sample data
table
Sample data, specified as a table. Each row of tbl
corresponds to one observation, and each column corresponds to one predictor
variable. Optionally, Tbl
can contain additional
columns for the response variable and observation weights.
Tbl
must contain all of the predictors used to
train mdl
. Multicolumn variables and cell arrays other
than cell arrays of character vectors are not allowed.
If you trained mdl
using sample data contained in a
table
, then the input data for this method must also
be in a table.
Data Types: table
ResponseVarName
— Response variable name
name of a variable in Tbl
Response variable name, specified as the name of a variable in
Tbl
. The response variable must be a numeric
vector.
You must specify ResponseVarName
as a character
vector or string scalar. For example, if the response variable
Y
is stored as Tbl.Y
, then specify
ResponseVarName
as 'Y'
.
Otherwise, the software treats all columns of Tbl
,
including Y
, as predictors when training the
model.
Data Types: char
| string
X
— Predictor data
numeric matrix
Predictor data, specified as a numeric matrix or table. Each row of
X
corresponds to one observation (also known as an
instance or example), and each column corresponds to one variable (also
known as a feature).
If you trained mdl
using a matrix of predictor
values, then X
must be a numeric matrix with
p columns. p is the number of
predictors used to train mdl
.
The length of Y
and the number of rows of
X
must be equal.
Data Types: single
| double
Y
— Observed response values
vector of numeric values
Observed response values, specified as a vector of length
n containing numeric values. Each entry in
Y
is the observed response based on the predictor
data in the corresponding row of X
.
Data Types: single
| double
Name-Value Arguments
Specify optional pairs of arguments as
Name1=Value1,...,NameN=ValueN
, where Name
is
the argument name and Value
is the corresponding value.
Name-value arguments must appear after other arguments, but the order of the
pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose
Name
in quotes.
LossFun
— Loss function
'mse'
(default) | 'epsiloninsensitive'
| function handle
Loss function, specified as the comma-separated pair consisting of
'LossFun'
and 'mse'
,
'epsiloninsensitive'
, or a function
handle.
The following table lists the available loss functions.
Value Loss Function 'mse'
Weighted Mean Squared Error 'epsiloninsensitive'
Epsilon-Insensitive Loss Function Specify your own function using function handle notation.
Your function must have the signature
lossvalue = lossfun(Y,Yfit,W)
, where:The output argument
lossvalue
is a scalar value.You choose the function name (lossfun).
Y
is an n-by-1 numeric vector of observed response values.Yfit
is an n-by-1 numeric vector of predicted response values, calculated using the corresponding predictor values inX
(similar to the output ofpredict
).W
is an n-by-1 numeric vector of observation weights. If you passW
, the software normalizes them to sum to 1.
Specify your function using
'LossFun',@lossfun
.
Example: 'LossFun','epsiloninsensitive'
Data Types: char
| string
| function_handle
PredictionForMissingValue
— Predicted response value to use for observations with missing predictor values
"median"
(default) | "mean"
| "omitted"
| numeric scalar
Since R2023b
Predicted response value to use for observations with missing predictor values,
specified as "median"
, "mean"
,
"omitted"
, or a numeric scalar.
Value | Description |
---|---|
"median" | loss uses the median of the observed
response values in the training data as the predicted response value for
observations with missing predictor values. |
"mean" | loss uses the mean of the observed
response values in the training data as the predicted response value for
observations with missing predictor values. |
"omitted" | loss excludes observations with missing
predictor values from the loss computation. |
Numeric scalar | loss uses this value as the predicted
response value for observations with missing predictor values. |
If an observation is missing an observed response value or an observation weight, then
loss
does not use the observation in the loss
computation.
Example: PredictionForMissingValue="omitted"
Data Types: single
| double
| char
| string
Weights
— Observation weights
ones(size(X,1),1)
(default) | numeric vector
Observation weights, specified as the comma-separated pair consisting
of 'Weights'
and a numeric vector.
Weights
must be the same length as the number of
rows in X
. The software weighs the observations in
each row of X
using the corresponding weight value
in Weights
.
Weights are normalized to sum to 1.
Data Types: single
| double
Output Arguments
L
— Regression loss
scalar value
Regression loss, returned as a scalar value.
Examples
Calculate Test Sample Loss for SVM Regression Model
Calculate the test set mean squared error (MSE) and epsilon-insensitive error of an SVM regression model.
Load the carsmall
sample data. Specify Horsepower
and Weight
as the predictor variables (X
), and MPG
as the response variable (Y
).
load carsmall
X = [Horsepower,Weight];
Y = MPG;
Delete rows of X
and Y
where either array has NaN
values.
R = rmmissing([X Y]); X = R(:,1:2); Y = R(:,end);
Reserve 10% of the observations as a holdout sample, and extract the training and test indices.
rng default % For reproducibility N = length(Y); cv = cvpartition(N,'HoldOut',0.10); trainInds = training(cv); testInds = test(cv);
Specify the training and test data sets.
XTrain = X(trainInds,:); YTrain = Y(trainInds); XTest = X(testInds,:); YTest = Y(testInds);
Train a linear SVM regression model and standardize the data.
mdl = fitrsvm(XTrain,YTrain,'Standardize',true)
mdl = RegressionSVM ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' Alpha: [68x1 double] Bias: 23.0248 KernelParameters: [1x1 struct] Mu: [108.8810 2.9419e+03] Sigma: [44.4943 805.1412] NumObservations: 84 BoxConstraints: [84x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [84x1 logical] Solver: 'SMO'
mdl
is a RegressionSVM
model.
Determine how well the trained model generalizes to new predictor values by estimating the test sample mean squared error and epsilon-insensitive error.
lossMSE = loss(mdl,XTest,YTest)
lossMSE = 32.0268
lossEI = loss(mdl,XTest,YTest,'LossFun','epsiloninsensitive')
lossEI = 3.2919
More About
Weighted Mean Squared Error
The weighted mean squared error is calculated as follows:
where:
n is the number of rows of data
xj is the jth row of data
yj is the true response to xj
f(xj) is the response prediction of the SVM regression model
mdl
to xjw is the vector of weights.
The weights in w are all equal to one by default. You can
specify different values for weights using the 'Weights'
name-value pair argument. If you specify weights, each value is divided by the sum
of all weights, such that the normalized weights add to one.
Epsilon-Insensitive Loss Function
The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. It is formally described as:
The mean epsilon-insensitive loss is calculated as follows:
where:
n is the number of rows of data
xj is the jth row of data
yj is the true response to xj
f(xj) is the response prediction of the SVM regression model
mdl
to xjw is the vector of weights.
The weights in w are all equal to one by default. You can
specify different values for weights using the 'Weights'
name-value pair argument. If you specify weights, each value is divided by the sum
of all weights, such that the normalized weights add to one.
Tips
If
mdl
is a cross-validatedRegressionPartitionedSVM
model, usekfoldLoss
instead ofloss
to calculate the regression error.
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
The
loss
function fully supports tall arrays. For more information,
see Tall Arrays.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™. (since R2023a)
This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2015bR2023b: Specify predicted response value to use for observations with missing predictor values
Starting in R2023b, when you predict or compute the loss, some regression models allow you to specify the predicted response value for observations with missing predictor values. Specify the PredictionForMissingValue
name-value argument to use a numeric scalar, the training set median, or the training set mean as the predicted value. When computing the loss, you can also specify to omit observations with missing predictor values.
This table lists the object functions that support the
PredictionForMissingValue
name-value argument. By default, the
functions use the training set median as the predicted response value for observations with
missing predictor values.
Model Type | Model Objects | Object Functions |
---|---|---|
Gaussian process regression (GPR) model | RegressionGP , CompactRegressionGP | loss , predict , resubLoss , resubPredict |
RegressionPartitionedGP | kfoldLoss , kfoldPredict | |
Gaussian kernel regression model | RegressionKernel | loss , predict |
RegressionPartitionedKernel | kfoldLoss , kfoldPredict | |
Linear regression model | RegressionLinear | loss , predict |
RegressionPartitionedLinear | kfoldLoss , kfoldPredict | |
Neural network regression model | RegressionNeuralNetwork , CompactRegressionNeuralNetwork | loss , predict , resubLoss , resubPredict |
RegressionPartitionedNeuralNetwork | kfoldLoss , kfoldPredict | |
Support vector machine (SVM) regression model | RegressionSVM , CompactRegressionSVM | loss , predict , resubLoss , resubPredict |
RegressionPartitionedSVM | kfoldLoss , kfoldPredict |
In previous releases, the regression model loss
and predict
functions listed above used NaN
predicted response values for observations with missing predictor values. The software omitted observations with missing predictor values from the resubstitution ("resub") and cross-validation ("kfold") computations for prediction and loss.
R2023a: GPU array support
Starting in R2023a, loss
fully supports GPU arrays.
R2022a: loss
can return NaN for predictor data with missing
values
The loss
function no longer omits an observation with a
NaN prediction when computing the weighted average regression loss. Therefore,
loss
can now return NaN when the predictor data
X
or the predictor variables in Tbl
contain any missing values. In most cases, if the test set observations do not contain
missing predictors, the loss
function does not return
NaN.
This change improves the automatic selection of a regression model when you use
fitrauto
.
Before this change, the software might select a model (expected to best predict the
responses for new data) with few non-NaN predictors.
If loss
in your code returns NaN, you can update your code
to avoid this result. Remove or replace the missing values by using rmmissing
or fillmissing
, respectively.
The following table shows the regression models for which the
loss
object function might return NaN. For more details,
see the Compatibility Considerations for each loss
function.
Model Type | Full or Compact Model Object | loss Object Function |
---|---|---|
Gaussian process regression (GPR) model | RegressionGP , CompactRegressionGP | loss |
Gaussian kernel regression model | RegressionKernel | loss |
Linear regression model | RegressionLinear | loss |
Neural network regression model | RegressionNeuralNetwork , CompactRegressionNeuralNetwork | loss |
Support vector machine (SVM) regression model | RegressionSVM , CompactRegressionSVM | loss |
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)