fitrsvm

Fit a support vector machine regression model

collapse all in page

Syntax

Mdl = fitrsvm(Tbl,ResponseVarName)

Mdl = fitrsvm(Tbl,formula)

Mdl = fitrsvm(Tbl,Y)

Mdl = fitrsvm(X,Y)

Mdl = fitrsvm(___,Name,Value)

[Mdl,AggregateOptimizationResults] = fitrsvm(___)

Description

fitrsvm trains or cross-validates a support vector machine (SVM) regression model on a low- through moderate-dimensional predictor data set. fitrsvm supports mapping the predictor data using kernel functions, and supports SMO, ISDA, or L1 soft-margin minimization via quadratic programming for objective-function minimization.

To train a linear SVM regression model on a high-dimensional data set, that is, data sets that include many predictor variables, use fitrlinear instead.

To train an SVM model for binary classification, see fitcsvm for low- through moderate-dimensional predictor data sets, or fitclinear for high-dimensional data sets.

Mdl = fitrsvm(Tbl,ResponseVarName) returns a full, trained support vector machine (SVM) regression model Mdl trained using the predictors values in the table Tbl and the response values in Tbl.ResponseVarName.

example

Mdl = fitrsvm(Tbl,formula) returns a full SVM regression model trained using the predictors values in the table Tbl. formula is an explanatory model of the response and a subset of predictor variables in Tbl used to fit Mdl.

Mdl = fitrsvm(Tbl,Y) returns a full, trained SVM regression model trained using the predictors values in the table Tbl and the response values in the vector Y.

Mdl = fitrsvm(X,Y) returns a full, trained SVM regression model trained using the predictors values in the matrix X and the response values in the vector Y.

Mdl = fitrsvm(___,Name,Value) returns an SVM regression model with additional options specified by one or more name-value pair arguments, using any of the previous syntaxes. For example, you can specify the kernel function or train a cross-validated model.

example

[Mdl,AggregateOptimizationResults] = fitrsvm(___) also returns AggregateOptimizationResults, which contains hyperparameter optimization results when you specify the OptimizeHyperparameters and HyperparameterOptimizationOptions name-value arguments. You must also specify the ConstraintType and ConstraintBounds options of HyperparameterOptimizationOptions. You can use this syntax to optimize on compact model size instead of cross-validation loss, and to perform a set of multiple optimization problems that have the same options but different constraint bounds.

Examples

collapse all

Train Linear Support Vector Machine Regression Model

Open Live Script

Train a support vector machine (SVM) regression model using sample data stored in matrices.

Load the carsmall data set.

load carsmall
rng 'default'  % For reproducibility

Specify Horsepower and Weight as the predictor variables (X) and MPG as the response variable (Y).

X = [Horsepower,Weight];
Y = MPG;

Train a default SVM regression model.

Mdl = fitrsvm(X,Y)

Mdl = 
  RegressionSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
        ResponseTransform: 'none'
                    Alpha: [75x1 double]
                     Bias: 57.3800
         KernelParameters: [1x1 struct]
          NumObservations: 94
           BoxConstraints: [94x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [94x1 logical]
                   Solver: 'SMO'

Mdl is a trained RegressionSVM model.

Check the model for convergence.

Mdl.ConvergenceInfo.Converged

ans = logical
   0

0 indicates that the model did not converge.

Retrain the model using standardized data.

MdlStd = fitrsvm(X,Y,'Standardize',true)

MdlStd = 
  RegressionSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
        ResponseTransform: 'none'
                    Alpha: [77x1 double]
                     Bias: 22.9131
         KernelParameters: [1x1 struct]
                       Mu: [109.3441 2.9625e+03]
                    Sigma: [45.3545 805.9668]
          NumObservations: 94
           BoxConstraints: [94x1 double]
          ConvergenceInfo: [1x1 struct]
          IsSupportVector: [94x1 logical]
                   Solver: 'SMO'

Check the model for convergence.

MdlStd.ConvergenceInfo.Converged

ans = logical
   1

1 indicates that the model did converge.

Compute the resubstitution (in-sample) mean-squared error for the new model.

lStd = resubLoss(MdlStd)

lStd = 
16.8551

Train Support Vector Machine Regression Model

Open Live Script

Train a support vector machine regression model using the abalone data from the UCI Machine Learning Repository.

Download the data and save it in your current folder with the name 'abalone.csv'.

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data';
websave('abalone.csv',url);

Read the data into a table. Specify the variable names.

varnames = {'Sex'; 'Length'; 'Diameter'; 'Height'; 'Whole_weight';...
    'Shucked_weight'; 'Viscera_weight'; 'Shell_weight'; 'Rings'};
Tbl = readtable('abalone.csv','Filetype','text','ReadVariableNames',false);
Tbl.Properties.VariableNames = varnames;

The sample data contains 4177 observations. All the predictor variables are continuous except for Sex, which is a categorical variable with possible values 'M' (for males), 'F' (for females), and 'I' (for infants). The goal is to predict the number of rings (stored in Rings) on the abalone and determine its age using physical measurements.

Train an SVM regression model, using a Gaussian kernel function with an automatic kernel scale. Standardize the data.

rng default  % For reproducibility
Mdl = fitrsvm(Tbl,'Rings','KernelFunction','gaussian','KernelScale','auto',...
    'Standardize',true)

Mdl = 
  RegressionSVM
           PredictorNames: {'Sex'  'Length'  'Diameter'  'Height'  'Whole_weight'  'Shucked_weight'  'Viscera_weight'  'Shell_weight'}
             ResponseName: 'Rings'
    CategoricalPredictors: 1
        ResponseTransform: 'none'
                    Alpha: [3635×1 double]
                     Bias: 10.8144
         KernelParameters: [1×1 struct]
                       Mu: [0 0 0 0.5240 0.4079 0.1395 0.8287 0.3594 0.1806 0.2388]
                    Sigma: [1 1 1 0.1201 0.0992 0.0418 0.4904 0.2220 0.1096 0.1392]
          NumObservations: 4177
           BoxConstraints: [4177×1 double]
          ConvergenceInfo: [1×1 struct]
          IsSupportVector: [4177×1 logical]
                   Solver: 'SMO'


  Properties, Methods

The Command Window shows that Mdl is a trained RegressionSVM model and displays a property list.

Display the properties of Mdl using dot notation. For example, check to confirm whether the model converged and how many iterations it completed.

conv = Mdl.ConvergenceInfo.Converged

conv = logical
   1

iter = Mdl.NumIterations

iter = 2759

The returned results indicate that the model converged after 2759 iterations.

Cross-Validate SVM Regression Model

Open Live Script

Load the carsmall data set.

load carsmall
rng 'default'  % For reproducibility

Specify Horsepower and Weight as the predictor variables (X) and MPG as the response variable (Y).

X = [Horsepower Weight];
Y = MPG;

Cross-validate two SVM regression models using 5-fold cross-validation. For both models, specify to standardize the predictors. For one of the models, specify to train using the default linear kernel, and the Gaussian kernel for the other model.

MdlLin = fitrsvm(X,Y,'Standardize',true,'KFold',5)

MdlLin = 
  RegressionPartitionedSVM
    CrossValidatedModel: 'SVM'
         PredictorNames: {'x1'  'x2'}
           ResponseName: 'Y'
        NumObservations: 94
                  KFold: 5
              Partition: [1x1 cvpartition]
      ResponseTransform: 'none'

MdlGau = fitrsvm(X,Y,'Standardize',true,'KFold',5,'KernelFunction','gaussian')

MdlGau = 
  RegressionPartitionedSVM
    CrossValidatedModel: 'SVM'
         PredictorNames: {'x1'  'x2'}
           ResponseName: 'Y'
        NumObservations: 94
                  KFold: 5
              Partition: [1x1 cvpartition]
      ResponseTransform: 'none'

MdlLin.Trained

ans=5×1 cell array
    {1x1 classreg.learning.regr.CompactRegressionSVM}
    {1x1 classreg.learning.regr.CompactRegressionSVM}
    {1x1 classreg.learning.regr.CompactRegressionSVM}
    {1x1 classreg.learning.regr.CompactRegressionSVM}
    {1x1 classreg.learning.regr.CompactRegressionSVM}

MdlLin and MdlGau are RegressionPartitionedSVM cross-validated models. The Trained property of each model is a 5-by-1 cell array of CompactRegressionSVM models. The models in the cell store the results of training on 4 folds of observations, and leaving one fold of observations out.

Compare the generalization error of the models. In this case, the generalization error is the out-of-sample mean-squared error.

mseLin = kfoldLoss(MdlLin)

mseLin = 
17.2987

mseGau = kfoldLoss(MdlGau)

mseGau = 
16.5978

The SVM regression model using the Gaussian kernel performs better than the one using the linear kernel.

Create a model suitable for making predictions by passing the entire data set to fitrsvm, and specify all name-value pair arguments that yielded the better-performing model. However, do not specify any cross-validation options.

MdlGau = fitrsvm(X,Y,'Standardize',true,'KernelFunction','gaussian');

To predict the MPG of a set of cars, pass Mdl and a table containing the horsepower and weight measurements of the cars to predict.

Optimize SVM Regression

Open Live Script

This example shows how to optimize hyperparameters automatically using fitrsvm. The example uses the carsmall data.

Load the carsmall data set.

load carsmall

Specify Horsepower and Weight as the predictor variables (X) and MPG as the response variable (Y).

X = [Horsepower Weight];
Y = MPG;

Delete rows of X and Y where either array has missing values.

R = rmmissing([X Y]);
X = R(:,1:end-1);
Y = R(:,end);

Find hyperparameters that minimize five-fold cross-validation loss by using automatic hyperparameter optimization.

For reproducibility, set the random seed and use the 'expected-improvement-plus' acquisition function.

rng default
Mdl = fitrsvm(X,Y,'OptimizeHyperparameters','auto',...
    'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName',...
    'expected-improvement-plus'))

|===================================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | BoxConstraint|  KernelScale |      Epsilon |  Standardize |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    |              |              |              |              |
|===================================================================================================================================|
|    1 | Best   |       2.935 |     0.22434 |       2.935 |       2.935 |        294.5 |        11.95 |       0.4572 |         true |
|    2 | Accept |      3.1124 |    0.046208 |       2.935 |      2.9771 |       0.3265 |       938.31 |      0.26184 |        false |
|    3 | Accept |      11.104 |      6.0718 |       2.935 |      3.0485 |       439.19 |     0.047381 |     0.060061 |        false |
|    4 | Accept |      14.705 |      7.2936 |       2.935 |      2.9355 |    0.0086399 |    0.0027446 |      0.61439 |        false |
|    5 | Accept |      4.1988 |    0.050408 |       2.935 |      3.0066 |        0.123 |        999.3 |          201 |         true |
|    6 | Accept |      3.0084 |     0.21611 |       2.935 |      2.9355 |      0.89057 |    0.0080922 |       8.0144 |         true |
|    7 | Accept |      4.1988 |    0.031932 |       2.935 |      3.5404 |    0.0010016 |      0.62201 |       32.871 |         true |
|    8 | Accept |      4.1418 |    0.031055 |       2.935 |      2.9345 |    0.0037482 |    0.0010004 |       16.616 |         true |
|    9 | Accept |       8.042 |      6.1329 |       2.935 |      2.9354 |       995.25 |    0.0010955 |      0.14275 |         true |
|   10 | Accept |      4.1862 |    0.030103 |       2.935 |      2.9355 |       620.98 |       986.71 |       1.5902 |         true |
|   11 | Best   |      2.9241 |    0.088256 |      2.9241 |       2.926 |       2.1316 |       997.42 |    0.0096788 |        false |
|   12 | Accept |      4.1988 |    0.029241 |      2.9241 |      2.9246 |    0.0010101 |     0.016239 |       147.67 |         true |
|   13 | Accept |      2.9598 |    0.068985 |      2.9241 |      2.9247 |       1.4657 |       1.4793 |      0.38864 |         true |
|   14 | Best   |      2.9088 |    0.056299 |      2.9088 |      2.9121 |       959.16 |       995.08 |      0.54066 |        false |
|   15 | Accept |      4.1988 |    0.044734 |      2.9088 |      2.9102 |       754.18 |       993.03 |       275.59 |        false |
|   16 | Accept |      4.1988 |    0.028712 |      2.9088 |      2.9102 |       16.919 |       0.9408 |       921.29 |         true |
|   17 | Accept |      2.9568 |    0.066562 |      2.9088 |      2.9106 |     0.050235 |     0.039749 |    0.0093077 |         true |
|   18 | Accept |      4.1487 |     0.10195 |      2.9088 |      2.9097 |        5.277 |       46.186 |    0.0095359 |         true |
|   19 | Best   |       2.905 |     0.15923 |       2.905 |       2.875 |      0.27078 |     0.061962 |      0.84063 |         true |
|   20 | Accept |      2.9578 |    0.061801 |       2.905 |      2.8776 |       201.57 |       1.1402 |    0.0094718 |         true |
|===================================================================================================================================|
| Iter | Eval   | Objective:  | Objective   | BestSoFar   | BestSoFar   | BoxConstraint|  KernelScale |      Epsilon |  Standardize |
|      | result | log(1+loss) | runtime     | (observed)  | (estim.)    |              |              |              |              |
|===================================================================================================================================|
|   21 | Accept |      2.9308 |    0.048277 |       2.905 |      2.9026 |        546.6 |       882.55 |    0.0094502 |        false |
|   22 | Accept |      2.9098 |    0.031697 |       2.905 |      2.9032 |       32.703 |       984.05 |      0.12178 |        false |
|   23 | Accept |      4.1988 |    0.086435 |       2.905 |      2.9025 |       958.12 |       89.399 |       893.27 |         true |
|   24 | Accept |      2.9651 |    0.039784 |       2.905 |      2.9021 |      0.62018 |      0.28426 |    0.0093797 |         true |
|   25 | Accept |      4.1989 |    0.031419 |       2.905 |      2.9021 |    0.0010514 |       988.23 |     0.011796 |        false |
|   26 | Accept |      2.9381 |     0.04209 |       2.905 |      2.8933 |       86.303 |       2.3086 |      0.20666 |         true |
|   27 | Accept |       2.962 |     0.10533 |       2.905 |      2.8932 |        915.6 |       7.2222 |    0.0093543 |         true |
|   28 | Accept |      2.9341 |    0.096364 |       2.905 |      2.8946 |      0.13906 |     0.013474 |      0.35647 |         true |
|   29 | Accept |      2.9494 |     0.03663 |       2.905 |      2.9029 |       966.28 |       4.0378 |     0.088829 |         true |
|   30 | Accept |      2.9464 |     0.49554 |       2.905 |       2.903 |       986.98 |       248.46 |      0.11212 |        false |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 31.4991 seconds
Total objective function evaluation time: 21.8477

Best observed feasible point:
    BoxConstraint    KernelScale    Epsilon    Standardize
    _____________    ___________    _______    ___________

       0.27078        0.061962      0.84063       true    

Observed objective function value = 2.905
Estimated objective function value = 2.903
Function evaluation time = 0.15923

Best estimated feasible point (according to models):
    BoxConstraint    KernelScale    Epsilon    Standardize
    _____________    ___________    _______    ___________

       0.27078        0.061962      0.84063       true    

Estimated objective function value = 2.903
Estimated function evaluation time = 0.095415

Figure contains an axes object. The axes object with title Min objective vs. Number of function evaluations, xlabel Function evaluations, ylabel Min objective contains 2 objects of type line. These objects represent Min observed objective, Estimated min objective.

Mdl = 
  RegressionSVM
                         ResponseName: 'Y'
                CategoricalPredictors: []
                    ResponseTransform: 'none'
                                Alpha: [81x1 double]
                                 Bias: 22.9779
                     KernelParameters: [1x1 struct]
                                   Mu: [109.3441 2.9625e+03]
                                Sigma: [45.3545 805.9668]
                      NumObservations: 93
    HyperparameterOptimizationResults: [1x1 BayesianOptimization]
                       BoxConstraints: [93x1 double]
                      ConvergenceInfo: [1x1 struct]
                      IsSupportVector: [93x1 logical]
                               Solver: 'SMO'

The optimization searched over BoxConstraint, KernelScale, Epsilon, and Standardize. The output is the regression with the minimum estimated cross-validation loss.

Input Arguments

collapse all

`Tbl` — Predictor data
table

Sample data used to train the model, specified as a table. Each row of Tbl corresponds to one observation, and each column corresponds to one predictor variable. Optionally, Tbl can contain one additional column for the response variable. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If Tbl contains the response variable, and you want to use all remaining variables in Tbl as predictors, then specify the response variable using ResponseVarName.

If Tbl contains the response variable, and you want to use only a subset of the remaining variables in Tbl as predictors, then specify a formula using formula.

If Tbl does not contain the response variable, then specify a response variable using Y. The length of response variable and the number of rows of Tbl must be equal.

If a row of Tbl or an element of Y contains at least one NaN, then fitrsvm removes those rows and elements from both arguments when training the model.

To specify the names of the predictors in the order of their appearance in Tbl, use the PredictorNames name-value pair argument.

Data Types: table

`ResponseVarName` — Response variable name
name of variable in `Tbl`

Response variable name, specified as the name of a variable in Tbl. The response variable must be a numeric vector.

You must specify ResponseVarName as a character vector or string scalar. For example, if Tbl stores the response variable Y as Tbl.Y, then specify it as "Y". Otherwise, the software treats all columns of Tbl, including Y, as predictors when training the model.

Data Types: char | string

`formula` — Explanatory model of response variable and subset of predictor variables
character vector | string scalar

Explanatory model of the response variable and a subset of the predictor variables, specified as a character vector or string scalar in the form "Y~x1+x2+x3". In this form, Y represents the response variable, and x1, x2, and x3 represent the predictor variables.

To specify a subset of variables in Tbl as predictors for training the model, use a formula. If you specify a formula, then the software does not use any variables in Tbl that do not appear in formula.

The variable names in the formula must be both variable names in Tbl (Tbl.Properties.VariableNames) and valid MATLAB^® identifiers. You can verify the variable names in Tbl by using the isvarname function. If the variable names are not valid, then you can convert them by using the matlab.lang.makeValidName function.

Data Types: char | string

`Y` — Response data
numeric vector

Response data, specified as an n-by-1 numeric vector. The length of Y and the number of rows of Tbl or X must be equal.

If a row of Tbl or X, or an element of Y, contains at least one NaN, then fitrsvm removes those rows and elements from both arguments when training the model.

To specify the response variable name, use the ResponseName name-value pair argument.

Data Types: single | double

`X` — Predictor data
numeric matrix

Predictor data to which the SVM regression model is fit, specified as an n-by-p numeric matrix. n is the number of observations and p is the number of predictor variables.

The length of Y and the number of rows of X must be equal.

If a row of X or an element of Y contains at least one NaN, then fitrsvm removes those rows and elements from both arguments.

To specify the names of the predictors in the order of their appearance in X, use the PredictorNames name-value pair argument.

Data Types: single | double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'KernelFunction','gaussian','Standardize',true,'CrossVal','on' trains a 10-fold cross-validated SVM regression model using a Gaussian kernel and standardized training data.

Note

You cannot use any cross-validation name-value argument together with the OptimizeHyperparameters name-value argument. You can modify the cross-validation for OptimizeHyperparameters only by using the HyperparameterOptimizationOptions name-value argument.

Support Vector Machine Options

collapse all

`BoxConstraint` — Box constraint
positive scalar value

Box constraint for the alpha coefficients, specified as the comma-separated pair consisting of 'BoxConstraint' and a positive scalar value.

The absolute value of the Alpha coefficients cannot exceed the value of BoxConstraint.

The default BoxConstraint value for the 'gaussian' or 'rbf' kernel function is iqr(Y)/1.349, where iqr(Y) is the interquartile range of response variable Y. For all other kernels, the default BoxConstraint value is 1.

Example: BoxConstraint,10

Data Types: single | double

`KernelFunction` — Kernel function
`'linear'` (default) | `'gaussian'` | `'rbf'` | `'polynomial'` | function name

Kernel function used to compute the Gram matrix, specified as the comma-separated pair consisting of 'KernelFunction' and a value in this table.

Value Description Formula

Value	Description	Formula
`'gaussian'` or `'rbf'`	Gaussian or Radial Basis Function (RBF) kernel	$G (x_{j}, x_{k}) = \exp (- {‖ x_{j} - x_{k} ‖}^{2})$
`'linear'`	Linear kernel	$G (x_{j}, x_{k}) = x_{j}' x_{k}$
`'polynomial'`	Polynomial kernel. Use `'PolynomialOrder',q` to specify a polynomial kernel of order `q`.	$G (x_{j}, x_{k}) = {(1 + x_{j}' x_{k})}^{q}$

'gaussian' or 'rbf'

Gaussian or Radial Basis Function (RBF) kernel

$G (x_{j}, x_{k}) = \exp (- {‖ x_{j} - x_{k} ‖}^{2})$

'linear'

Linear kernel

$G (x_{j}, x_{k}) = x_{j}' x_{k}$

'polynomial'

Polynomial kernel. Use 'PolynomialOrder',q to specify a polynomial kernel of order q.

$G (x_{j}, x_{k}) = {(1 + x_{j}' x_{k})}^{q}$

You can set your own kernel function, for example, kernel, by setting 'KernelFunction','kernel'. kernel must have the following form:

function G = kernel(U,V)

where:

U is an m-by-p matrix.
V is an n-by-p matrix.
G is an m-by-n Gram matrix of the rows of U and V.

And kernel.m must be on the MATLAB path.

It is good practice to avoid using generic names for kernel functions. For example, call a sigmoid kernel function 'mysigmoid' rather than 'sigmoid'.

Example: 'KernelFunction','gaussian'

Data Types: char | string

`KernelScale` — Kernel scale parameter
`1` (default) | `'auto'` | positive scalar

Kernel scale parameter, specified as the comma-separated pair consisting of 'KernelScale' and 'auto' or a positive scalar. The software divides all elements of the predictor matrix X by the value of KernelScale. Then, the software applies the appropriate kernel norm to compute the Gram matrix.

If you specify 'auto', then the software selects an appropriate scale factor using a heuristic procedure. This heuristic procedure uses subsampling, so estimates can vary from one call to another. Therefore, to reproduce results, set a random number seed using rng before training.
If you specify KernelScale and your own kernel function, for example, 'KernelFunction','kernel', then the software throws an error. You must apply scaling within kernel.

Example: 'KernelScale','auto'

Data Types: double | single | char | string

`PolynomialOrder` — Polynomial kernel function order
`3` (default) | positive integer

Polynomial kernel function order, specified as the comma-separated pair consisting of 'PolynomialOrder' and a positive integer.

If you set 'PolynomialOrder' and KernelFunction is not 'polynomial', then the software throws an error.

Example: 'PolynomialOrder',2

Data Types: double | single

`KernelOffset` — Kernel offset parameter
nonnegative scalar

Kernel offset parameter, specified as the comma-separated pair consisting of 'KernelOffset' and a nonnegative scalar.

The software adds KernelOffset to each element of the Gram matrix.

The defaults are:

0 if the solver is SMO (that is, you set 'Solver','SMO')
0.1 if the solver is ISDA (that is, you set 'Solver','ISDA')

Example: 'KernelOffset',0

Data Types: double | single

`Epsilon` — Half the width of epsilon-insensitive band
`iqr(Y)/13.49` (default) | nonnegative scalar value

Half the width of the epsilon-insensitive band, specified as the comma-separated pair consisting of 'Epsilon' and a nonnegative scalar value.

The default Epsilon value is iqr(Y)/13.49, which is an estimate of a tenth of the standard deviation using the interquartile range of the response variable Y. If iqr(Y) is equal to zero, then the default Epsilon value is 0.1.

Example: 'Epsilon',0.3

Data Types: single | double

`Standardize` — Flag to standardize predictor data
`false` (default) | `true`

Flag to standardize the predictor data, specified as the comma-separated pair consisting of 'Standardize' and true (1) or false (0).

If you set 'Standardize',true:

The software centers and scales each column of the predictor data (X) by the weighted column mean and standard deviation, respectively (for details on weighted standardizing, see Algorithms). MATLAB does not standardize the data contained in the dummy variable columns generated for categorical predictors.
The software trains the model using the standardized predictor matrix, but stores the unstandardized data in the model property X.

Example: 'Standardize',true

Data Types: logical

`Solver` — Optimization routine
`'ISDA'` | `'L1QP'` | `'SMO'`

Optimization routine, specified as the comma-separated pair consisting of 'Solver' and a value in this table.

Value	Description
`'ISDA'`	Iterative Single Data Algorithm (see [3])
`'L1QP'`	Uses `quadprog` (Optimization Toolbox) to implement L1 soft-margin minimization by quadratic programming. This option requires an Optimization Toolbox™ license. For more details, see Quadratic Programming Definition (Optimization Toolbox).
`'SMO'`	Sequential Minimal Optimization (see [2])

The defaults are:

'ISDA' if you set 'OutlierFraction' to a positive value
'SMO' otherwise

Example: 'Solver','ISDA'

`Alpha` — Initial estimates of alpha coefficients
numeric vector

Initial estimates of alpha coefficients, specified as the comma-separated pair consisting of 'Alpha' and a numeric vector. The length of Alpha must be equal to the number of rows of X.

Each element of Alpha corresponds to an observation in X.
Alpha cannot contain any NaNs.
If you specify Alpha and any one of the cross-validation name-value pair arguments ('CrossVal', 'CVPartition', 'Holdout', 'KFold', or 'Leaveout'), then the software returns an error.

If Y contains any missing values, then remove all rows of Y, X, and Alpha that correspond to the missing values. That is, enter:

idx = ~isnan(Y);
Y = Y(idx);
X = X(idx,:);
alpha = alpha(idx);

Then, pass Y, X, and alpha as the response, predictors, and initial alpha estimates, respectively.

The default is zeros(size(Y,1)).

Example: 'Alpha',0.1*ones(size(X,1),1)

Data Types: single | double

`CacheSize` — Cache size
`1000` (default) | `'maximal'` | positive scalar

Cache size, specified as the comma-separated pair consisting of 'CacheSize' and 'maximal' or a positive scalar.

If CacheSize is 'maximal', then the software reserves enough memory to hold the entire n-by-n Gram matrix.

If CacheSize is a positive scalar, then the software reserves CacheSize megabytes of memory for training the model.

Example: 'CacheSize','maximal'

Data Types: double | single | char | string

`ClipAlphas` — Flag to clip alpha coefficients
`true` (default) | `false`

Flag to clip alpha coefficients, specified as the comma-separated pair consisting of 'ClipAlphas' and either true or false.

Suppose that the alpha coefficient for observation j is α_j and the box constraint of observation j is C_j, j = 1,...,n, where n is the training sample size.

Value	Description
`true`	At each iteration, if α_j is near 0 or near C_j, then MATLAB sets α_j to 0 or to C_j, respectively.
`false`	MATLAB does not change the alpha coefficients during optimization.

MATLAB stores the final values of α in the Alpha property of the trained SVM model object.

ClipAlphas can affect SMO and ISDA convergence.

Example: 'ClipAlphas',false

Data Types: logical

`NumPrint` — Number of iterations between optimization diagnostic message output
`1000` (default) | nonnegative integer

Number of iterations between optimization diagnostic message output, specified as the comma-separated pair consisting of 'NumPrint' and a nonnegative integer.

If you specify 'Verbose',1 and 'NumPrint',numprint, then the software displays all optimization diagnostic messages from SMO and ISDA every numprint iterations in the Command Window.

Example: 'NumPrint',500

Data Types: double | single

`OutlierFraction` — Expected proportion of outliers in training data
0 (default) | numeric scalar in the interval [0,1)

Expected proportion of outliers in training data, specified as the comma-separated pair consisting of 'OutlierFraction' and a numeric scalar in the interval [0,1). fitrsvm removes observations with large gradients, ensuring that fitrsvm removes the fraction of observations specified by OutlierFraction by the time convergence is reached. This name-value pair is only valid when 'Solver' is 'ISDA'.

Example: 'OutlierFraction',0.1

Data Types: single | double

`RemoveDuplicates` — Flag to replace duplicate observations with single observations
`false` (default) | `true`

Flag to replace duplicate observations with single observations in the training data, specified as the comma-separated pair consisting of 'RemoveDuplicates' and true or false.

If RemoveDuplicates is true, then fitrsvm replaces duplicate observations in the training data with a single observation of the same value. The weight of the single observation is equal to the sum of the weights of the corresponding removed duplicates (see Weights).

Tip

If your data set contains many duplicate observations, then specifying 'RemoveDuplicates',true can decrease convergence time considerably.

Data Types: logical

`Verbose` — Verbosity level
`0` (default) | `1` | `2`

Verbosity level, specified as the comma-separated pair consisting of 'Verbose' and 0, 1, or 2. The value of Verbose controls the amount of optimization information that the software displays in the Command Window and saves the information as a structure to Mdl.ConvergenceInfo.History.

This table summarizes the available verbosity level options.

Value	Description
`0`	The software does not display or save convergence information.
`1`	The software displays diagnostic messages and saves convergence criteria every `numprint` iterations, where `numprint` is the value of the name-value pair argument `'NumPrint'`.
`2`	The software displays diagnostic messages and saves convergence criteria at every iteration.

Example: 'Verbose',1

Data Types: double | single

Other Regression Options

collapse all

`CategoricalPredictors` — Categorical predictors list
vector of positive integers | logical vector | character matrix | string array | cell array of character vectors | `'all'`

Categorical predictors list, specified as one of the values in this table.

Value	Description
Vector of positive integers	Each entry in the vector is an index value indicating that the corresponding predictor is categorical. The index values are between 1 and `p`, where `p` is the number of predictors used to train the model. If `fitrsvm` uses a subset of input variables as predictors, then the function indexes the predictors using only the subset. The `CategoricalPredictors` values do not count any response variable, observation weights variable, or other variable that the function does not use.
Logical vector	A `true` entry means that the corresponding predictor is categorical. The length of the vector is `p`.
Character matrix	Each row of the matrix is the name of a predictor variable. The names must match the entries in `PredictorNames`. Pad the names with extra blanks so each row of the character matrix has the same length.
String array or cell array of character vectors	Each element in the array is the name of a predictor variable. The names must match the entries in `PredictorNames`.
`"all"`	All predictors are categorical.

By default, if the predictor data is in a table (Tbl), fitrsvm assumes that a variable is categorical if it is a logical vector, categorical vector, character array, string array, or cell array of character vectors. If the predictor data is a matrix (X), fitrsvm assumes that all predictors are continuous. To identify any other predictors as categorical predictors, specify them by using the CategoricalPredictors name-value argument.

For the identified categorical predictors, fitrsvm creates dummy variables using two different schemes, depending on whether a categorical variable is unordered or ordered. For an unordered categorical variable, fitrsvm creates one dummy variable for each level of the categorical variable. For an ordered categorical variable, fitrsvm creates one less dummy variable than the number of categories. For details, see Automatic Creation of Dummy Variables.

Example: 'CategoricalPredictors','all'

`PredictorNames` — Predictor variable names
string array of unique names | cell array of unique character vectors

Predictor variable names, specified as a string array of unique names or cell array of unique character vectors. The functionality of PredictorNames depends on the way you supply the training data.

If you supply X and Y, then you can use PredictorNames to assign names to the predictor variables in X.
- The order of the names in PredictorNames must correspond to the column order of X. That is, PredictorNames{1} is the name of X(:,1), PredictorNames{2} is the name of X(:,2), and so on. Also, size(X,2) and numel(PredictorNames) must be equal.
- By default, PredictorNames is {'x1','x2',...}.
If you supply Tbl, then you can use PredictorNames to choose which predictor variables to use in training. That is, fitrsvm uses only the predictor variables in PredictorNames and the response variable during training.
- PredictorNames must be a subset of Tbl.Properties.VariableNames and cannot include the name of the response variable.
- By default, PredictorNames contains the names of all predictor variables.
- A good practice is to specify the predictors for training using either PredictorNames or formula, but not both.

Example: "PredictorNames",["SepalLength","SepalWidth","PetalLength","PetalWidth"]

Data Types: string | cell

`ResponseName` — Response variable name
`"Y"` (default) | character vector | string scalar

Response variable name, specified as a character vector or string scalar.

If you supply Y, then you can use ResponseName to specify a name for the response variable.
If you supply ResponseVarName or formula, then you cannot use ResponseName.

Example: ResponseName="response"

Data Types: char | string

`ResponseTransform` — Function for transforming raw response values
`"none"` (default) | function handle | function name

Function for transforming raw response values, specified as a function handle or function name. The default is "none", which means @(y)y, or no transformation. The function should accept a vector (the original response values) and return a vector of the same size (the transformed response values).

Example: Suppose you create a function handle that applies an exponential transformation to an input vector by using myfunction = @(y)exp(y). Then, you can specify the response transformation as ResponseTransform=myfunction.

Data Types: char | string | function_handle

`Weights` — Observation weights
`ones(size(X,1),1)` (default) | vector of numeric values

Observation weights, specified as the comma-separated pair consisting of 'Weights' and a vector of numeric values. The size of Weights must equal the number of rows in X. fitrsvm normalizes the values of Weights to sum to 1.

Data Types: single | double

Cross-Validation Options

collapse all

`CrossVal` — Cross-validation flag
`'off'` (default) | `'on'`

Cross-validation flag, specified as the comma-separated pair consisting of 'CrossVal' and either 'on' or 'off'.

If you specify 'on', then the software implements 10-fold cross-validation.

To override this cross-validation setting, use one of these name-value pair arguments: CVPartition, Holdout, KFold, or Leaveout. To create a cross-validated model, you can use one cross-validation name-value pair argument at a time only.

Alternatively, you can cross-validate the model later using the crossval method.

Example: 'CrossVal','on'

`CVPartition` — Cross-validation partition
`[]` (default) | `cvpartition` object

Cross-validation partition, specified as a cvpartition object that specifies the type of cross-validation and the indexing for the training and validation sets.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: Suppose you create a random partition for 5-fold cross-validation on 500 observations by using cvp = cvpartition(500,KFold=5). Then, you can specify the cross-validation partition by setting CVPartition=cvp.

`Holdout` — Fraction of data for holdout validation
scalar value in the range (0,1)

Fraction of the data used for holdout validation, specified as a scalar value in the range (0,1). If you specify Holdout=p, then the software completes these steps:

Randomly select and reserve p*100% of the data as validation data, and train the model using the rest of the data.
Store the compact trained model in the Trained property of the cross-validated model.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: Holdout=0.1

Data Types: double | single

`KFold` — Number of folds
`10` (default) | positive integer value greater than 1

Number of folds to use in the cross-validated model, specified as a positive integer value greater than 1. If you specify KFold=k, then the software completes these steps:

Randomly partition the data into k sets.
For each set, reserve the set as validation data, and train the model using the other k – 1 sets.
Store the k compact trained models in a k-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: KFold=5

Data Types: single | double

`Leaveout` — Leave-one-out cross-validation flag
`"off"` (default) | `"on"`

Leave-one-out cross-validation flag, specified as "on" or "off". If you specify Leaveout="on", then for each of the n observations (where n is the number of observations, excluding missing observations, specified in the NumObservations property of the model), the software completes these steps:

Reserve the one observation as validation data, and train the model using the other n – 1 observations.
Store the n compact trained models in an n-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can specify only one of these four name-value arguments: CVPartition, Holdout, KFold, or Leaveout.

Example: Leaveout="on"

Data Types: char | string

Convergence Controls

collapse all

`DeltaGradientTolerance` — Tolerance for gradient difference
0 (default) | nonnegative scalar

Tolerance for gradient difference between upper and lower violators obtained by SMO or ISDA, specified as the comma-separated pair consisting of 'DeltaGradientTolerance' and a nonnegative scalar.

Example: 'DeltaGradientTolerance',1e-4

Data Types: single | double

`GapTolerance` — Feasibility gap tolerance
`1e-3` (default) | nonnegative scalar

Feasibility gap tolerance obtained by SMO or ISDA, specified as the comma-separated pair consisting of 'GapTolerance' and a nonnegative scalar.

If GapTolerance is 0, then fitrsvm does not use this parameter to check convergence.

Example: 'GapTolerance',1e-4

Data Types: single | double

`IterationLimit` — Maximal number of numerical optimization iterations
`1e6` (default) | positive integer

Maximal number of numerical optimization iterations, specified as the comma-separated pair consisting of 'IterationLimit' and a positive integer.

The software returns a trained model regardless of whether the optimization routine successfully converges. Mdl.ConvergenceInfo contains convergence information.

Example: 'IterationLimit',1e8

Data Types: double | single

`KKTTolerance` — Tolerance for KKT violation
0 | nonnegative scalar value

Tolerance for Karush-Kuhn-Tucker (KKT) violation, specified as the comma-separated pair consisting of 'KKTTolerance' and a nonnegative scalar value.

This name-value pair applies only if 'Solver' is 'SMO' or 'ISDA'.

If KKTTolerance is 0, then fitrsvm does not use this parameter to check convergence.

Example: 'KKTTolerance',1e-4

Data Types: single | double

`ShrinkagePeriod` — Number of iterations between reductions of active set
`0` (default) | nonnegative integer

Number of iterations between reductions of the active set, specified as the comma-separated pair consisting of 'ShrinkagePeriod' and a nonnegative integer.

If you set 'ShrinkagePeriod',0, then the software does not shrink the active set.

Example: 'ShrinkagePeriod',1000

Data Types: double | single

Hyperparameter Optimization

collapse all

`OptimizeHyperparameters` — Parameters to optimize
`'none'` (default) | `'auto'` | `'all'` | string array or cell array of eligible parameter names | vector of `optimizableVariable` objects

Parameters to optimize, specified as the comma-separated pair consisting of 'OptimizeHyperparameters' and one of the following:

'none' — Do not optimize.
'auto' — Use {'BoxConstraint','KernelScale','Epsilon','Standardize'}.
'all' — Optimize all eligible parameters.
String array or cell array of eligible parameter names.
Vector of optimizableVariable objects, typically the output of hyperparameters.

The optimization attempts to minimize the cross-validation loss (error) for fitrsvm by varying the parameters. To control the cross-validation type and other aspects of the optimization, use the HyperparameterOptimizationOptions name-value argument. When you use HyperparameterOptimizationOptions, you can use the (compact) model size instead of the cross-validation loss as the optimization objective by setting the ConstraintType and ConstraintBounds options.

Note

The values of OptimizeHyperparameters override any values you specify using other name-value arguments. For example, setting OptimizeHyperparameters to "auto" causes fitrsvm to optimize hyperparameters corresponding to the "auto" option and to ignore any specified values for the hyperparameters.

The eligible parameters for fitrsvm are:

BoxConstraint — fitrsvm searches among positive values, by default log-scaled in the range [1e-3,1e3].
Epsilon — fitrsvm searches among positive values, by default log-scaled in the range [1e-3,1e2]*iqr(Y)/1.349.
KernelFunction — fitrsvm searches among 'gaussian', 'linear', and 'polynomial'.
KernelScale — fitrsvm searches among positive values, by default log-scaled in the range [1e-3,1e3].
PolynomialOrder — fitrsvm searches among integers in the range [2,4].
Standardize — fitrsvm searches among 'true' and 'false'.

Set nondefault parameters by passing a vector of optimizableVariable objects that have nondefault values. For example,

load carsmall
params = hyperparameters('fitrsvm',[Horsepower,Weight],MPG);
params(1).Range = [1e-4,1e6];

Pass params as the value of OptimizeHyperparameters.

By default, the iterative display appears at the command line, and plots appear according to the number of hyperparameters in the optimization. For the optimization and plots, the objective function is log(1 + cross-validation loss). To control the iterative display, set the Verbose field of the 'HyperparameterOptimizationOptions' name-value argument. To control the plots, set the ShowPlots field of the 'HyperparameterOptimizationOptions' name-value argument.

For an example, see Optimize SVM Regression.

Example: 'OptimizeHyperparameters','auto'

`HyperparameterOptimizationOptions` — Options for optimization
`HyperparameterOptimizationOptions` object | structure

Options for optimization, specified as a HyperparameterOptimizationOptions object or a structure. This argument modifies the effect of the OptimizeHyperparameters name-value argument. If you specify HyperparameterOptimizationOptions, you must also specify OptimizeHyperparameters. All the options are optional. However, you must set ConstraintBounds and ConstraintType to return AggregateOptimizationResults. The options that you can set in a structure are the same as those in the HyperparameterOptimizationOptions object.

Option	Values	Default
`Optimizer`	`"bayesopt"` — Use Bayesian optimization. Internally, this setting calls `bayesopt`. `"gridsearch"` — Use grid search with `NumGridDivisions` values per dimension. `"gridsearch"` searches in a random order, using uniform sampling without replacement from the grid. After optimization, you can get a table in grid order by using the command `sortrows(Mdl.HyperparameterOptimizationResults)`. `"randomsearch"` — Search at random among `MaxObjectiveEvaluations` points.	`"bayesopt"`
`ConstraintBounds`	Constraint bounds for N optimization problems, specified as an N-by-2 numeric matrix or `[]`. The columns of `ConstraintBounds` contain the lower and upper bound values of the optimization problems. If you specify `ConstraintBounds` as a numeric vector, the software assigns the values to the second column of `ConstraintBounds`, and zeros to the first column. If you specify `ConstraintBounds`, you must also specify `ConstraintType`.	`[]`
`ConstraintTarget`	Constraint target for the optimization problems, specified as `"matlab"` or `"coder"`. If `ConstraintBounds` and `ConstraintType` are `[]` and you set `ConstraintTarget`, then the software sets `ConstraintTarget` to `[]`. The values of `ConstraintTarget` and `ConstraintType` determine the objective and constraint functions. For more information, see `HyperparameterOptimizationOptions`.	If you specify `ConstraintBounds` and `ConstraintType`, then the default value is `"matlab"`. Otherwise, the default value is `[]`.
`ConstraintType`	Constraint type for the optimization problems, specified as `"size"` or `"loss"`. If you specify `ConstraintType`, you must also specify `ConstraintBounds`. The values of `ConstraintTarget` and `ConstraintType` determine the objective and constraint functions. For more information, see `HyperparameterOptimizationOptions`.	`[]`
`AcquisitionFunctionName`	Type of acquisition function: `"expected-improvement-per-second-plus"` `"expected-improvement"` `"expected-improvement-plus"` `"expected-improvement-per-second"` `"lower-confidence-bound"` `"probability-of-improvement"` Acquisition functions whose names include `per-second` do not yield reproducible results, because the optimization depends on the run time of the objective function. Acquisition functions whose names include `plus` modify their behavior when they overexploit an area. For more details, see Acquisition Function Types.	`"expected-improvement-per-second-plus"`
`MaxObjectiveEvaluations`	Maximum number of objective function evaluations. If you specify multiple optimization problems using `ConstraintBounds`, the value of `MaxObjectiveEvaluations` applies to each optimization problem individually.	`30` for `"bayesopt"` and `"randomsearch"`, and the entire grid for `"gridsearch"`
`MaxTime`	Time limit for the optimization, specified as a nonnegative real scalar. The time limit is in seconds, as measured by `tic` and `toc`. The software performs at least one optimization iteration, regardless of the value of `MaxTime`. The run time can exceed `MaxTime` because `MaxTime` does not interrupt function evaluations. If you specify multiple optimization problems using `ConstraintBounds`, the time limit applies to each optimization problem individually.	`Inf`
`NumGridDivisions`	For `Optimizer="gridsearch"`, the number of values in each dimension. The value can be a vector of positive integers giving the number of values for each dimension, or a scalar that applies to all dimensions. This option is ignored for categorical variables.	`10`
`ShowPlots`	Logical value indicating whether to show plots of the optimization progress. If this option is `true`, the software plots the best observed objective function value against the iteration number. If you use Bayesian optimization (`Optimizer="bayesopt"`), then the software also plots the best estimated objective function value. The best observed objective function values and best estimated objective function values correspond to the values in the `BestSoFar (observed)` and `BestSoFar (estim.)` columns of the iterative display, respectively. You can find these values in the properties `ObjectiveMinimumTrace` and `EstimatedObjectiveMinimumTrace` of `Mdl.HyperparameterOptimizationResults`. If the problem includes one or two optimization parameters for Bayesian optimization, then `ShowPlots` also plots a model of the objective function against the parameters.	`true`
`SaveIntermediateResults`	Logical value indicating whether to save the optimization results. If this option is `true`, the software overwrites a workspace variable named `"BayesoptResults"` at each iteration. The variable is a `BayesianOptimization` object. If you specify multiple optimization problems using `ConstraintBounds`, the workspace variable is an `AggregateBayesianOptimization` object named `"AggregateBayesoptResults"`.	`false`
`Verbose`	Display level at the command line: `0` — No iterative display `1` — Iterative display `2` — Iterative display with additional information For details, see the `bayesopt` `Verbose` name-value argument and the example Optimize Classifier Fit Using Bayesian Optimization.	`1`
`UseParallel`	Logical value indicating whether to run the Bayesian optimization in parallel, which requires Parallel Computing Toolbox™. Due to the nonreproducibility of parallel timing, parallel Bayesian optimization does not necessarily yield reproducible results. For details, see Parallel Bayesian Optimization.	`false`
`Repartition`	Logical value indicating whether to repartition the cross-validation at every iteration. If this option is `false`, the optimizer uses a single partition for the optimization. A value of `true` usually gives the most robust results because this setting takes partitioning noise into account. However, for optimal results, `true` requires at least twice as many function evaluations.	`false`
Specify only one of the following three options.
`CVPartition`	`cvpartition` object created by `cvpartition`	`Kfold=5` if you do not specify a cross-validation option
`Holdout`	Scalar in the range `(0,1)` representing the holdout fraction
`Kfold`	Integer greater than 1

Example: HyperparameterOptimizationOptions=struct(UseParallel=true)

Output Arguments

collapse all

`Mdl` — Trained SVM regression model
`RegressionSVM` model object | `RegressionPartitionedSVM` cross-validated model object

Trained SVM regression model, returned as a RegressionSVM model object or RegressionPartitionedSVM cross-validated model object.

If you set any of the name-value pair arguments KFold, Holdout, Leaveout, CrossVal, or CVPartition, then Mdl is a RegressionPartitionedSVM cross-validated model object. Otherwise, Mdl is a RegressionSVM model object.

If you specify OptimizeHyperparameters and set the ConstraintType and ConstraintBounds options of HyperparameterOptimizationOptions, then Mdl is an N-by-1 cell array of model objects, where N is equal to the number of rows in ConstraintBounds. If none of the optimization problems yields a feasible model, then each cell array value is [].

`AggregateOptimizationResults` — Aggregate optimization results
`AggregateBayesianOptimization` object

Aggregate optimization results for multiple optimization problems, returned as an AggregateBayesianOptimization object. To return AggregateOptimizationResults, you must specify OptimizeHyperparameters and HyperparameterOptimizationOptions. You must also specify the ConstraintType and ConstraintBounds options of HyperparameterOptimizationOptions. For an example that shows how to produce this output, see Hyperparameter Optimization with Multiple Constraint Bounds.

Limitations

fitrsvm supports low- through moderate-dimensional data sets. For high-dimensional data set, use fitrlinear instead.

Tips

Unless your data set is large, always try to standardize the predictors (see Standardize). Standardization makes predictors insensitive to the scales on which they are measured.
It is good practice to cross-validate using the KFold name-value pair argument. The cross-validation results determine how well the SVM model generalizes.
Sparsity in support vectors is a desirable property of an SVM model. To decrease the number of support vectors, set the BoxConstraint name-value pair argument to a large value. This action also increases the training time.
For optimal training time, set CacheSize as high as the memory limit on your computer allows.
If you expect many fewer support vectors than observations in the training set, then you can significantly speed up convergence by shrinking the active-set using the name-value pair argument 'ShrinkagePeriod'. It is good practice to use 'ShrinkagePeriod',1000.
Duplicate observations that are far from the regression line do not affect convergence. However, just a few duplicate observations that occur near the regression line can slow down convergence considerably. To speed up convergence, specify 'RemoveDuplicates',true if:
- Your data set contains many duplicate observations.
- You suspect that a few duplicate observations can fall near the regression line.
However, to maintain the original data set during training, fitrsvm must temporarily store separate data sets: the original and one without the duplicate observations. Therefore, if you specify true for data sets containing few duplicates, then fitrsvm consumes close to double the memory of the original data.
After training a model, you can generate C/C++ code that predicts responses for new data. Generating C/C++ code requires MATLAB Coder™. For details, see Introduction to Code Generation.

Algorithms

For the mathematical formulation of linear and nonlinear SVM regression problems and the solver algorithms, see Understanding Support Vector Machine Regression.
NaN, <undefined>, empty character vector (''), empty string (""), and <missing> values indicate missing data values. fitrsvm removes entire rows of data corresponding to a missing response. When normalizing weights, fitrsvm ignores any weight corresponding to an observation with at least one missing predictor. Consequently, observation box constraints might not equal BoxConstraint.
fitrsvm removes observations that have zero weight.
If you set 'Standardize',true and 'Weights', then fitrsvm standardizes the predictors using their corresponding weighted means and weighted standard deviations. That is, fitrsvm standardizes predictor j (x_j) using

$x_{j}^{*} = \frac{x_{j} - μ_{j}^{*}}{σ_{j}^{*}} .$
- $μ_{j}^{*} = \frac{1}{\sum_{k} w_{k}} \sum_{k} w_{k} x_{j k} .$
- x_jk is observation k (row) of predictor j (column).
- ${(σ_{j}^{*})}^{2} = \frac{v_{1}}{v_{1}^{2} - v_{2}} \sum_{k} w_{k} {(x_{j k} - μ_{j}^{*})}^{2} .$
- $v_{1} = \sum_{j} w_{j} .$
- $v_{2} = \sum_{j} {(w_{j})}^{2} .$
If your predictor data contains categorical variables, then the software generally uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable.
- The PredictorNames property stores one element for each of the original predictor variable names. For example, assume that there are three predictors, one of which is a categorical variable with three levels. Then PredictorNames is a 1-by-3 cell array of character vectors containing the original names of the predictor variables.
- The ExpandedPredictorNames property stores one element for each of the predictor variables, including the dummy variables. For example, assume that there are three predictors, one of which is a categorical variable with three levels. Then ExpandedPredictorNames is a 1-by-5 cell array of character vectors containing the names of the predictor variables and the new dummy variables.
- Similarly, the Beta property stores one beta coefficient for each predictor, including the dummy variables.
- The SupportVectors property stores the predictor values for the support vectors, including the dummy variables. For example, assume that there are m support vectors and three predictors, one of which is a categorical variable with three levels. Then SupportVectors is an m-by-5 matrix.
- The X property stores the training data as originally input. It does not include the dummy variables. When the input is a table, X contains only the columns used as predictors.
For predictors specified in a table, if any of the variables contain ordered (ordinal) categories, the software uses ordinal encoding for these variables.
- For a variable having k ordered levels, the software creates k – 1 dummy variables. The jth dummy variable is -1 for levels up to j, and +1 for levels j + 1 through k.
- The names of the dummy variables stored in the ExpandedPredictorNames property indicate the first level with the value +1. The software stores k – 1 additional predictor names for the dummy variables, including the names of levels 2, 3, ..., k.

All solvers implement L1 soft-margin minimization.
Let p be the proportion of outliers that you expect in the training data. If you set 'OutlierFraction',p, then the software implements robust learning. In other words, the software attempts to remove 100p% of the observations when the optimization algorithm converges. The removed observations correspond to gradients that are large in magnitude.

References

[1] Clark, D., Z. Schreter, A. Adams. "A Quantitative Comparison of Dystal and Backpropagation." submitted to the Australian Conference on Neural Networks, 1996.

[2] Fan, R.-E., P.-H. Chen, and C.-J. Lin. “Working set selection using second order information for training support vector machines.” Journal of Machine Learning Research, Vol 6, 2005, pp. 1889–1918.

[3] Kecman V., T. -M. Huang, and M. Vogt. “Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance.” In Support Vector Machines: Theory and Applications. Edited by Lipo Wang, 255–274. Berlin: Springer-Verlag, 2005.

[4] Lichman, M. UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

[5] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait." Sea Fisheries Division, Technical Report No. 48, 1994.

[6] Waugh, S. "Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks." University of Tasmania Department of Computer Science thesis, 1995.

Extended Capabilities

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

To perform parallel hyperparameter optimization, use the UseParallel=true option in the HyperparameterOptimizationOptions name-value argument in the call to the fitrsvm function.

For more information on parallel hyperparameter optimization, see Parallel Bayesian Optimization.

For general information about parallel computing, see Run MATLAB Functions with Automatic Parallel Support (Parallel Computing Toolbox).

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

The fitrsvm function supports GPU array input with these usage notes and limitations:

You cannot specify the KernelFunction name-value argument as a custom kernel function.
You can specify the Solver name-value argument as "SMO" only.
You cannot specify the OutlierFraction or ShrinkagePeriod name-value argument.
The predictor data cannot contain infinite values.
fitrsvm fits the model on a GPU if at least one of the following applies:
- The input argument X is a gpuArray object.
- The input argument Y is a gpuArray object.
- The input argument Tbl contains gpuArray predictor or response variables.

Version History

Introduced in R2015b

expand all

R2023b: `"auto"` option of `OptimizeHyperparameters` includes `Standardize`

Starting in R2023b, when you specify "auto" as the OptimizeHyperparameters value, fitrsvm includes Standardize as an optimizable hyperparameter.

R2023a: `fitrsvm` now accepts `gpuArray` inputs (requires Parallel Computing Toolbox)

Starting in R2023a, fitrsvm supports GPU arrays with some limitations.

fitrsvm

Syntax

Description

Examples

Train Linear Support Vector Machine Regression Model

Train Support Vector Machine Regression Model

Cross-Validate SVM Regression Model

Optimize SVM Regression

Input Arguments

Tbl — Predictor data table

ResponseVarName — Response variable name name of variable in Tbl

formula — Explanatory model of response variable and subset of predictor variables character vector | string scalar

Y — Response data numeric vector

X — Predictor data numeric matrix

Name-Value Arguments

BoxConstraint — Box constraint positive scalar value

KernelFunction — Kernel function 'linear' (default) | 'gaussian' | 'rbf' | 'polynomial' | function name

KernelScale — Kernel scale parameter 1 (default) | 'auto' | positive scalar

PolynomialOrder — Polynomial kernel function order 3 (default) | positive integer

KernelOffset — Kernel offset parameter nonnegative scalar

Epsilon — Half the width of epsilon-insensitive band iqr(Y)/13.49 (default) | nonnegative scalar value

Standardize — Flag to standardize predictor data false (default) | true

Solver — Optimization routine 'ISDA' | 'L1QP' | 'SMO'

Alpha — Initial estimates of alpha coefficients numeric vector

CacheSize — Cache size 1000 (default) | 'maximal' | positive scalar

ClipAlphas — Flag to clip alpha coefficients true (default) | false

NumPrint — Number of iterations between optimization diagnostic message output 1000 (default) | nonnegative integer

OutlierFraction — Expected proportion of outliers in training data 0 (default) | numeric scalar in the interval [0,1)

RemoveDuplicates — Flag to replace duplicate observations with single observations false (default) | true

Verbose — Verbosity level 0 (default) | 1 | 2

CategoricalPredictors — Categorical predictors list vector of positive integers | logical vector | character matrix | string array | cell array of character vectors | 'all'

PredictorNames — Predictor variable names string array of unique names | cell array of unique character vectors

ResponseName — Response variable name "Y" (default) | character vector | string scalar

ResponseTransform — Function for transforming raw response values "none" (default) | function handle | function name

Weights — Observation weights ones(size(X,1),1) (default) | vector of numeric values

CrossVal — Cross-validation flag 'off' (default) | 'on'

CVPartition — Cross-validation partition [] (default) | cvpartition object

Holdout — Fraction of data for holdout validation scalar value in the range (0,1)

KFold — Number of folds 10 (default) | positive integer value greater than 1

Leaveout — Leave-one-out cross-validation flag "off" (default) | "on"

DeltaGradientTolerance — Tolerance for gradient difference 0 (default) | nonnegative scalar

GapTolerance — Feasibility gap tolerance 1e-3 (default) | nonnegative scalar

IterationLimit — Maximal number of numerical optimization iterations 1e6 (default) | positive integer

KKTTolerance — Tolerance for KKT violation 0 | nonnegative scalar value

ShrinkagePeriod — Number of iterations between reductions of active set 0 (default) | nonnegative integer

OptimizeHyperparameters — Parameters to optimize 'none' (default) | 'auto' | 'all' | string array or cell array of eligible parameter names | vector of optimizableVariable objects

HyperparameterOptimizationOptions — Options for optimization HyperparameterOptimizationOptions object | structure

Output Arguments

Mdl — Trained SVM regression model RegressionSVM model object | RegressionPartitionedSVM cross-validated model object

AggregateOptimizationResults — Aggregate optimization results AggregateBayesianOptimization object

Limitations

Tips

Algorithms

References

Extended Capabilities

Automatic Parallel Support Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

R2023b: "auto" option of OptimizeHyperparameters includes Standardize

R2023a: fitrsvm now accepts gpuArray inputs (requires Parallel Computing Toolbox)

See Also

Topics

`Tbl` — Predictor data
table

`ResponseVarName` — Response variable name
name of variable in `Tbl`

`formula` — Explanatory model of response variable and subset of predictor variables
character vector | string scalar

`Y` — Response data
numeric vector

`X` — Predictor data
numeric matrix

`BoxConstraint` — Box constraint
positive scalar value

`KernelFunction` — Kernel function
`'linear'` (default) | `'gaussian'` | `'rbf'` | `'polynomial'` | function name

`KernelScale` — Kernel scale parameter
`1` (default) | `'auto'` | positive scalar

`PolynomialOrder` — Polynomial kernel function order
`3` (default) | positive integer

`KernelOffset` — Kernel offset parameter
nonnegative scalar

`Epsilon` — Half the width of epsilon-insensitive band
`iqr(Y)/13.49` (default) | nonnegative scalar value

`Standardize` — Flag to standardize predictor data
`false` (default) | `true`

`Solver` — Optimization routine
`'ISDA'` | `'L1QP'` | `'SMO'`

`Alpha` — Initial estimates of alpha coefficients
numeric vector

`CacheSize` — Cache size
`1000` (default) | `'maximal'` | positive scalar

`ClipAlphas` — Flag to clip alpha coefficients
`true` (default) | `false`

`NumPrint` — Number of iterations between optimization diagnostic message output
`1000` (default) | nonnegative integer

`OutlierFraction` — Expected proportion of outliers in training data
0 (default) | numeric scalar in the interval [0,1)

`RemoveDuplicates` — Flag to replace duplicate observations with single observations
`false` (default) | `true`

`Verbose` — Verbosity level
`0` (default) | `1` | `2`

`CategoricalPredictors` — Categorical predictors list
vector of positive integers | logical vector | character matrix | string array | cell array of character vectors | `'all'`

`PredictorNames` — Predictor variable names
string array of unique names | cell array of unique character vectors

`ResponseName` — Response variable name
`"Y"` (default) | character vector | string scalar

`ResponseTransform` — Function for transforming raw response values
`"none"` (default) | function handle | function name

`Weights` — Observation weights
`ones(size(X,1),1)` (default) | vector of numeric values

`CrossVal` — Cross-validation flag
`'off'` (default) | `'on'`

`CVPartition` — Cross-validation partition
`[]` (default) | `cvpartition` object

`Holdout` — Fraction of data for holdout validation
scalar value in the range (0,1)

`KFold` — Number of folds
`10` (default) | positive integer value greater than 1

`Leaveout` — Leave-one-out cross-validation flag
`"off"` (default) | `"on"`

`DeltaGradientTolerance` — Tolerance for gradient difference
0 (default) | nonnegative scalar

`GapTolerance` — Feasibility gap tolerance
`1e-3` (default) | nonnegative scalar

`IterationLimit` — Maximal number of numerical optimization iterations
`1e6` (default) | positive integer

`KKTTolerance` — Tolerance for KKT violation
0 | nonnegative scalar value

`ShrinkagePeriod` — Number of iterations between reductions of active set
`0` (default) | nonnegative integer

`OptimizeHyperparameters` — Parameters to optimize
`'none'` (default) | `'auto'` | `'all'` | string array or cell array of eligible parameter names | vector of `optimizableVariable` objects

`HyperparameterOptimizationOptions` — Options for optimization
`HyperparameterOptimizationOptions` object | structure

`Mdl` — Trained SVM regression model
`RegressionSVM` model object | `RegressionPartitionedSVM` cross-validated model object

`AggregateOptimizationResults` — Aggregate optimization results
`AggregateBayesianOptimization` object

Automatic Parallel Support
Accelerate code by automatically running computation in parallel using Parallel Computing Toolbox™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

R2023b: `"auto"` option of `OptimizeHyperparameters` includes `Standardize`

R2023a: `fitrsvm` now accepts `gpuArray` inputs (requires Parallel Computing Toolbox)