ocsvm
Fit one-class support vector machine (SVM) model for anomaly detection
Since R2022b
Syntax
Description
Use the ocsvm function to fit a one-class support vector
machine (SVM) model for outlier detection and novelty detection.
Outlier detection (detecting anomalies in training data) — Use the output argument
tfofocsvmto identify anomalies in training data.Novelty detection (detecting anomalies in new data with uncontaminated training data) — Create a
OneClassSVMobject by passing uncontaminated training data (data with no outliers) toocsvm. Detect anomalies in new data by passing the object and the new data to the object functionisanomaly.
returns
a Mdl = ocsvm(Tbl)OneClassSVM object
(one-class SVM model object) for predictor data in the table
Tbl.
specifies options using one or more name-value arguments in addition to any of the input
argument combinations in the previous syntaxes. For example,
Mdl = ocsvm(___,Name=Value) instructs the function
to process 10% of the training data as anomalies.ContaminationFraction=0.1
Examples
Input Arguments
Name-Value Arguments
Output Arguments
More About
Tips
After training a model, you can generate C/C++ code that finds anomalies for new data. Generating C/C++ code requires MATLAB® Coder™. For details, see Code Generation of the
isanomalyfunction and Introduction to Code Generation.
Algorithms
ocsvmconsidersNaN,''(empty character vector),""(empty string),<missing>, and<undefined>values inTblandNaNvalues inXto be missing values.ocsvmdoes not use observations with some missing values. The function assigns the anomaly score ofNaNand anomaly indicator offalse(logical0) to the observations.ocsvmminimizes the regularized objective function using a Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) solver with ridge (L2) regularization. Ifocsvmrequires more memory than the value ofBlockSizeto hold the transformed predictor data, then the function uses a block-wise strategy.When
ocsvmuses a block-wise strategy, it implements LBFGS by distributing the calculation of the loss and gradient among different parts of the data at each iteration. Also,ocsvmrefines the initial estimates of the linear coefficients and the bias term by fitting the model locally to parts of the data and combining the coefficients by averaging. If you specifyVerbose=1, thenocsvmdisplays diagnostic information for each data pass.When
ocsvmdoes not use a block-wise strategy, the initial estimates are zeros. If you specifyVerbose=1, thenocsvmdisplays diagnostic information for each iteration.
Alternative Functionality
You can also use the fitcsvm function to train a one-class SVM model for
anomaly detection.
The
ocsvmfunction provides a simpler and preferred workflow for anomaly detection than thefitcsvmfunction.The
ocsvmfunction returns aOneClassSVMobject, anomaly indicators, and anomaly scores. You can use the outputs to identify anomalies in training data. To find anomalies in new data, you can use theisanomalyobject function ofOneClassSVM. Theisanomalyfunction returns anomaly indicators and scores for the new data.The
fitcsvmfunction supports both one-class and binary classification. If the class label variable contains only one class (for example, a vector of ones),fitcsvmtrains a model for one-class classification and returns aClassificationSVMobject. To identify anomalies, you must first compute anomaly scores by using theresubPredictorpredictobject function ofClassificationSVM, and then identify anomalies by finding observations that have negative scores.Note that a large positive anomaly score indicates an anomaly in
ocsvm, whereas a negative score indicates an anomaly inpredictofClassificationSVM.
The
ocsvmfunction finds the decision boundary based on the primal form of SVM, whereas thefitcsvmfunction finds the decision boundary based on the dual form of SVM.The solver in
ocsvmis computationally less expensive than the solver infitcsvmfor a large data set (large n). Unlike solvers infitcsvm, which require computation of the n-by-n Gram matrix, the solver inocsvmonly needs to form a matrix of size n-by-m. Here, m is the number of dimensions of expanded space, which is typically much less than n for big data.

