Main Content

reset

Reset incremental robust random cut forest model

Since R2023b

    Description

    example

    forest = reset(forest) returns the incremental robust random cut forest (RRCF) model forest with reset learned parameters. If any hyperparameters of forest are estimated during incremental training, the reset function resets these hyperparameters as well. reset always preserves the forest.NumPredictors property.

    Examples

    collapse all

    Create a default RRCF model for incremental anomaly detection. Specify to use 50 robust random cut trees and to standardize the predictor data. Reset the model after incremental training and see which parameters are reset.

    IncrementalMdl = incrementalRobustRandomCutForest(NumLearners=50, ...
        StandardizeData=true);

    IncrementalMdl is an incrementalRobustRandomCutForest model object. All its properties are read-only. By default, the software sets the anomaly contamination fraction to 0 and the score threshold to 0.

    IncrementalMdl must be fit to data before you can use it to perform any other operations.

    Load Data

    Load the 1994 census data stored in census1994.mat. The data set consists of demographic data from the US Census Bureau.

    load census1994.mat

    The fit function of incrementalRobustRandomCutForest does not use observations with missing values. Remove missing values and categorical variables in the data to reduce memory consumption and speed up training. Use only the first 5000 observations in the data for training and anomaly detection.

    adultdata = rmmissing(adultdata);
    adultdata = removevars(adultdata,["workClass","education","marital_status", ...
        "occupation","relationship","race","sex","native_country","salary"]);
    adultdata = adultdata(1:5000,:);
    rng("default") % For reproducibility

    Fit Incremental Model

    Fit the incremental model IncrementalMdl to the data by using the fit function. To simulate a data stream, fit the model in chunks of 100 observations at a time. At each iteration:

    • Process 100 observations.

    • Overwrite the previous incremental model with a new one fitted to the incoming observations.

    n = numel(adultdata(:,1));
    numObsPerChunk = 100;
    nchunk = floor(n/numObsPerChunk);
    
    % Incremental fitting
    rng("default"); % For reproducibility
    for j = 1:nchunk
        ibegin = min(n,numObsPerChunk*(j-1) + 1);
        iend = min(n,numObsPerChunk*j);
        idx = ibegin:iend;    
        IncrementalMdl = fit(IncrementalMdl,adultdata(idx,:));
    end

    Display all the properties of the trained model object IncrementalMdl.

    details(IncrementalMdl)
      incrementalRobustRandomCutForest with properties:
    
            CollusiveDisplacement: 'maximal'
                      NumLearners: 50
        NumObservationsPerLearner: 256
               ObservationRemoval: 'oldest'
            NumObservationsToKeep: 256
                               Mu: [37.9400 1.9217e+05 10.1980 567.7170 102.5340 40.7060]
                            Sigma: [12.8905 1.0789e+05 2.5006 2.4309e+03 431.7485 11.7970]
            CategoricalPredictors: []
                 EstimationPeriod: 1000
                           IsWarm: 1
            ContaminationFraction: 0
          NumTrainingObservations: 4000
                    NumPredictors: 6
                   ScoreThreshold: 176.3187
                ScoreWarmupPeriod: 0
                   PredictorNames: {'age'  'fnlwgt'  'education_num'  'capital_gain'  'capital_loss'  'hours_per_week'}
                  ScoreWindowSize: 1000
    

    Reset Incremental Model

    Reset the learned parameters by using the reset function, and compare them to the previous model to see which parameters are reset.

    newMdl = reset(IncrementalMdl);
    details(newMdl)
      incrementalRobustRandomCutForest with properties:
    
            CollusiveDisplacement: 'maximal'
                      NumLearners: 50
        NumObservationsPerLearner: 256
               ObservationRemoval: 'oldest'
            NumObservationsToKeep: 256
                               Mu: [0 0 0 0 0 0]
                            Sigma: [1 1 1 1 1 1]
            CategoricalPredictors: []
                 EstimationPeriod: 1000
                           IsWarm: 0
            ContaminationFraction: 0
          NumTrainingObservations: 0
                    NumPredictors: 6
                   ScoreThreshold: 0
                ScoreWarmupPeriod: 0
                   PredictorNames: {'age'  'fnlwgt'  'education_num'  'capital_gain'  'capital_loss'  'hours_per_week'}
                  ScoreWindowSize: 1000
    

    The reset function resets the warm-up status of the model (IsWarm = 0), the score threshold, the number of training observations, and the estimated hyperparameters (Mu and Sigma).

    Input Arguments

    collapse all

    Incremental RRCF model, specified as an incrementalRobustRandomCutForest model object. You can create forest directly or by converting a supported, traditionally trained RRCF model using the incrementalLearner function. For more details, see the incrementalRobustRandomCutForest object page.

    Version History

    Introduced in R2023b