# RegressionBaggedEnsemble

Package: classreg.learning.regr
Superclasses: RegressionEnsemble

Regression ensemble grown by resampling

## Description

RegressionBaggedEnsemble combines a set of trained weak learner models and data on which these learners were trained. It can predict ensemble response for new data by aggregating predictions from its weak learners.

## Construction

Create a bagged regression ensemble object using fitrensemble. Set the name-value pair argument 'Method' of fitrensemble to 'Bag' to use bootstrap aggregation (bagging, for example, random forest).

## Properties

 BinEdges Bin edges for numeric predictors, specified as a cell array of p numeric vectors, where p is the number of predictors. Each vector includes the bin edges for a numeric predictor. The element in the cell array for a categorical predictor is empty because the software does not bin categorical predictors.The software bins numeric predictors only if you specify the 'NumBins' name-value pair argument as a positive integer scalar when training a model with tree learners. The BinEdges property is empty if the 'NumBins' value is empty (default).You can reproduce the binned predictor data Xbinned by using the BinEdges property of the trained model mdl.X = mdl.X; % Predictor data Xbinned = zeros(size(X)); edges = mdl.BinEdges; % Find indices of binned predictors. idxNumeric = find(~cellfun(@isempty,edges)); if iscolumn(idxNumeric) idxNumeric = idxNumeric'; end for j = idxNumeric x = X(:,j); % Convert x to array if x is a table. if istable(x) x = table2array(x); end % Group x into bins by using the discretize function. xbinned = discretize(x,[-inf; edges{j}; inf]); Xbinned(:,j) = xbinned; endXbinned contains the bin indices, ranging from 1 to the number of bins, for numeric predictors. Xbinned values are 0 for categorical predictors. If X contains NaNs, then the corresponding Xbinned values are NaNs. CategoricalPredictors Categorical predictor indices, specified as a vector of positive integers. CategoricalPredictors contains index values corresponding to the columns of the predictor data that contain categorical predictors. If none of the predictors are categorical, then this property is empty ([]). CombineWeights A character vector describing how the ensemble combines learner predictions. ExpandedPredictorNames Expanded predictor names, stored as a cell array of character vectors. If the model uses encoding for categorical variables, then ExpandedPredictorNames includes the names that describe the expanded variables. Otherwise, ExpandedPredictorNames is the same as PredictorNames. FitInfo A numeric array of fit information. The FitInfoDescription property describes the content of this array. FitInfoDescription Character vector describing the meaning of the FitInfo array. FResample A numeric scalar between 0 and 1. FResample is the fraction of training data fitrensemble resampled at random for every weak learner when constructing the ensemble. HyperparameterOptimizationResults Description of the cross-validation optimization of hyperparameters, stored as a BayesianOptimization object or a table of hyperparameters and associated values. Nonempty when the OptimizeHyperparameters name-value pair is nonempty at creation. Value depends on the setting of the HyperparameterOptimizationOptions name-value pair at creation: 'bayesopt' (default) — Object of class BayesianOptimization'gridsearch' or 'randomsearch' — Table of hyperparameters used, observed objective function values (cross-validation loss), and rank of observations from lowest (best) to highest (worst) LearnerNames Cell array of character vectors with names of the weak learners in the ensemble. The name of each learner appears just once. For example, if you have an ensemble of 100 trees, LearnerNames is {'Tree'}. Method A character vector with the name of the algorithm fitrensemble used for training the ensemble. ModelParameters Parameters used in training ens. NumObservations Numeric scalar containing the number of observations in the training data. NumTrained Number of trained learners in the ensemble, a positive scalar. PredictorNames A cell array of names for the predictor variables, in the order in which they appear in X. ReasonForTermination A character vector describing the reason fitrensemble stopped adding weak learners to the ensemble. Regularization A structure containing the result of the regularize method. Use Regularization with shrink to lower resubstitution error and shrink the ensemble. Replace Boolean flag indicating if training data for weak learners in this ensemble were sampled with replacement. Replace is true for sampling with replacement, false otherwise. ResponseName A character vector with the name of the response variable Y. ResponseTransform Function handle for transforming scores, or character vector representing a built-in transformation function. 'none' means no transformation; equivalently, 'none' means @(x)x. Add or change a ResponseTransform function using dot notation: ens.ResponseTransform = @function Trained The trained learners, a cell array of compact regression models. TrainedWeights A numeric vector of weights the ensemble assigns to its learners. The ensemble computes predicted response by aggregating weighted predictions from its learners. UseObsForLearner A logical matrix of size N-by-NumTrained, where N is the number of rows (observations) in the training data X, and NumTrained is the number of trained weak learners. UseObsForLearner(I,J) is true if observation I was used for training learner J, and is false otherwise. W The scaled weights, a vector with length n, the number of rows in X. The sum of the elements of W is 1. X The matrix or table of predictor values that trained the ensemble. Each column of X represents one variable, and each row represents one observation. Y The numeric column vector with the same number of rows as X that trained the ensemble. Each entry in Y is the response to the data in the corresponding row of X.

## Methods

 oobLoss Out-of-bag regression error oobPermutedPredictorImportance Predictor importance estimates by permutation of out-of-bag predictor observations for random forest of regression trees oobPredict Predict out-of-bag response of ensemble

### Inherited Methods

 compact Create compact regression ensemble crossval Cross validate ensemble cvshrink Cross validate shrinking (pruning) ensemble regularize Find weights to minimize resubstitution error plus penalty term resubLoss Regression error by resubstitution resubPredict Predict response of ensemble by resubstitution resume Resume training ensemble shrink Prune ensemble
 loss Regression error predict Predict responses using ensemble of regression models predictorImportance Estimates of predictor importance for regression ensemble removeLearners Remove members of compact regression ensemble

## Copy Semantics

Value. To learn how value classes affect copy operations, see Copying Objects (MATLAB).

## Examples

collapse all

Load the carsmall data set. Consider a model that explains a car's fuel economy (MPG) using its weight (Weight) and number of cylinders (Cylinders).

X = [Weight Cylinders];
Y = MPG;

Train a bagged ensemble of 100 regression trees using all measurements.

Mdl = fitrensemble(X,Y,'Method','bag')
Mdl =
classreg.learning.regr.RegressionBaggedEnsemble
ResponseName: 'Y'
CategoricalPredictors: []
ResponseTransform: 'none'
NumObservations: 94
NumTrained: 100
Method: 'Bag'
LearnerNames: {'Tree'}
ReasonForTermination: 'Terminated normally after completing the requested number of training cycles.'
FitInfo: []
FitInfoDescription: 'None'
Regularization: []
FResample: 1
Replace: 1
UseObsForLearner: [94x100 logical]

Properties, Methods

Mdl is a RegressionBaggedEnsemble model object.

Mdl.Trained is the property that stores a 100-by-1 cell vector of the trained, compact regression trees (CompactRegressionTree model objects) that compose the ensemble.

Plot a graph of the first trained regression tree.

view(Mdl.Trained{1},'Mode','graph')

By default, fitrensemble grows deep trees for bags of trees.

Estimate the in-sample mean-squared error (MSE).

L = resubLoss(Mdl)
L = 13.6835

## Tips

For a bagged ensemble of regression trees, the Trained property of ens stores a cell vector of ens.NumTrained CompactRegressionTree model objects. For a textual or graphical display of tree t in the cell vector, enter

view(ens.Trained{t})