preparedPredictors
Obtain prepared data used for training or testing in direct forecasting
Since R2023b
Syntax
Description
Examples
Prepared Predictor Data for Forecasting
When you perform direct forecasting using directforecaster
, the function creates lagged and leading predictors from the training data before fitting a DirectForecaster
model. Similarly, the loss
and predict
object functions reformat the test data before computing loss and prediction values, respectively.
This example shows how to access the prepared predictor data used by direct forecasting models for training and testing.
Load the sample file TemperatureData.csv
, which contains average daily temperatures from January 2015 through July 2016. Read the file into a table. Observe the first eight observations in the table.
temperatures = readtable("TemperatureData.csv");
head(temperatures)
Year Month Day TemperatureF ____ ___________ ___ ____________ 2015 {'January'} 1 23 2015 {'January'} 2 31 2015 {'January'} 3 25 2015 {'January'} 4 39 2015 {'January'} 5 29 2015 {'January'} 6 12 2015 {'January'} 7 10 2015 {'January'} 8 4
For this example, use a subset of the temperature data that omits the first 100 observations.
Tbl = temperatures(101:end,:);
Create a datetime
variable t
that contains the year, month, and day information for each observation in Tbl
. Then, use t
to convert Tbl
into a timetable.
numericMonth = month(datetime(Tbl.Month, ... InputFormat="MMMM",Locale="en_US")); t = datetime(Tbl.Year,numericMonth,Tbl.Day); Tbl.Time = t; Tbl = table2timetable(Tbl);
Plot the temperature values in Tbl
over time.
plot(Tbl.Time,Tbl.TemperatureF) xlabel("Date") ylabel("Temperature in Fahrenheit")
Partition the temperature data into training and test sets by using tspartition
. Reserve 20% of the observations for testing.
partition = tspartition(size(Tbl,1),"Holdout",0.20);
trainingTbl = Tbl(training(partition),:);
testTbl = Tbl(test(partition),:);
Create a full direct forecasting model by using the data in trainingTbl
. Specify the horizon steps as one to seven steps ahead. Train a model at each horizon step using a boosted ensemble of trees. All three of the predictors (Year
, Month
, and Day
) are leading predictors because their future values are known.
To create new predictors by shifting the leading predictor and response variables backward in time, specify the leading predictor lags and the response variable lags. For this example, use the following as predictors values: the current and previous Year
values, the current and previous Month
values, the current and previous seven Day
values, and the previous seven TemperatureF
values.
Mdl = directforecaster(trainingTbl,"TemperatureF", ... Horizon=1:7,LeadingPredictors="all", ... LeadingPredictorLags={0:1,0:1,0:7}, ... ResponseLags=1:7)
Mdl = DirectForecaster Horizon: [1 2 3 4 5 6 7] ResponseLags: [1 2 3 4 5 6 7] LeadingPredictors: [1 2 3] LeadingPredictorLags: {[0 1] [0 1] [0 1 2 3 4 5 6 7]} ResponseName: 'TemperatureF' PredictorNames: {'Year' 'Month' 'Day'} CategoricalPredictors: 2 Learners: {7x1 cell} MaxLag: 7 NumObservations: 372
Mdl
is a DirectForecaster
model object. Mdl
consists of seven regression models: Mdl.Learners{1}
, which predicts one step into the future; Mdl.Learners{2}
, which predicts two steps into the future; and so on.
Compare the first and seventh regression models in Mdl
.
Mdl.Learners{1}
ans = CompactRegressionEnsemble PredictorNames: {1x19 cell} ResponseName: 'TemperatureF_Step1' CategoricalPredictors: [10 11] ResponseTransform: 'none' NumTrained: 100
Mdl.Learners{7}
ans = CompactRegressionEnsemble PredictorNames: {1x19 cell} ResponseName: 'TemperatureF_Step7' CategoricalPredictors: [10 11] ResponseTransform: 'none' NumTrained: 100
The regression models in Mdl
are all CompactRegressionEnsemble
objects. Because the models are compact, they do not include the predictor data used to train them.
To see the data used to train the regression models in Mdl
, use the preparedPredictors
object function.
Observe the prepared predictor data used to train Mdl.Learners{1}
. By default, preparedPredictors
returns the prepared predictor data used at horizon step Mdl.Horizon(1)
, which in this case is one step ahead.
prepTrainingTbl1 = preparedPredictors(Mdl,trainingTbl)
prepTrainingTbl1=372×19 timetable
Time TemperatureF_Lag1 TemperatureF_Lag2 TemperatureF_Lag3 TemperatureF_Lag4 TemperatureF_Lag5 TemperatureF_Lag6 TemperatureF_Lag7 Year_Step1 Year_Lag1 Month_Step1 Month_Lag1 Day_Step1 Day_Lag1 Day_Lag2 Day_Lag3 Day_Lag4 Day_Lag5 Day_Lag6 Day_Lag7
___________ _________________ _________________ _________________ _________________ _________________ _________________ _________________ __________ _________ ___________ __________ _________ ________ ________ ________ ________ ________ ________ ________
10-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 NaN {'April'} {0x0 char} 10 NaN NaN NaN NaN NaN NaN NaN
11-Apr-2015 41 NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 11 10 NaN NaN NaN NaN NaN NaN
12-Apr-2015 45 41 NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 12 11 10 NaN NaN NaN NaN NaN
13-Apr-2015 49 45 41 NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 13 12 11 10 NaN NaN NaN NaN
14-Apr-2015 50 49 45 41 NaN NaN NaN 2015 2015 {'April'} {'April' } 14 13 12 11 10 NaN NaN NaN
15-Apr-2015 54 50 49 45 41 NaN NaN 2015 2015 {'April'} {'April' } 15 14 13 12 11 10 NaN NaN
16-Apr-2015 54 54 50 49 45 41 NaN 2015 2015 {'April'} {'April' } 16 15 14 13 12 11 10 NaN
17-Apr-2015 46 54 54 50 49 45 41 2015 2015 {'April'} {'April' } 17 16 15 14 13 12 11 10
18-Apr-2015 51 46 54 54 50 49 45 2015 2015 {'April'} {'April' } 18 17 16 15 14 13 12 11
19-Apr-2015 47 51 46 54 54 50 49 2015 2015 {'April'} {'April' } 19 18 17 16 15 14 13 12
20-Apr-2015 41 47 51 46 54 54 50 2015 2015 {'April'} {'April' } 20 19 18 17 16 15 14 13
21-Apr-2015 41 41 47 51 46 54 54 2015 2015 {'April'} {'April' } 21 20 19 18 17 16 15 14
22-Apr-2015 51 41 41 47 51 46 54 2015 2015 {'April'} {'April' } 22 21 20 19 18 17 16 15
23-Apr-2015 50 51 41 41 47 51 46 2015 2015 {'April'} {'April' } 23 22 21 20 19 18 17 16
24-Apr-2015 40 50 51 41 41 47 51 2015 2015 {'April'} {'April' } 24 23 22 21 20 19 18 17
25-Apr-2015 39 40 50 51 41 41 47 2015 2015 {'April'} {'April' } 25 24 23 22 21 20 19 18
⋮
prepTrainingTbl1
contains lagged predictors (with Lag
in their names) and leading predictors (with Step
in their names). The table contains missing values due to the creation of these prepared predictors. For example, TemperatureF_Lag1
contains a missing value at time 10-Apr-2015
because the temperature at time 09-Apr-2015
is not known.
Observe the prepared predictor data used to train Mdl.Learners{7}
.
prepTrainingTbl7 = preparedPredictors(Mdl,trainingTbl, ...
HorizonStep=7)
prepTrainingTbl7=372×19 timetable
Time TemperatureF_Lag1 TemperatureF_Lag2 TemperatureF_Lag3 TemperatureF_Lag4 TemperatureF_Lag5 TemperatureF_Lag6 TemperatureF_Lag7 Year_Step7 Year_Step6 Month_Step7 Month_Step6 Day_Step7 Day_Step6 Day_Step5 Day_Step4 Day_Step3 Day_Step2 Day_Step1 Day_Lag1
___________ _________________ _________________ _________________ _________________ _________________ _________________ _________________ __________ __________ ___________ ___________ _________ _________ _________ _________ _________ _________ _________ ________
10-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 NaN {'April'} {0x0 char} 10 NaN NaN NaN NaN NaN NaN NaN
11-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 11 10 NaN NaN NaN NaN NaN NaN
12-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 12 11 10 NaN NaN NaN NaN NaN
13-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 13 12 11 10 NaN NaN NaN NaN
14-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 14 13 12 11 10 NaN NaN NaN
15-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 15 14 13 12 11 10 NaN NaN
16-Apr-2015 NaN NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 16 15 14 13 12 11 10 NaN
17-Apr-2015 41 NaN NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 17 16 15 14 13 12 11 10
18-Apr-2015 45 41 NaN NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 18 17 16 15 14 13 12 11
19-Apr-2015 49 45 41 NaN NaN NaN NaN 2015 2015 {'April'} {'April' } 19 18 17 16 15 14 13 12
20-Apr-2015 50 49 45 41 NaN NaN NaN 2015 2015 {'April'} {'April' } 20 19 18 17 16 15 14 13
21-Apr-2015 54 50 49 45 41 NaN NaN 2015 2015 {'April'} {'April' } 21 20 19 18 17 16 15 14
22-Apr-2015 54 54 50 49 45 41 NaN 2015 2015 {'April'} {'April' } 22 21 20 19 18 17 16 15
23-Apr-2015 46 54 54 50 49 45 41 2015 2015 {'April'} {'April' } 23 22 21 20 19 18 17 16
24-Apr-2015 51 46 54 54 50 49 45 2015 2015 {'April'} {'April' } 24 23 22 21 20 19 18 17
25-Apr-2015 47 51 46 54 54 50 49 2015 2015 {'April'} {'April' } 25 24 23 22 21 20 19 18
⋮
Because Mdl.Learners{7}
predicts seven steps ahead, prepTrainingTbl7
contains different predictors from the predictors in prepTrainingTbl1
. For example, prepTrainingTbl7
contains the predictors Year_Step7
and Year_Step6
instead of the predictors Year_Step1
and Year_Lag1
in prepTrainingTbl1
. The step numbers indicate the horizon steps (that is, the number of time steps ahead).
Compute the test set mean squared error at each horizon step.
mse = loss(Mdl,testTbl)
mse = 1×7
32.1256 45.3297 49.8831 49.3660 55.7613 50.4300 53.6758
Obtain the prepared test set predictor data used by Mdl.Learners{1}
to compute mse(1)
. Compare the variables in prepTestTbl1
and prepTrainingTbl1
.
prepTestTbl1 = preparedPredictors(Mdl,testTbl);
isequal(prepTrainingTbl1.Properties.VariableNames, ...
prepTestTbl1.Properties.VariableNames)
ans = logical
1
The prepared predictors in prepTestTbl1
and prepTrainingTbl1
are the same.
Similarly, obtain the prepared test set predictor data used by Mdl.Learners{7}
to compute mse(7)
. Compare the variables in prepTestTbl7
and prepTrainingTbl7
.
prepTestTbl7 = preparedPredictors(Mdl,testTbl, ... HorizonStep=7); isequal(prepTrainingTbl7.Properties.VariableNames, ... prepTestTbl7.Properties.VariableNames)
ans = logical
1
The prepared predictors in prepTestTbl7
and prepTrainingTbl7
are also the same.
Input Arguments
Mdl
— Direct forecasting model
DirectForecaster
model object | CompactDirectForecaster
model object
Direct forecasting model, specified as a DirectForecaster
or CompactDirectForecaster
model object.
Tbl
— Training or test set data
table | timetable
Training or test set data, specified as a table or timetable. Each row of
Tbl
corresponds to one observation, and each column corresponds
to one variable. Tbl
must have the same data type as the predictor
data argument used to train Mdl
, and must include all exogenous
predictors and the response variable.
X
— Training or test set exogenous predictor data
numeric matrix | table | timetable
Training or test set exogenous predictor data, specified as a numeric matrix, table,
or timetable. Each row of X
corresponds to one observation, and
each column corresponds to one predictor. X
must have the same data
type as the predictor data argument used to train Mdl
, and must
consist of the same exogenous predictors.
Y
— Training or test set response data
numeric vector | one-column table | one-column timetable
Training or test set response data, specified as a numeric vector, one-column table,
or one-column timetable. Each row of Y
corresponds to one observation.
If
X
is a numeric matrix, thenY
must be a numeric vector.If
X
is a table, thenY
must be a numeric vector or one-column table.If
X
is a timetable or it is not specified, thenY
must be a numeric vector, one-column table, or one-column timetable.
If you specify both X
and Y
,
then they must have the same number of observations.
step
— Horizon step at which to prepare data
Mdl.Horizon(1)
(default) | positive integer scalar
Horizon step at which to prepare data, specified as a positive integer scalar.
step
must be one of the values in
Mdl.Horizon
.
If step
is element i
in
Mdl.Horizon
, then
Mdl.PreparedPredictorsPerHorizon(i,:)
indicates the prepared
predictors in preparedX
.
Example: 2
Data Types: single
| double
Output Arguments
preparedX
— Prepared predictor data used for training or testing
numeric matrix | table | timetable
Prepared predictor data used for training or testing at the specified horizon step,
returned as a numeric matrix, table, or timetable. preparedX
has
the same data type as the data used to train Mdl
and is of size
n-by-p, where n is the number
of observations in Tbl
, X
, or
Y
, and p is the number of prepared predictors
at the specified horizon step.
Limitations
When you use the
preparedPredictors
object function, the data set must contain at leastMdl.MaxLag + max(Mdl.Horizon)
observations. The software requires these observations for creating lagged and leading predictors.
Version History
Introduced in R2023b
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: United States.
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)