Main Content

simulate

Monte Carlo simulation of regression model with ARIMA errors

Description

example

Y = simulate(Mdl,numObs) returns a length numObs random sample path of observations Y simulated from the regression model with ARIMA time series errors Mdl.

example

Y = simulate(Mdl,numObs,Name=Value) specifies additional options using one or more name-value arguments. For example, simulate(Mdl,10,NumPaths=1000,X=Pred) simulates 1000 sample paths of length 10 from the regression model with ARIMA errors Mdl, and uses the predictor data in Pred for the model regression component.

example

[Y,E,U] = simulate(___) also returns simulated innovation paths E and unconditional disturbance paths U using any input argument combination in the previous syntaxes.

Examples

collapse all

Create the following regression model with ARMA(2,1) errors:

yt=1+utut=0.5ut-1-0.8ut-2+εt-0.5εt-1,

where εt is Gaussian with variance 0.1.

Mdl = regARIMA(Intercept=1,AR={0.5 -0.8},MA=-0.5, ...
    Variance=0.1);

Mdl is a fully specified regARIMA object.

Simulate a path of responses of length 100.

rng(1) % For reproducibility
y = simulate(Mdl,100);

y is a 100-by-1 vector of containing the response path simulated from Mdl.

Plot the simulated path.

plot(y)

Figure contains an axes object. The axes object contains an object of type line.

Regress the stationary, quarterly log GDP onto the CPI using a regression model with ARMA(1,1) errors, and forecast log GDP using Monte Carlo simulation.

Load the US Macroeconomic data set and preprocess the data.

load Data_USEconModel;
logGDP = log(DataTimeTable.GDP);
dlogGDP = diff(logGDP);              % Stabilize series
dCPI = diff(DataTimeTable.CPIAUCSL); % Stabilize series
numObs = length(dlogGDP);
gdp = dlogGDP(1:end-15);   % Estimation sample
cpi = dCPI(1:end-15);
T = length(gdp);          % Effective sample size
frstHzn =  (T+1):numObs;  % Forecast horizon
hoCPI = dCPI(frstHzn);    % Holdout sample
dts = DataTimeTable.Time(2:end);

Fit a regression model with ARMA(1,1) errors.

Mdl = regARIMA(ARLags=1,MALags=1);
EstMdl = estimate(Mdl,gdp,X=cpi);
 
    Regression with ARMA(1,1) Error Model (Gaussian Distribution):
 
                   Value       StandardError    TStatistic      PValue  
                 __________    _____________    __________    __________

    Intercept      0.014793      0.0016289        9.0818      1.0684e-19
    AR{1}           0.57601        0.10009        5.7548      8.6755e-09
    MA{1}          -0.15258        0.11978       -1.2738         0.20272
    Beta(1)       0.0028972      0.0013989         2.071        0.038355
    Variance     9.5734e-05     6.5562e-06        14.602       2.723e-48

Infer unconditional disturbances.

[~,u0] = infer(EstMdl,gdp,X=cpi);

Simulate 1000 paths with 15 observations each. Use the inferred unconditional disturbances as presample data.

rng(1); % For reproducibility
gdpF = simulate(EstMdl,15,NumPaths=1000, ...
   U0=u0,X=hoCPI);

Plot the simulation mean forecast and approximate 95% forecast intervals.

lower = prctile(gdpF,2.5,2);
upper = prctile(gdpF,97.5,2);
mn = mean(gdpF,2);

figure
plot(dts(end-65:end),dlogGDP(end-65:end),Color=[.7,.7,.7])
hold on
h1 = plot(dts(frstHzn),lower,"r:",LineWidth=2);
plot(dts(frstHzn),upper,"r:",LineWidth=2)
h2 = plot(dts(frstHzn),mn,"k",LineWidth=2);
h = gca;
ph = patch([repmat(dts(frstHzn(1)),1,2) repmat(dts(frstHzn(end)),1,2)], ...
   [h.YLim fliplr(h.YLim)], ...
   [0 0 0 0],"b");
ph.FaceAlpha = 0.1;
legend([h1 h2],["95% Interval" "Simulation mean"],Location="northwest", ...
    AutoUpdate="off")
axis tight
title("log GDP Forecast - 15 Quarter Horizon")
hold off

Figure contains an axes object. The axes object with title log GDP Forecast - 15 Quarter Horizon contains 5 objects of type line, patch. These objects represent 95% Interval, Simulation mean.

Regress the unit root nonstationary, quarterly log GDP onto the CPI using a regression model with ARIMA(1,1,1) errors with known intercept. Forecast log GDP using Monte Carlo simulation.

Load the US Macroeconomic data set and preprocess the data.

load Data_USEconModel
numObs = length(DataTimeTable.GDP);
logGDP = log(DataTimeTable.GDP(1:end-15));
cpi = DataTimeTable.CPIAUCSL(1:end-15);
T = length(logGDP);
frstHzn = (T+1):numObs;                   % Forecast horizon
hoCPI = DataTimeTable.CPIAUCSL(frstHzn);  % Holdout sample

Fit a regression model with ARIMA(1,1,1). The intercept is not identifiable in a model with integrated errors, so fix its value before estimation.

intercept = 5.867;
Mdl = regARIMA(ARLags=1,MALags=1,D=1,Intercept=intercept);
EstMdl = estimate(Mdl,logGDP,X=cpi);
 
    Regression with ARIMA(1,1,1) Error Model (Gaussian Distribution):
 
                   Value       StandardError    TStatistic      PValue   
                 __________    _____________    __________    ___________

    Intercept         5.867              0           Inf                0
    AR{1}           0.92271       0.030978        29.786      5.8732e-195
    MA{1}          -0.38785       0.060354       -6.4263       1.3076e-10
    Beta(1)       0.0039668      0.0016498        2.4044         0.016199
    Variance     0.00010894      7.272e-06        14.981       9.7346e-51

Infer unconditional disturbances.

[~,u0] = infer(EstMdl,logGDP,X=cpi);

Simulate 1000 response paths with 15 observations each. Use the inferred unconditional disturbances as presample data.

rng(1); % For reproducibility
GDPF = simulate(EstMdl,15,NumPaths=1000, ...
    U0=u0,X=hoCPI);

Plot the simulation mean forecast and approximate 95% forecast intervals.

lower = prctile(GDPF,2.5,2);
upper = prctile(GDPF,97.5,2);
mn = mean(GDPF,2);
dt = DataTimeTable.Time;

figure
plot(dt(end-65:end),log(DataTimeTable.GDP(end-65:end)),'Color',...
   [.7,.7,.7])
hold on
h1 = plot(dt(frstHzn),lower,'r:','LineWidth',2);
plot(dt(frstHzn),upper,'r:','LineWidth',2)
h2 = plot(dt(frstHzn),mn,'k','LineWidth',2);
h = gca;
ph = patch([repmat(dt(frstHzn(1)),1,2) repmat(dt(frstHzn(end)),1,2)],...
    [h.YLim fliplr(h.YLim)],...
    [0 0 0 0],'b');
ph.FaceAlpha = 0.1;
legend([h1 h2],{'95% Interval','Simulation Mean'},'Location','NorthWest',...
    'AutoUpdate','off')
axis tight
title('{\bf log GDP Forecast - 15 Quarter Horizon}')
hold off

Figure contains an axes object. The axes object with title blank l o g blank G D P blank F o r e c a s t blank - blank 1 5 blank Q u a r t e r blank H o r i z o n contains 5 objects of type line, patch. These objects represent 95% Interval, Simulation Mean.

The unconditional disturbances, ut, are nonstationary, therefore the widths of the forecast intervals grow with time.

Simulate paths of responses, innovations, and unconditional disturbances from a regression model with SARIMA(2,1,1)12 errors.

Specify the model:

yt=X[1.5-2]+ut(1-0.2L-0.1L2)(1-L)(1-0.01L12)(1-L12)ut=(1+0.5L)(1+0.02L12)εt,

where εt follows a t-distribution with 15 degrees of freedom.

dstr = struct("Name","t","DoF",15);
Mdl = regARIMA(AR={0.2 0.1},MA=0.5,SAR=0.01,SARLags=12, ...
    SMA=0.02,SMALags=12,D=1,Seasonality=12,Beta=[1.5; -2], ...
    Intercept=0,Variance=0.1,Distribution=dstr)
Mdl = 
  regARIMA with properties:

     Description: "Regression with ARIMA(2,1,1) Error Model Seasonally Integrated with Seasonal AR(12) and MA(12) (t Distribution)"
    Distribution: Name = "t", DoF = 15
       Intercept: 0
            Beta: [1.5 -2]
               P: 27
               D: 1
               Q: 13
              AR: {0.2 0.1} at lags [1 2]
             SAR: {0.01} at lag [12]
              MA: {0.5} at lag [1]
             SMA: {0.02} at lag [12]
     Seasonality: 12
        Variance: 0.1

Simulate and plot 500 paths with 25 observations each.

T = 25;
rng(1) % For reproducibility
Pred = randn(T,2);
[Y,E,U] = simulate(Mdl,T,NumPaths=500,X=Pred);

figure
tiledlayout(3,1)
nexttile
plot(Y)
axis tight
title("Simulated Response Paths")
nexttile
plot(E)
axis tight
title("Simulated Innovations Paths")
nexttile
plot(U)
axis tight
title("Simulated Unconditional Disturbances Paths")

Figure contains 3 axes objects. Axes object 1 with title Simulated Response Paths contains 500 objects of type line. Axes object 2 with title Simulated Innovations Paths contains 500 objects of type line. Axes object 3 with title Simulated Unconditional Disturbances Paths contains 500 objects of type line.

Plot the 2.5th, 50th (median), and 97.5th percentiles of the simulated response paths.

lower = prctile(Y,2.5,2);
middle = median(Y,2);
upper = prctile(Y,97.5,2);

figure
plot(1:25,lower,"r:",1:25,middle,"k",1:25,upper,"r:")
title("95% Percentile Confidence Interval for Response")
legend("95% Interval","Median",Location="best")

Figure contains an axes object. The axes object with title 95% Percentile Confidence Interval for Response contains 3 objects of type line. These objects represent 95% Interval, Median.

Compute statistics across the second dimension (across paths) to summarize the sample paths.

Plot a histogram of the simulated paths at time 20.

figure
histogram(Y(20,:),10)
title("Response Distribution at Time 20")

Figure contains an axes object. The axes object with title Response Distribution at Time 20 contains an object of type histogram.

Input Arguments

collapse all

Fully specified regression model with ARIMA errors, specified as a regARIMA model object created by regARIMA or estimate.

The properties of Mdl cannot contain NaN values.

Number of random observations to generate per output path, specified as a positive integer. The output arguments Y, E, and U, have numObs rows.

Data Types: double

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: simulate(Mdl,10,NumPaths=1000,X=Pred) simulates 1000 sample paths of length 10 from the regression model with ARIMA errors Mdl, and uses the predictor data in Pred for the model regression component.

Number of sample paths to generate, specified as a positive integer. The outputs Y, E, and U have NumPaths columns.

Example: NumPaths=1000

Data Types: double

Presample innovations εt that provide initial values for the ARIMA error model, specified as a length numpreobs numeric column vector or a numpreobs-by-numprepaths numeric matrix.

numpreobs is the number of presample observations. numprepaths is the number of presample response paths.

Each row is a presample innovation, and measurements in each row occur simultaneously. The last row contains the latest presample innovation for each path. E0 must have at least Mdl.Q rows. If you supply more rows than necessary, simulate uses the latest Mdl.Q innovations only.

Columns correspond to separate, independent paths.

  • If E0 is a column vector, simulate applies it to simulate each sample path (column). Therefore, all output sample paths derive from common initial conditions.

  • Otherwise, simulate applies E0(:,j) to initialize simulating path j. E0 must have at least NumPaths columns, and simulate uses only the first NumPaths columns.

Data Types: double

Presample unconditional disturbances ut that provide initial values for the ARIMA error model, specified as a length numpreobs numeric column vector or a numpreobs-by-numprepaths numeric matrix.

Each row is a presample unconditional disturbance, and measurements in each row occur simultaneously. The last row contains the latest presample unconditional disturbance for each path. U0 must have at least Mdl.P rows. If you supply more rows than necessary, simulate uses the latest Mdl.P unconditional disturbances only.

Columns correspond to separate, independent paths.

  • If U0 is a column vector, simulate applies it to simulate each sample path (column). Therefore, all output sample paths derive from common initial conditions.

  • Otherwise, simulate applies U0(:,j) to initialize simulating path j. U0 must have at least NumPaths columns, and simulate uses only the first NumPaths columns.

Data Types: double

Predictor data for the regression component in the model, specified as a numeric matrix containing numpreds columns.

numpreds is the number of predictor variables (numel(Mdl.Beta)).

Each row corresponds to an observation, and measurements in each row occur simultaneously. The last row contains the latest observation. X must have at least numObs rows. If you supply more rows than necessary, simulate uses only the latest numObs observations. simulate does not use the regression component in the presample period.

Each column is an individual predictor variable.

simulate applies X to each path (page); that is, X represents one path of observed predictors.

By default, simulate excludes the regression component, regardless of its presence in Mdl.

Data Types: double

Note

  • NaNs in input data indicate missing values. simulate uses listwise deletion to delete all sampled times (rows) in the input data containing at least one missing value. Specifically, simulate performs these steps:

    1. Synchronize, or merge, the presample data sets E0 and U0 to create the set Presample.

    2. Remove all rows from Presample and the predictor data X containing at least one NaN.

    Listwise deletion applied to the in-sample data can reduce the sample size and create irregular time series.

  • simulate assumes that you synchronize the predictor series such that the latest observations occur simultaneously. The software also assumes that you synchronize the presample series similarly.

Output Arguments

collapse all

Simulated response paths, returned as a length numObs numeric column vector or a numObs-by-NumPaths numeric matrix. Y represents the continuation of input presample observations.

Each row is a time point in the simulation horizon. Values in a row, among all paths (columns), occur simultaneously. The last row contains the latest simulated values.

Columns correspond to separate, independently simulated paths.

Simulated model innovation paths, returned as a length numObs numeric column vector or a numObs-by-NumPaths numeric matrix.

The dimensions of E correspond to the dimensions of Y.

Simulated unconditional disturbance paths, returned as a length numObs numeric column vector or a numObs-by-NumPaths numeric matrix.

The dimensions of U correspond to the dimensions of Y.

References

[1] Box, George E. P., Gwilym M. Jenkins, and Gregory C. Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Davidson, R., and J. G. MacKinnon. Econometric Theory and Methods. Oxford, UK: Oxford University Press, 2004.

[3] Enders, Walter. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[4] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[5] Pankratz, A. Forecasting with Dynamic Regression Models. John Wiley & Sons, Inc., 1991.

[6] Tsay, R. S. Analysis of Financial Time Series. 2nd ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005.

Version History

Introduced in R2013b