Documentation

This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English version of the page.

estimate

Class: arima

Estimate ARIMA or ARIMAX model parameters

Syntax

```EstMdl = estimate(Mdl,y) [EstMdl,EstParamCov,logL,info] = estimate(Mdl,y) [EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value) ```

Description

`EstMdl = estimate(Mdl,y)` uses maximum likelihood to estimate the parameters of the ARIMA(p,D,q) model `Mdl` given the observed univariate time series `y`. `EstMdl` is an `arima` model that stores the results.

```[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y)``` additionally returns `EstParamCov`, the variance-covariance matrix associated with estimated parameters, `logL`, the optimized loglikelihood objective function, and `info`, a data structure of summary information.

`[EstMdl,EstParamCov,logL,info] = estimate(Mdl,y,Name,Value)` estimates the model with additional options specified by one or more `Name,Value` pair arguments.

Input Arguments

expand all

ARIMA or ARIMAX model, specified as an `arima` model returned by `arima` or `estimate`.

`estimate` treats non-`NaN` elements in `Mdl` as equality constraints and does not estimate the corresponding parameters.

Single path of response data to which the model is fit, specified as a numeric column vector. The last observation of `y` is the latest.

Data Types: `double`

Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Initial estimates of the nonseasonal autoregressive coefficients for the ARIMA model, specified as the comma-separated pair consisting of `'AR0'` and a numeric vector.

The number of coefficients in `AR0` must equal the number of lags associated with nonzero coefficients in the nonseasonal autoregressive polynomial, `ARLags`.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double`

Initial estimates of regression coefficients for the regression component, specified as the comma-separated pair consisting of `'Beta0'` and a numeric vector.

The number of coefficients in `Beta0` must equal the number of columns of `X`.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double`

Initial ARIMA model constant estimate, specified as the comma-separated pair consisting of `'Constant0'` and a scalar.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double`

Command Window display option, specified as the comma-separated pair consisting of `'Display'` and a value or any combination of values in this table.

Valueestimate Displays
`'diagnostics'`Optimization diagnostics
`'full'`Maximum likelihood parameter estimates, standard errors, t statistics, iterative optimization information, and optimization diagnostics
`'iter'`Iterative optimization information
`'off'`No display in the Command Window
`'params'`Maximum likelihood parameter estimates, standard errors, and t statistics

For example:

• To run a simulation where you are fitting many models, and therefore want to suppress all output, use `'Display','off'`.

• To display all estimation results and the optimization diagnostics, use `'Display',{'params','diagnostics'}`.

Data Types: `char` | `cell` | `string`

Initial t-distribution degrees-of-freedom parameter estimate, specified as the comma-separated pair consisting of `'DoF0'` and a positive scalar. `DoF0` must exceed 2.

Data Types: `double`

Presample innovations that have mean 0 and provide initial values for the ARIMA(p,D,q) model, specified as the comma-separated pair consisting of `'E0'` and a numeric column vector.

`E0` must contain at least `Mdl.Q` rows. If you use a conditional variance model, such as a `garch` model, then the software might require more than `Mdl.Q` presample innovations.

If `E0` contains extra rows, then `estimate` uses the latest `Mdl.Q` presample innovations. The last row contains the latest presample innovation.

By default, `estimate` sets the necessary presample innovations to `0`.

Data Types: `double`

Initial estimates of nonseasonal moving average coefficients for the ARIMA(p,D,q) model, specified as the comma-separated pair consisting of `'MA0'` and a numeric vector.

The number of coefficients in `MA0` must equal the number of lags associated with nonzero coefficients in the nonseasonal moving average polynomial, `MALags`.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double`

Optimization options, specified as the comma-separated pair consisting of `'Options'` and an `optimoptions` optimization controller. For details on altering the default values of the optimizer, see `optimoptions` or `fmincon` in Optimization Toolbox™.

For example, to change the constraint tolerance to `1e-6`, set `Options = optimoptions(@fmincon,'ConstraintTolerance',1e-6,'Algorithm','sqp')`. Then, pass `Options` into `estimate` using `'Options',Options`.

By default, `estimate` uses the same default options as `fmincon`, except `Algorithm` is `'sqp'` and `ConstraintTolerance` is `1e-7`.

Initial estimates of seasonal autoregressive coefficients for the ARIMA(p,D,q) model, specified as the comma-separated pair consisting of `'SAR0'` and a numeric vector.

The number of coefficients in `SAR0` must equal the number of lags associated with nonzero coefficients in the seasonal autoregressive polynomial, `SARLags`.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double`

Initial estimates of seasonal moving average coefficients for the ARIMA(p,D,q) model, specified as the comma-separated pair consisting of `'SMA0'` and a vector.

The number of coefficients in `SMA0` must equal the number of lags with nonzero coefficients in the seasonal moving average polynomial, `SMALags`.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double`

Presample conditional variances that provide initial values for any conditional variance model, specified as the comma-separated pair consisting of `'V0'` and a numeric column vector with positive entries.

The software requires `V0` to have at least the number of observations required to initialize the variance model. If the number of rows in `V0` exceeds the number necessary, then `estimate` only uses the latest observations. The last row contains the latest observation.

If the variance of the model is constant, then `V0` is unnecessary.

By default, `estimate` sets the necessary presample conditional variances to the average of the squared inferred residuals.

Data Types: `double`

Initial estimates of variances of innovations for the ARIMA(p,D,q) model, specified as the comma-separated pair consisting of `'Variance0'` and a positive scalar or a cell vector of positive scalars. If `Variance0` is a cell vector, then the conditional variance model must recognize the parameter names as valid coefficients.

By default, `estimate` derives initial estimates using standard time series techniques.

Data Types: `double` | `cell`

Exogenous predictors in the regression model, specified as the comma-separated pair consisting of `'X'` and a matrix.

The columns of `X` are separate, synchronized time series, with the last row containing the latest observations.

If you do not specify `Y0`, then the number of rows of `X` must be at least ```numel(y,2) + Mdl.P```. Otherwise, the number of rows of `X` should be at least the length of `y`.

If the number of rows of `X` exceeds the number necessary, then `estimate` uses the latest observations and synchronizes `X` with the response series `y`.

By default, `estimate` does not estimate the regression coefficients regardless of their presence in `Mdl`.

Data Types: `double`

Presample response data that provides initial values for the ARIMA(p,D,q) model, specified as the comma-separated pair consisting of `'Y0'` and a numeric column vector.

`Y0` is a column vector with at least `Mdl.P` rows. If the number of rows in `Y0` exceeds `Mdl.P`, `estimate` only uses the latest `Mdl.P` observations. The last row contains the latest observation.

By default, `estimate` backward forecasts for the necessary amount of presample observations.

Data Types: `double`

Notes

• `NaN`s indicate missing values, and `estimate` removes them. The software merges the presample data (`E0`, `V0`, and `Y0`) separately from the effective sample data (`X` and `y`), then uses list-wise deletion to remove any `NaN`s. Removing `NaN`s in the data reduces the sample size, and can also create irregular time series.

• Removing `NaN`s in the data reduces the sample size, and can also create irregular time series.

• `estimate` assumes that you synchronize the response and exogenous predictors such that the last (latest) observation of each occurs simultaneously. The software also assumes that you synchronize the presample series similarly.

• If you specify a value for `Display`, then it takes precedence over the specifications of the optimization options `Diagnostics` and `Display`. Otherwise, `estimate` honors all selections related to the display of optimization information in the optimization options.

Output Arguments

expand all

Model containing parameter estimates, returned as an `arima` model. `estimate` uses maximum likelihood to calculate all parameter estimates not constrained by `Mdl` (that is, all parameters in `Mdl` that you set to `NaN`).

Variance-covariance matrix of maximum likelihood estimates of model parameters known to the optimizer, returned as a matrix.

The rows and columns contain the covariances of the parameter estimates. The standard errors of the parameter estimates are the square root of the entries along the main diagonal.

The rows and columns associated with any parameters held fixed as equality constraints contain `0`s.

`estimate` uses the outer product of gradients (OPG) method to perform covariance matrix estimation.

`estimate` orders the parameters in `EstParamCov` as follows:

• Constant

• Nonzero `AR` coefficients at positive lags

• Nonzero `SAR` coefficients at positive lags

• Nonzero `MA` coefficients at positive lags

• Nonzero `SMA` coefficients at positive lags

• Regression coefficients (when you specify `X` in `estimate`)

• Variance parameters (scalar for constant-variance models, vector of additional parameters otherwise)

• Degrees of freedom (t innovation distribution only)

Data Types: `double`

Optimized loglikelihood objective function value, returned as a scalar.

Data Types: `double`

Summary information, returned as a structure.

FieldDescription
`exitflag`Optimization exit flag (see `fmincon` in Optimization Toolbox)
`options`Optimization options controller (see `optimoptions` and `fmincon` in Optimization Toolbox)
`X`Vector of final parameter estimates
`X0`Vector of initial parameter estimates

For example, you can display the vector of final estimates by typing `info.X` in the Command Window.

Data Types: `struct`

Examples

expand all

Fit an ARMA(2,1) model to simulated data.

Simulate 500 data points from the ARMA(2,1) model

`${y}_{t}=0.5{y}_{t-1}-0.3{y}_{t-2}+{\epsilon }_{t}+0.2{\epsilon }_{t-1},$`

where ${\epsilon }_{t}$ follows a Gaussian distribution with mean 0 and variance 0.1.

```Mdl = arima('AR',{0.5,-0.3},'MA',0.2,... 'Constant',0,'Variance',0.1); rng(5); % For reproducibility y = simulate(Mdl,500);```

The simulated data is stored in the column vector `Y`.

Specify an ARMA(2,1) model with no constant and unknown coefficients and variance.

```ToEstMdl = arima(2,0,1); ToEstMdl.Constant = 0```
```ToEstMdl = arima with properties: Description: "ARIMA(2,0,1) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 2 D: 0 Q: 1 Constant: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {NaN} at lag [1] SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN ```

Fit the ARMA(2,1) model to `y`.

`EstMdl = estimate(ToEstMdl,y);`
``` ARIMA(2,0,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ __________ Constant 0 0 NaN NaN AR{1} 0.49404 0.10321 4.7866 1.6961e-06 AR{2} -0.25348 0.06993 -3.6248 0.00028921 MA{1} 0.27958 0.10721 2.6078 0.0091132 Variance 0.10009 0.0066403 15.073 2.4228e-51 ```

The result is a new `arima` model called `EstMdl`. The estimates in `EstMdl` resemble the parameter values that generated the simulated data.

Fit an integrated ARIMA(1,1,1) model to the daily close of the NASDAQ Composite Index.

Load the NASDAQ data included with the toolbox. Extract the first 1500 observations of the Composite Index (January 1990 to December 1995).

```load Data_EquityIdx nasdaq = DataTable.NASDAQ(1:1500);```

Specify an ARIMA(1,1,1) model for fitting.

`Mdl = arima(1,1,1);`

The model is nonseasonal, so you can use shorthand syntax.

Fit the model to the first half of the data.

`EstMdl = estimate(Mdl,nasdaq(1:750));`
``` ARIMA(1,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue _______ _____________ __________ ___________ Constant 0.2234 0.18418 1.213 0.22514 AR{1} 0.11434 0.11944 0.95733 0.3384 MA{1} 0.12764 0.11925 1.0703 0.28448 Variance 18.983 0.68999 27.512 1.2547e-166 ```

The result is a new `arima` model (`EstMdl`). The estimated parameters, their standard errors, and $t$ statistics display in the Command Window.

Use the estimated parameters as initial values for fitting the second half of the data.

```con0 = EstMdl.Constant; ar0 = EstMdl.AR{1}; ma0 = EstMdl.MA{1}; var0 = EstMdl.Variance; [EstMdl2,EstParamCov2,logL2,info2] = estimate(Mdl,.... nasdaq(751:end),'Constant0',con0,'AR0',ar0,... 'MA0',ma0,'Variance0',var0);```
``` ARIMA(1,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ ___________ Constant 0.61143 0.32675 1.8712 0.061313 AR{1} -0.15071 0.11782 -1.2792 0.20084 MA{1} 0.38569 0.10905 3.5366 0.00040529 Variance 36.493 1.227 29.742 2.1903e-194 ```

The parameter estimates are stored in the `info` data structure. Display the final parameter estimates.

`info2.X`
```ans = 4×1 0.6114 -0.1507 0.3857 36.4933 ```

Fit an ARIMAX model to a simulated time series without specifying initial values for the response or the parameters.

Define the ARIMAX(2,1,1) model

`$\left(1-0.5L+0.3{L}^{2}\right)\left(1-L{\right)}^{1}{y}_{t}=1.5{x}_{1,t}+2.6{x}_{2,t}-0.3{x}_{3,t}+{\epsilon }_{t}+0.2{\epsilon }_{t-1}$`

to eventually simulate a time series of length 500, where ${\epsilon }_{t}$ follows a Gaussian distribution with mean 0 and variance 0.1.

```Mdl = arima('AR',{0.5,-0.3},'MA',0.2,'D',1,... 'Constant',0,'Variance',0.1,'Beta',[1.5 2.6 -0.3]); T = 500;```

Simulate three stationary AR(1) series and presample values:

`$\begin{array}{c}{x}_{1,t}=0.1{x}_{1,t-1}+{\eta }_{1,t}\\ {x}_{2,t}=0.2{x}_{2,t-1}+{\eta }_{2,t}\\ {x}_{3,t}=0.3{x}_{3,t-1}+{\eta }_{3,t},\end{array}$`

where ${\eta }_{i,t}$ follows a Gaussian distribution with mean 0 and variance 0.01 for i = {1,2,3}.

```numObs = Mdl.P + T; MdlX1 = arima('AR',0.1,'Constant',0,'Variance',0.01); MdlX2 = arima('AR',0.2,'Constant',0,'Variance',0.01); MdlX3 = arima('AR',0.3,'Constant',0,'Variance',0.01); X1 = simulate(MdlX1,numObs); X2 = simulate(MdlX2,numObs); X3 = simulate(MdlX3,numObs); Xmat = [X1 X2 X3];```

The simulated exogenous predictors are stored in the `numObs`-by-3 matrix `Xmat`.

Simulate 500 data points from the ARIMA(2,1,1) model.

`y = simulate(Mdl,T,'X',Xmat);`

The simulated response is stored in the column vector `y`.

Create an ARIMA(2,1,1) model with known `0`-valued constant and unknown coefficients and variance.

```ToEstMdl = arima(2,1,1); ToEstMdl.Constant = 0```
```ToEstMdl = arima with properties: Description: "ARIMA(2,1,1) Model (Gaussian Distribution)" Distribution: Name = "Gaussian" P: 3 D: 1 Q: 1 Constant: 0 AR: {NaN NaN} at lags [1 2] SAR: {} MA: {NaN} at lag [1] SMA: {} Seasonality: 0 Beta: [1×0] Variance: NaN ```

`ToEstMdl` is an ARIMA(2,1,1) model. `estimate` changes this designation to ARIMAX(2,1,1) when you pass the exogenous predictors into the `X` argument. `estimate` estimates all parameters with the value `NaN` in `ToEstMdl`.

Fit the ARIMAX(2,1,1) model to `y` including regression matrix `Xmat`.

`EstMdl = estimate(ToEstMdl,y,'X',Xmat);`
``` ARIMAX(2,1,1) Model (Gaussian Distribution): Value StandardError TStatistic PValue ________ _____________ __________ __________ Constant 0 0 NaN NaN AR{1} 0.41634 0.046067 9.0376 1.601e-19 AR{2} -0.27405 0.040645 -6.7427 1.5552e-11 MA{1} 0.3346 0.057208 5.8488 4.95e-09 Beta(1) 1.4194 0.14242 9.9662 2.1429e-23 Beta(2) 2.542 0.1331 19.098 2.6194e-81 Beta(3) -0.28767 0.14035 -2.0496 0.040399 Variance 0.096777 0.005791 16.712 1.08e-62 ```

`ToEstMdl` is a new `arima` model designated as ARIMAX(2,1,1) since exogenous predictors enter the model. The estimates in `ToEstMdl` resemble the parameter values that generated the simulated data.

Fit an ARIMAX model to a time series specifying initial values for the response and the parameters.

The Credit Defaults data set contains four variables:

• Default rate on investment-grade corporate bonds (IGD)

• Percentage of investment-grade bond issuers first rated 3 years ago (AGE)

• One-year-ahead forecast of the change in corporate profits, adjusted for inflation (CPF)

• Spread between corporate bond yields and those of comparable government bonds (SPR)

Assume that an ARIMAX(1,0,0) model is appropriate to fit IGD using AGE, CPF, and SPR as exogenous predictors. Load the Credit Defaults data set. Assign the response IGD to `y`. Assign the predictors AGE, CPF, and SPR to the matrix `X`.

```load Data_CreditDefaults X = Data(:,[1 3:4]); T = size(X,1); y = Data(:,5);```

The response and exogenous predictor series should be stationary before you continue. If your response is not stationary, then specify the degree of integration in the `arima` statement. If your exogenous predictors are not stationary, then you must difference them using `diff`. The series in this example are stationary to not distract from its main purpose.

Separate the initial values from the main response and exogenous predictors. Choose initial values for the regression coefficients `Beta0`.

```y0 = y(1); yEst = y(2:T); XEst = X(2:end,:); Beta0 = [0.5 0.5 0.5];```

`y0` initializes the response series and `yest` is the main response series for estimation. `XEst` is the main exogenous predictor matrix for estimation.

Specify the model `Mdl` to fit to the data.

`Mdl = arima(1,0,0);`

Fit the model to the data and specify the initial values.

```EstMdl = estimate(Mdl,yEst,'X',XEst,... 'Y0',y0,'Beta0',Beta0);```
``` ARIMAX(1,0,0) Model (Gaussian Distribution): Value StandardError TStatistic PValue _________ _____________ __________ ________ Constant -0.20477 0.26608 -0.76958 0.44155 AR{1} -0.017311 0.56562 -0.030606 0.97558 Beta(1) 0.023933 0.021842 1.0957 0.27319 Beta(2) -0.01246 0.0074992 -1.6615 0.096603 Beta(3) 0.068087 0.074504 0.91388 0.36078 Variance 0.0053946 0.0022439 2.4041 0.016212 ```

References

[1] Box, G. E. P., G. M. Jenkins, and G. C. Reinsel. Time Series Analysis: Forecasting and Control 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.

[2] Enders, W. Applied Econometric Time Series. Hoboken, NJ: John Wiley & Sons, 1995.

[3] Greene, W. H. Econometric Analysis. 3rd ed. Upper Saddle River, NJ: Prentice Hall, 1997.

[4] Hamilton, J. D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.