kpsstest

KPSS test for stationarity

Syntax

``h = kpsstest(y)``
``````[h,pValue,stat,cValue] = kpsstest(y)``````
``StatTbl = kpsstest(Tbl)``
``[___] = kpsstest(___,Name=Value)``
``[___,reg] = kpsstest(___)``

Description

example

````h = kpsstest(y)` returns rejection decision from conducting the Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) test for a unit root in the input univariate time series.```

example

``````[h,pValue,stat,cValue] = kpsstest(y)``` also returns the p-value `pValue`, test statistic `stat`, and critical value `cValue` of the test.```

example

````StatTbl = kpsstest(Tbl)` returns a table containing variables for the test results, statistics, and settings from conducting the KPSS test for a unit root in the last variable of the input table or timetable `Tbl`. To select a different variable in `Tbl` to test, use the `DataVariable` name-value argument.```

example

````[___] = kpsstest(___,Name=Value)` specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. `kpsstest` returns the output argument combination for the corresponding input arguments.Some options control the number of tests to conduct. The following conditions apply when `kpsstest` conducts multiple tests: `kpsstest` treats each test as separate from all other tests.If you specify `y`, all outputs are vectors.If you specify `Tbl`, each row of `StatTbl` contains the results of the corresponding test. For example, ```kpsstest(Tbl,DataVariable="GDP",Alpha=0.025,Lags=[0 1])``` conducts two tests, at a level of significance of 0.025, for the presence of a unit root in the variable `GDP` of the table `Tbl`. The first test includes `0` autocovariance lags in the Newey-West estimator of the long-run variance and the second test includes `1` autocovariance lag.```

example

````[___,reg] = kpsstest(___)` additionally returns a structure of regression statistics for the hypothesis test `reg`.```

Examples

collapse all

Test a time series for a unit root using the default options of `kpsstest`. Input the time series data as a numeric vector.

Load the Nelson-Plosser macroeconomic series data set. Plot the real gross national product (RGNP).

```load Data_NelsonPlosser rgnp = DataTable.GNPR; dt = datetime(dates,ConvertFrom="datenum"); plot(dt,rgnp) title("Real Gross National Product")```

The series exhibits exponential growth.

Linearize the RGNP series.

`linRGNP = log(rgnp);`

Assess the null hypothesis of the KPSS test, which is that the series is trend stationary. Use default options.

`h = kpsstest(linRGNP)`
```h = logical 1 ```

`h = 1` indicates that, at a 5% level of significance, the test rejects the null hypothesis that the linearized Real GNP series is trend stationary, which suggests that the series is unit root nonstationary.

Load the Nelson-Plosser Macroeconomic series data set, and linearize the RGNP series.

```load Data_NelsonPlosser linRGNP = log(DataTable.GNPR);```

Assess the null hypothesis that the series is trend stationary. Return the test decision, $\mathit{p}$-value, test statistic, and critical value.

`[h,pValue,stats,cValue] = kpsstest(linRGNP)`
```h = logical 1 ```
```pValue = 0.0100 ```
```stats = 0.6299 ```
```cValue = 0.1460 ```

Test whether a time series, which is one variable in a table, is trend stationary using the default options.

Load the Nelson-Plosser macroeconomic series data set, which contains annual measurements of macroeconomic variables in the table `DataTable`. Linearize the RGNP series by applying the log transformation, and store the result in `DataTable`.

```load Data_NelsonPlosser DataTable.LinRGNP = log(DataTable.GNPR); DataTable.Properties.VariableNames{end}```
```ans = 'LinRGNP' ```

Test the null hypothesis that the linearized RGNP series is trend stationary.

`StatTbl = kpsstest(DataTable)`
```StatTbl=1×7 table h pValue stat cValue Lags Alpha Trend _____ ______ _______ ______ ____ _____ _____ Test 1 true 0.01 0.62989 0.146 0 0.05 true ```

`kpsstest` returns test results and settings in the table `StatTbl`, where variables correspond to test results (`h`, `pValue`, `stat`, and `cValue`) and settings (`Lags`, `Alpha`, `Trend`), and rows correspond to individual tests (in this case, `kpsstest` conducts one test).

By default, `kpsstest` tests the last variable in the table. To select a variable from an input table to test, set the `DataVariable` option.

Conduct multiple tests on the linearized RGNP series that reproduce the first row of the second half of Table 5 in [2].

Load the Nelson-Plosser macroeconomic series data set, which contains annual measurements of macroeconomic variables in the table `DataTable`. Apply the log transformation to all variables in the table.

```load Data_NelsonPlosser LogDT = varfun(@log,DataTable); LogDT.Properties.VariableNames{end}```
```ans = 'log_SP' ```

`varfun` applies `log` to all variables in `DataTable`, prepends `log_` to all transformed variable names, and stores the result in the table `LogDT`. The final variable is the log of the stock price index series (`SP`).

Assess the null hypothesis that the linearized RGNP series is trend stationary over a range of lags. Specify the variable name of the linearized RGNP series `log_GNPR`.

```lags = (0:8); StatTbl = kpsstest(LogDT,DataVariable="log_GNPR",Lags=lags)```
```StatTbl=9×7 table h pValue stat cValue Lags Alpha Trend _____ ________ _______ ______ ____ _____ _____ Test 1 true 0.01 0.62989 0.146 0 0.05 true Test 2 true 0.01 0.33666 0.146 1 0.05 true Test 3 true 0.01 0.24209 0.146 2 0.05 true Test 4 true 0.0169 0.1976 0.146 3 0.05 true Test 5 true 0.027579 0.17291 0.146 4 0.05 true Test 6 true 0.04015 0.15782 0.146 5 0.05 true Test 7 true 0.048417 0.1479 0.146 6 0.05 true Test 8 false 0.05886 0.14122 0.146 7 0.05 true Test 9 false 0.066757 0.13695 0.146 8 0.05 true ```

The tests corresponding to 0 $\le$ `lags` $\le$ 2 produce $\mathit{p}$-values that are less than 0.01. For 2 < `lags` < 7, the tests indicate sufficient evidence to suggest that log RGNP is unit root nonstationary (as opposed to the series being trend stationary) at the default 5% level.

Test whether the wage series in the manufacturing sector (1900–1970) has a unit root. Use the advice in [2] to select the number of lags in the Newey-West estimator of the coefficient standard errors.

Load the Nelson-Plosser macroeconomic data set. Remove all missing values from the data relative to the wage series `WN.`

```load Data_NelsonPlosser [DataTable,idx] = rmmissing(DataTable,DataVariables="WN"); dt = dates(~idx);```

Compute the effective sample size $\mathit{T}$ and its square root, where the latter is approximately the number of lags recommended for the Newey-West estimator.

```T = height(DataTable); sqrtT = sqrt(T);```

Plot the wage series.

```plot(dt,DataTable.WN) title("Wages")```

The wage series appears to grow exponentially.

Linearize the wages series by applying the log transformation to all variables in the table.

```LogDT = varfun(@log,DataTable); plot(dt,LogDT.log_WN) title("Log Wages")```

The log wage series appears to have a linear trend.

Test the null hypothesis that the log wage series is trend stationary (no unit root) against the alternative hypothesis that the log wage series is difference stationary. Conduct the test by setting a range of lags for the Newey-West estimator around $\sqrt{T}$.

`StatTbl = kpsstest(LogDT,DataVariable="log_WN",Lags=7:10)`
```StatTbl=4×7 table h pValue stat cValue Lags Alpha Trend _____ ______ ________ ______ ____ _____ _____ Test 1 false 0.1 0.10678 0.146 7 0.05 true Test 2 false 0.1 0.10074 0.146 8 0.05 true Test 3 false 0.1 0.096634 0.146 9 0.05 true Test 4 false 0.1 0.094058 0.146 10 0.05 true ```

All tests fail to reject the null hypothesis that the log wages series is trend stationary.

The $\mathit{p}$-values are larger than 0.1. The software compares the test statistic to critical values and computes $\mathit{p}$-values that it interpolates from tables in [2].

Load the Nelson-Plosser macroeconomic series data set. Apply the log transformation to all variables in the table.

```load Data_NelsonPlosser LogDT = varfun(@log,DataTable);```

Assess the null hypothesis that the linearized RGNP series is trend stationary. Use the `Trend` option to conduct the test with (`true`) and without (`false`) a deterministic time trend term in the response model. Return the regression statistics.

`[~,reg] = kpsstest(LogDT,DataVariable="log_GNPR",Trend=[true false]);`

`reg` is a structure array of length 2 with fields that store the OLS regression results. Each element corresponds to a test.

Compare the coefficient estimates.

`withTrend = reg(1).coeff`
```withTrend = 2×1 4.5834 0.0310 ```
`woTrend = reg(2).coeff`
```woTrend = 5.5595 ```

For the first test, the response model for the regression includes a trend term, so the regression coefficients `withTrend` include a model intercept (under the null hypothesis) `4.5834` and the coefficient of the time trend `0.0310`. For the second test, the response model includes an intercept only for the regression, so the intercept `woTrend` is `5.5595`.

Display the coefficient standard errors for the first test.

`reg(1).se`
```ans = 2×1 0.0344 0.0010 ```

The `Lags` option includes autocovariance lags in the Newey-West estimator of the long-run variance. Therefore, the option does not affect the estimated OLS coefficients, standard errors, or MSE.

Conduct a KPSS test for each lag from 0 through 4. Compare the standard OLS and the Newey-West estimates.

```lags = 0:4; [~,regLags] = kpsstest(LogDT,DataVariable="log_GNPR",Lags=lags); coeffs = table(regLags.coeff,VariableNames="Lags_"+lags, ... RowNames=["Intercept" "Trend"]); se = table(regLags.se,VariableNames="Lags_"+lags, ... RowNames=["SE_Intercept" "SE_Trend"]); mse = table(regLags.MSE,VariableNames="Lags_"+lags, ... RowNames="MSE"); nw = table(regLags.NWEst,VariableNames="Lags_"+lags, ... RowNames="NWVar"); [coeffs; se; mse; nw]```
```ans=6×5 table Lags_0 Lags_1 Lags_2 Lags_3 Lags_4 __________ __________ __________ __________ __________ Intercept 4.5834 4.5834 4.5834 4.5834 4.5834 Trend 0.030988 0.030988 0.030988 0.030988 0.030988 SE_Intercept 0.03443 0.03443 0.03443 0.03443 0.03443 SE_Trend 0.00095035 0.00095035 0.00095035 0.00095035 0.00095035 MSE 0.017933 0.017933 0.017933 0.017933 0.017933 NWVar 0.017354 0.03247 0.045154 0.055321 0.063222 ```

Input Arguments

collapse all

Univariate time series data, specified as a numeric vector. Each element of `y` represents an observation.

Data Types: `double`

Time series data, specified as a table or timetable. Each row of `Tbl` is an observation.

Specify a single series (variable) to test by using the `DataVariable` argument. The selected variable must be numeric.

Note

`kpsstest` removes missing observations, represented by `NaN` values, from the input series.

Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: ```kpsstest(Tbl,DataVariable="GDP",Alpha=0.025,Lags=[0 1])``` conducts two tests, at a level of significance of 0.025, for the presence of a unit root in the variable `GDP` of the table `Tbl`. The first test includes `0` autocovariance lags in the Newey-West estimator of the long-run variance and the second test includes `1` autocovariance lag.

Number of autocovariance lags to include in the Newey-West estimator of the long-run variance, specified as a nonnegative integer or vector of nonnegative integers. If `Lags(j)` > 0, `kpsstest` includes lags 1 through `Lags(j)` in the estimator for test `j`.

`kpsstest` conducts a separate test for each element in `Lags`.

Example: `Lags=0:2` includes zero lagged autocovariance terms in the Newey-West estimator for the first test, the lag 1 autocovariance term for the second test, and autocovariance lags 1 and 2 in the third test.

Data Types: `double`

Flag for including deterministic trend δt in the model, specified as a logical scalar or vector.

`kpsstest` conducts a separate test for each element in `Trend`.

Example: `Trend=false` excludes δt from the response model for all tests.

Data Types: `logical`

Significance level for the hypothesis test, specified as a numeric scalar or vector with entries between 0.01 and 0.10.

`kpsstest` conducts a separate test for each element in `Alpha`.

Example: `Alpha=[0.01 0.05]` uses a level of significance of `0.01` for the first test, and then uses a level of significance of `0.05` for the second test.

Data Types: `double`

Variable in `Tbl` to test, specified as a string scalar or character vector containing a variable name in `Tbl.Properties.VariableNames`, or an integer or logical vector representing the index of a name. The selected variable must be numeric.

Example: `DataVariable="GDP"`

Example: `DataVariable=[false true false false]` or `DataVariable=2` tests the second table variable.

Data Types: `double` | `logical` | `char` | `string`

Note

• When `kpsstest` conducts multiple tests, the function applies all single settings (scalars or character vectors) to each test.

• All vector-valued specifications that control the number of tests must have equal length.

• If you specify the vector `y` and any value is a row vector, all outputs are row vectors.

Output Arguments

collapse all

Test rejection decisions, returned as a logical scalar or vector with length equal to the number of tests. `kpsstest` returns `h` when you supply the input `y`.

• Values of `1` indicate rejection of the trend-stationary null hypothesis in favor of the unit root alternative.

• Values of `0` indicate failure to reject the trend-stationary null hypothesis.

Test statistic p-values, returned as a numeric scalar or vector with length equal to the number of tests. `kpsstest` returns `pValue` when you supply the input `y`.

The p-values are right-tail probabilities.

When test statistics are outside tabulated critical values, `kpsstest` returns maximum (`0.10`) or minimum (`0.01`) p-values.

Test statistics, returned as a numeric scalar or vector with length equal to the number of tests. `kpsstest` returns `stat` when you supply the input `y`.

`kpsstest` computes test statistics by using an ordinary least squares (OLS) regression (for more details, see KPSS Test).

• If you set `Trend=false`, `kpsstest` regresses `y` on an intercept.

• Otherwise, `kpsstest` regresses `y` on an intercept and trend term.

Critical values, returned as a numeric scalar or vector with length equal to the number of tests. `kpsstest` returns `cValue` when you supply the input `y`.

Critical values are for right-tail probabilities.

Test summary, returned as a table with variables for the outputs `h`, `pValue`, `stat`, and `cValue`, and with a row for each test. `kpsstest` returns `StatTbl` when you supply the input `Tbl`.

`StatTbl` contains variables for the test settings specified by `Lags`, `Alpha`, and `Trend`.

Regression statistics for OLS estimation of the coefficients in the model, returned as a structure array with the number of records equal to the number of tests.

Each element of `reg` has the fields in this table. You can access a field using dot notation, for example, `reg(1).coeff` contains the coefficient estimates of the first test.

FieldDescription
`num`Length of input series with `NaN`s removed
`size`Effective sample size T, adjusted for lags
`names`Regression coefficient names
`coeff`Estimated coefficient values
`se`Estimated coefficient standard errors
`Cov`Estimated coefficient covariance matrix
`tStats`t statistics of coefficients and p-values
`FStat`F statistic and p-value
`yMu`Mean of the lag-adjusted input series
`ySigma`Standard deviation of the lag-adjusted input series
`yHat`Fitted values of the lag-adjusted input series
`res`Regression residuals
`autoCov`Estimated residual autocovariances
`NWEst`Newey-West coefficient standard error estimates
`DWStat`Durbin-Watson statistic
`SSR`Regression sum of squares
`SSE`Error sum of squares
`SST`Total sum of squares
`MSE`Mean square error
`RMSE`Standard error of the regression
`RSq`R2 statistic
`aRSq`Adjusted R2 statistic
`LL`Loglikelihood of data under Gaussian innovations
`AIC`Akaike information criterion
`BIC`Bayesian (Schwarz) information criterion
`HQC`Hannan-Quinn information criterion

collapse all

Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Test

The KPSS test assesses the null hypothesis that a univariate time series is trend stationary against the alternative that it is a nonstationary unit root process.

The test uses the structural model

`$\begin{array}{l}{y}_{t}={c}_{t}+\delta t+{u}_{1t}\\ {c}_{t}={c}_{t-1}+{u}_{2t},\end{array}$`

where

• δ is the trend coefficient (see the `Trend` argument).

• u1t is a stationary process.

• u2t is an independent and identically distributed process with mean 0 and variance σ2.

The null hypothesis is that σ2 = 0, which implies that the random walk term (ct) is constant and acts as the model intercept. The alternative hypothesis is that σ2 > 0, which introduces the unit root in the random walk.

An OLS regression of yt onto Xt yields the residual series {et}, where Xt has one of the following forms:

• Xt = 1 for all t when `Trend` is `false`.

• Xt = [1 δt] when `Trend` is `true`.

The test statistic is

`$\frac{\sum _{t=1}^{T}{S}_{t}^{2}}{{s}^{2}{T}^{2}},$`

where

• T is the effective sample size.

• s2 is the Newey-West estimate of the long-run variance.

• sT = e1 + e2 + … + eT.

Tips

• To draw valid inferences from a KPSS test, you must determine a suitable value for the `Lags` argument. The following methods can determine a suitable number of lags:

• Begin with a small number of lags, and then evaluate the sensitivity of the results by adding more lags.

• Kwiatkowski et al. [2] suggest that a number of lags on the order of $\sqrt{T}$, where T is the effective sample size, is often satisfactory under both the null and the alternative.

For consistency of the Newey-West estimator, the number of lags must approach infinity as the sample size increases.

• With a specific testing strategy in mind, determine the value of the `Trend` argument by the growth characteristics of the input time series.

• If the input series grows, include a trend term by setting `Trend` to `true` (default). This setting provides a reasonable comparison of a trend stationary null and a unit root process with drift.

• If a series does not exhibit long-term growth characteristics, exclude a trend term by setting `Trend` to `false`.

Algorithms

• Test statistics follow nonstandard distributions under the null, even asymptotically. Kwiatkowski et al. [2] use Monte Carlo simulations, for models with and without a trend, to tabulate asymptotic critical values for a standard set of significance levels between 0.01 and 0.1. `kpsstest` interpolates critical values and p-values from these tables.

References

[1] Hamilton, James D. Time Series Analysis. Princeton, NJ: Princeton University Press, 1994.

[2] Kwiatkowski, D., P. C. B. Phillips, P. Schmidt, and Y. Shin. “Testing the Null Hypothesis of Stationarity against the Alternative of a Unit Root.” Journal of Econometrics. Vol. 54, 1992, pp. 159–178.

[3] Newey, W. K., and K. D. West. "A Simple, Positive Semidefinite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix." Econometrica. Vol. 55, 1987, pp. 703–708.

Version History

Introduced in R2009b