How to Interpret Cointegration Test Results

by Eric · Published May 26, 2020 · Updated July 12, 2021

Introduction

In this blog we will explore how to set up and interpret cointegration results using a real-world time series example. We will cover the case with no structural breaks as well as the case with one unknown structural break using tools from the GAUSS tspdlib library.

Dataset

In this blog, we will use the famous Nelson-Plosser time series data. The dataset contains macroeconomic fundamentals for the United States.

We will be using three of these fundamentals:

M2 money stock.
Bond yield (measured by the basic yields of 30-year corporate bonds).
S&P 500 index stock prices.

The time series data is annual data, covering 1900 - 1970.

Preparing for Cointegration

In order to prepare for cointegration testing, we will take some preliminary time series modeling steps. We will:

Establishing an Underlying Theory

In this example, we will examine the macroeconomic question of whether stock prices are linked to macroeconomic indicators. In particular, we will examine if there is a cointegrated, long-run relationship between the S&P 500 price index and monetary policy indicators of the M2 money stock and the bond yields.

Mathematically we will consider the cointegrated relationship:

$y_{sp, t} = c + \beta_1 y_{money, t} + \beta_2y_{bond, t} + u_t$

Time Series Visualization

When visualizing time series data, we look for visual evidence of:

The comovements between our variables.
The presence of deterministic components such as constants and time trends.
Potential structural breaks.

Our time series plots give us some important considerations for our testing, providing visual evidence to support:

Comovements between the variables.
At least one structural break in the time series dynamics of all three of our variables.
A potential time trend in the datasets, especially in the later years of the sample.

Unit Root Testing

Prior to testing for cointegration between our time series data, we should check for unit roots in the data. We will do this using the adf procedure in the tspdlib library to conduct the Augmented Dickey-Fuller unit root test.

Variable	Test Statistic	1% Critical Value	5% Critical Value	10% Critical Value	Conclusion
Money	1.621	-4.04	-3.45	-3.15	Cannot reject the null
Bond yield	-1.360	-4.04	-3.45	-3.15	Cannot reject the null
S&P 500	-0.3842	-4.04	-3.45	-3.15	Cannot reject the null

Our ADF test statistics are greater than the 10% critical value for all of our time series. This implies that we cannot reject the null hypothesis of a unit root for any of our time series data.

For detailed information on conducting unit root tests in GAUSS see our previous blog on “How to Conduct Unit Root Tests in GAUSS”.

Unit Root Testing with Structural Breaks

What about the potential structural break that we see in our time series data? Does this have an impact on our unit root testing?

Using the adf_1break procedure in the tspdlib library to test for unit roots with a single structural break in the trend and constant we get the following results.

Variable	Test Statistic	Break Date	1% Critical Value	5% Critical Value	10% Critical Value	Conclusion
Money	-4.844	1948	-5.57	-5.08	-4.82	Cannot reject the null
Bond yield	-3.226	1963	-5.57	-5.08	-4.82	Cannot reject the null
S&P 500	-4.639	1945	-5.57	-5.08	-4.82	Cannot reject the null

Our ADF test statistics again suggest that even when accounting for the structural break, we cannot reject the null hypothesis of a unit root for any of our time series data.

Conducting our Cointegration Tests

Having concluded that there is evidence for unit roots in our data, we can now run our cointegration tests.

When setting up cointegration tests, there are a number of assumptions that we must specify:

Which normalization we want to use.
The deterministic components to include in our model.
The maximum number of lags to allow in our test.
The information criterion to use to select the optimal number of lags.

To better understand these general assumptions, let’s look at the simplest of our tests, the Engle-Granger cointegration test.

Normalization

In the two-stage, residual-based cointegration tests which we will consider today, normalization amounts to deciding which variable is our dependent variable and which variables are our independent variables in the cointegration regression.

We will choose our normalization to reflect our theoretical question of whether the S&P 500 index is cointegrated with the money stock and the bond yield. As we mentioned earlier, this means we will consider the cointegrated relationship:

$y_{sp, t} = c + \beta_1 y_{money, t} + \beta_2 y_{bond, t} + u_t$


// Set fname to name of dataset
fname = "nelsonplosser.dta";
 
// Load three variables from the dataset 
// and remove rows with missing values
coint_data = packr(loadd(fname, "sp500 + m + bnd"));
 
// Define y and x matrix
y = coint_data[., 1];
x = coint_data[., 2 3];

The Deterministic Component

The second assumption we must make about our Engle-Granger test is which model we wish to use. To understand how to make this decision, let's look closer at what this input means.

The Engle-Granger test is a two-step test:

Estimate the cointegration regression.
Test for stationary in the residuals using the ADF unit root test.

When we specify which model to use we impact two things:

The deterministic components which are used in the first-stage cointegration regression.
The distribution of the test statistic.

There are three options to choose from:

No constant or trend (model = 0) $y_{sp, t} = \beta_1 y_{money, t} + \beta_2 y_{bond, t} + u_t$
Constant (model = 1) $y_{sp, t} = \alpha + \beta_1 y_{money, t} + \beta_2 y_{bond, t} + u_t$
Constant and trend (model = 2) $y_{sp, t} = \alpha + \delta t + \beta_1 y_{money, t} + \beta_2 y_{bond, t} + u_t$

For our example, we will include a constant and trend in our first-stage cointegration regression by setting:


// Select model with constant and trend
model = 2;

The Lag Specifications

In the second-stage ADF residual unit root test, the error terms should be serially independent. To account for possible autocorrelation, lags of the first differences of the residual can be included in ADF test regression.

The GAUSS coint_egranger will automatically determine the optimal number of lags to include in the second-stage regression based on two user inputs:

The maximum number of lags to allow.
The criterion to use to determine the optimal number of lags:
- The Akaike information criterion (AIC) [ic = 0]
- The Schwarz information criterion (SIC) [ic = 1]
- The t-stat criterion [ic = 2]


/*
** Information Criterion: 
** 1=Akaike; 
** 2=Schwarz; 
** 3=t-stat sign.
*/
ic = 2; 
 
// Maximum number of lags 
pmax = 12;

Calling our Cointegration Test

Now that we have loaded our data and chosen the test settings, we can call the coint_egranger procedure:


// Perform Engle-Granger Cointegration Test
{ tau_eg, cvADF_eg } = coint_egranger(y, x, model, pmax, ic);

Interpreting Our Cointegration Results

In order to interpret our cointegration results, let's revisit the two steps of the Engle-Granger test:

Estimate the cointegration regression.
Test the residuals from the cointegration regression for unit roots.

The Engle-Granger test statistic for cointegration reduces to an ADF unit root test of the residuals of the cointegration regression:

If the residuals contain a unit root, then there is no cointegration.
The null hypothesis of the ADF test is that the residuals have a unit root. Therefore, the Engle-Granger test considers the null hypothesis that there is no cointegration.
As the Engle-Granger test statistic decreases:
- We are more likely to reject the null hypothesis of no cointegration.
- We have stronger evidence that the variables are cointegrated.

After running our cointegration test we obtain the following results:

-----------Engle-Granger Test---------------------------
-----------Constant and Trend---------------------------
H0: no co-integration (EG, 1987 & P0, 1990)

     Test      Statistic   CV(1%,      5%,      10%)
   ------      -------------------------------------
   EG_ADF         -2.105   -4.645   -4.157   -3.843

We can see that:

Our test statistic of -2.105 is larger than the critical values at the 1%, 5%, and 10% levels.
We cannot reject the null hypothesis of no cointegration.
We do not find evidence in support of the cointegration of the S&P 500 with the U.S. money stock and bond yield.

Conducting our Cointegration Tests with One Structural Break

Earlier we saw that the potential structural break in our data did not change our unit root test conclusion. We should also see if the structural break has an impact on our cointegration testing.

To do this we will use the Gregory-Hansen cointegration test which can be implemented using the coint_ghansen test in the tspdlib library.

We can carry over all of our coint_egranger testing specifications, except our model specification.

The Model Specification

When implementing the Gregory-Hansen test, we must decide on a model which specifies:

Which deterministic components are present in the cointegration regression.
How the structural break affects the cointegration regression.

There are four modeling options to choose from

The level shift [model = 1]

$y_{sp, t} = \mu_1(1 - d_{\tau}) + \mu_{1,\tau} d_{\tau} + \beta_1 y_{money, t} + \beta_2 y_{bond, t} + u_t$
In this model, there is a structural break at time $\tau$ and $d_{\tau}$ is an indicator variable equal to 1 when $t >= \tau$ . The constant before the structural break is $\mu_1$ and the constant after the structural break is $\mu_2$ .
The level shift with trend [model = 2]

$y_{sp, t} = \mu_1(1 - d_{\tau}) + \mu_{1,\tau} d_{\tau} + \delta t + \beta_1 y_{money, t} + \beta_2 y_{bond, t} + u_t$
In this model, the structural break again affects the constant. However, there is also a time trend included in the model.
The regime shift [model = 3]

$y_{sp, t} = \mu_1(1 - d_{\tau}) + \mu_{1,\tau} d_{\tau} + \beta_1(1 - d_{\tau})y_{money, t} +$ $\beta_{1,\tau}d_{\tau}y_{money, t} + \beta_2(1 - d_{\tau}) y_{bond, t} + \beta_{2,\tau}d_{\tau}y_{bond, t} + u_t$
In this model, the structural break affects the constant and regression coefficients.
The regime and trend shift shift [model = 4]

$y_{sp, t} = \mu_1(1 - d_{\tau}) + \mu_{1,\tau} d_{\tau} + \delta_1(1 - d_{\tau}) t + \delta_{1,\tau}d_{\tau}t + \beta_1(1 - d_{\tau})y_{money, t} +$ $\beta_{1,\tau}d_{\tau}y_{money, t} + \beta_2(1 - d_{\tau}) y_{bond, t} + \beta_{2,\tau}d_{\tau}y_{bond, t} + u_t$

In this model, the structural break again affects the constant, the regression coefficients, and the trend.

For example, let's consider the last case, where the constant, coefficients, and trend are all impacted by the structural break:


// Set fname to name of dataset
fname = "nelsonplosser.dta";
 
// Load three variables from the dataset
// and remove rows with missing values
coint_data = packr(loadd(fname, "sp500 + m + bnd"));
 
// Define y and x matrix
y = coint_data[., 1];
x = coint_data[., 2 3];
 
// Regime and trend shift
model = 4; 
 
/*
** Information Criterion: 
** 1=Akaike; 
** 2=Schwarz; 
** 3=t-stat sign.
*/
ic = 2; 
 
// Maximum number of lags 
pmax = 12;  
 
/*
** Long run variance computation
** 1 = iid
** 2 = Bartlett
** 3 = Quadratic Spectral (QS);
** 4 = SPC with Bartlett /see (Sul, Phillips & Choi, 2005)
** 5 = SPC with QS;
** 6 = Kurozumi with Bartlett
** 7 = Kurozumi with QS
*/ 
varm = 1;
 
// Bandwidth for variance 
bwl=1;
 
// Data trimming
trimm=0.1;
 
// Perform cointegration test
{ ADF_min_gh, TBadf_gh, Zt_min_gh, TBzt_gh, Za_min_gh, TBza_gh, cvADFZt_gh, cvZa_gh } =
    coint_ghansen(y, x, model, bwl, ic, pmax, varm, trimm);

Interpreting Our Cointegration Results with One Structural Break

The coint_ghansen procedure provides more extensive results than the coint_egranger test. In particular, the Gregory-Hansen test:

Performs Augmented Dickey-Fuller testing on the residuals from the cointegration regression.
Perform the Phillips-Perron testing on the residuals from the cointegration regression.
Identifies structural breaks.

Cointegration results with one structural break

Cointegration test results
After calling the coint_ghansen procedure and testing all possible models, we obtain the following test statistic results:

Test	$ADF$ Test Statistic	$Z_t$ Test Statistic	$Z_{\alpha}$ Test Statistic	10% Critical Value $ADF$ , $Z_t$	10% Critical Value $Z_{\alpha}$	Conclusion
Gregory-Hansen, Level shift	-4.004	-3.819	-27.858	-4.690	-42.490	Cannot reject the null of no cointegration for $ADF$ , $Z_t$ , or $Z_{\alpha}$ .
Gregory-Hansen, Level shift with trend	-3.889	-3.751	-27.618	-5.030	-48.94	Cannot reject the null of no cointegration for $ADF$ , $Z_t$ , or $Z_{\alpha}$ .
Gregory-Hansen, Regime change	-4.658	-4.539	-32.766	-5.23	-52.85	Cannot reject the null of no cointegration for $ADF$ , $Z_t$ , or $Z_{\alpha}$ .
Gregory-Hansen, Regime change with trend	-5.834	-4.484	-32.411	-5.72	-63.10	Cannot reject the null of no cointegration for $ADF$ , $Z_t$ , or $Z_{\alpha}$ .

As we can see from these results, there is no evidence that our S&P 500 Index is cointegrated with the money stock and bond yield.

Structural break results
The coint_ghansen procedure also returns estimates for break dates based on the $ADF$ , $Z_t$ , and $Z_{\alpha}$ tests:

Test	$ADF$ Break Date	$Z_t$ Break Date	$Z_{\alpha}$ Break Date
Gregory-Hansen, Level shift	1958	1956	1956
Gregory-Hansen, Level shift with trend	1958	1956	1956
Gregory-Hansen, Regime change	1955	1955	1955
Gregory-Hansen, Regime change with trend	1951	1953	1947

What can we Conclude from the Gregory-Hansen Cointegration Test?

The results from our Gregory Hansen cointegration test provide some important conclusions:

There is no support for cointegration.
Incorporating a structural break does NOT change our conclusion that there is no cointegration.

Note that while the Gregory-Hansen test does estimate break dates, it does not provide the statistical evidence to conclude whether these are statistically significant break dates or not.

Conclusion

Today's blog looks closer at the Engle-Granger and Gregory-Hansen residual-based cointegration tests. By building a better understanding of how the tests work and what assumptions we make when running the tests, you will be better equipped to interpret the test results.

In particular, today we learned

How to prepare for cointegration testing.
How to set up the specifications for cointegration tests.
How to interpret the results from the Engle-Granger and Gregory-Hansen cointegration tests.

Eric( Director of Applications and Training at Aptech Systems, Inc. )

Eric has been working to build, distribute, and strengthen the GAUSS universe since 2012. He is an economist skilled in data analysis and software development. He has earned a B.A. and MSc in economics and engineering and has over 18 years of combined industry and academic experience in data analysis and research.

2 thoughts on “How to Interpret Cointegration Test Results”

jamels July 9, 2021 at 1:34 am

Nice post, very pedagogical, these three parameters need to be specified:

// To be specified
bwl=1;

trimm=0.1;

varm=1;

Best,
JS

Log in to Reply ↓

Erica Post authorJuly 12, 2021 at 10:01 am

Hello Jamel,

Thank you for your comment! I've updated the blog to reflect this.

Also, it should be noted that since the last update of TSPDLIB, the bwl, ic, pmax, varm, and trimm arguments are all optional arguments. This allows you to call coint_ghansen using internal defaults for these parameters:


// Set fname to name of dataset
fname = "nelsonplosser.dta";
 
// Load three variables from the dataset
// and remove rows with missing values
coint_data = packr(loadd(fname, "sp500 + m + bnd"));
 
// Define y and x matrix
y = coint_data[., 1];
x = coint_data[., 2 3];
 
// Regime and trend shift
model = 4; 
 
// Perform cointegration test
{ ADF_min_gh, TBadf_gh, Zt_min_gh, TBzt_gh, Za_min_gh, TBza_gh, cvADFZt_gh, cvZa_gh } =
    coint_ghansen(y, x, model);

More information about the default values can be in the TSPDLIB documentation.

Best,
Erica

↓

You must be logged in to post a comment.

	// Set fname to name of dataset
	fname = "nelsonplosser.dta";

	// Load three variables from the dataset
	// and remove rows with missing values
	coint_data = packr(loadd(fname, "sp500 + m + bnd"));

	// Define y and x matrix
	y = coint_data[., 1];
	x = coint_data[., 2 3];

	/*
	** Information Criterion:
	** 1=Akaike;
	** 2=Schwarz;
	** 3=t-stat sign.
	*/
	ic = 2;

	// Maximum number of lags
	pmax = 12;

	// Perform Engle-Granger Cointegration Test
	{ tau_eg, cvADF_eg } = coint_egranger(y, x, model, pmax, ic);