Introduction to Granger Causality

Introduction

Multivariate time series analysis turns to vector autoregressive models not only for understanding the relationships between variables but also for forecasting. In today’s blog, we look at how to improve VAR model selection and achieve better forecasts using Granger causality.

Today’s blog explores the questions:

  • What is Granger causality?
  • When to use Granger causality?
  • How to use Granger causality?

What is Granger causality?

If you’ve explored the vector autoregressive literature, it is likely that you have come across the term Granger causality. Granger causality is an econometric test used to verify the usefulness of one variable to forecast another.

A variable is said to:

  • Granger-cause another variable if it is helpful for forecasting the other variable.
  • Fail to Granger-cause if it is not helpful for forecasting the other variable.

At this point, you may be asking yourself what does it mean for a variable to be “helpful” in forecasting? In simple terms, a variable is “helpful” for forecasting, if when added to the forecast model, it reduces the forecasting error.

In the context of the vector autoregressive models, a variable fails to Granger-cause another variable if its:

  • Lags are not statistically significant in the equation for another variable.
  • Past values aren’t significant in predicting the future values of another.
Example applications of Granger causality.
Do sunspots help forecast real GDP growth?
Does the price of Amazon stock help forecast UPS stock prices?
What is the functional connectivity of brain structure to underlying perception, cognition, and behavior?

When do we use Granger causality?

To understand when to use Granger causality testing, it helps to consider what Granger causality doesn’t tell us. Granger causality only provides information about forecasting ability, it does not provide insight into the true causal relationship between two variables.

This should be considered in conjunction with some of the statistical requirements for using Granger causality testing.

In particular, we should use Granger causality testing when:

  • We are interested in forecasting performance, not the theoretical model behind the forecast.
  • Our data is stationary.

How do we test for Granger causality?

Testing for Granger causality is relatively simple, though it is important to consider a few nuances.

Bivariate system

To start, let’s consider the simple case that we have two time-series, $X$ and $Y$, and are modeling them in a VAR(3) system.

The VAR(3) model is made up of two equations: $$x_t = c_1 + \sum_{i=1}^3 \alpha_{1,i} y_{t-i} + \sum_{i=1}^3 \beta_{1,i} x_{t-i} + \epsilon_{x,t}$$ $$y_t = c_2 + \sum_{i=1}^3 \alpha_{2,i} y_{t-i} + \sum_{i=1}^3 \beta_{2,i} x_{t-i} + \epsilon_{y,t}$$

To test if $X$ Granger-causes $Y$, we need to determine if any lags of $X$ are statistically significant in our model. We can do this using a Wald test for linear restrictions.

The Wald test is based on the fairly simple premise that we wish to compare the performance of a restricted model for $Y$, which excludes $X$, against an unrestricted model for $Y$, which includes $X$.

Granger causality comparisons
ModelRegression$X$ CoefficientsWald test
Restricted$y_t = c_2 + \sum_{i=1}^3 \alpha_{2,i} y_{t-i} + \epsilon_{x,t}$ $\beta_{2,1} = \beta_{2,2} = \beta_{2,3} = 0$Null hypothesis
Unrestricted$y_t = c_2 + \sum_{i=1}^3 \alpha_{2,i} y_{t-i} + \sum_{i=1}^3 \beta_{2,i} x_{t-i} + \epsilon_{x,t}$ At least one of $\beta_{2,1}, \beta_{2,2}, \beta_{2,3} \neq 0$Alternative hypothesis

When testing for Granger causality:

  • We test the null hypothesis of non-causality $(H_0: \beta_{2,1} = \beta_{2,2} = \beta_{2,3} = 0)$.
  • The Wald test statistic follows a $\chi^2$ distribution.
  • We are more likely to reject the null hypothesis of non-causality as the test statistic gets larger.
  • We should test both directions $X \Rightarrow Y$ and $X \Leftarrow Y$.

Multivariate system

Now let’s consider a system with more than two variables, $X$, $Y$, and $Z$. Testing for Granger causality is more complicated in this model.

Suppose we are modeling this system as a VAR(2) model such that: $$x_t = c_1 + \sum_{i=1}^2 \alpha_{1,i} y_{t-i} + \sum_{i=1}^2 \beta_{1,i} x_{t-i} + \sum_{i=1}^2 \gamma_{1,i} z_{t-i} + \epsilon_{x,t}$$ $$y_t = c_2 + \sum_{i=1}^2 \alpha_{2,i} y_{t-i} + \sum_{i=1}^2 \beta_{2,i} x_{t-i} + \sum_{i=1}^2 \gamma_{2,i} z_{t-i} + \epsilon_{y,t}$$ $$z_t = c_2 + \sum_{i=1}^2 \alpha_{3,i} y_{t-i} + \sum_{i=1}^2 \beta_{3,i} x_{t-i} + \sum_{i=1}^2 \gamma_{3,i} z_{t-i} + \epsilon_{z,t}$$

We can again test if $X$ Granger-causes $Y$ by testing the hypothesis that $\beta_{2,1} = \beta_{2,2} = 0$. Many researchers will report the results of this test.

However, this may not give a complete picture regarding causality, because it only accounts for direct causality but does not acknowledge the indirect causality that $X$ may have on $Y$ through its impacts on $Z$.

One solution proposed for this issue is to consider the impact of $X$ on $Y$ and $Z$ combined. Very generally, this is done by considering the "variable" $W = \{Y, Z\}$ and testing whether $X$ Granger causes $W$.

In our system, this is the same as testing the null hypothesis $(H_0: \beta_{2,1} = \beta_{2,2} = \beta_{3,1} = \beta_{3,2} = 0)$.

Example:

Let's look at a simple example to help solidify some of these concepts. In this example, we will look at the relationship between West Texas Intermediate oil prices and gold prices.

In this example, we walk through all the steps of testing Granger causality including:

  1. Viewing the time series plot of our data.
  2. Checking for stationarity.
  3. Testing for Granger causality using the granger procedure in GAUSS.
Data information
SeriesUnitsDatesSource
West Texas Intermediate oil pricesUSD per barrel2016-06 through 2021-06FRED DCOILWTICO
Gold Fixing Price 10:30 A.M. (London time) in London Bullion MarketUSD per Troy ounces2016-06 through 2021-06FRED GOLDAMGBD228NLBM

Time series plot

Time series plot of oil and gold prices

Before any time series modeling, it is generally helpful to plot your data. The time series plot of our data provides some interesting insights into our data:

  • Both of our series have non-zero means so we should include a constant in our model.
  • Neither series appears to have a time trend.
  • Both series appear to have structural breaks, which for the sake of simplicity we will ignore in this post.

Checking for stationarity

To test for stationarity we will use two fundamental tests:

We'll use the adf and kpss procedures from the free GAUSS library tspdlib to test for unit roots.

library tspdlib;

// Load data
price_data = loadd( "price_data.xls", "date($observation_date) + 
                                       price_gold + price_oil");

// Set model to include constant
model = 1;

// Call ADF unit root test
call adf(price_data[., "price_gold"], model);
call adf(price_data[., "price_oil"], model);

// Call KPSS stationarity test
call lmkpss(price_data[., "price_gold"], model);
call lmkpss(price_data[., "price_oil"], model);

The results of these tests suggest:

  • Our data does not meet the stationarity requirements for Granger causality testing.
  • We need to transform our data using first differences prior to testing.
Testing for stationarity
TestSeriesStatisticConclusion
ADFOil-1.953Cannot reject the null hypothesis of a unit root.
KPSSOil12.037Reject the null hypothesis of a stationarity at 1% level.
ADFGold-0.343Cannot reject the null hypothesis of a unit root.
KPSSGold101.374Reject the null hypothesis of a stationarity at 1% level.

Testing for Granger causality

We will again turn to the tspdlib library to test for Granger causality using the granger procedure. This built-in procedure requires two inputs:


data
Matrix or dataframe, data to be tested.
test
Scalar, type of Granger causality test to use.
0Granger causality (Gragner 1969)
1Toda & Yamamoto (Toda & Yamamote, 1995)
2Single Fourier-frequency Granger causality (Enders & Jones, 2016)
3Single Fourier-frequency Toda & Yamamoto (Nazlioglu et al., 2019)
4Cumulative Fourier-frequency Granger causality (Enders & Jones, 2019)
5Cumulative Fourier-frequency Toda & Yamamoto (Nazlioglu et al., 2019)

There are some helpful things to note about this procedure:

  • It offers a number of advanced causality testing options. These are beyond the scope of this blog and we will just stick to standard Granger causality testing.
  • The procedure tests for Granger causality across all columns in both directions.
  • For model options 0, 2, and 4 the data is first-differenced before testing. This means we don't have to take any additional steps to deal with the non-stationarity of our data.

Continuing with our price_data data from earlier:

/*
** Granger causality test
*/

// This specifies to use
// the standard Granger causality test.
// Note that data will be tested in 
// differences.
test = 0;

// Run test
call granger(price_data[., "price_gold" "price_oil"], test);

The procedure prints the Wald statistic, along with its respective p-value:

   Standard Granger Causality Test
------------------------------------------------------------
Direction                   Wald         Bootstrap p-val
price_oil => price_gold    6.860               0.093
price_gold => price_oil    5.982               0.109

Our results suggest that we:

  • Reject the null hypothesis that oil prices fail to Granger-cause gold prices at the 10% level.
  • Cannot reject the null hypothesis that gold prices fail to Granger-cause oil prices.

Conclusion

In today’s blog, we explored how to improve model selection using Granger causality. Proper model selection upfront can

  • Reduce time running invalid computationally expensive models.
  • Improve model reliability.
  • Improve forecasting.

After today's blog, you should have a better understanding of what Granger causality is and how to use it.

The code and data used in this blog can be downloaded from the Aptech GitHub repository.

Further reading

  1. Introduction to the Fundamentals of Autoregressive Models
  2. Introduction to the Fundamentals of Time Series Data and Analysis
  3. How to Conduct Unit Root Tests in GAUSS

4 thoughts on “Introduction to Granger Causality

  1. blankjamels

    Thank you very much for this fantastic blog, could you indicate the references for the causality tests:
    4 Cumulative Fourier-frequency Granger causality (Enders & Jones, 2019)
    5 Cumulative Fourier-frequency Toda & Yamamoto (Nazlioglu et al., 2019)
    Best regards,
    JS

  2. blankErica Post author

    Hi Jamel,

    Thank you for your kind comment! I am glad you enjoyed the blog on Granger Causality. I do have the full references for the tests you are inquiring about. (It appears that the Ender & Jones paper is actually a 2016 paper):

    Enders, W., & P. Jones. (2016). Grain prices, oil prices, and multiple smooth breaks in a var. Studies in Nonlinear Dynamics & Econometrics 20 (4):399-419.

    Nazlioglu, S., Soytas, U. & Gormus, A. (2019). Oil prices and monetary policy in emerging markets: structural shifts in causal linkages”. Emerging Markets Finance and Trade. 55:1, 105-117.

    I hope this helps!

    Erica

  3. blankrant

    Hello Eric,
    You have done an amazing work and thank you very much for that.
    Can I ask please if I have 3 variables (commodity prices) and I want to explore whether each one granger causes the other on and the opposite direction, shall I make a trivariate var model or every possible combination and then to apply the causality test on the derived residuals? I plan to use the non parametric test of Dicks and Panchenko (2006) and needs the data used to be stationary .
    Can this causality test be conducted on gauss?
    Thank you very much and congratulations again on your work!

Leave a Reply