Apples to Apples: The case for cluster-robust standard errors

Introduction

Linear regression commonly assumes that the error terms of a model are independently and identically distributed (i.i.d). However, when datasets contain groups, the potential for correlated error terms within groups arises.

Example: Weather shocks to apple orchards

For example, consider a model of the supply of apples from various orchards across the United States. Naturally, we would expect that orchards within Washington may all face similar weather-related "shocks" to their supply. However, we would not expect the weather shocks to the orchards in Washington to be the same as the weather shocks to the orchards in New York.

Weather shock to apple production.

When these correlated within-group shocks occur, the i.i.d error term assumption is not valid and traditional error terms can result in misleading inference about coefficient estimates.

In these cases, is important to use error terms that appropriately account for the within-group correlation between error terms.

The Model

Let's consider our hypothetical apple supply model which makes observations for each individual orchard, $i = 1, 2, \ldots\, N$ at each time period $t = 1, 2, \ldots\, T$:

$$y_{it} = x_{it}\beta + u_{it}$$

We can further aggregate our dataset into state-level groups, $g = 1, 2, \ldots, G$ such that:

$$y_{igt} = x_{igt}\beta + u_{igt}$$

The cluster-robust error term assumes that, $u_{igt}$, is correlated within groups but independent across groups. More formally:

$$E[u_{igt}u_{iht}] \begin{cases} = 0 & \text{ if }g \neq h \\ \neq 0 & \text{ if }g = h \end{cases} $$

The cluster-robust error computation allows for this correlation:

$$V_{clu}[\hat{\beta}] = (X'X)^{-1} * \sum_{j=1}^G u_j' u_j * (X'X)^{-1}$$

where

$$u_j = \sum_{cluster_j} u_{it} x_{it} .$$

Estimating our model in GAUSS

Let's look more formally at our apple production model using the apples_cluster.dat dataset. Using this data we will model the production of apples in relationship to orchard acreage:

$$prod = \beta_0 + \beta_1*acres + u$$

Estimating i.i.d error terms

First, let's model the data using i.i.d standard errors:

// Specify filename
fname = __FILE_DIR $+ "apples_cluster.dat";

// Estimate model using ols
struct olsmtOut oOut;
oOut = olsmt(fname, "prod ~ acres", oCtl);

This yields the following results:

                         Standard                 Prob   Standardized  Cor with
Variable     Estimate      Error      t-value     >|t|     Estimate    Dep Var
-------------------------------------------------------------------------------
CONSTANT    0.0296797   0.0283593     1.04656     0.295       ---         ---
acres         1.03483   0.0285833     36.2041     0.000    0.455813    0.455813 

Estimating cluster-robust error terms

Now, we specify cluster-robust errors using two members in the olsmtControl structure:


oCtl.cov
String, the type of covariance matrix to be computed:
  • "iid" for i.i.d errors.
  • "cluster" for cluster-robust errors.
  • "robust" for the Huber/White sandwich estimator.
oCtl.clusterId
String, the name of the variable containing data groups.

Our code for estimation now becomes:

// Estimate model using ols
struct olsmtControl oCtl;
oCtl = olsmtControlCreate();

// Set up cluster id variable
oCtl.clusterId = "state";

// Turn on cluster vce
oCtl.cov = "cluster";

// Estimate model 
struct olsmtOut oOut;
oOut = olsmt(fname, "prod ~ acres", oCtl);

Which yields:

                         Standard                 Prob   Standardized  Cor with
Variable     Estimate      Error      t-value     >|t|     Estimate    Dep Var
-------------------------------------------------------------------------------
CONSTANT    0.0296797   0.0670127    0.442897     0.658       ---         ---
acres         1.03483   0.0505957      20.453     0.000    0.455813    0.455813 

Comparing results

There are several key things to note about the two sets of results.

  1. Using cluster-robust standard errors has no impact on the coefficient estimates.
  2. The cluster-robust standard errors are larger than i.i.d errors.

In this case, the larger standard errors do not impact our conclusions regarding the significance of the estimated coefficients, but this may not always be true.

Conclusions

In today's discussion of cluster-robust standard errors we have learned :

  1. What types of models may introduce within-cluster correlation in error terms.
  2. The potential impacts of ignoring within-cluster correlations in error terms.
  3. How to estimate cluster-robust error terms.

Code and data from this blog can be found here.

Leave a Reply