Using Feasible Generalized Least Squares To Improve Estimates

Introduction

Data analysis in reality is rarely as clean and tidy as it is presented in the textbooks. Consider linear regression -- data rarely meets the stringent assumptions required for OLS. Failing to recognize this and incorrectly implementing OLS can lead to embarrassing, inaccurate conclusions.

In today's blog, we'll look at how to use feasible generalized least squares to deal with data that does not meet the OLS assumption of Independent and Identically Distributed (IID) error terms.

What Is Feasible Generalized Least Squares (FGLS)?

FGLS is a flexible and powerful tool that provides a reliable approach for regression analysis in the presence of non-constant variances and correlated errors.

Feasible Generalized Least Squares (FGLS):

Why is this important?

Recall the fundamental OLS IID assumption which implies that the error terms have constant variance and are uncorrelated. When this assumption is violated:

  • OLS estimators are no longer efficient.
  • The estimated covariance matrix of the coefficients will be inconsistent.
  • Standard inferences will be incorrect.

Unfortunately, many real-world cross-sectional, panel data, and time series datasets do violate this fundamental assumption.

FGLS allows for a more accurate modeling of complex and realistic data structures by accommodating the heteroscedasticity and autocorrelation in the error terms.

How Does FGLS Work?

FGLS uses a weighting matrix that captures the structure of the variance-covariance matrix of the errors.

This allows FGLS to:

  • Give more weight to observations with smaller variances.
  • Account for correlations.
  • Provide more efficient and unbiased estimates in the presence of non-constant variance and serial correlation.

The method uses a relatively simple iterative process:

  1. Pick a method for estimating the covariance matrix based on believed data structure.
  2. Make initial OLS parameter estimates.
  3. Use the OLS residuals and the chosen method to estimate an initial covariance matrix.
  4. Compute FGLS estimates using the estimated covariance matrix for weighting.
  5. Calculate residuals and refine the weighting matrix.
  6. Repeat steps 3, 4, and 5 until convergence.

How Do I Know If I Should Use FGLS?

We've already noted that you should use FGLS when you encounter heteroscedasticity and/or autocorrelation. It's easy to say this but how do you identify when this is the case?

There are a number of tools that can help.


Example Tools for Identifying Heteroscedasticity and Autocorrelation

ToolDescriptionUsed to Identify
Scatter plots
  • Plot the dependent variable against each independent variable and look for patterns that suggest relationships between the variance and variables.
  • Plot the residuals over time and look for cycles or trends in the residuals.
Heteroscedasticity and autocorrelation.
Residual plot
  • A fan-shaped or funnel-shaped pattern in a plot of the residuals against fitted values indicates that the variance of the residuals is not constant across all levels of the independent variable.
  • A pattern of correlation in plots of residuals against lagged residuals may indicate autocorrelation.
Heteroscedasticity and autocorrelation.
Histogram of residualsPlot a histogram of the residuals. If the histogram is skewed or has unequal spread, it could suggest heteroscedasticity or non-normal distribution.Heteroscedasticity.
Durbin-Watson statisticThe Durbin-Watson statistic tests for first-order autocorrelation in the residuals. The test statistic ranges from 0 to 4, with values around 2 indicating no autocorrelation.Autocorrelation.
Breusch-Pagan testThe Breusch-Pagan test considers the null hypothesis of homoscedasticity against the alternative of heteroscedasticity.Heteroscedasticity.
Breusch-Godfrey testThe Breusch-Godfrey test extends the Durbin-Watson test to higher-order autocorrelation. The test assesses whether larger lags of residuals and independent variables help explain the current residuals.Autocorrelation.
White testSimilar to the Breusch-Pagan test, the White test considers the null hypothesis of homoscedasticity.Heteroscedasticity.

Example One: US Consumer Price Index (CPI)

Let's get a better feel for FGLS using real-world data. In this application, we will:

  • Find OLS estimates and examine the results for signs of heteroscedasticity and autocorrelation.
  • Compute FGLS estimates and discuss results.

Data

For this example, we will use publicly available FRED time series data:

  • Consumer Price Index for All Urban Consumers: All Items in U.S. City Average (CPIAUCSL), seasonally adjusted.
  • Compensation of employees, paid (COE), seasonally adjusted.

Both variables are quarterly, continuously compounded rates of change spanning from 1947Q2 to 2023Q3.

// Load data 
fred_fgls = loadd("fred_fgls.gdat");

// Preview data
head(fred_fgls);
tail(fred_fgls);
            date              COE         CPIAUCSL
      1947-04-01      0.013842900      0.014184600
      1947-07-01      0.015131100      0.021573900
      1947-10-01      0.030381600      0.027915600
      1948-01-01      0.025448400      0.020966300
      1948-04-01      0.011788800      0.015823300

            date              COE         CPIAUCSL
      2022-07-01      0.023345300      0.013491800
      2022-10-01     0.0048207000      0.010199500
      2023-01-01      0.021001700     0.0093545000
      2023-04-01      0.013436000     0.0066823000
      2023-07-01      0.013337800     0.0088013000

OLS Estimation

Let's start by using OLS to examine the relationship between COE and CPI returns. We'll be sure to have GAUSS save our residuals so we can use them to evaluate OLS performance.

// Declare 'ols_ctl' to be an olsmtControl structure
// and fill with default settings
struct olsmtControl ols_ctl;
ols_ctl = olsmtControlCreate();

// Set the 'res' member of the olsmtControl structure
// so that 'olsmt' will compute residuals and the
// Durbin-Watson statistic
ols_ctl.res = 1;

// Declare 'ols_out' to be an olsmtOut structure
// to hold the results of the computations
struct olsmtOut ols_out;

// Perform estimation, using settings in the 'ols_ctl'
// control structure and store the results in 'ols_out'
ols_out = olsmt(fred_fgls, "CPIAUCSL ~ COE", ols_ctl);
Valid cases:                   306      Dependent variable:            CPIAUCSL
Missing cases:                   0      Deletion method:                   None
Total SS:                    0.019      Degrees of freedom:                 304
R-squared:                   0.197      Rbar-squared:                     0.195
Residual SS:                 0.016      Std error of est:                 0.007
F(1,304):                   74.673      Probability of F:                 0.000
Durbin-Watson:               0.773

                         Standard                 Prob   Standardized  Cor with
Variable     Estimate      Error      t-value     >|t|     Estimate    Dep Var
-------------------------------------------------------------------------------
CONSTANT   0.00397578 0.000678566     5.85909     0.000       ---         ---
COE          0.303476   0.0351191     8.64133     0.000    0.444067    0.444067

Evaluating the OLS Results

Taken at face value, these results look good. The standard errors on both estimates are small and both variables are statistically significant. We may be tempted to stop there. However, let's look more closely using some of the tools mentioned earlier.

Checking For Heteroscedasticity

First, let's create some plots using our residuals to check for heteroscedasticity. We will look at:

  • A histogram of the residuals.
  • The residuals versus the independent variable.
/*
** Plot a histogram of the residuals 
** Check for skewed distribution
*/
plotHist(ols_out.resid, 50);

Our histogram indicates that the residuals from our OLS regression are asymmetric and slightly skewed. While the results aren't dramatic, they warrant further exploration to check for heteroscedasticity.

/*
** Plot residuals against COE
** Check for increasing or decreasing variance 
** as the independent variable changes.
*/
plotScatter(fred_fgls[., "COE"], ols_out.resid);

It's hard to determine if these results are indicative of heteroscedasticity or not. Let's add random normal observations to our scatter plot as see how they compare.

// Add random normal observations to our scatter plot
// scale by 100 to put on same scale as residuals
rndseed 897680;
plotAddScatter(fred_fgls[., "COE"], rndn(rows(ols_out.resid), 1)/100);

Our residual plot doesn't vary substantially from the random normal observations and there isn't strong visual evidence of heteroscedasticity.

If we did have heteroscedasticity, our residuals would exhibit a fan-like shape, indicating a change in the spread between residuals as our observed data changes. For example, consider this plot of hypothetical residuals against COE:

Example of heteroscedasticity observed in a residual plot.

Checking For Autocorrelation

Comparison of error terms with autocorrelation and without.

As you may have noticed, we don't have to look further than our OLS results for signs of autocorrelation. The olsmt procedure reports the Durbin-Watson statistic as part of the printed output. For this regression, the Durbin-Watson statistic is 0.773, which is significantly below 2, suggesting positive autocorrelation.

We can find further support for this conclusion by inspecting our residual plots, starting with a plot of the residuals against time.

// Checking for autocorrelation
/*
** Plot the residuals over time and 
** look for cycles or trends to 
** check for autocorrelation.
*/
plotXY(fred_fgls[., "date"], ols_out.resid);

Our time plot of residuals:

  • Has extended periods of large residuals, (roughly 1970-1977, 1979-1985, and 2020-2022).
  • Suggests positive autocorrelation.

Now let's examine the plot of our residuals against lagged residuals:

/*
** Plot residuals against lagged residuals 
** look for relationships and trends
*/
// Lag residuals and remove missing values
lagged_res = lagn(ols_out.resid, 1);

// Trim first observations and plot residuals
// against lagged residuals
plotScatter(lagged_res, ols_out.resid);

This plot gives an even clearer visual of our autocorrelation issue demonstrating:

  • A clear linear relationship between the residuals and their lags.
  • Larger residuals in the previous period lead to larger residuals in the current period.

FGLS Estimation

After examining the results more closely from the OLS estimation, we have clear support for using FGLS. We can do this using the fgls procedure, introduced in GAUSS 24.

The GAUSS fgls Procedure

The fgls procedure allows for model specification in one of two styles. The first style requires a dataframe input and a formula string:

// Calling fgls using a dataframe and formula string
out = fgls(data, formula);

The second option requires an input matrix or dataframe containing the dependent variable and an input matrix or dataframe containing the independent variables:

// Calling fgls using dependent variable
// and independent variable inputs
out = fgls(depvar, indvars);

Both options also allow for:

  • An optional input specifying the computation method for the weighting matrix. GAUSS includes 7 pre-programmed options for the weighting matrix or allows for a user-specified weighting matrix.
  • An optional fglsControl structure input for advanced estimation settings.
out = fgls(data, formula [, method, ctl])

The results from the FGLS estimation are stored in a fglsOut structure containing the following members:

MemberDescription
out.beta_fglsThe feasible least squares estimates of parameters.
out.sigma_fglsCovariance matrix of the estimated parameters.
out.se_fglsStandard errors of the estimated parameters.
out.ciConfidence intervals of the estimated parameters.
out.t_statsThe t-statistics of the estimated parameters.
out.pvtsThe p-value of the t-statistics of the estimated parameters.
out.residThe estimate residuals.
out.dfDegrees of freedom.
out.sseSum of squared errors.
out.sstTotal sum of squares.
out.std_estStandard deviation of the residuals.
out.fstatModel f-stat.
out.pvfP-value of the model f-stat.
out.rsqR-squared.
out.dwDurbin-Watson statistic.

Running FGLS

Let's use FGLS and see if it helps with autocorrelation. We'll start with the default weighting matrix, which is an AR(1) structure.

// Estimate FGLS parameters using
// default setting
struct fglsOut fOut;
fOut = fgls(fred_fgls, "CPIAUCSL ~ COE");
Valid cases:                    306          Dependent variable:             COE
Total SS:                     0.019          Degrees of freedom:             304
R-squared:                    0.140          Rbar-squared:                 0.137
Residual SS:                  0.017          Std error of est:             0.007
F(1,304)                     49.511          Probability of F:             0.000
Durbin-Watson                 0.614

--------------------------------------------------------------------------------
                        Standard                    Prob
Variable   Estimates       Error     t-value        >|t|  [95% Conf.   Interval]
--------------------------------------------------------------------------------

Constant     0.00652    0.000908        7.19       0.000     0.00474      0.0083
CPIAUCSL        0.14      0.0286         4.9       0.000       0.084       0.196

The FGLS estimates the AR(1) weighting matrix differ from our OLS estimates in both the coefficients and standard errors.

Let's look at a plot of our residuals:

// Plot FGLS residual 
lagged_resid = lagn(fOut.resid, 1);
plotScatter(lagged_resid, fOut.resid);

Our residuals suggest that FGLS hasn't fully addressed our autocorrelation. What should we take from this?

This likely means that we need to consider higher-order autocorrelation. We may want to extend this analysis by:


Ready to give FGLS a try? Get started with GAUSS demo today!

Example Two: American Community Survey

Now let's consider a second example using a subset of data from the 2019 American Community Survey (ACS).

Data

The 2019 ACS data subset was cleaned and provided by the Social Science Computing Cooperative from University of Wisconsin-Madison.

The survey data subset contains 5000 observations of the following variables:

VariableCensus Codebook NameDescription
householdSERIALNOHousing unit identifier.
personSPORDERPerson number.
stateSTState.
ageAGEPAge in years.
other_languagesLANXAnother language is spoken at home.
englishENGSelf-rated ability to speak English, if another language is spoken.
commute_timeJWMNPTravel time to work in minutes, top-coded at 200.
marital_statusMARMarital status.
educationSCHLEducational attainment, collapsed into categories.
sexSEXSex (male or female).
hours_workedWKHPUsual hours worked per week in the past 12 months.
weeks_workedWKHNWeeks worked per year in the past 12 months.
raceRAC1PRace.
incomePINCPTotal income in current dollars, rounded.

Let's run a naive model of income against two independent variables, age and hours_worked.

Our first step is loading our data:

/*
** Step One: Data Loading 
** Using the 2019 ACS 
*/
// Load data 
acs_fgls = loadd("acs2019sample.dta", "income + age + hours_worked");

// Review the summary statistics
dstatmt(acs_fgls);
---------------------------------------------------------------------------------------------
Variable             Mean     Std Dev      Variance     Minimum     Maximum    Valid Missing
---------------------------------------------------------------------------------------------

income          4.062e+04   5.133e+04     2.634e+09       -8800   6.887e+05     4205    795
age                 43.38       24.17           584           0          94     5000      0
hours_worked        38.09       13.91         193.5           1          99     2761   2239

Based on our descriptive statistics there are a few data cleaning steps that will help our model:

  • Remove missing values using the packr procedure.
  • Transform income to thousands of dollars to improve data scaling.
  • Remove cases with negative incomes.
// Remove missing values
acs_fgls = packr(acs_fgls);

// Transform income
acs_fgls[., "income"] = acs_fgls[., "income"]/1000;

// Filter out cases with negative incomes
acs_fgls = delif(acs_fgls, acs_fgls[., "income"] .< 0);

OLS Estimation

Now we're ready to run a preliminary OLS estimation.

// Declare 'ols_ctl' to be an olsmtControl structure
// and fill with default settings
struct olsmtControl ols_ctl;
ols_ctl = olsmtControlCreate();

// Set the 'res' member of the olsmtControl structure
// so that 'olsmt' will compute residuals and the Durbin-Watson statistic
ols_ctl.res = 1;

// Declare 'ols_out' to be an olsmtOut structure
// to hold the results of the computations
struct olsmtOut ols_out;

// Perform estimation, using settings in the 'ols_ctl'
// control structure and store the results in 'ols_out'
ols_out = olsmt(acs_fgls, "income ~ age + hours_worked", ols_ctl);
Valid cases:                  2758      Dependent variable:              income
Missing cases:                   0      Deletion method:                   None
Total SS:              8771535.780      Degrees of freedom:                2755
R-squared:                   0.147      Rbar-squared:                     0.146
Residual SS:           7481437.527      Std error of est:                52.111
F(2,2755):                 237.536      Probability of F:                 0.000
Durbin-Watson:               1.932

                             Standard                 Prob   Standardized  Cor with
Variable         Estimate      Error      t-value     >|t|     Estimate    Dep Var
-----------------------------------------------------------------------------------
CONSTANT         -31.0341     3.91814    -7.92062     0.000       ---         ---
age              0.762573   0.0620066     12.2983     0.000    0.216528    0.227563
hours_worked      1.25521   0.0715453     17.5443     0.000    0.308893    0.316628

Our results make intuitive sense and suggest that:

  • Both age and hours_worked are statistically significant.
  • Increases in age lead to increases in income.
  • Increase in hours_worked lead to increases in income.

Evaluating the OLS Results

As we know from our previous example, we need to look beyond the estimated coefficients and standard errors when evaluating our model results. Let's start with the histogram of our residuals:

/*
** Plot a histogram of the residuals 
** Check for skewed distribution
*/
plotHist(ols_out.resid, 50);

The histogram of our residuals is right skewed with a long tail on the right side.

However, because our initial data is truncated, residual scatter plots will be more useful for checking for heteroscedasticity.

/*
** Plot residuals against independent variables
** Check for increasing or decreasing variance 
** as the independent variable changes.
*/
plotScatter(acs_fgls[., "age"], ols_out.resid);

// Open second plot window
plotOpenWindow();
plotScatter(acs_fgls[., "hours_worked"], ols_out.resid);

Both plots show signs of heteroscedasticity:

  • The age scatter plot demonstrates the tell-tale fan-shaped relationship with residuals. This indicates that variance in residuals increases as age increases.
  • The hours_worked scatter plot is less obvious but does seem to indicate higher variance in the residuals at the middle ranges (40-60) than the lower and higher ends.

FGLS estimation

To address the issues of heteroscedasticity, let's use FGLS. This time we'll use the "HC0" weighting matrix (White, 1980).

// Estimate FGLS parameters 
// using the HC1 weighting matrix
struct fglsOut fOut;
fOut = fgls(acs_fgls, "income ~ age + hours_worked", "HC0");
Valid cases:                   2758              Dependent variable:             age
Total SS:               8771535.780              Degrees of freedom:            2755
R-squared:                    0.147              Rbar-squared:                 0.146
Residual SS:            7481440.027              Std error of est:            52.111
F(2,2755)                   237.535              Probability of F:             0.000
Durbin-Watson                 1.932

-------------------------------------------------------------------------------------
                             Standard                    Prob
     Variable   Estimates       Error     t-value        >|t|  [95% Conf.   Interval]
-------------------------------------------------------------------------------------
    Constant       -30.9      0.0743        -416       0.000       -31.1       -30.8
hours_worked       0.762    0.000475    1.61e+03       0.000       0.761       0.763
      income        1.25     0.00181         694       0.000        1.25        1.26

While using FGLS results in slightly different coefficient estimates, it has a big impact on the standard error estimations. In this case, these changes don't have an impact on our inferences -- all of our regressors are still statistically significant.

Conclusion

Today we've seen how FGLS offers a potential solution for data that doesn't fall within the restrictive IID assumption of OLS.

After today, you should have a better understanding of how to:

  • Identify heteroscedasticity and autocorrelation.
  • Compute OLS and FGLS estimates using GAUSS.

Further Reading

  1. Introduction to the Fundamentals of Time Series Data and Analysis.
  2. Finding the PACF and ACF.
  3. Generating and Visualizing Regression Residuals.
  4. OLS diagnostics: Heteroscedasticity.

Leave a Reply