Introduction to quantile regression

Goals

This tutorial introduces the use of quantileFit to estimate quantile regressions. After this tutorial, you should be able to estimate a basic model using either:

as your data inputs.

Basic usage with a formula string

The quantileFit procedure accepts dataset names and formula strings as direct inputs. This allows you to tell quantileFit which data to load, saving the extra step of manually loading your data into matrices. For example, consider using data from the dataset regsmpl.dta to fit the model:

$$ln(wage) = \alpha + \beta_1 * age + \beta_2 * age^2 + \beta_3 * tenure$$

// Create string with full path to dataset
dataset = getGAUSSHome() $+ "examples/regsmpl.dta";

// Estimate the model
call quantileFit(dataset, "ln_wage ~ age + age:age + tenure");

This code will produce the following output

Total observations:                                   28101
Number of variables: 3
VAR. / tau (in %) 5% 50% 95%
--------------------------------------------------- CONSTANT -0.7630 0.5112 0.0006 age 0.1103 0.0656 0.1271 age:age -0.0017 -0.0010 -0.0016 tenure 0.0356 0.0466 0.0196

In the results above, we get estimates for the default quantile levels, 5%, 50%, and 95%. In addition, note that because we use formula strings, GAUSS automatically includes variable names in the output table.

Basic usage with matrix inputs

The quantileFit procedure can also accept matrix inputs. We demonstrate using the same model as before. However, this time we:

  1. Create the data matrices y and x from the regsmpl.dta by loading the appropriate variables.
  2. Use the matrices y and x as the dependent and independent variable inputs, respectively, in the quantileFit function call.
// Create string with full path to dataset
dataset = getGAUSSHome() $+ "examples/regsmpl.dta";

// Load dependent variable
y = loadd(dataset, "ln_wage");

// Load the independent variables
x = loadd(dataset, "age + age:age + tenure");

// Estimate the model with matrix inputs
call quantileFit(y, x);

This code will give us the same estimates as above but will use generic names for the variables since GAUSS matrices do not store variable names.

Total observations:                                   28101
Number of variables: 3
VAR. / tau (in %) 5% 50% 95%
--------------------------------------------------- CONSTANT -0.7630 0.5112 0.0006 X01 0.1103 0.0656 0.1271 X02 -0.0017 -0.0010 -0.0016 X03 0.0356 0.0466 0.0196

Conclusion

This tutorial showed you how to estimate the parameters of a simple linear quantile regression using a dataset name and formula string or matrix inputs.

In the next tutorial we will learn how to specify quantile levels for the quantileFit regression.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.