Goals
This tutorial introduces the use of quantileFit
to estimate quantile regressions. After this tutorial, you should be able to estimate a basic model using either:
as your data inputs.
Basic usage with a formula string
The quantileFit
procedure accepts dataset names and formula strings as direct inputs. This allows you to tell quantileFit
which data to load, saving the extra step of manually loading your data into matrices. For example, consider using data from the dataset regsmpl.dta
to fit the model:
$$ln(wage) = \alpha + \beta_1 * age + \beta_2 * age^2 + \beta_3 * tenure$$
// Create string with full path to dataset
dataset = getGAUSSHome() $+ "examples/regsmpl.dta";
// Estimate the model
call quantileFit(dataset, "ln_wage ~ age + age:age + tenure");
This code will produce the following output
Total observations: 28101
Number of variables: 3
VAR. / tau (in %) 5% 50% 95%
--------------------------------------------------- CONSTANT -0.7630 0.5112 0.0006 age 0.1103 0.0656 0.1271 age:age -0.0017 -0.0010 -0.0016 tenure 0.0356 0.0466 0.0196
In the results above, we get estimates for the default quantile levels, 5%, 50%, and 95%. In addition, note that because we use formula strings, GAUSS automatically includes variable names in the output table.
Basic usage with matrix inputs
The quantileFit
procedure can also accept matrix inputs. We demonstrate using the same model as before. However, this time we:
- Create the data matrices
y
andx
from theregsmpl.dta
by loading the appropriate variables. - Use the matrices
y
andx
as the dependent and independent variable inputs, respectively, in thequantileFit
function call.
// Create string with full path to dataset
dataset = getGAUSSHome() $+ "examples/regsmpl.dta";
// Load dependent variable
y = loadd(dataset, "ln_wage");
// Load the independent variables
x = loadd(dataset, "age + age:age + tenure");
// Estimate the model with matrix inputs
call quantileFit(y, x);
This code will give us the same estimates as above but will use generic names for the variables since GAUSS matrices do not store variable names.
Total observations: 28101
Number of variables: 3
VAR. / tau (in %) 5% 50% 95%
--------------------------------------------------- CONSTANT -0.7630 0.5112 0.0006 X01 0.1103 0.0656 0.1271 X02 -0.0017 -0.0010 -0.0016 X03 0.0356 0.0466 0.0196
Conclusion
This tutorial showed you how to estimate the parameters of a simple linear quantile regression using a dataset name and formula string or matrix inputs.
In the next tutorial we will learn how to specify quantile levels for the quantileFit
regression.