Example introduction
Suppose we have a dependent variable that is observed in several ordered categories. We might estimate coefficients of a regression on this variable using the ordered probit model
where
Assuming
we have
where represents the normal cumulative distribution function. The log-likelihood for this model is
Step 1: Load the CMLMT library
// Make cmlmt library functions available library cmlmt;
Step 2: Load data
Our data source is a CSV file, containing four variables. The first column is the dependent variable and the remaining three will be the independent variables.
// Load all four variables into one matrix data = loadd(getGAUSSHome() $+ "pkgs/cmlmt/examples/cmlmt_ordered_probit.csv"); // Create separate matrices for dependent // and independent variables y = data[., 1]; X = data[., 2:4];
Step 3: Starting parameters
Our model has two characteristics which make it less convenient to store our parameters as a single vector.
- Our log-likelihood function has two vectors which we will want to reference separately.
- We do not want to estimate the first and last elements of .
In order to handle this, we will place our parameters inside of a PV structure. The PV structure allows us to 'pack' our vectors into the structure and retrieve them by name. It also allows us to apply a mask so that some of the parameters will not be estimated.
// Declare 'p' to be a PV structure // and initialize it struct PV p; p = pvCreate(); // Pack the starting vectors for 'beta' // into 'p' so that we can reference it by name beta_strt = { 0.5, 0.5, 0.5 }; p = pvPack(p, beta_strt, "beta"); // Pack the starting values for 'tau' // with a mask, telling CMLMT not to // estimate the first and last elements of 'tau' // Note the use of 'pvPackM' instead of 'pvPack' mask = { 0, 1, 1, 1, 0 }; tau_strt = { -30, -1, 0, 1, 30 }; p = pvPackM(p, tau_strt, "tau", mask);
Step 4: The log-likelihood procedure
Now that we have set up our parameters, it is time to look at how we will use them in the log-likelihood procedure. The log-likelihood procedure has three sets of inputs
- The first input is the parameter vector or PV structure containing the model parameters.
- Next comes any data your function needs other than the parameters.
- The final input is an indicator vector which CMLMT uses to tell your function whether to compute the gradient or Hessian. For this example, we will ignore the indicator vector.
// The log-likelihood procedure proc (1) = orderedProbit(p, y, X, ind); local mu, tau_, beta_, emu, eml; // Declare 'mm' to be a modelResults // structure to hold the function value struct modelResults mm; // Extract the current 'tau' // and 'beta' vectors tau_ = pvUnpack(p, "tau"); beta_ = pvUnpack(p, "beta"); mu = X * beta_; eml = tau_[y] - mu; emu = tau_[y+1] - mu; // Assign the log-likelihood value to the // 'function' member of the modelResults structure mm.function = ln(cdfn(emu) - cdfn(eml)); // Return the model results structure retp(mm); endp;
Step 5: Apply the inequality constraints
In order to enforce the inequality
we will set this inequality constraint equation
// Declare 'ctl' to be a cmlmtControl structure // and fill with default values struct cmlmtControl ctl; ctl = cmlmtControlCreate(); /* ** The dot product of each row of ctl.C ** with the parameter vector must be ** greater than or equal to the corresponding ** row of ctl.D ** ** Note that the first 3 elements of our parameter ** vector are 'beta' and the final 3 are the middle ** 3 elements of 'tau', because the mask keeps ** the first and last elements of 'tau' from ** the estimation. */ ctl.C = { 0 0 0 -1 1 0, 0 0 0 0 -1 1 }; ctl.D = { 0, 0 };
Step 6: Run the estimation and print results
The cmlmt
procedure will take five inputs
- A pointer to the log-likelihood procedure.
- The PV structure containing the model parameters.
3-4. The
y
vector andX
matrix. - The
cmlmtControl
structure, containing the inequality constraints.
// Declare 'out' to be a cmlmtResults structure // to hold the results of the estimation struct cmlmtResults out; // Perform estimation and print results out = cmlmt(&orderedProbit, p, y, X, ctl); call cmlmtPrt(out);
Results
After running all of the above code, the following report will be printed
=============================================================================== CMLMT Version 3.0.0
=============================================================================== return code = 0 normal convergence Log-likelihood -14.1623 Number of cases 200 Covariance of the parameters computed by the following method: ML covariance matrix Parameters Estimates Std. err. Est./s.e. Prob. Gradient --------------------------------------------------------------------- beta[1,1] 4.2996 0.2816 15.268 0.0000 0.0000 beta[2,1] 5.1704 0.3275 15.787 0.0000 0.0000 beta[3,1] 6.1077 0.2931 20.838 0.0000 0.0000 tau[2,1] -10.3687 0.6380 -16.251 0.0000 0.0000 tau[3,1] 0.2012 0.2833 0.710 0.4776 0.0000 tau[4,1] 10.2323 0.6439 15.890 0.0000 0.0000 Correlation matrix of the parameters 1 0.5337 0.37587 -0.56933 0.12117 0.55204 0.53365 1 0.56732 -0.71255 -0.029152 0.66281 0.37587 0.56732 1 -0.62297 -0.010140 0.61754 -0.56933 -0.71255 -0.62297 1 -0.013120 -0.59049 0.12117 -0.029152 -0.010140 -0.013120 1 0.016093 0.55204 0.66281 0.61754 -0.59049 0.016093 1 Wald Confidence Limits 0.95 confidence limits Parameters Estimates Lower Limit Upper Limit Gradient ---------------------------------------------------------------------- beta[1,1] 4.2996 3.7442 4.8551 0.0000 beta[2,1] 5.1704 4.5245 5.8164 0.0000 beta[3,1] 6.1077 5.5296 6.6858 0.0000 tau[2,1] -10.3687 -11.6271 -9.1104 0.0000 tau[3,1] 0.2012 -0.3575 0.7599 0.0000 tau[4,1] 10.2323 8.9623 11.5023 0.0000