Logistic Regression Example
The first step to using
logisticRegression is insuring that data is in the proper format and scaled appropriately.The y-matrix input of the
logisticRegression procedure must house a discrete, dichotomous dependent variable matrix with values {0,1}. Independent attributes used in
logisticRegression may be continuous or discrete. However, all data must be numerical, and any categorical, string data should be recoded before passing to the
logisticRegression procedure.
Discrete Choice Analysis Tools v2.0 includes the
reclassify procedure to reclassify binary data to {0,1} data, to convert categorical string data to categorical numeric data, and to create dummy variables. As an example, consider the vector
y, containing string data coded as
Yes or
No. To reclassify as {0,1} data
y_new = reclassify(y,0);
The output from
reclassify,
y_new, will be a vector of binary data coded as {0,1}. In addition,
reclassify prints a recoding key to the input/output screen:
Category Assigned
Name category
No 0
Yes 1
Following data set-up the next step is to declare and initialize the
lrControl structure used for controlling model parameters:
struct lrControl lctl;
lctl = lrGetDefaults();
Next, set solver type and specify both post-estimation prediction and prediction plotting:
lctl.solver_type = 3;
lctl.predict = 1;
lctl.predictPlot = 1;
Finally, calling the
logisticRegression procedure performs estimation of attribute weights, along with cross-validation if specified.
struct lrOut outLR;
outLR = logisticRegress(lctl, y, x);
After classifying the data, the procedure produces the following plots: