Negative Predicted Probabilities in gologit2
Here is a simple example of how gologit2 can sometimes produce negative predicted probabilities.
. sysuse auto
(1978 Automobile Data)
. gologit2 rep78 foreign length mpg, log
Iteration 0: log likelihood = -93.692061
Iteration 1: log likelihood = -75.312137
Iteration 2: log likelihood = -73.625868
Iteration 3: log likelihood = -73.511329
Iteration 4: log likelihood = -73.487266
Iteration 5: log likelihood = -73.481532
Iteration 6: log likelihood = -73.480231
Iteration 7: log likelihood = -73.479948
Iteration 8: log likelihood = -73.479901
Iteration 9: log likelihood = -73.479895 (not concave)
Generalized Ordered Logit Estimates Number of obs = 69
LR chi2(12) = 40.42
Prob > chi2 = 0.0001
Log likelihood = -73.479895 Pseudo R2 = 0.2157
------------------------------------------------------------------------------
rep78 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1 |
foreign | (dropped)
length | .1070418 .154429 0.69 0.488 -.1956334 .4097171
mpg | .1952588 .4860697 0.40 0.688 -.7574202 1.147938
_cons | -21.49491 38.85059 -0.55 0.580 -97.64066 54.65084
-------------+----------------------------------------------------------------
2 |
foreign | 15.29959 1044.196 0.01 0.988 -2031.287 2061.886
length | -.0005172 .0410406 -0.01 0.990 -.0809554 .079921
mpg | -.0250512 .1696942 -0.15 0.883 -.3576457 .3075432
_cons | 1.908498 11.06988 0.17 0.863 -19.78807 23.60507
-------------+----------------------------------------------------------------
3 |
foreign | 3.839048 1.020579 3.76 0.000 1.83875 5.839346
length | .0618394 .0322828 1.92 0.055 -.0014336 .1251125
mpg | .2485872 .1474317 1.69 0.092 -.0403735 .5375479
_cons | -18.36718 9.021422 -2.04 0.042 -36.04884 -.6855182
-------------+----------------------------------------------------------------
4 |
foreign | 3.029612 1.064825 2.85 0.004 .9425925 5.116631
length | .0464896 .0329677 1.41 0.158 -.0181259 .1111052
mpg | .2451507 .1194876 2.05 0.040 .0109592 .4793421
_cons | -17.31256 8.470451 -2.04 0.041 -33.91434 -.7107851
------------------------------------------------------------------------------
WARNING! 25 in-sample cases have an outcome with a predicted probability that is
less than 0. See the gologit2 help section on Warning Messages for more information.
. predict p1 p2 p3 p4 p5
(option p assumed; predicted probabilities)
. sum p1-p5
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
p1 | 74 .1251925 .1639 .0021189 .7717534
p2 | 74 .0244799 .2264678 -.7717533 .2173719
p3 | 74 .4389264 .2131038 .0052249 .7701156
p4 | 74 .2623029 .1576892 .0249331 .5947708
p5 | 74 .1490983 .2158554 .006046 .9513912
. tab rep78
Repair |
Record 1978 | Freq. Percent Cum.
------------+-----------------------------------
1 | 2 2.90 2.90
2 | 8 11.59 14.49
3 | 30 43.48 57.97
4 | 18 26.09 84.06
5 | 11 15.94 100.00
------------+-----------------------------------
Total | 69 100.00
As we can see from the above, several of the predicted probabilities were negative. However, there are also many problems with this analysis. Only 2 cases fall into category one and only 8 cases fall into category 2. The variable foreign was actually dropped from the first equation and has a huge coefficient and even bigger standard error in the second. We are spreading the data way too thin here. Combining categories may help.
. recode rep78 (1/3=3), gen(rep78b)
(10 differences between rep78 and rep78b)
. gologit2 rep78b foreign length mpg, log
Iteration 0: log likelihood = -66.194631
Iteration 1: log likelihood = -48.776863
Iteration 2: log likelihood = -47.631556
Iteration 3: log likelihood = -47.599469
Iteration 4: log likelihood = -47.599382
Iteration 5: log likelihood = -47.599382
Generalized Ordered Logit Estimates Number of obs = 69
LR chi2(6) = 37.19
Prob > chi2 = 0.0000
Log likelihood = -47.599382 Pseudo R2 = 0.2809
------------------------------------------------------------------------------
rep78b | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
3 |
foreign | 3.844035 1.027301 3.74 0.000 1.830561 5.857509
length | .0617889 .032248 1.92 0.055 -.001416 .1249937
mpg | .2468554 .1485422 1.66 0.097 -.044282 .5379928
_cons | -18.32185 9.008774 -2.03 0.042 -35.97872 -.6649788
-------------+----------------------------------------------------------------
4 |
foreign | 3.032383 1.067163 2.84 0.004 .9407823 5.123985
length | .0466275 .0330607 1.41 0.158 -.0181702 .1114252
mpg | .2451139 .1194367 2.05 0.040 .0110223 .4792054
_cons | -17.33818 8.484411 -2.04 0.041 -33.96732 -.7090361
------------------------------------------------------------------------------
As you can, combining categories solved the problem.
In other examples I have seen, there are more cases, but there are often a huge number of variables, some of which are themselves problematic (e.g. a 0-1 dichotomy with 985 0's and 15 1's.) It may just be that gologit models aren't appropriate for your data, but following the rest of the advice in the trouble-shooting FAQ may help you to come up with a workable model. (Indeed, your model may be problematic with any method, but negative predicted probabilities in gologit2 may just make the problems more obvious.)