Negative Predicted Probabilities in gologit2

Here is a simple example of how gologit2 can sometimes produce negative predicted probabilities.   

. sysuse auto
(1978 Automobile Data)
 
. gologit2 rep78 foreign length mpg, log
 
Iteration 0:   log likelihood = -93.692061  
Iteration 1:   log likelihood = -75.312137  
Iteration 2:   log likelihood = -73.625868  
Iteration 3:   log likelihood = -73.511329  
Iteration 4:   log likelihood = -73.487266  
Iteration 5:   log likelihood = -73.481532  
Iteration 6:   log likelihood = -73.480231  
Iteration 7:   log likelihood = -73.479948  
Iteration 8:   log likelihood = -73.479901  
Iteration 9:   log likelihood = -73.479895  (not concave)
 
Generalized Ordered Logit Estimates               Number of obs   =         69
                                                  LR chi2(12)     =      40.42
                                                  Prob > chi2     =     0.0001
Log likelihood = -73.479895                       Pseudo R2       =     0.2157
------------------------------------------------------------------------------
       rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1            |
     foreign |  (dropped)
      length |   .1070418    .154429     0.69   0.488    -.1956334    .4097171
         mpg |   .1952588   .4860697     0.40   0.688    -.7574202    1.147938
       _cons |  -21.49491   38.85059    -0.55   0.580    -97.64066    54.65084
-------------+----------------------------------------------------------------
2            |
     foreign |   15.29959   1044.196     0.01   0.988    -2031.287    2061.886
      length |  -.0005172   .0410406    -0.01   0.990    -.0809554     .079921
         mpg |  -.0250512   .1696942    -0.15   0.883    -.3576457    .3075432
       _cons |   1.908498   11.06988     0.17   0.863    -19.78807    23.60507
-------------+----------------------------------------------------------------
3            |
     foreign |   3.839048   1.020579     3.76   0.000      1.83875    5.839346
      length |   .0618394   .0322828     1.92   0.055    -.0014336    .1251125
         mpg |   .2485872   .1474317     1.69   0.092    -.0403735    .5375479
       _cons |  -18.36718   9.021422    -2.04   0.042    -36.04884   -.6855182
-------------+----------------------------------------------------------------
4            |
     foreign |   3.029612   1.064825     2.85   0.004     .9425925    5.116631
      length |   .0464896   .0329677     1.41   0.158    -.0181259    .1111052
         mpg |   .2451507   .1194876     2.05   0.040     .0109592    .4793421
       _cons |  -17.31256   8.470451    -2.04   0.041    -33.91434   -.7107851
------------------------------------------------------------------------------
 
WARNING! 25 in-sample cases have an outcome with a predicted probability that is
less than 0. See the gologit2 help section on Warning Messages for more information.
 
. predict p1 p2 p3 p4 p5
(option p assumed; predicted probabilities)
 
. sum p1-p5
 
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          p1 |        74    .1251925       .1639   .0021189   .7717534
          p2 |        74    .0244799    .2264678  -.7717533   .2173719
          p3 |        74    .4389264    .2131038   .0052249   .7701156
          p4 |        74    .2623029    .1576892   .0249331   .5947708
          p5 |        74    .1490983    .2158554    .006046   .9513912
 
. tab rep78
 
     Repair |
Record 1978 |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |          2        2.90        2.90
          2 |          8       11.59       14.49
          3 |         30       43.48       57.97
          4 |         18       26.09       84.06
          5 |         11       15.94      100.00
------------+-----------------------------------
      Total |         69      100.00
 

As we can see from the above, several of the predicted probabilities were negative.  However, there are also many problems with this analysis.  Only 2 cases fall into category one and only 8 cases fall into category 2.  The variable foreign was actually dropped from the first equation and has a huge coefficient and even bigger standard error in the second.  We are spreading the data way too thin here.  Combining categories may help.

. recode rep78 (1/3=3), gen(rep78b)
(10 differences between rep78 and rep78b)
 
. gologit2 rep78b foreign length mpg, log
 
Iteration 0:   log likelihood = -66.194631  
Iteration 1:   log likelihood = -48.776863  
Iteration 2:   log likelihood = -47.631556  
Iteration 3:   log likelihood = -47.599469  
Iteration 4:   log likelihood = -47.599382  
Iteration 5:   log likelihood = -47.599382 
 
Generalized Ordered Logit Estimates               Number of obs   =         69
                                                  LR chi2(6)      =      37.19
                                                  Prob > chi2     =     0.0000
Log likelihood = -47.599382                       Pseudo R2       =     0.2809
------------------------------------------------------------------------------
      rep78b |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
3            |
     foreign |   3.844035   1.027301     3.74   0.000     1.830561    5.857509
      length |   .0617889    .032248     1.92   0.055     -.001416    .1249937
         mpg |   .2468554   .1485422     1.66   0.097     -.044282    .5379928
       _cons |  -18.32185   9.008774    -2.03   0.042    -35.97872   -.6649788
-------------+----------------------------------------------------------------
4            |
     foreign |   3.032383   1.067163     2.84   0.004     .9407823    5.123985
      length |   .0466275   .0330607     1.41   0.158    -.0181702    .1114252
         mpg |   .2451139   .1194367     2.05   0.040     .0110223    .4792054
       _cons |  -17.33818   8.484411    -2.04   0.041    -33.96732   -.7090361
------------------------------------------------------------------------------

 As you can, combining categories solved the problem.

 In other examples I have seen, there are more cases, but there are often a huge number of variables, some of which are themselves problematic (e.g. a 0-1 dichotomy with 985 0ís and 15 1ís.)  It may just be that gologit models arenít appropriate for your data, but following the rest of the advice in the trouble-shooting FAQ may help you to come up with a workable model.  (Indeed, your model may be problematic with any method, but negative predicted probabilities in gologit2 may just make the problems more obvious.)