Richard Williams, Notre Dame Sociology

gologit2/oglm Troubleshooting

Richard Williams, University of Notre Dame

Here are some of the main issues that I get asked about with gologit2. (some of these points also apply to oglm).  Feel free to email me if you have other problems, suggestions or recommendations, or to let me know what recommendations worked best for you.  Click here if you want the main gologit2 support page. Click here for the main oglm page.

Universal Recommendation.  Make sure you have the most current version of the program (and also the most up-to-date version of the Stata software you are using.).  If you are lucky the problem you are encountering may have already been fixed.  From within Stata, type

ssc install gologit2, replace
ssc install oglm, replace
update all

If you have Stata 9 or higher you can also use the adoupdate command.  Also, the most up-to-date version of the documentation is gologit2.pdf

For security or other reasons, my computer can't access the Internet.  How can I install your programs?  This must be a real nuisance for you!  You may want to talk to your computing people to see if they can't find a way to make your life easier.  Some possible solutions are:

* If you have another computer that has Stata and can access the Internet, install the programs on it.  Then, copy c:\ado (or whatever the appropriate directory is on your machine) from one computer to the other.  Be sure you understand what you are doing, because you don't want to accidentally overwrite files that are needed on the non-Internet machine.

* Following are zipped versions of my programs and their support files.  Unzip the files and store them in c:\ado\personal or some other location where Stata can find them.  Again, if you don't understand how to do this, find somebody on your support staff who can help you out.

gologit2 version 2.1.6
oglm version 2.0.0
mfx2 version 1.1.0

* Sometimes people have read/write access to some drives but not others.  If so, you may be able to modify these suggestions for Notre Dame users.  Basically, the "trick" is to get Stata to look for programs in a folder that you have control over.

The output is hard to read and understand.  What are some good ways to interpret and present the results?  For interpreting the results -- see my Stata Journal article, especially section 3.1.  Also look at this presentation and handout. A few additional points, and some comparisons with oglm, are made in this presentation and handout.

For presenting the results -- the default gologit2 output is indeed a little hard to read, because you keep seeing the same numbers over and over for those variables that meet the parallel lines constraint.  A more parsimonious layout is achieved with the gamma option; see sections 3.2 and 3.8 of my Stata Journal article.  I also like the way Thomas Craemer formatted this table (version1, version2); he only presents multiple coefficients for a variable when necessary.  The citation for his complete paper (which also includes a nice discussion of the gologit model) is Craemer, Thomas.  2009.  Psychological ‘self–other overlap’ and support for slavery reparations. Social Science Research 38: 668–680.

I don't have Stata.  Is there any other way I can estimate gologit models?  I've never used it myself, but I understand that Don Hedeker's mixor program can do many of the same things that gologit2 can.  Somebody who is familiar with both programs said that "Hedeker's software does gologit2,  but with random effects. I would assume that if you don't specify a random effect you get the same results. His program doesn't do a lot of the cool things that yours does, but if you have specific non-proportionality hypotheses in mind, it will test them and produce the non-proportional results."  Hedeker's web page also includes programs or code for DOS, SPSS and SAS. There is also a commercial verison of the program called SuperMix.

Can I do a random effects or multilevel model with gologit2?  No.  Instead, check out Stefan Boes's regoprob program, which uses code adapted from gologit2 and reoprob.  Also, I've never tried it myself, but gllamm (also downloadable from SSC) has been used by some people to estimate gologit2-type models - see these Statalist posts from Mirko Moro and Richard Williams. Or, Don Hedeker's mixor program or SuperMix may do what you want.

How do I change the base category in gologit2?  You can't (although you could reverse the coding of your dependent variable if you liked that better). Suppose your dependent variable has four categories. Although the gologit2 output looks a lot like mlogit output, it doesn't make any sense to think of there being a single "base" category. Rather, the gologit results are like a series of logistic regressions. In the first panel, it is like a logistic regression where category 1 = 0 and categories 2, 3, 4 = 1. In the 2nd panel, it is 1 & 2 = 0 and 3, 4 = 1. 3rd panel, 1, 2, 3 = 0, 4 = 1. There is no 4th panel because if there were it would be like 1, 2, 3, 4 = 0, nothing equals 1. If the assumptions of the ordered logit model are met, the coefficients should all be the same in each panel (except for the intercepts). For more, see section 3.1 of my Stata Journal article.

gologit2 does not work with Long & Freese's spost routines.  This is covered in the help file but many people miss the advice.  Add the v1 option to gologit2, e.g.

gologit2 y x1 x2 x3, v1

Some (but not all) of Long and Freese's spost routines currently work with the original gologit but not gologit2.  The v1 option saves the internally-stored results in the same format that was used by gologit.  However, you can still use gologit2's other unique options, such as autofit or pl.  Note that post-estimation commands written specifically for gologit2 (including the pr option of predict) may not work correctly if you use the v1 option.  In that case just rerun the model without it.  Also, the v1 option only works with the default logit link, since that is all the original gologit supported. Long has indicated that future versions of spost will support gologit2.

suest (or some other post-estimation command) produces an error when I run it after gologit2; or, I get other weird errrors when running gologit2.  Try running gologit2 with the nolabel option.  This will cause the equations to be labeled eq1, eq2, etc.  The printout may not be as aesthetically appealing but this may reduce the likelihood of having problems with gologit2 itself or with other commands that have trouble with your labels (e.g. value labels that start with a number sometimes cause problems).  Changing your value labels may also solve the problem.  In many cases though, a program must be officially "blessed" in order to work with a post-estimation command (i.e. the program must be on a hard-coded approved list) so you may just be out of luck if you try to use gologit2 with it.

How do I estimate marginal effects with gologit2 & oglm?  Stata's mfx command will work, but make sure you have an up-to-date version of gologit2 (earlier versions had a bug that kept mfx from working correctly).  However, it is generally better to use my mfx2 program, which can be downloaded and installed from SSC (ssc install mfx2). mfx2 makes it easier to compute marginal effects after multiple-outcome commands like oglm, gologit2, ologit, oprobrit, mlogit, mprobit and slogit.  In addition, the results are formatted in a way that makes them compatible with post-estimation table formatting commands like outreg2 and estout.  An often superior and faster program is Tamas Bartus's margeff (also available from SSC).  Make sure you have the latest version of margeff, since earlier versions did not support gologit2 margeff now works with gologit2 but not oglm.

The predict command comes up with negative predicted probabilities.  Believe it or not, negative predicted probabilities are possible.  McCullagh & Nelder discuss this in Generalized Linear Models, 2nd edition, 1989, p. 155:

The usefulness of non-parallel regression models is limited to some extent by the fact that the lines must eventually intersect.  Negative fitted values are then unavoidable for some values of x, though perhaps not in the observed range.  If such intersections occur in a sufficiently remote region of the x-space, this flaw in the model need not be serious.

So yes, it can happen, and a couple of people have written me about this.  But, they've also mentioned things like extremely high standard errors or other problems, so I suspect that in most cases a solution lies somewhere in the next couple of points. 

I do recommend computing the predicted probabilities under your models; if they seem implausible, then you may wish to modify your model or use a different statistical technique altogether. (One person wrote me that 2 cases out of 27,068 had negative predicted probabilities; I probably wouldn't worry too much in a case like that, but I would get worried if a non-trivial number of cases had negative predicted values.)  Sometimes combining categories of the response variable (especially if the Ns for some categories are small) and/or simplifying the model helps.  The imposition of parallel lines constraints (either via autofit or the pl or npl options) may also help because it reduces the likelihood of non-parallel lines intersecting. 

Click here for an example of the problem and a solution.

The standard errors are extremely high.  You may have high multicollinearity in your variables.  User-written routines like collin can check for this.  But, routines like ologit and gologit2 can also have problems when an X variable has little or no variability within a category of Y, e.g. when Y = 2 X always equals 0.  In ologit, you might get a warning message like this:

Note: 40 observations completely determined.  Standard errors questionable.

In gologit2, alas, such a warning is still on the "wish list" of things I'd like to add.  But, the high standard errors will still be a clue.  Possible diagnostic devices:

If lack of x variability or extreme multicollinearity within a category of y is the problem - you'll have to decide what to do.  You may want (or be forced) to drop the problematic variable.  Maybe y has too many categories with small Ns, and some will need to be combined.  When logit encounters such a problem it not only drops the variable, it drops the cases that were completely determined.

If none of this seems to address the problem - then consider the next FAQ:

gologit2 is very slow and/or does not converge/and/or produces implausible estimates.  A couple of people have written to me with problems like this.  Often they have a large number of variables and/or cases.  Since I don't have their data it is hard for me to tell if there actually is a problem with the program or whether they need to be more patient or whether their model is problematic.  Here are several things you can try.

gologit2 y x1 x2 x3, pl(x1) difficult

gologit2 y x1 x2 x3, pl(x1) technique(nr bhhh dfp bfgs)