How the Large N Could Complement the Small in Democratization Research

Abstract: Large-N quantitative analysis should be considered a possible complement to small-N comparisons (SNC). Quantitative analysis has its weaknesses, to be sure, but they could be counterbalanced by some real strengths in SNC. And quantitative analysis does have certain methodological advantages that help compensate for some of the weaknesses of SNC. On the one hand, SNC tends to develop "thick" (complex, multidimensional, contextualized, or rich) concepts and theories that are well-suited for description and for making inferences about simple causation on a small scale or in a few cases; but thick concepts and theories are unwieldy when it comes to generalization or rigorous testing of complex hypotheses. On the other hand, quantitative analysis is justifiably criticized for its "thin" (reductionist or simplistic) concepts and theories, but it is the best method available for testing generalizations, especially generalizations about complex causal relationships. In principle, thick concepts can be translated into the thin format of quantitative data, and the nuanced, conditional, complex, and contextualized hypotheses of SNC can be translated into quantitative models. To make this potential a reality, however, we have to collect different data and more data and do it more systematically and rigorously. There is an especially pressing need to develop data and models that bridge levels of analysis. The paper also surveys the extent to which the proposed translation of thick into thin is taking place already.

Michael Coppedge
Kellogg Institute
Hesburgh Center
University of Notre Dame
Notre Dame, IN 46556
tel 219/631-7036
fax 219/631-6717

How the Large N Could Complement the Small in Democratization Research

General Observations drawn from Particulars, are the Jewels of
knowledge, comprehending great Store in a little Room; but they
are therefore to be made with the greater Care and Caution.

John Locke, "Of the Conduct of the Understanding" (1706)

Democracy and regime change have been favorite objects of study among Latin Americanists for decades. Over the years, these scholars and other comparativists observing Latin American cases have proposed many interesting hypotheses about these phenomena. Among the prominent hypotheses are those holding that democratization is affected (for good or ill) by: the rise of a middle class (Johnson 1958), dependency (Cardoso and Faletto 1971), military professionalization (Stepan 1971), the mode of incorporation of the working class (Collier and Collier 1991), the alignment of the state with class interests (Rueschemeyer, Stephens, and Stephens 1992), economic development in combination with all of the above (O'Donnell 1973), corporatist political culture (Wiarda 1981), U.S. intervention (Blasier 1985, Lowenthal 1991), presidentialism (Linz and Valenzuela 1994) and multipartism (Mainwaring 1993), economic performance (Gasiorowski 1995, Remmer 1996, Haggard and Kaufman 1997), or elite strategies in response to crisis (Linz and Stepan 1978, O'Donnell and Schmitter 1986). This kind of theorizing has been criticized recently for being unsystematic, not cumulative, untestable, or even atheoretical (Bates 1996, Geddes 1997). However, people disagree about the preferred alternative. The critics in question are advocates of rational choice, which aspires to the logical deduction of universalistic theory. But small-N comparison has also been criticized from the opposite, more anthropological, pole, which advocates closer attention to local context, nuance, and meaning (Levine 1994).

The purpose of this paper is to insert into this controversy another alternative--large-N quantitative analysis--not as a superior method, but as a possible complement to small-N comparison (SNC). Quantitative analysis has its weaknesses, to be sure, but they could be counterbalanced by some real strengths in SNC. And quantitative analysis does have certain methodological advantages that help compensate for some of the weaknesses of SNC. On the one hand, SNC tends to develop "thick" (complex, multidimensional, contextualized, or rich) concepts and theories that are well-suited for description and for making inferences about simple causation on a small scale or in a few cases; but thick concepts and theories are unwieldy when it comes to generalization or rigorous testing of complex hypotheses. On the other hand, quantitative analysis is justifiably criticized for its "thin" (reductionist or simplistic) concepts and theories, but it is the best method available for testing generalizations, especially generalizations about complex causal relationships.

So far quantitative analysis has hardly begun to exploit its full potential for assimilating complex concepts and testing complex theories, largely due to data limitations; but the potential is still there. In order to realize this potential, scholars need to answer two key questions that arise at the intersection of SNC and quantitative analysis: Can thick concepts be translated into the thin format of quantitative data? And can the nuanced, conditional, complex, and contextualized hypotheses of SNC be translated into quantitative models? In this paper I argue that the answer to both questions is "yes" in principle, but that in order to make these approaches complementary in practice, we have to collect different data and more data and do it more systematically and rigorously. In the process I will also survey the extent to which the proposed translation of thick into thin is taking place already.

A Perspective on Methods

In debates about the merits of one approach vs. another, it is always healthy to bear in mind that all contain gaping methodological holes. We social scientists never prove anything, not even with our most sophisticated methods. Popper argued that the goal of science is not to prove a theory, but to disconfirm alternative hypotheses (Popper 1968).(1) In a strict sense, our goal is to disconfirm all the alternative hypotheses. But no serious social scientist requires proof that, for example, space aliens have not been destabilizing democracies by poisoning their water supplies. In practice, therefore, we are content to disconfirm only the alternative hypotheses that are conventionally considered plausible by other social scientists. (Of course, if implausible hypotheses become plausible later, we are obliged to try to disconfirm them as well.) This convention lightens our burden tremendously because the vast majority of the hypotheses an imaginative person could dream up are implausible. But it leaves room for a still-overwhelming number of alternatives, for two reasons. First, different people find different things plausible. Some people are convinced by intimate personal knowledge of a case; other by sophisticated statistical tests; and still others by the force of logical deduction. (A corollary is that most of us tend to overlook the weaknesses of research that satisfies our pet criteria.) Second, as Lakatos argued, disconfirmation is no simple yes-or-no exercise. Every hypothesis is embedded in a web of theories, not the least of which is the "interpretive" theory used to gather evidence for the test (Lakatos 1978). The common--and legitimate--practice of revising the supporting theories to explain away an apparent disconfirmation further increases the number of plausible alternatives. (For this reason, in this article I define "disconfirmation" rather loosely, as "inconsistency with the web of theories conventionally treated as the facts.")

This multiplication of plausible alternative hypotheses is especially problematic for those who would like to explain the complex macro-phenomena of politics, such as democratization, because the number of plausible alternatives is almost hopelessly large. There is no room here to list all the hypotheses about democratization that anyone finds plausible, but one can intuitively grasp the magnitude of the challenge by surveying all the "orders of complexity" involved.

Every theoretical model in social science has five parameters. First, every model pertains to a certain level of analysis--individual, group, national, world-systemic, or some intermediate gradation between these. Second, it has one or more dependent variables. Third, it has one or more explanatory variables. Fourth, it applies to a certain relevant universe of cases. And fifth, it applies to events or processes that take place during a certain period of time. We can refer to the definitions of each of these five parameters as possessing "zero-order complexity" because no relationships among parameters are involved. In the study of democratization, however, even at the zero order there is great leeway for defining what democracy is, how to measure it and any explanatory factors, which sample of countries is relevant for testing any given set of explanations, and the period of time to which such explanations apply. And this is just at the national level of analysis; with smaller or larger units of analysis, one would use completely different variables, cases, and time frames.

"First-order complexity" involves any causal relationship between any of these parameters and itself. These relationships include:

1. causation bridging levels of analysis, or (dis)aggregation;
2. causal relationships among dependent variables, or endogeneity;
3. interactions among independent variables;
4. impacts of one time period on another, called lagged effects or temporal autocorrelation; and
5. the impact of one case on another, called diffusion or spatial autocorrelation.

Plausible examples of all of these can be found in the democratization literature:(2)

1. Aggregation: democratization at the national level as the outcome of strategic maneuvering among elites at the group or individual level (O'Donnell and Schmitter 1986);
2. Endogeneity: political liberalization as a prerequisite for transition (O'Donnell and Schmitter 1986);
3. Interactions: collinearity among two or more independent variables, such as modernization indicators (Hadenius 1992);
4. Lagged effects: democratization as a process of incremental change from a country's previous level of freedom (Burkhart and Lewis-Beck 1994, Przeworski et al. 1996); and
5. Diffusion: waves of democracy (Li and Thompson 1975, Huntington 1991, Diamond 1996, Starr 1991).

Second-order complexity involves causal relationships between two different parameters. All hypotheses about an independent variable causing democracy (or democracy causing something else) are of this order; but so are various complications that could be introduced into a model. If the meaning of democracy varies over time or the best way to operationalize an independent variable depends on the world region, then one is dealing with this degree of complexity. Third-order complexity comes into play when there are plausible hypotheses relating three parameters. Most common among these are hypotheses that the relationship between the dependent variables and an independent variable is partly a function of time or place. A good example is the hypothesis that the impact of economic development on democratization depends on a country's world-system position (O'Donnell 1973, Bollen 1983, Burkhart and Lewis-Beck 1994, Hadenius 1992). With fourth-order complexity, a causal relationship could be a function of both time and place (or level of analysis). This may sound far-fetched, but in small-N comparison such relationships are fairly commonly asserted--for example, the notion that increasing wealth has not favored democracy in the Arab oil-producing states since the Second World War (Karl 1997); or the claim that the U.S. has become more sincerely interested in promoting democracy in the Caribbean Basin since the end of the Cold War (Huntington 1991).

Orders of complexity can increase only so far. Eventually, one arrives at the extremely inelegant "saturated" model that explains each outcome perfectly by providing different and unique explanations for each case. Laypersons who have not been socialized into social science know that the saturated model is the truth: every country is unique, history never repeats itself exactly, and every event is the product of a long and densely tangled chain of causation stretching back to the beginning of time. We political scientists know on some level that a true and complete explanation for the things that fascinate us would be impossibly complex. But we willfully ignore this disturbing fact and persist in our research. We are a community of eccentrics who share the delusion that politics is simpler than it appears. This is why our relatives roll their eyes when we get excited about our theories. Although I would be as delighted as any other political scientist to discover simple, elegant, and powerful explanations, I think the common sense of the layperson is correct: we must presume that politics is extremely complex, and the burden of proof rests on those who claim that it is not.

From this admittedly perfectionist perspective, all approaches yield only a partial and conditional glimpse of the truth. Believing otherwise is an invitation to error. Nevertheless, all approaches have some value because, as Karl Deutsch said, the truth lies at the confluence of independent streams of evidence. Any method that helps us identify some of the many possible plausible hypotheses is useful, as is any method that combines theory and evidence to help us judge how plausible these hypotheses are. But this perspective also suggests a practical and realistic standard for evaluating the utility of competing methodologies. For methods that are primarily concerned with empirical assessments,(3) it is not enough for a method to document isolated empirical associations or regularities; and it is asking too much to expect incontrovertible proof of anything. The question that should be asked is, rather, what are the strengths and weaknesses of each approach in helping us render certain kinds of alternative hypotheses more plausible or less? My discussion of these basic strengths and weaknesses is organized into three sections dealing with three desiderata for a theory of democratization: thick concepts, thick theory, and bridges between levels of analysis.

Thick Concepts

In the empiricist's ideal world, theoretical concepts would be simple, clear, and objective. We would theorize exclusively about relatively straightforward things like voter turnout, union membership, or legislative turnover. But the reality is that much of the political theory we find interesting concerns some of the messiest concepts around--power, participation, legitimacy, identity, development, accountability, and of course democracy. Such concepts are often controversial because they mean different things to different people: they are complex and multifaceted, and their meanings often have subtle variations depending on the time, the context, or the person using them. People disagree about whether this is a good thing or not; but at present and as long as interesting theories are couched in terms of messy concepts, those who wish to test such theories have no choice but to translate those concepts into indicators of one sort or another, whether the results are categorical definitions or continuous numerical variables.

SNC excels at the kind of conceptual fussiness that is required to develop valid measures of thick concepts. Researchers using this approach usually define their concepts carefully; they take pains to explain how what they mean by a concept differs from what their colleagues have meant; they spend a great deal of time justifying what the functional equivalent of a concept is in the case they are analyzing; they are sensitive to how the meaning of the concept may have changed over a long period of time; and it is not unusual for small-N comparativists to debate publicly what is or should be meant by the word that represents a concept. By far the best demonstration of these tendencies is Collier and Levitsky's recent survey of qualifiers for "democracy": they encountered hundreds in the published literature (Collier and Levitsky 1997)! This attention to nuance comes at a price, however, for it impedes generalization and cumulation. The more elaborately a concept is defined, the narrower it becomes. The more baggage it has to carry, the less widely it can travel. A concept that is perfectly tailored for analyzing politics in the United States--say, roll-call voting--is not very useful for analyzing legislative behavior in Britain, or even in the U.S. at the turn of the century. This difficulty in generalizing also means that general theory cumulates accompanied by cumulative uncertainty. If my explanation of Y1 differs from your explanation of Y2 or her explanation of Y3, it may be because we are explaining slightly different things. Every researcher who defines a dependent variable anew automatically lends plausibility to this alternative hypothesis, which remains plausible until it is ruled out by additional research.

Quantitative research has the opposite strengths and weaknesses. Its variables tend to be defined more narrowly, which makes it more feasible to gather data from a large number of cases and therefore to support generalizations. Also, the same few variables--a handful of democracy indicators, per capita GNP, etc.--tend to be used repeatedly. This habit, which is reinforced by the cost of collecting new data for a large sample, reduces (but does not eliminate) the plausibility of the hypothesis that different researchers are in fact explaining different things in different ways, and therefore favors (but does not guarantee) cumulation. The weakness is that the "thin" concepts implied by the construction of some of the variables often introduces uncertainty about the validity of these measures. Quantitative researchers in effect use the bait-and-switch tactic of announcing that they are testing hypotheses about the impact of, for example, "economic development," and then by sleight of hand substitute an indicator of per capita energy consumption and assert that it measures development well enough. The problem with such substitutions is not necessarily that they do not measure the concept of interest at all (although this does occur at times); in fact, per capita energy consumption correlates extremely strongly (.90+) with per capita GNP. Rather, the problem is that a single narrow indicator cannot capture all the relevant aspects of a thick concept. A fully valid indicator of economic development would have to measure not only energy consumption or wealth, but also industrialization, technology production, productivity, and perhaps other facets. If we operationalize development as per capita GNP, we are not really testing a hypothesis about economic development, but about the value of production by the average person. If many researchers use this same indicator, they may be cumulating general knowledge, but it is general knowledge about a somewhat different hypothesis that does not provide a completely satisfying answer to our theoretical questions.

Is there any way to assemble large datasets with valid indicators? It would be easier if concepts were not thick. We cannot expect everyone to lose all interest in certain theoretical concepts simply because they are difficult to measure, to eschew the thick and theorize about the thin. However, this has occasionally happened. For example, most comparativists in the mid-1960s considered "instability" an interesting and important phenomenon, but few do today. The reason for the change is that political scientists soon realized that "instability" meant several different and incomparable things--regime instability, government instability, cabinet instability, and disturbances to public order, which could be further subdivided into riots, strikes, crime, terrorism, and internal war. These thinner concepts seem more useful and interesting to us today, and when students begin to talk about instability in general we are quick to set them straight. Again, it would facilitate testing if a similar consensus would evolve about the subdivision of other thick concepts, such as "governance" or "democratic consolidation."

But some thick concepts are too old and too central to our thinking to be reduced in this way. In such cases the alternative is to explore empirically how to measure them validly and reliably, which requires recognition of their complexity. The basic procedure for measuring any complex concept is well known and has four steps. First, the analyst breaks the "mother" concept up into as many simple and relatively objective components as possible. Second, each of these components is measured separately. Third, the analyst examines the strength of association among the components to discover how many dimensions are represented among them and in the mother concept. Fourth, components that are very strongly associated with one another are treated as unidimensional, i.e., as all measuring the same underlying dimension, and may be combined. Any other components or clusters of components are treated as indicators of different dimensions. If the mother concept turns out to be multidimensional, the analyst then has two or more unidimensional indicators that together can capture its complexity. If the mother concept turns out to be unidimensional, then the analyst has several closely associated component indicators that may be combined into a single indicator that captures all the aspects of that dimension better than any one component would.(4) This is the kind of analysis that makes it possible to construct indicators for complex concepts that can be used in large-N quantitative analysis.

Some of this sort of empirical exploration has been done already for the concept of democracy. Democratic theorists over the decades first simplified the task by progressively narrowing the concept, purging it of impractical components such as the appointment of administrators by lottery, and adapting it to the context of the large nation-state by accepting the idea of representation (Dahl 1989). But from the French Revolution through Alexis de Tocqueville's Democracy in America, "democracy" was still so multifaceted that it was not even clearly distinct from social equality. The "elite theorists" during and after the Second World War then promoted an even narrower concept of democracy that was limited to political, rather than social or economic, components and did not require direct participation in policymaking, only in the selection of policymakers (Schumpeter 1942, Dahl and Lindblom 1953, Dahl 1971, Sartori 1973). By the time political scientists began trying to measure democracy, the concept had therefore been reduced to selected national political institutions and practices and some of their characteristics.

The first indicators of democracy had a few problems that required refinements. The early democracy indicators often confounded democracy with regime stability. In his classic 1959 article, for example, Lipset used the ordinal classifications "stable democracies/unstable democracies/dictatorships" (for European and English-speaking countries) and "democracies/unstable dictatorships/stable dictatorships" (for Latin American countries) (Lipset 1959). Phillips Cutright's index of "national political development" was the sum of a country's democracy scores over a 21-year period, which made the number of years of democracy matter as much as the degree of democracy in each year (Cutright 1963). As Kenneth Bollen has observed, this mistake has been repeated several times, even as late as 1988 (Bollen 1991, 10-12). This is not to say that it is illegitimate to be interested in stable democracy. However, measuring stable democracy with anything more sensitive than an either-or category requires at least two dimensions, as regime stability and democracy vary independently: there are stable democracies, unstable democracies, stable nondemocracies, and unstable nondemocracies.

Other attempts to measure democracy excluded stability, sometimes by reporting a score for one time-point, sometimes by reporting an annual series of scores. But some of them compromised validity by including components that had little or no theoretical justification. For example, Vanhanen (1990) included the percentage of the vote won by the governing party in his index of democracy, even though nothing in democratic theory suggests that extremely fragmented party systems are more democratic than two-party or moderate multiparty systems. Another example is the Freedom House survey. Its checklists take into consideration the autonomy of elected representatives from military control, a country's right of self-determination, citizens' freedom from domination by economic oligarchies, the autonomy of religious and ethnic minorities, gender equality, property rights, the freedom to choose family size, freedom from dependency on union leaders and bureaucrats, and freedom from gross government corruption, among other requirements (Freedom House 1991, 49-51). Some of these components probably should not be included in a measure of democracy; others could be if the definition of democracy were fairly rich but should not be lumped together in the same index because they are likely to be multidimensional. Freedom House appears to combine its components in a flexible way that somehow avoids the worst potential biases, but it has not reported systematically how the components are related, so it is difficult for outside observers to confirm their validity or reliability.

Despite these measurement problems and another not yet mentioned, we know that even the relatively thin versions of democracy consist of at least two dimensions. For one of those dimensions we already have several indicators that are adequate for various large-N comparisons. One of Dahl's major contributions in Polyarchy (1971) was to argue convincingly that polyarchy has two dimensions--contestation and inclusiveness. He defined contestation as having five components, or institutional requirements--elected officials, free and fair elections, freedom of expression, associational autonomy, and the existence of alternative sources of information. Inclusiveness was defined solely in terms of the suffrage and widespread eligibility to run for public office. Coppedge and Reinicke (1990) later confirmed that the components of contestation are indeed unidimensional and may be legitimately combined into a single indicator, while the extent of the suffrage lies on a different dimension and should not be included as a component of contestation. Many of the existing quantitative indicators of "democracy" are actually indicators of contestation. They are the Bollen Index of Political Democracy (Bollen 1980), the Polity III data on democracy and autocracy (Jaggers and Gurr 1995), the Freedom House ratings of Political Rights and Civil Liberties, the Polyarchy Scale (Coppedge and Reinicke 1990), Hadenius' Index of Democracy (Hadenius 1992), and Bollen's Index of Liberal Democracy (Bollen 1993).(5) It has been demonstrated repeatedly that these indicators measure the same underlying dimension. Their intercorrelations, for example, usually exceed .83 (Inkeles 1990, 5-6). Table I provides additional information about these indicators. They are by no means perfect: Bollen has demonstrated, for example, that Freedom House ratings for 1979-1981 (at least) tended to underrate Eastern European countries and overrate Latin American countries by a small but statistically significant amount (Bollen 1993).(6) His index for 1980, which corrects for these biases as well as anyone can at this point, is probably the most valid indicator available today. But Bollen's index is a point measure; only the Freedom House ratings and the Polity III data are time-series. If one needs time-series data, there is little reason to avoid using the Freedom House data. According to Bollen's estimates, the Freedom House Political Rights ratings for 1979-1981 were 93 percent valid despite the regional bias. It also correlates at .938 with the Polyarchy Scale. These results suggest that can expect very similar results from an analysis regardless of which of these indicators is used (Inkeles 1990, 5; Hopple and Husbands 1991, 11-12).
Table 1: Comparison of Available Indicators of Contestation
Name Source Nature Years No. Countries
Index of Political Democracy Bollen 1980 0-100 index 1960 and 1965 123 and 113
Polity III Democracy data Jaggers and Gurr 1995 10-point scale 1800-1994 161
Polity III Autocracy data Jaggers and Gurr 1995 10-point scale 1800-1994 161
Democracy - Autocracy Jaggers and Gurr 1995 20-point scale 1800-1994 161
Political Rights Freedom House 7-point scale 1972-1996ave. 151
Civil Liberties Freedom House 7-point scale 1972-1996 ave. 151
combination of both Freedom House 13-point scale 1972-1996 ave. 151
Polyarchy Scale Coppedge and Reinicke 1990 11-point Guttman scale 1985 170
Index of Democracy Hadenius 1992 0-10 index 1988 132
Index of Liberal Democracy Bollen 1993 0-100 index 1980 153

It is important to remember, however, that contestation is just one narrow dimension of what has historically been meant by democracy. Partly for this reason controversy has always surrounded the use of these indicators. One common objection concerns the measurement of democracy as a continuum; another concerns the exclusion of various theoretically important components.

Categorical vs. Continuous Indicators

There are two basic objections to continuous measures of democracy. One, most recently and forcefully argued by Adam Przeworski but also championed by Giovanni Sartori, holds that the theoretical concept of democracy is categorical, not continuous, and that attempts to measure a categorical phenomenon with a continuous instrument produce either measurement error or nonsense (Przeworski et al. 1996, Sartori 1987). The second objection is that our theories about democratization usually concern regimes rather than degrees of democracy; continuous indicators are therefore inappropriate for testing the leading theories (Munck 1996). Or, as Juan Linz argued in 1964:

We prefer for purposes of analysis to reject the idea of a continuum from democracy to totalitarianism and to stress the distinctive nature of authoritarian regimes. Unless we examine the features unique to them, the conditions under which they emerge, the conceptions of power held by those who shape them, regimes which are not clearly either democratic or totalitarian will be treated merely as deviations from these ideal types and will not be studied systematically and comparatively (Linz 1970, 253.)

These objections tend to imply that numerical indicators are inherently unsuitable for some purposes. But the problem with the quantitative indicators that we have now is not that they are quantitative; it is that they are qualitatively different from the categorical definitions of democracy that they attempt to measure.

If both continuous and categorical indicators measured exactly the same concept, then we would prefer the continuous one on the grounds that it is more informative, more flexible, and better suited for the sophisticated testing that can rule out more of the plausible alternative hypotheses. If one wanted a categorical measure, it could always be derived from the continuous one by identifying one or more thresholds that correspond to the categories desired. A dichotomized indicator would sort cases and interact with other variables the same way a dichotomy would--again, assuming that they measured exactly the same concept. In other words, the continuous indicator contains more information, which we could choose to ignore, but the reverse is not true: one cannot derive a continuous measure from a categorical one without adding new information.

Some may still object that the additional information in the continuous measure is not meaningful or useful because translating neat and satisfying categories into slippery matters of degree deprives us of analytic footholds. According to this argument, our minds seek out categories because we need definite, firm, satisfying, categorical ideas to guide us. This, I think, is just an illusion created by attempts to translate precise mathematical language into imprecise verbal language. Suppose a simple bivariate regression estimate of the relationship between polyarchy and per capita GNP is that Democracy = 2.0 + .125*log(GNPPC). If one had to explain this finding without using any numbers, one could say little more than, "There is a minimal level of democracy below which no country falls, but the wealthier the average person is, the more democratic the country is. However, the benefits of wealth diminish steadily as wealth increases." Such a statement does not allow one to make any useful statement about how democratic we should expect any country to be. But this statement is only a faint hint of what the estimate really says because the most useful information--the numbers--have been removed. Restoring the numbers recreates a compact, elegant formula that can generate quite definite predictions of how democratic any country should be. And properly understood, it is not a false precision, because the standard errors and other parameters of the estimate can be used to calculate a confidence interval for any of its predictions.

It is of course natural to feel uncertain about what a prediction of, say, "5.4" means because the number itself has no inherent meaning. But the same could be said about other numbers in our everyday lives--temperature readings, ages, incomes, and Olympic diving scores. All of these numbers, and the equally arbitrary words that make up a language, acquire meaning only through the familiarity that comes from using them to describe, compare, and communicate. Moreover, numbers have the additional advantage of being translatable into graphics, which often speak more eloquently than words. Finally, if none of this is reassuring, there is one indicator that does transparently explain what all of its scores mean--the Polyarchy Scale. Because it is a Guttman scale, every score corresponds to a well-defined set of institutions and practices--a thumbnail description of the relevant characteristics of each political system (Coppedge and Reinicke 1990).

But this entire defense of the superiority of continuous indicators rests on the premise that the hypothetical continuous and categorical indicators measure exactly the same concept. In theory they could, but in practice they do not. This is the real problem with continuous indicators: they measure only thin, reductionist versions of the thicker concepts that interest the non-quantitative scholars. Such is the case with "regimes," which come with rich and sometimes quite elaborate definitions. Compare, for example, Juan Linz's definition of an authoritarian regime with the Polyarchy Scale's criteria for scale score 5, which is the threshold that corresponds most closely to authoritarianism. (These definitions are reproduced in Table 2). The first two components of each definition are nearly interchangeable even though the Polyarchy Scale is more explicit here about what "limited pluralism" means in practice. (Obviously, Linz's legendary 237-page essay is much more explicit than the brief definition quoted in Table 2.) Linz's definition, however, goes on to list three additional components that are entirely missing from scale score 5 or any other score of the Polyarchy Scale--the nature of the leaders' belief systems, the absence of active political mobilization by the regime, and some degree of institutionalization.
Table 2: Definitions of Authoritarian Regime and a Low Degree of Polyarchy Contrasted
Authoritarian Regime (Linz 1975, 264): Polyarchy Scale Score 5 (Coppedge and Reinicke 1990, 53-54):
[Political systems without] free competition between leaders to validate at regular intervals by nonviolent means their claim to rule. . . (10) [There are] no meaningful elections: elections without choice of candidates or parties, or no elections at all.
. . . political systems with limited, not responsible, political pluralism Some political parties are banned and trade unions or interest groups are harassed or banned, but membership in some alternatives to official organizations is permitted. Dissent is discouraged, whether by informal pressure or by systematic censorship, but control is incomplete. The extent of control may range from selective punishment of dissidents on a limited number of issues to a situation in which only determined critics manage to make themselves heard. There is some freedom of private discussion. Alternative sources of information are widely available but government versions are presented in preferential fashion. this may be the result of partiality in and greater availability of government-controlled media; selective closure, punishment, harassment, or censorship of dissident reporters, publishers, or broadcasters; or mild self-censorship resulting from any of these.
without elaborate and guiding ideology, but with distinctive mentalities
without extensive nor intensive political mobilization, except at some points in their development
and in which a leader or occasionally a small group exercises power within formally ill-defined limits but actually quite predictable ones.

Although this comparison demonstrates that the two concepts are not fully comparable, it also illustrates how the richness of categorical definitions can be combined with the advantages of numerical indicators. Every element of a categorical definition can be reconceptualized as a threshold on a continuous dimension; these components can be measured separately, and then recombined to the extent that they are shown to be unidimensional. For example, if the Polyarchy Scale included all the components from Linz's definition of authoritarianism, then it would be a valid indicator of his concept, and it would have the additional advantage of defining and measuring greater and lesser degrees of authoritarianism.(7) No information would be lost, and some would be added.

Theorists could certainly refuse to recognize the higher and lower ranges of such an indicator as valid, on the grounds that they were never contemplated in the original theory. But the fact is that scholars who define those higher and lower ranges are breaking new conceptual ground. As long as one threshold of their continuous concept is faithful to all the facets of the original categorical concept, the only additional requirement for validity is that the extended ranges be useful for analysis. If they are, there is no reason not to use thick continuous measures. It bears repeating, however, that thick continuous measures that are fully equivalent to regime definitions remain to be developed.(8) In the meantime, users of continuous indicators should remind themselves that they are working in a realm of thin conceptualization.

Thickening Thin Concepts

The second objection to quantitative indicators of democracy directly addresses their thinness. A by-product of the third wave of democratization is that as more and more developing countries now satisfy the rather minimalist existing requirements for democracy, it is difficult not to notice that some of these political systems have disturbing characteristics that seem intuitively inconsistent with democracy. Some scholars therefore remind us of components of democracy that have been dropped or taken for granted in the past 50 years and quite understandably call for them to be restored or made explicit. Thus Schmitter and Karl (1991, 76-80) include institutionalization and a viable civil society ("cooperation and deliberation via autonomous group activity") among their criteria for "what democracy is." Similarly, others stress the centrality of the rule of law (Hartlyn and Valenzuela 1994, O'Donnell 1994) and an independent judiciary (Diamond 1996). O'Donnell and others also argue that democracy requires elected officials to enjoy autonomy from unelected "veto groups," whether they are economic conglomerates, international powers, or the military; and impartial respect for basic citizenship rights (O'Donnell 1993).

Once again empirical analysis could be a great help in deciding whether and how to restore these components to the concept of democracy. The crucial task is to ascertain which of these components lie on the same dimensions as contestation and inclusiveness. Unidimensional components can be incorporated into definitions and indicators without provoking much controversy (although not without considerable hard work). If these additional components turn out to lie on different dimensions, then we face a choice: we could incorporate any of these components as a third or fourth dimension of a more complex concept of democracy; or we could decide to keep them as separate concepts whose relationship with a still-narrow concept of democracy is empirical rather than definitional.

When this analysis is done, I suspect that we will settle on a thicker, 3-dimensional concept of democracy. The three dimensions I have in mind are inclusiveness, empowerment, and the scope of democratic authority. In other words, democracy is about a large proportion of the citizens having an equal chance to participate in making final decisions on a wide range of issues. Inclusiveness needs little explanation: it is simply the proportion of the adult citizens who have effective opportunities to participate equally in the opportunities for decisionmaking that the political system allows. However, the definitions of polyarchy and many other thin concepts of democracy do not consider inclusiveness in any opportunities besides voting for representatives and running for office. In reality there are, or could be, many other opportunities for citizens to participate equally in decisionmaking: in judicial proceedings, at public hearings, in referendums and plebiscites, and in speaking through the media to place issues on the public agenda, for example. Most civil liberties fit into this dimension as well, as they involve individuals' equal right to decide their own beliefs and many other aspects of their personal lives. There is also a hierarchy of increasingly responsible opportunities for participation that are available to much smaller groups of citizens: drafting legislation, voting on legislation, ratifying appointments, reconciling legislation, and so on. These opportunities, ranked according to their proximity to a binding final decision, constitute the dimension of empowerment. The criterion of inclusiveness is relevant to all of these opportunities, not just to the suffrage. If the judicial system does not provide equal protection under the law, for example, the political system would have to be rated as less inclusive. Also, a political system that allows citizens to vote directly on some important legislation is more empowered than one in which all citizens choose only their representatives, other things being equal. Because this is a two-dimensional concept so far, other things may not be equal. For instance, a political system in which all citizens are allowed to vote in a rigged plebiscite would score high on inclusiveness but low on empowerment.

The third dimension--the scope of democratic authority--reflects the agenda of issues that the democratic government may decide without consulting unelected actors. This dimension reflects any constraints on governmental authority imposed by the military, business groups, religious authorities, foreign powers, or international organizations regarding issues of importance to them. The fewer the issues that are in practice "off limits" to final decisionmaking by relatively inclusive bodies, the broader the scope of democratic authority. These three dimensions taken together make it possible to incorporate several of the additional components mentioned above into a thicker concept of democracy. Autonomy from veto groups is reflected in the scope of democratic authority, and the rule of law and respect for citizenship are equivalent to inclusiveness in opportunities to influence the judicial process. This three-dimensional concept would therefore help us make meaningful distinctions among countries that satisfy the current minimal requirements for democracy.

Thick Theory

A second strength of small-N comparison is the development of "thick theory": richly specified, complex models that are sensitive to variations by time and place. As argued above, such complex models are desirable because many of the complex alternative hypotheses are plausible and we have to try to disconfirm them in order to make any progress. In the study of democratization and many other complex macro-phenomena, the virtues of parsimony are overrated. Small-N comparative work does a good job of suggesting what these complex relationships might be. In the Latin American literature, the conventional wisdom presumes that each wave of democratization is different, that each country has derived different lessons from its distinct political and economic history; that corporate actors vary greatly in power and tactics from country to country, and that both individual politicians and international actors can have a decisive impact on the outcome. This is the stuff of thick theory, and comparative politics as a whole benefits when a regional specialization generates such rich possibilities.

But can such complex hypotheses be tested with small-N comparisons? On first thought, one might say no because of the "many variables, small N" dilemma. The more complex the hypothesis, the more variables are involved; therefore a case study or paired comparison seems to provide too few degrees of freedom to mount a respectable test. This cynicism is not fair, however, because in a case study or small-N comparison the units of analysis are not necessarily whole countries. Hypotheses about democratization do not have to be tested by examining associations between structural causes and macro-outcomes. In King, Keohane, and Verba's terminology, we increase confidence in our tests by maximizing the number of observable implications of the hypothesis: we brainstorm about things that must be true if our hypothesis is true, and systematically confirm or disconfirm them (King, Keohane, and Verba 1994, 24). The rich variety of information available to comparativists with an area specialization makes this strategy ideal for them. In fact, it is what they do best. For example, a scholar who suspects that Allende was overthrown in large part because he was a socialist can gather evidence to show that Allende claimed to be a socialist; that he proposed socialist policies; that these policies became law; that these laws adversely affected the economic interests of certain powerful actors; that some of these actors moved into opposition immediately after certain quintessentially socialist policies were announced or enacted; that Allende's rhetoric disturbed other actors; that these actors issued explicit public and private complaints about the socialist government and its policies; that representatives of some of these actors conspired together to overthrow the government; that actors who shared the president's socialist orientation did not participate in the conspiracy; that the opponents publicly and privately cheered the defeat of socialism after the overthrow; and so on. Much of this evidence could also disconfirm alternative hypotheses, such as the idea that Allende was overthrown because of U.S. pressure despite strong domestic support. If it turns out that all of these observable implications are true, then the scholar could be quite confident of the hypothesis. In fact, she would be justified in remaining confident of the hypothesis even if a macro-comparison showed that most elected socialist governments have not been overthrown, because she has already gathered superior evidence that failed to disconfirm the hypothesis in this case.

The longitudinal case study is simply the best research design available for testing hypotheses about the causes of specific events. In addition to maximizing opportunities to disconfirm observable implications, it does the best job of documenting the sequence of events, which is crucial for establishing the direction of causal influence. Moreover, it is unsurpassed in providing quasi-experimental control, because conditions that do not change from time 1 to time 2 are held constant, and every case is always far more similar to itself at a different time than it is to any other case. A longitudinal case study is the ultimate "most similar systems" design. The closer together the time periods are, the tighter the control. In a study of a single case that examines change from month to month, week to week, or day to day, almost everything is held constant and scholars can often have great confidence in inferring causation between the small number of conditions that do change around the same time. Of course, any method can be applied poorly or well, so this method is no guarantee of a solid result. But competent small-N comparativists have every reason to be skeptical of conclusions from macro-comparisons that are inconsistent with their more solid understanding of a case.

This approach has two severe limitations, however. First, it is extremely difficult to use it to generalize to other cases. Every additional case requires a repetition of the same meticulous process-tracing and data collection. To complicate matters further, the researcher usually becomes aware of other conditions that were taken for granted in the first case and now must be examined systematically in it and all additional cases. Generalization therefore introduces new complexity and increases the data demands almost exponentially, making comparative case studies unwieldy. A good demonstration of this tendency is the Colliers' Shaping the Political Arena: in order to apply detailed case-study methods systematically to 8 cases, they had to write an 800-page book. The second limitation of the case study is that it does not provide the leverage necessary to test hypotheses of the third order of complexity and beyond. Such hypotheses usually involve hypotheticals, for which a single case can supply little data (beyond interviews in which actors speculate about what they would have done under other conditions). For example, would the Chilean military have intervened if Allende had been elected in 1993 rather than 1970? If a different Socialist leader had been president? If he were in Thailand rather than Chile? If Chile had a parliamentary system? Such hypotheses cannot be tested without some variation in these added explanatory factors, variation that one case often cannot provide.

Generalization and complex relationships are better supported by large-N comparisons, which provide the degrees of freedom necessary to handle many variables and complex relationships. These comparisons need not be quantitative, as the qualitative Boolean analysis recommended by Charles Ragin has many of the same strengths (Ragin 1987, Berg-Schlosser and De Meur 1994). However, Boolean analysis forces one to dichotomize all the variables, which sacrifices useful information and introduces arbitrary placement of classification cut points that can influence the conclusions. It also dispenses with probability and tests of statistical significance, which are very useful for ruling out weak hypotheses. Quantitative analysis has these additional advantages over Boolean analysis. Moreover, quantitative methods are available that can easily handle categorical or ordinal data alongside continuous variables, and complex interactions as well, so there would be little reason to prefer qualitative methods if quantitative data were available and sound.

The fact that little high-quality quantitative data is available for large samples is the main reason why the potential for large-N comparisons to explain democratization has not been realized more fully. However, large-N analyses have made use of relevant indicators as they have become available, even if they were not fully valid. There has been quite of bit of exploration of thin versions of a variety of hypotheses. The central hypothesis in the 1960s was that democracy is a product of "modernization" (proposed in a case study of Turkey--Lerner 1958), which was measured by a long, familiar, and occasionally lampooned set of indicators--per capita energy consumption, literacy, school enrollments, urbanization, life expectancy, infant mortality, size of industrial workforce, newspaper circulation, and radio and television ownership. The principal conclusion of these analyses was that democracy is consistently associated with per capita energy consumption or (in later studies) per capita GNP or GDP, although the reasons for this association remain open for discussion (Jackman 1973, Rueschemeyer 1991, Diamond 1992). Quantitative studies have also explored associations between democracy and:

*income inequality (Bollen and Jackman 1985, Muller 1988, Bollen and Jackman 1995, Przeworski et al. 1996)
*religion and language (Hannan and Carroll 1981, Lipset, Seong, and Torres 1993, Muller 1995)
*region or world-system position (Bollen 1983, Gonick and Rosh 1988, Burkhart and Lewis-Beck 1994, Muller 1995, Coppedge 1997)
*state size (Brunk, Caldeira, and Lewis-Beck 1987)
*presidentialism, parliamentarism, and party systems (Stepan and Skach 1993, Mainwaring 1993, Przeworski et al. 1996, Power and Gasiorowski 1997) and
*economic performance (Remmer 1996, Londregan and Poole 1996).

Certain other explanatory factors that have been suggested in the Latin American literature have not yet been tested in large samples. Among them are:

*U.S. support for democracy or authoritarian governments (Blasier 1985, Lowenthal 1991)
*relations between the party in power and elite interests (Rueschemeyer, Stephens, and Stephens 1992)
*the mode of incorporation of the working class (Collier and Collier 1991)
*interactions with different historical periods, and
*U.S. military training (Stepan 1971, Loveman 1994).

Incorporation of these last hypotheses into large-N models will require much more thought and data collection, especially if samples extend beyond Latin America. Ideally, data collection would also be done more rigorously for these independent variables so that they would be adequately valid indicators of the concepts of theoretical interest. This is not always possible, but where it is not--as with data collected worldwide for other purposes--greater use should be made of statistical techniques for incorporating latent variables into models. These techniques, such as LISREL, can help compensate and correct for poor measurement if several related indicators are available and their theoretical relationship to the concept of interest is known (Jaccard and Wan 1996).

In spite of the improvements still needed in measurement, quantitative research has steadily forged ahead into higher orders of complexity. The first studies consisted of cross-tabulations, correlations, and bivariate regressions, taking one independent variable at a time. The first multivariate analysis was Cutright's in 1963, but nearly a decade passed before it became the norm to estimate the partial impact of several independent variables using multiple regression. In the early 1980s some researchers began exploring interactions between independent variables and fixed effects such as world-system position (Bollen 1983, Gonick and Rosh 1988), a third-order hypothesis. However, these models were simpler than those being entertained by Latin Americanists of the time. O'Donnell's model of bureaucratic authoritarianism, for example, was nonlinear, sensitive to cross-national variations and the historical-structural moment, and defined the nature of links between the national and international levels of analysis (O'Donnell 1973, Collier 1979). One major advance came in 1985, when Edward Muller made a distinction between factors that cause transitions to democracy and factors that help already-democratic regimes survive. This distinction was shared by the discussions of the Wilson Center group that led to Transitions from Authoritarian Rule in 1986 and it has been reinforced by the ambitious project of Przeworski et al. (1996).

Until very recently almost all quantitative research on democratization was cross-sectional, due to the lack of a time-series indicator of democracy. Since the late 1980s, however, Freedom House and Polity II data have been available and are increasingly used to incorporate lagged dependent variables into quantitative models (Burkhart and Lewis-Beck 1994, Londregan and Poole 1996). These lagged effects represent a great step forward in control, because they make it possible to hold constant, even if crudely, all the unmeasured conditions in each country that do not change from one year to the next. They also give one more confidence in inferences about causality because the independent variables can in effect explain changes in the level of democracy over relatively short spans of time.

Scattered studies here and there have employed several other techniques to check out possible complexities in democratization. Starr (1991) explored evidence for causal influence across cases, i.e., the diffusion of democracy. There is little other published quantitative work on democratic diffusion, but O'Loughlin and Ward at the University of Colorado--Boulder are undertaking an ambitious study of diffusion (O'Loughlin and Ward 1995). Bratton and Van de Walle's 1996 study of recent African transitions is also innovative for disaggregating the democratic outcome into a series of dependent variables--political protests, political liberalization, and democratization, each of which is a stepping-stone to the next. Given the virtual consensus on the idea of stages of democratization--liberalization, transition, and consolidation or survival--it would seem to be a good idea either to model these stages separately, as Przeworski et al. have done for survival rates or democratic "life expectancy"; or better yet, to combine these or other stages as endogenous variables in a unified model. Finally, some of my own research suggests that the relationship between modernization and democracy depends on which threshold of democracy is being explained (Coppedge 1997). Unfortunately for parsimony, all of these studies make such complexities more plausible rather than less: they provide evidence that should lead us to presume that diffusion, endogeneity, and threshold effects are real. The same may be true of even more troublesome complexities that remain unexplored. Perhaps the most difficult one is the challenge of bridging levels of analysis.

Bridging Levels of Analysis

Aside from the few studies of diffusion or interactions with world-system position, all of the quantitative research mentioned above is cast at the national level of analysis. The widest gulf that divides large-N studies from small-N comparisons results from the fact that most of the latter are either cast at the subnational (group or individual) level or move easily between all levels, from individual to international. Small-N comparison is more flexible in this respect, and this is one of its methodological advantages. As long as small-N comparativists have information that is plausibly relevant for explaining democratization, there is nothing to stop them from inserting it into the explanation regardless of its level of analysis. Thus, case studies routinely mix together national structural factors such as industrialization, growth rates, or constitutional provisions; group factors such as party or union characteristics; individual factors such as particular leaders' personalities or decisions; and international factors such as international commodity prices and U.S. or IMF influence. Quantitative researchers are caught flat-footed when faced with shifting levels of analysis because they go to great pains to build a dataset at one level, and shifting to a different level requires building a completely different dataset from scratch. The units of analysis they have are countries and years, at best. To test hypotheses from the O'Donnell-Schmitter-Whitehead project, they would have to collect data about strategic actors rather than countries and resample at intervals of weeks or months rather than years.(9)

In view of the difficulty of bridging levels of analysis, it is tempting to conclude that the effort is not necessary: that the choice of a level of analysis is a matter of taste, that those working at the individual and national levels may eat at Almond's separate tables and need never reconcile their theories. But from the perspective of methodological perfection outlined in this paper, the level of analysis is not a matter of taste, because no level of analysis by itself can yield a complete picture of all the causal relationships that lead to an outcome like democracy. All levels of analysis are, by themselves, incomplete. Rational-choice advocates are right to insist that any political theory tell us what is going on at the individual level. This does not mean, however, that claims about associations at a macro-level do not qualify as theory until one can tell a story about the causal mechanisms at the individual level. A theory of structural causation is theory, but an incomplete theory, just as theory at the individual level is incomplete until it tells us what process determined the identities and number of players, why these players value the ends they pursue rationally and which variety of rationality guides their choices, how the institutional arena for the game evolved, what process priced the payoffs in the game, why the rules sometimes change in mid-game, and how the power distribution among actors determines the macro-outcome. And both microtheories and macrotheories are incomplete until we understand them in their slowly but constantly evolving historical-structural context.

This insistence on bridging levels of analysis is not mere methodological prudery. Empirical questions of great theoretical, even paradigmatic, import depend on it: questions such as, "Do individuals affect democratization at all?" Rational choice assumes that they do; Linz and Stepan (1978) and O'Donnell, Schmitter, and Whitehead (1986) asserted that they do. Yet despite all the eloquent theorizing that led to "tentative conclusions about uncertain transitions," all the cases covered by Transitions from Authoritarian Rule underwent successful transitions that have lasted remarkably long. There are many possible explanations for this genuinely surprising outcome, but one that is plausible enough to require disconfirmation is the idea that these transitions were driven by structural conditions. Even if it is the case that elites and groups had choices and made consequential decisions at key moments, their goals, perceptions, and choices may have been decisively shaped by the context in which they were acting. If so, they may have had a lot of proximate influence but very little independent influence after controlling for the context. I do not mean to assert this interpretation as fact, but merely to suggest that it has some plausibility and theoretical importance and to point out that we will never know how seriously to take it until we bridge these levels of analysis with methods that permit testing of complex multivariate hypotheses. This effort would require collecting a lot of new data on smaller units of analysis at shorter time intervals.


Small-N comparison and quantitative large-N analyses need each other. SNC needs to be able to generalize and test its complex theories; quantitative studies need to be based on richer concepts and a greater variety of explanatory factors. They have the potential to be quite complementary. However, the obstacle that stands in the way of a blending of these two approaches is the lack of appropriate data. The priorities I would suggest for future democratization research are:

1. Conceptual work to identify aspects of democracy that should be restored to its definition.
2. Empirical work to measure all of these components of democracy and identify their dimensions.
3. Operationalization of hypotheses about democratization cast at the group and individual levels.
4. Theoretical work to specify the most plausible functional forms and interactions among variables.
5. Integration of explanatory models at different levels of analysis.
6. Testing of the resulting complex models with many cases and long time-series.


Bates, Robert H. "Letter from the President: Area Studies and the Discipline." APSA-CP: Newsletter of the APSA Organized Section in Comparative Politics 7:1 (Winter): 1-2.

Berg-Schlosser, Dietrich and GisŠle De Meur. 1994. "Conditions of Democracy in Interwar Europe: A Boolean Test of Major Hypotheses," Comparative Politics 26:3 (April): 253-80.

Blasier, Cole. 1985. The Hovering Giant: U.S. Responses to Revolutionary Change in Latin America, 1910-1985. Rev. ed. Pittsburgh: U Pittsburgh P.

Bollen, Kenneth. 1983. "World System Position, Dependency, and Democracy: The Cross-National Evidence," American Sociological Review 48: 468-79.

------. 1991. "Political Democracy: Conceptual and Measurement Traps," in Alex Inkeles, ed., On Measuring Democracy: Its Consequences and Concomitants, pp. 3-20. New Brunswick: Transaction.

-----. 1993. "Liberal Democracy: Validity and Sources Biases in Cross-National Measures," American Journal of Political Science 37: 1207-30.

Bollen, Kenneth and Robert Jackman. 1985. "Political Democracy and the Size Distribution of Income," American Sociological Review 50: 438-57.

Bratton, Michael, and Nicolas van de Walle. 1996. "Democratic Experiments in Africa: Testing Competing Explanations of Regime Transitions." Paper prepared for delivery at the 1996 Annual Meeting of the American Political Science Association, the San Francisco Hilton and Towers, August 29-September 1.

Brunk, Gregory C., Gregory A. Caldeira, and Michael S. Lewis-Beck. 1987. "Capitalism, Socialism, and Democracy: An Empirical Inquiry," European Journal of Political Research 15: 459-70.

Burkhart, Ross E. and Michael Lewis-Beck. 1994. "Comparative Democracy: The Economic Development Thesis," American Political Science Review 88:4 (December): 903-910.

Cardoso, Fernando Henrique, and Enzo Faletto. 1971. Dependencia y desarrollo en Am‚rica Latina. M‚xico: Siglo Veintiuno.

Collier, David. 1979. "Overview of the Bureaucratic-Authoritarian Model." In David Collier, ed., The New Authoritarianism in Latin America, pp. 19-32. Princeton: Princeton UP.

Collier, David, and Steven Levitsky. 1997. "Democracy with Adjectives: Conceptual Innovation in Comparative Research." World Politics 49 (April): 430-51.

Collier, Ruth Berins, and David Collier. 1991. Shaping the Political Arena. Princeton: Princeton UP.

Coppedge, Michael. 1997. "Modernization and Thresholds of Democracy: Evidence for a Common Path and Process," in Manus Midlarsky, ed., Inequality and Democracy. New York: Cambridge UP.

Coppedge, Michael and Wolfgang Reinicke. 1990. "A Scale of Polyarchy," Studies in Comparative and International Development 25:1 (Spring): 51-72.

Cutright, Phillips. 1963. "National Political Development: Measurement and Analysis," American Sociological Review 28: 253-64.

Dahl, Robert. 1971. Polyarchy: Participation and Opposition. New Haven: Yale UP.

-----. 1989. Democracy and Its Critics. New Haven: Yale UP.

Dahl, Robert A., and Charles E. Lindblom. 1953. Politics, Economics, and Welfare: Planning and Politico-Economic Systems Resolved into Basic Social Processes. New York: Harper.

Diamond, Larry. 1992. "Economic Development and Democracy Reconsidered," in Gary Marks and Larry Diamond, eds., Reexamining Democracy, pp. 93-139. Newbury Park: SAGE.

-----. 1996. "Is the Third Wave of Democracy Over?" Unpublished excerpts from Developing Democracy: Toward Consolidation. Baltimore: Johns Hopkins UP.

Freedom House. 1991. "Survey Methodology." In Freedom in the World 1990-91, pp. 49-52. New York: Freedom House.

Gasiorowski, Mark J. "Economic Crisis and Political Regime Change: An Event History Analysis." American Political Science Review 89:4 (December): 882-97.

-----. 1997. "An Overview of the Political Regime Change Dataset." Comparative Political Studies 29:4 (August): 469-83.

Geddes, Barbara. 1997. "Paradigms and Sandcastles: Research Design in Comparative Politics." APSA-CP: Newsletter of the APSA Organized Section in Comparative Politics 8:1 (Winter): 18-20.

Gleditsch, Kristian S., and Michael D. Ward. 1997. "Double Take: A Reexamination of Democracy and Autocracy in Modern Polities." Journal of Conflict Resolution 41:3 (June).

Gonick, Lev S. and Robert M. Rosh. 1988. "The Structural Constraints of the World-Economy on National Political Development," Comparative Political Studies 21: 171-99.

Hadenius, Axel. 1992. Democracy and Development. Cambridge UP.

Haggard, Stephan, and Robert Kaufman. 1997. "The Political Economy of Democratic Transitions." Comparative Politics 29:3 (April): 263-83.

Hannan, Michael T. and Glenn R. Carroll. 1981. "Dynamics of Formal Political Structure: An Event-History Analysis," American Sociological Review 46: 19-35.

Hartlyn, Jonathan, and Arturo Valenzuela. 1994. "Democracy in Latin America Since 1930." In Leslie Bethell, ed., The Cambridge History of Latin America, v. VI: Latin America Since 1930: Economy, Society, and Politics. Cambridge: Cambridge UP.

Hopple, Gerald W., and Jo L. Husbands, eds. 1991. "Assessing Progress Toward Democracy: Summary Report of a Workshop." Panel on Issues in Democratization, Commission on Behavioral and Social Sciences and Education, National Research Council. Washington, DC: National Academy Press.

Huntington, Samuel. 1991. The Third Wave: Democratization in the Late Twentieth Century Norman: U Oklahoma P.

Inkeles, Alex. 1990. "Introduction: On Measuring Democracy." Studies in Comparative International Development 25:1 (Spring): 3-6.

Jaccard, James, and Choi K. Wan. 1996. LISREL Approaches to Interaction Effects in Multiple Regression. Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-114. Beverly Hills and London: Sage.

Jackman, Robert W. 1973. "On the Relation of Economic Development and Democratic Performance," American Journal of Political Science 17: 611-21.

Jaggers, Keith, and Ted Robert Gurr. 1995. "Tracking Democracy's Third Wave with the Polity III Data." Journal of Peace Research 32 (November): 469-82.

Johnson, John J. 1958. Political Change in Latin America: The Emergence of the Middle Sectors. Stanford: Stanford UP.

Karl, Terry Lynn. 1997. The Paradox of Plenty : Oil Booms and Petro-States. Studies in International Political Economy, No. 26. U California P, forthcoming.

King, Gary, Robert O. Keohane, and Sidney Verba. 1994. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton UP.

Lakatos, Imre. 1978. "Falsification and the Methodology of Scientific Research Programmes." In John Worrall and Gregory Currie, eds., The Methodology of Scientific Research Programmes, pp. 8-101. Cambridge: Cambridge UP.

Lerner, Daniel. 1958. The Passing of Traditional Society. New York: Free Press.

Levine, Daniel H. 1994. "Goodbye to Venezuelan Exceptionalism." Journal of Inter-American Studies and World Affairs 36:4 (Winter): 145-82.

Li, R.P.Y. and W.R. Thompson. 1975. "The Coup Contagion' Hypothesis," Journal of Conflict Resolution 19: 63-88.

Linz, Juan J. 1970 (republication of a 1964 article). "An Authoritarian Regime: Spain." In Erik Allardt and Stein Rokkan, eds., Mass Politics, pp. 251-83. New York: Free Press.

-----. 1975. "Totalitarian and Authoritarian Regimes." In Fred I. Greenstein and Nelson W. Polsby, eds., Handbook of Political Science, v. 3: Macropolitical Theory, pp. 175-411. Reading, Mass.: Addison-Wesley.

-----. 1994. "Presidential or Parliamentary Democracy: Does It Make a Difference?" in Juan J. Linz and Arturo Valenzuela, eds., The Failure of Presidential Democracy, pp. 3-87. Baltimore: Johns Hopkins UP.

Linz, Juan J. and Alfred Stepan. 1978. The Breakdown of Democratic Regimes. Baltimore: Johns Hopkins UP.

Lipset, Seymour Martin. 1959. "Some Social Requisites of Democracy: Economic Development and Political Legitimacy," American Political Science Review 53 (March): 69-105.

Lipset, Seymour Martin, Kyoung-Ryung Seong, and John Charles Torres. 1993. "A Comparative Analysis of the Social Requisites of Democracy," International Social Science Journal 136 (May): 155-75.

Londregan, John B. and Keith T. Poole. 1996. "Does High Income Promote Democracy?" World Politics 49:1 (October): 1-30.

Long, J. Scott. Confirmatory Factor Analysis. Sage University Paper series on Quantitative Applications in the Social Sciences, series no. 07-033. Beverly Hills and London: Sage.

Loveman, Brian. 1994. " Protected Democracies' and Military Guardianship: Political Transitions in Latin America, 1978-1993." Journal of Inter-American Studies and World Affairs 36 (Summer): 105-189.

Lowenthal, Abraham. 1991. "The U.S. and Latin American Democracy: Learning from History." In Lowenthal, ed., Exporting Democracy: The United States and Latin America, pp. 261-83. Baltimore: Johns Hopkins UP.

Mainwaring, Scott. 1993. "Presidentialism, Multipartism, and Democracy: The Difficult Combination," Comparative Political Studies 26 (July): 198-228.

Muller, Edward N. 1988. "Democracy, Economic Development, and Income Inequality," American Sociological Review 53:2 (February): 50-68.

-----. 1995. "Economic Determinants of Democracy," American Sociological Review 60:4 (December): 966-82, and debate with Bollen and Jackman following on pp. 983-96.

Munck, Gerardo L. 1996. "Disaggregating Political Regime: Conceptual Issues in the Study of Democratization." Working Paper No. 228. Notre Dame, Ind.: Kellogg Institute.

O'Donnell, Guillermo A. 1973. Modernization and Bureaucratic-Authoritarianism: Studies in South American Politics. Berkeley: Institute of International Studies, U California, Berkeley.

-----. 1993. "On the State, Democratization, and Some Conceptual Problems: A Latin American View with Glances at Some Post-Communist Countries." World Development 21: 1355-69.

-----. 1994. "Delegative Democracy." Journal of Democracy 5 (April): 57-74.

O'Donnell, Guillermo and Phillippe Schmitter. 1986. Transitions from Authoritarian Rule: Tentative Conclusions about Uncertain Transitions. Baltimore: Johns Hopkins UP.

O'Donnell, Guillermo, Phillippe Schmitter, and Laurence Whitehead, eds. 1986. Transitions from Authoritarian Rule. Baltimore: Johns Hopkins UP.

O'Loughlin, John, and Michael D. Ward. 1995. "The Spatial and Temporal Diffusion of Democracy, 1815-1994." Abridged proposal submitted to the National Science Foundation.

Popper, Karl R. 1968. The Logic of Scientific Discovery. New York: Harper and Row.

Power, Timothy J., and Mark J. Gasiorowski. 1997. "Institutional Design and Democratic Consolidation in the Third World." Comparative Political Studies 30:2 (April): 123-55.

Przeworski, Adam, Michael Alvarez, Jos‚ Antonio Cheibub, and Fernando Limongi. 1996. "What Makes Democracies Endure?" Journal of Democracy 7:1 (January): 39-55.

Ragin, Charles. 1987. The Comparative Method: Moving Beyond Qualitative and Quantitative Strategies. Berkeley: U California P.

Remmer, Karen L. 1996. "The Sustainability of Political Democracy: Lessons from South America." Comparative Political Studies 29:6 (December): 611-34.

Rueschemeyer, Dietrich. 1991. "Different Methods, Contradictory Results? Research on Development and Democracy," International Journal of Comparative Sociology 32:1-2: 9-38.

Rueschemeyer, Dietrich, John D. Stephens, and Evelyne Huber Stephens. 1992. Capitalist Development and Democracy. U Chicago P.

Rustow, Dankwart. 1970. "Transitions to Democracy," Comparative Politics 2: 337-63.

Sartori, Giovanni. 1973. Democratic Theory. Westport, Conn.: Greenwood Press.

-----. 1987. The Theory of Democracy Revisited. Chatham, N.J.: Chatham House.

Schmitter, Philippe C., and Terry Lynn Karl. 1991. "What Democracy Is . . . and Is Not." Journal of Democracy 2 (Summer): 75-88.

Schumpeter, Joseph A. 1942. Capitalism, Socialism, and Democracy. New York and London: Harper and Brothers.

Starr, Harvey. 1991. "Democratic Dominoes: Diffusion Approaches to the Spread of Democracy in the International System," Journal of Conflict Resolution 35:2 (June): 356-381.

Stepan, Alfred. 1971. The Military in Politics: Changing Patterns in Brazil. Princeton: Princeton UP.

Stepan, Alfred and Cindy Skach. 1993. "Constitutional Frameworks and Democratic Consolidation: Parliamentarism and Presidentialism," World Politics 46 (October): 1-22.

Vanhanen, Tatu. 1990. The Process of Democratization: A Comparative Study of 147 States, 1980-88. NY: Crane Russak.

Wiarda, Howard J. 1981. Corporatism and National Development in Latin America. Boulder, Colo. : Westview.