Regression Analysis and the Philosophy of Social Sciences --
a Critical Realist View
University of Minnesota
December 20, 1999
* email@example.com. I would like to thank Professor James Farr and
Professor John Freeman for their helpful advice, comments and criticism.
An earlier version of this paper was presented at the Third International
Conference of the Center for Critical Realism, Örebro University, August
This paper challenges the connection conventionally made between regression analysis
and the empiricist philosophy of science and offers an alternative explication for the way
regression analysis is being practiced. The alternative explication is based on critical realism, a
competing approach to empiricism in the field of philosophy of science. The paper argues that
critical realism can better explicate the way in which scientists ‘play’ with the data as part of the
process of inquiry. The practice of regression analysis is understood by the critical realist
explication as a post hoc attempt to identify a restricted closed system. The gist of successful
regression analysis is not being able to offer a law-like statement but to bring forth evidence of
an otherwise hidden mechanism. Through the study methodological debates regarding regression
analysis, it is argued that critical realism can offer conceptual tools for better understanding the
core issues that are at stake in these debates.
The procedure of regression analysis is conventionally considered as an exemplar of the
positivist empiricist approach to research in political science. Those that use the procedure are
forced to defend empiricism with the entire philosophical burden that it carries; and those that
attack the empiricist philosophy of science condemn the prevalence of this statistical procedure
in the field. Very few, however, challenge the connection that was established between the
procedure and the empiricist philosophy.
By the term regression analysis, I refer to various mathematical methods that aggregate
observations into a form in which a dependent variable is a mathematical function of independent
variables (y = f(x1,x2,…xn)), often in a way that allows a statistical inference regarding the
parameters of the function outside the specific sample. Thus, my argument covers not only the
methods that are based on least square estimations, but also maximum likelihood estimations and
Bayesian inference. The question, then, is how this mathematical function is part of the overall
project of increase in scientific knowledge. The answer to such a question requires not only a
technical understanding of the mathematical procedures, but also an explication of the meaning
of scientific laws.
I will use the term “empiricism” to describe the interpretation of the law-like relation as
an Hempelian general law.1 According to this interpretation, regression analysis is a useful tool
for establishing such laws in the social sciences. However, the conventional wisdom continues,
social life is complex and for various reason it is never possible to satisfy the demanding
1 I use the term empiricism following Bhaskar (1975). It is used here in a different sense from
the way it is used in the rationalist-empiricist debate on the possible sources of knowledge. I
prefer the term empiricism to the more commonly used term positivism, since the definition of
empiricism used here is broader than what is conventionally included under the term positivism.
conditions of the mathematical theorems, and hence our ability to identify laws is always
problematic and questionable. This view appears in textbooks as the only possible interpretation
of the procedure. Thus, the practitioner of regression analysis who follows the textbook might
unnecessarily subscribe herself to the problems of empiricism: the demand of operational
definitions, the assumption of a world ruled by laws, insufficient place for agency, lack of
sensitivity to the problems of interpretation, and a limited place for emancipatory practice.
In this paper, I argue for an alternative philosophical framework to interpret more
adequately the way regression analysis is actually used by political scientists. This alternative is
based on Critical Realism, a school in philosophy of science, which is increasingly becoming
influential, particularly in Anglo-American social theory (Isaac, 1990: 2). In particular, I develop
a version of Critical Realism first articulated by Roy Bhaskar (1975, 1979). According to this
view, laws should not be understood as descriptions of constant conjunctions of events but as
tendencies of "powerful particulars" (Harré & Madden, 1975: 5). Thus, when a scientist cites a
law, she is describing a property of a thing, and not trying to predict in a specific circumstance if
the thing will behave in a specific way.
The empiricist view holds, among other things, that it is very difficult to conduct
experiments in the social sciences. Therefore, procedures like regression analysis can be used to
identify laws based on statistical analysis of passive observations. Through regression analysis,
the scientist can control for all the effects that govern real-life phenomena, and to identify the
best way to describe the relation between the observations. I claim here to the contrary that in
using the procedure of regression analysis, scientists are trying to identify situations in which it is
possible to observe the activity of a mechanism. When a scientist offers a regression equation,
she does not necessarily mean that the whole model or part of it approximates a universal general
law. Instead, she argues, at least implicitly, that she was able to demonstrate the activity of a
mechanism that could not be observed from the data alone. The gist of successful regression
analysis is not being able to offer a law-like statement but to bring forth evidence of an otherwise
The account of regression analysis that I offer here is a matter of "explication."
Explication entails the analysis of a given notion, used intuitively by scientists, in order to
provide it with coherent philosophical foundations. The debate of the meaning of the notion of
scientific explanation is itself a famous example of explication (Salmon, 1990). In this essay,
therefore, I am offering an alternative explication of the notion of regression analysis. My
strategy is not to proceed by immanent critique, that is, to point out internal contradictions in the
empiricist interpretation. Instead, I offer an alternative explication, which better represents
regression analysis as actually practiced by social and political scientists. Thus, my explication
is directed to the scientists who practice regression and not only the philosophers who examine
the adequacy of arguments.
Explication is never only a descriptive discussion but also always a normative one too.
From the moment that the philosopher establishes a coherent explication, that is, it can become
an evaluative standard for evaluating scientific works. Hempel, for example, uses his explication
of the notion 'scientific explanation' to determine that some notions of explanation (such as
functional explanations) are not truly explanatory. In this paper, I focus only on establishing an
alternative, convincing interpretation other than the empiricist one. Thus, instead of criticizing
works that use regression for being empiricist, I offer a realist reconstruction of these works.2
The structure of my argument is as follows. In the next section, I offer a brief
description of critical realism as an alternative to the dominant tradition of empiricism in the
philosophy of science. It is far beyond the scope of this essay to offer a comprehensive
2 For an argument on the reconstructive potential of realism see Isaac, 1990. Collier (1994,
chapter 7) offers examples of such reconstruction.
discussion of critical realism in general or Bhaskar's version in particular.3 Instead, I focus
briefly only on those aspects of critical realism which are relevant for the argument presented in
this paper. The main point to be taken from this section is that we can and should think about
scientific laws in a way that is not related to relations between observations. Put differently,
goodness of fit is not a definitive character of a scientific law. It is important to emphasize that
critical realism does not call for a new type of scientific laws. Instead, it argues that the notion,
as used by scientists and properly explicated, refers to tendencies, not constant conjunctions of
This point is elaborated in the second section, in which I present and contrast an
empiricist and a realist explication of regression analysis. It is important to understand the nature
of the argument I present. My starting point is regression analysis as practiced by political
scientists. I do not criticize the practice itself or its appropriateness for social sciences. Instead, I
assume that this practice is appropriate, and then ask what philosophical foundations can best
explicate it. Next, I look at the explication offered by some important textbooks and
methodological works on regression analysis. I argue that they implicitly rely on an empiricist
philosophy of science.4
My argument must be further clarified at the outset. Even though some econometrics
texts, by adopting the terms "true models" and "true betas," present regression analysis as a
method to reveal the general laws that govern social interactions, I do not think that most
political scientists understand regression analysis in this way. Nevertheless, they do accept the
empiricist definition of scientific laws as a deductive claim that allows the knowledge of the
3 For such a discussion see Collier, 1994, Outhwaite, 1987, Isaac, 1990 and Archer et al., 1998.
4 Here I discuss these works in general, I give particular examples and references in the second
dependent variable from the independent variables. Thus, they think that compared to the natural
sciences we can make only more circumscribed and qualitative formulation of laws, but
nonetheless in the same formal representation, as a function with the form y=f(xn).
To see how this is achieved it is necessary to understand the roots of the empiricist
philosophy of science. Empiricism belongs to the analytic tradition in the philosophy of science
(Gunnell, 1969). This tradition tries to establish a set of logical criteria that both describes
scientific activity and justifies its logical validity. It does so, in short, by describing scientists in
terms that are borrowed from logic. As a footnote, it is necessary to mention that so far
empiricism was not able to establish such a criterion. It is only claims that its goal is to establish
one, which will be identical to all sciences. The empiricist explication of regression analysis
rests on the claim that goodness of fit together with additional criteria, which I discuss in detail
later, can offer a formal criteria for deciding whether a specific regression equation qualifies as a
The immediate response to the above claim is that without goodness of fit as a formal
criterion we lose the scientific nature of the inquiry. If we do not look at the relations between
our models and the data, why collect data in the first place? My argument in the second part of
the section is that a realist perspective can defend the scientific character of regression analysis
without relying on goodness of fit as a formal criterion for establishing laws. In this view,
goodness of fit is required only to ensure that the specification being used indeed makes the
activity of the mechanism in question observed.
5 There is a debate between methodologists regarding the proper measure of goodness of fit, but
underlying this debate is an agreement that a formal criterion of goodness of fit is necessary to
establish explanation (see King, 1986, 1991; Luskin, 1991).
In the third section, I focus on one theoretical debate and show how the realist
interpretation can be used to reformulate the issues that are at stake in the debate. I focus on the
'Norwegian Exceptionalism' debate in the field of comparative political economy not only
because of its theoretical importance but also because of the methodological issues that it
consciously raises (Lange & Garrett, 1985, 1986, 1987; Jackman, 1987, 1989). I argue that the
debate is not properly about which model produces the best goodness of fit or which model better
explains the dependent variable (in this case economic growth). Rather, the question is whether
the specification used by Lange and Garrett is sufficient to demonstrate the mechanism for which
they argue. I conclude with the implication of my argument to the use of regression analysis and
for broader questions in the philosophy of science.
Philosophy of science cannot and does not pretend to replace the practice of scientists.
Realism claims that in their practice scientists attempt to discover the real mechanisms of social
structures. This, however, does not mean that every social structure offered by scientists are real,
or even that we can have secure grounds in believing that any particular structure is real. The
work of scientists is best understood as an attempt to demonstrate the reality of otherwise
hypothetical structures. Realism does not make the work of scientists easier but offers coherent
philosophical grounds for their activity.
2. Two Philosophies of Science:
This section presents critical realism as an alternative to the dominant empiricist
philosophy of science. Empiricism is the philosophical tradition that understands scientific laws
as a description of relations between sense observations expressed as constant conjunctions of
events. This understanding of science is shared not only by those advocating a naturalist
approach to social sciences but also by those that oppose it. Many scholars who see themselves
as anti-naturalists and emphasize the difference between the study of nature and the study of
society in fact share the empiricist view of laws in natural science but reject the possibility of
applying scientific laws to the realm of society (Bhaskar, 1979: 2-3). Thus, despite the
disagreements between the naturalistic and the anti-naturalist tradition, they both rest upon the
same definition of scientific laws. Critical realism challenges the empiricist view of science and
laws, and offers a different understanding of scientific activity.
My interest in this paper is the applicability of realism and not any possible
inconsistencies of empiricism. For this reason, I will not go into details about possible
philosophical inconstancies in the empiricist or realist philosophy of science. Instead, I argue
that even if empiricism can offer a systematic model of scientific activity, lacking any logical
contradiction, it fails in explicating the way political scientists use regression analysis. Thus, in
the discussion I linger less on the logical structure of both philosophies of science and more on
the way that they explicate the activities of scientists.
According to Critical Realism, “the world consists of things, not events” and thus,
science is "concerned essentially with what kinds of things they are and with what they tend to
do; it is only derivatively concerned with predicting what is actually going to happen" (Bhaskar,
1975: 51). Scientific laws describe tendencies that things have in virtue of their internal
structure. The predicate of scientific laws is not observations but real tendencies of things.
"Thus," according to Bhaskar, "in citing a law one is referring to the transfactual activity of
mechanisms, that is to their activity as such, not making a claim about the actual outcome (which
in general will be co-determined by the activity of other mechanisms too)" (Bhaskar, 1979: 12).
Does it make any difference which definition of scientific laws we choose? To answer
this question it is necessary to discuss the definition of open and closed systems. A closed
system is a system where a constant conjunction of events obtains. In a close system, as a matter
of definition, the formula 'whenever this, then that' applies (Bhaskar, 1975: 69). 6 Empiricism
assumes that a "universal closure" obtains. This means that all events in the world can be
described under the formula 'whenever this, then that,' or at least that science can be applied only
to that part of the world which is a closed system. This is a metaphysical assumption that cannot
be corroborated empirically. One of the main challenges of the empiricist philosophy of science
is to offer an analytical distinction between law-like statements and real scientific laws, because
the two are similar in form.
The task of distinguishing between law-like statements and laws is not an easy task for
philosophers. Not surprisingly, it is not an easy one for scientists either. Law-like statements,
even those that are considered as corroborated by science, are often found inadequate in real life
situations. In such situations empiricists usually adopt one of the following strategies. First, it
can be claimed that the law-like statement is only an accidental generalization.7 For example, if
third parties would begin to appear in simple-majority simple-ballot systems, it would be
possible to argue that "Duverger's law" does not hold anymore.8 Second, the empiricist can use
the ceteris paribus clause and claim that the laws hold only if certain conditions apply or that it is
only a probabilistic law. Third, it can be claimed that the law-like statement is not complete.
Either a relevant variable was omitted from the formulation, or a complex unit has to be
separated into a more basic one. Duverger's law, according to such strategy, holds in all cases
6 More precisely, "For every event y there is an event x or set of events x1...xn such that x or
x1...xn and y are regularly conjoined under some set of descriptions" (Bhaskar, 1975: 69).
7 It can also be argued that the law-like statement was once a law but is no longer so (as in
historically bounded laws, see Ball, 1972 for a critical discussion of the concept)
8 According to Duverger's law plurality election rules bring about and maintain two-party
competition (see Riker, 1976).