College of Business Working Paper Series
Exploration and Exploitation Revisited:
Extending March’s Model of Mutual Learning
Simon Rodan
Department of Organization and Management
College of Business, San José State University
Exploration and Exploitation Revisited:
Pre-print of an article in the Scandinavian Journal of Management, 2005, Vol 22: 407-28
Keywords: Organizational Learning; Simulation
OM-05-003
I am immensely indebted to James G. March for his comments and encouragement on this and earlier versions of the paper, and
for so generously sharing with me the code for his original simulation. Thanks are due also to Charles Galunic, Anne Lawrence,
Woody Powell, Peter Moran and the anonymous reviewers for the Academy of Management and the Scandinavian Journal of
Management for their comments on earlier drafts and to Nancy Adler for her help with my writing. And as always, to Judith for
her devotion and support. All remaining errors are mine.
Exploration and Exploitation Revisited
Abstract
A system of actors, appropriately organized, is able to learn even in situations where individuals
in isolation cannot. This was one of the most important, though seldom emphasized, insights of
March’s paper, “Exploration and Exploitation in Organizational Learning” (1991). The present
paper builds on March’s original simulation and incorporates a number of different real-world
organizational features. The results suggest that unconstrained experimentation is of great benefit
to organizational learning, although it should not be carried to excess. Low levels of turnover in
personnel are beneficial and mitigate the problem of high socialization March noted in 1991.
Inclusion in the policy-making elite should be predicated on performance rather than seniority
and on shorter rather than longer individual performance histories, particularly when
environments are changing rapidly. Finally, erring on the side of stringency in selecting members
of the organization for the policy-making elite is better than erring towards laxity.
- 2 -
Exploration and Exploitation Revisited
Introduction
An important albeit seldom emphasized aspect of March’s paper, “Exploration and Exploitation
in Organizational Learning” (1991), is that individuals are able to learn when participating in an
appropriately organized system when they could not do so in isolation. The present paper takes
as its starting-point the general principles of March’s conceptualization of a collective learning
system and links them with work from the domains of human resources and strategic
management, to ask the question: How do certain organizational policies effect organizations
conceptualized as mutual learning systems? Since the learning system March described in 1991
is in essence evolutionary, the organizational policies selected for investigation here were those
likely to impact learning through their role in variation (exploration), and in selection and
retention (exploitation).
I define learning here, rather simply, as ‘the acquisition of useful knowledge’ and vicarious
learning as the acquisition of useful knowledge from others rather than through direct
experience. Organizations provide a context in which vicarious learning is facilitated and
encouraged. Indeed, it has been suggested that it is their knowledge-sharing properties that
accounts for their existence (e.g., Conner & Prahalad, 1996; Grant, 1996). One way in which
organizations disseminate knowledge among their members is through routines and standard
operating procedures (March & Simon, 1958; Cyert & March, 1963; Levitt & March, 1988). As
Levitt and March note: “The experiential lessons of history are captured by routines in a way that
makes the lessons, but not the history, available to organizational members who have not
themselves experienced the history” (1988, p 320). There is a relatively long tradition of
considering organizations as learning systems, and as repositories and conduits of knowledge.
- 3 -
Exploration and Exploitation Revisited
While Barnard (1938) notes the organization’s utility in achieving ends that require cooperation,
he also suggests, as Galbraith (1974) and Egeloff (1982) did later, that organizational structure
arises from the need to pass information efficiently. Operations management has a rich literature
dealing with learning in organizations (e.g., Epple, Argote & Devadas, 1991; Argote, 1999;
Argote & Darr, 2000; Argote, Ingram, Levine & Moreland, 2000; Argote, McEvily & Reagans,
2003). The role of routines as a means of holding and disseminating knowledge throughout the
organization has been examined by March & Simon (1958) and Cyert & March (1963). Many
studies have followed in this vein, dealing specifically with organizational routines (e.g. March
& Simon, 1958; Cyert & March, 1963; Nelson & Winter, 1973; Levinthal & March, 1981;
Lounamaa & March, 1987; Nelson, 1987; Winter, 1987; Levitt & March, 1988; March, 1988;
Levinthal & March, 1993; Miner, 1994; March, Schulz & Zhao, 2000). During the 1990s,
knowledge and its role in the firm was the focus of much activity (Kogut & Zander, 1992; Kogut
& Zander, 1993; Nonaka & Takeuchi, 1995; Grant, 1996; Grant, 1996; Galunic & Rodan, 1998),
and there was renewed interest in the topic of organizations as learning systems (Cohen &
Levinthal, 1990; Cohen, 1991; Levinthal, 1991; March, 1991; March, Sproull & Tamuz, 1991;
Simon, 1991; Lant & Mezias, 1992; Levinthal & March, 1993; Bruderer & Singh, 1996).
Individual experiential learning relies on the temporal or spatial proximity of stimuli that are
potentially causally related (Bullock, Gelman & Baillargeon, 1982; Fiske & Taylor, 1991).
Experiential learning involves considering the outcomes of many trials over time and selecting
the action that yields an outcome closest to a desired goal (March & Simon, 1958). However, in
many situations, clear correlations between cause and effect are hard to detect. When
environments are complex and much is changing simultaneously, the links between actions and
outcomes are often ambiguous (Lounamaa & March, 1987; Levitt & March, 1988; Levinthal,
- 4 -
Exploration and Exploitation Revisited
1991; Levinthal & March, 1993). Yet March (1991) showed that learning is possible, even where
considerable causal ambiguity exists, if individuals are part of an organized system. Although
March (1991) has been criticized for presenting an overly narrow and stylized view of
organizations, the strength of his original conception lies in its general insight about collective
learning in ambiguous settings, regardless of the specifics of its implementation.
March’s model is a useful starting-point for further theorizing about organizational learning
because it presents a mechanism whereby collectives can learn in situations where individuals on
their own cannot. Building on this model, the present paper speculates about a variety of
individual- and organizational-level processes that affect variation (exploration) and selection
(exploitation) of beliefs – processes that may therefore have an impact on organizational
learning. Two classes of variance-inducing mechanisms are considered. The first centers on the
propensity of individuals to experiment and the influence of two different forms of restraint on
experimenting, one organizational and the other individual. The second is turnover in
organizational membership. Alternative selection mechanisms considered here include the use of
tenure rather than performance as the criterion for promotion to the organization’s policy-making
elite, the stringency of the entry requirements to that group, and the extent to which a person’s
cumulative performance or ‘track record’ rather than their most recent performance is used as the
yardstick for promotion decisions. The paper is organized as follows: the next section presents
the theory and sets out some propositions. The methods section then explains how the simulation
was implemented. After a presentation of the results, there follows a discussion and some
conclusions.
- 5 -
Exploration and Exploitation Revisited
Theory
The basic mechanism of the model of mutual learning created by March in 1991 is evolutionary,
depending on variation in beliefs about the environment across actors and time (exploration), and
selecting and retaining the most accurate knowledge (exploitation). Drawing on the experiences
of others throughout the organization means that individuals have a larger set of trials on which
to draw and need to rely less on their own personal experience. To illustrate this, March
constructed a model in which individuals learned only from others and never directly from their
own experience (1991). The organizational processes considered below are some that are likely
to influence either variation (exploration) or selection/retention (exploitation).
Exploitation is about making best use of what we already know. If we can avoid the mistakes
others have made in the past, we can achieve our ends faster and at less cost. Exploitation of
current knowledge includes best-practice transfer and vicarious learning from those who seem to
have more knowledge than we do. Exploitation of an organization’s knowledge leads to a
convergence in beliefs: excessive exploitation can result in premature consensus (Levitt &
March, 1988). Exploration mitigates this by reintroducing variation into the system. Variation is
essential to any evolutionary process, but the key—as March noted—is striking a balance
between exploitation and exploration.
Variation-producing processes
Experimentation
Experimentation typically involves making choices when outcomes are unpredictable. This
implies that choices, however well intentioned and considered, are ultimately indistinguishable
- 6 -
Exploration and Exploitation Revisited
from a random selection. Random choices constitute a source of variation. Such choices may be
thought of as representing stochastic alterations in individuals’ underlying beliefs. If collective
learning involves exploiting the knowledge of the most knowledgeable members of the
organization, beliefs will converge on those of the organization’s single most knowledgeable
member. Since this individual has no one whose knowledge he or she can exploit, absent any
random alterations in his or her beliefs, the final level of knowledge attainable across the
organization can be no higher than that of its most knowledgeable individual at inception. To
learn more, there must be some means for members to ‘leapfrog’ the current most
knowledgeable individual. Experimentation provides a way in which this can happen.
Experimentation captures such things as risk-taking (March, 1991), guessing and foolishness
(March, 1988). It allows the most knowledgeable person at a particular moment to be displaced
by someone who, perhaps by chance or trial and error, has happened on a more accurate set of
beliefs about the environment. While exploitation increases efficiency, it makes organizations
vulnerable to environmental change by driving out variation which might serve as the basis for
successful adaptation (Hannan & Freeman, 1977). Experimentation ensures that an
organization’s learning is not limited to the knowledge of the best-informed individual at its
inception. Constant experimentation, on the other hand, may be as bad as no experimentation,
since it makes no use of prior knowledge. Thus, questions that emerge are: How much
experimentation should organizations undertake? Should experimentation be controlled and if so,
how can this best be done?
Some organizations discourage experimentation by inadequately rewarding successful attempts
at innovation and routinely punishing failures. In contrast, others—Microsoft being one
example—actively encourage experimentation (Theilen, 1999, p 52). Moreover, not all people
- 7 -
Exploration and Exploitation Revisited
are alike in their willingness to experiment. Some people feel comfortable making decisions with
relatively little information while others may prefer to proceed more cautiously, waiting for the
organization to which they belong to offer some guidance. Altering both individuals’
predisposition to experiment and the constraints imposed on their experimental endeavors will
influence the rate at which variation is introduced to the system, and will thus have an effect on
organizational learning.
In the next three subsections I consider three different aspects of experimentation. I look first at
the effect of varying the rate of experimentation. Next, I consider how organizational policy
might influence experimentation by indicating the domains in which it is sanctioned. Lastly, I
explore an approach to limiting excessive experimentation that depends on individual judgment.
Unconstrained experimentation or ‘Foolishness’
March has described experimentation as action that is “‘irrational,’ ‘out of character’ or
‘foolish’” (March 1994, p 263). Weick notes that “An ambivalent stance towards past wisdom
makes adaptive sense”; without testing our surroundings, we become prisoners to our
assumptions (1979, p 7). Absent experimentation, there is insufficient variation and inadequate
testing of the environment. The only variation in beliefs will be that present at the beginning of
the organization’s life, before socialization causes members’ views to converge. In a static
environment this may not matter, as the organization will learn by selecting from the early trials
of the initially heterogeneous population. However, in a changing world, insufficient
experimentation will have severe consequences. As views converge and heterogeneity disappears
the organization will soon find itself with an insufficient variety of beliefs from which to
generate useful adjustments to its store of knowledge. Raising the rate of experimentation will
- 8 -
Exploration and Exploitation Revisited
increase the variation in beliefs among individuals in the system. The more diverse the beliefs in
the organization, the higher the likelihood that, in a given period, a few individuals will have
beliefs that more accurately reflect the state of environment then the do the beliefs of
organization as a whole. Their knowledge is exploited as it is passed on to the rest of the
organization through the organization’s rules and standard operating procedures. However, in
excess, unconstrained experimentation will have a negative effect on learning; the more
frequently individuals experiment, the more often they will discard accurate beliefs. One should
therefore expect an inverted U-shaped relationship between experimentation and learning, and an
optimal level of experimentation that provides sufficient variation without discarding existing
knowledge.
Proposition 1:
Organizational learning will exhibit an ‘inverted U-shaped’ relationship with
the amount of unconstrained individual experimentation.
Organizationally constrained experimentation
Experimentation is often directed and constrained by the organization. It may be steered in
particular directions by means of rules, incentives and organizational values. For example,
Bower (1970) describes how firms create a strategic context through resource allocation
decisions that serve as a guide to behavior, encouraging individuals’ efforts in certain directions
and prodding them away from others. The organization imposes a ‘global rationality’ on
experimentation—it being ‘rational’ to experiment only in domains where knowledge may not be
regarded as reliable and mature. This maximizes the exploitation of existing organizational
knowledge. In contrast to unconstrained experimentation, given the influence of the
organization’s strategic context, individual experimentation will not be so excessive as to prevent
the organization from learning by discarding knowledge that it has already acquired.
- 9 -
Exploration and Exploitation Revisited
If experimentation is only sanctioned in areas where the organization has not reached a collective
consensus, most experimentation will likely occur at an early stage in the organization’s life. As
the organization develops a set of recommendations in its routines and standard operating
procedures that enable its growing body of knowledge to be exploited, the scope for
experimentation will decline and may ultimately disappear altogether. Moreover, even if some
experimentation does occur, because it will be limited to organizationally sanctioned domains,
all members will be exploring the same aspects of the environment and neglecting the same
issues—those where a consensus has been reached—leaving certain parts of the environment
untested. Such constraint limits the usefulness of experimental activity and there may never be a
complete testing of the environment.
Proposition 2:
Constrained experimentation will exhibit a monotonically increasing
relationship with organizational learning but at no level will it be as effective
as the best level of unconstrained experimentation.
Rational self-restrained experimentation
In contrast to Bower’s view of organizational activity, Burgelman (1988) described how
individuals often step outside the bounds of the strategic context; ‘autonomous strategic
behavior’ results in many small deviations from the firm’s strategic direction. Some of these may
prove fruitless. Others, such as Intel’s initially un-strategic foray into microprocessors, turn out
to be so important as to cause a sea-change in the company’s strategy.
Self-restrained experimentation lies somewhere in between foolishness and organizationally
constrained experimentation. Individuals exhibit self-control in their choice of domains in which
to experiment while ignoring the guidance of the organization. Individuals with such ‘local
rationality’ experiment only on issues about which they, rather than the organization, consider
- 10 -
Add New Comment