Applying Cognitive Theory to Statistics Instruction
Marsha C. LOVETT and Joel B. GREENHOUSE
as a change in knowledge and skills that enables new and
di?erent kinds of performance—but the question of how to
This article presents ?ve principles of learning, derived improve it is addressed from a di?erent perspective. In cog-
from cognitive theory and supported by empirical results in nitive psychology, the focus is on understanding the basic
cognitive psychology. To bridge the gap between theory and phenomena related to learning—for example, memory, skill
practice, each of these principles is transformed into a prac- acquisition, and problem solving. The goal is to acquire a
tical guideline and exempli?ed in a real teaching context. body of empirical results that characterizes these phenom-
It is argued that this approach of putting cognitive theory ena and to develop precise theories that explain and predict
into practice can o?er several bene?ts to statistics educa- observed data. In particular, one line of inquiry involves de-
tion: a means for explaining and understanding why reform veloping and testing uni?ed theories of cognition that are
e?orts work; a set of guidelines that can help instructors
make well-informed design decisions when implementing implemented as computer models (see Newell 1990). These
these reforms; and a framework for generating new and ef- theories specify a ?xed set of general mechanisms designed
fective instructional innovations.
to explain learning and performance across a broad range of
situations. Moreover, they specify their mechanisms quan-
KEY WORDS: Instructional technique; Pedagogy; Sta- titatively so that predictions can be systematically derived,
even for complex situations where the mechanisms may in-
teract in complicated ways. Together, these features have
enabled cognitive theories to explain and/or predict pat-
terns in human learning and performance across a wide va-
riety of tasks (Anderson and Lebiere 1998).
Unfortunately, these two ?elds—the “educational” and
the “psychological”—are not often brought together so that
At the heart of the reform movements in mathematics one can inform the other. The former deals primarily with
and science education lies the question: How can students’ issues of practice and the latter with issues of theory. A
learning be improved? During the last 20 years, reform reasonable characterization is that the ?elds are making
movements in math and science education have made new progress on parallel paths. For instance, many of the in-
strides to address this question. For example, in elemen- structional techniques that have blossomed in recent reform
tary and secondary mathematics instruction, there has been movements (e.g., collaborative learning) are consistent with
a shift toward teaching mathematics in the context of real- current cognitive psychological research but tend not to be
world problems so that students can see its usefulness in developed directly from those results. Perhaps more im-
concrete and familiar situations (NCTM 1989). In calculus
instruction, there is a new (and still controversial) interest portantly, the degree of separation between the ?elds has
in o?-loading the burden of calculation to technology so made it even more di?cult for instructors to reap the bene-
that students can focus more on learning the conceptual is- ?ts of the combined research. Educational research tends to
sues at hand (e.g., Tucker and Leitzel 1995). Similarly, in emphasize what instructional methods work in a particular
physics education, qualitative understanding of fundamen- situation but not how instructors can generalize from that to
tal principles has risen in importance relative to quantitative make e?ective instructional decisions in their own classes.
equation solving (e.g., McDermott 1984). A common goal Psychological research tends to emphasize how learning
of these educational reform movements is to promulgate proceeds in the abstract but not how speci?c instructional
new instructional techniques that will be e?ective in the methods in?uence learning.
This article tries to bring the two ?elds together within
During the same time period, research in cognitive psy- the area of statistics instruction in a way that is of practical
chology has also addressed the question of how learning can use to instructors. Speci?cally, we present ?ve principles
be improved. Here, learning is de?ned in similar terms— of learning, derived from cognitive theory and applied to
education. These principles, stated brie?y, are:
1. Students learn best what they practice and perform on
Marsha C. Lovett is Research Scientist, Center for Innovation in Learn-
ing, Carnegie Mellon University, Pittsburgh, PA 15213 (E-mail: lovett@
2. Knowledge tends to be speci?c to the context in which
cmu.edu). Joel B. Greenhouse is Professor of Statistics and Associate Dean
in the College of Humanities and Social Sciences, Carnegie Mellon Uni-
it is learned.
versity, Pittsburgh, PA 15213 (E-mail: firstname.lastname@example.org). This work was
3. Learning is more e?cient when students receive real-
partially supported by grants 97-20354 and 98-19950 from the National
time feedback on errors.
Science Foundation and by grant MH-30915 from the National Institutes
of Health. The authors thank the associate editor, two anonymous review-
4. Learning involves integrating new knowledge with ex-
ers, and David Moore for helpful suggestions and comments on this article.
c 2000 American Statistical Association
The American Statistician, August 2000, Vol. 54, No. 3
5. Learning becomes less e?cient as the mental load stu- tive learning, and the use of computers to aid students in the
dents must carry increases.
practice of statistics. During weekly lab sessions, students
These principles likely seem reasonable and intuitive. work in pairs at a computer, using a commercially available
Nevertheless, as given, they are only abstract statements statistics package (e.g., Data Desk 1992 or Minitab 1994) to
that do not o?er a clear picture of how best to design educa- complete assigned exercises. These exercises are presented
tional activities that will promote deep learning. To combat to the students in a lab handout that describes a dataset, pro-
this, for each principle we document its empirical support vides detailed instructions that guide the students through
in the psychological literature and, based on that research, the analysis, and asks students to interpret the results of
generate a corresponding practical guideline. Moreover, we their analysis. Students work with real datasets designed
demonstrate how each of these guidelines can be applied to both to engage them in the analysis and to exemplify the
statistics instruction by giving examples from an introduc- general applicability of statistical methods. Students are re-
tory statistics course designed by one of the authors. Hence, warded for trying to learn how to solve new problems, and
the goal of this article is two-fold: (1) to present a general, they are encouraged to learn from each other. In addition
theoretical framework for understanding and predicting the to solving data-analysis problems in the computer labs, stu-
e?ects of various instructional methods on student learn- dents solve related problems on their own for their home-
ing, and (2) to demonstrate how that framework can be put work assignments.
to practical use in statistics instruction. The remainder of
This description of the course is consistent with several
the article includes the following: a description of the in- reform-movement ideas such as active learning, collabo-
troductory statistics course at Carnegie Mellon to which rative learning, and the e?ective use of computers in in-
the above principles have been applied; a sketch of current struction, but it still leaves out many potentially important
cognitive theory and an elaboration of our ?ve principles features of the overall course design: How are topics se-
of learning; and a discussion of how our approach comple- quenced? How are the computer lab sessions integrated into
ments the achievements of the statistics education reform the course, and how are they run? How are problems cho-
sen for lab exercises, homework assignments, and examina-
tions? These “detail” questions are not always addressed in
2. CASE STUDY: THE STATISTICAL REASONING
descriptions of innovative course designs. And yet, they are
COURSE AT CARNEGIE MELLON
critical to instructors because the success of an instructional
innovation can be directly in?uenced by exactly how such
In this section, we describe an introductory, one-semester questions are answered (see Gage 1991; and Sections 3.1
statistical reasoning course for students in the humanities and 3.5). Indeed, we believe that the success of the Carnegie
and social sciences at Carnegie Mellon University, designed Mellon course stems in large part from the details of its im-
by one of us in 1991. The design of the course was moti- plementation. In the next section, then, we present our ?ve
vated by the speci?cation of four fundamental course goals: learning principles in the context of this course to demon-
that students learn to (1) apply the techniques of exploratory strate how a theory-driven approach can help in answering
data analysis for data reduction and summary; (2) under- the above kinds of questions. More generally, these princi-
stand the concept of sampling variability; (3) understand ples o?er a general framework that instructors can apply to
and critically evaluate the e?ectiveness of di?erent ap- a variety of courses when they need to make instructional
proaches for producing data; and (4) understand the use decisions.
and interpretation of inferential statistical procedures. The
curriculum focuses on the use and interpretation of data-
analysis techniques without teaching all the probabilistic or
3. PRINCIPLES OF LEARNING
mathematical underpinnings. The idea is that before learn-
As mentioned in the introduction, current cognitive the-
ing the quantitative aspects of statistical reasoning, students ories aim to explain and predict human learning and per-
can build useful skills and intuitions and become interested formance data by positing a ?xed set of representations
in solving statistical problems. The course gives students of knowledge and a ?xed set of mechanisms for acquiring
the opportunity to engage in authentic statistical reasoning and using those knowledge representations (e.g., Anderson
activities and, we hope, experience the excitement of scien- and Lebiere 1998; Holland, Holyoak, Nisbett, and Thagard
ti?c discovery (Cobb 1992; Moore 1992).
1986; Newell 1990). Although these theories are not of-
The above course goals were operationalized by consider- ten applied to classroom learning (Anderson, Conrad, and
ing the kind of data-analysis problems that students should Corbett 1989 is one exception), their success at capturing
be able to solve when they complete the course. To solve data collected in a variety of laboratory tasks (Anderson
such problems, students must learn many di?erent “pieces” and Matessa 1997; Lovett and Anderson 1996; Singley and
of knowledge and integrate them in some uni?ed whole. Anderson 1989) suggests that they can o?er important in-
In the course, students learn and practice the component sights for understanding and predicting students’ learning
skills of data analysis, at ?rst individual skills on simpli?ed outcomes.
problems and then, later, combinations of skills on real-
Indeed, these theories share several claims about learn-
istic problems of larger scope. The format of the course ing and information processing. The ?rst such claim is that
was designed to teach these skills using several reform- cognitive abilities can be decomposed into separate pieces
based instructional techniques—collaborative learning, ac- of knowledge, each of which can be categorized as either
procedural (representing skills and procedures) or declara- To the contrary, Kessler (1988) found that students who
tive (representing facts and ideas). For example, to answer spent their time creating new computer programs did not
the question “What is 13 + 48?” one must access and use improve much in their ability to evaluate programs and vice
several pieces of knowledge. Assuming one used the strat- versa. The same asymmetry in learning was found when stu-
egy of long addition, these would include both procedural dents practiced either translating calculus word problems
rules for adding the numbers in the one’s column, writing into equations or solving the equations themselves; when
the sum mod 10, handling the carry, moving to the next col- students practiced one skill and yet were pre/posttested on
umn, and so on, and declarative facts for the various sums both, they showed great improvement on the practiced skill
required by this problem. The second theoretical claim is but no improvement on the other, related skill (Singley and
that these knowledge pieces are acquired and strengthened Anderson 1989). Instructors may ?nd such results famil-
according to their use. Although the di?erent theories spec- iar: Students perform well on homework problems but then
ify somewhat di?erent mechanisms for knowledge acquisi- poorly on test problems that seem (to the instructor) quite
tion and strengthening, they tend to share the notion that closely related. One possible explanation of such poor test
using a given piece of knowledge will make it stronger and performance is that the test problems actually require cer-
hence easier to access in the future. Note that this “rich tain subskills that the students did not get to practice while
get richer” e?ect does not generalize beyond the pieces solving the homework problems; that is, students’ lack of
of knowledge actually used, and it applies equally to uses practice on these subskills may be impairing their overall
of knowledge that turn out to be “correct” or “incorrect.” performance. For example, in homework problems that in-
The third theoretical claim is that goals set the context for volve computing and interpreting inferential statistics, stu-
learning. That is, depending on the learner’s current goal, dents may either be explicitly told which statistics to com-
di?erent pieces of knowledge will be used, acquired, and pute or they may implicitly be cued by virtue of the current
strengthened. This dependence arises in part because proce- week’s topic. In this situation, the students will not have had
dural knowledge is represented in the form “IF my goal is opportunities to practice the skill of selecting the appropri-
<x> and <other conditions hold>, then take action <y>.” ate test statistics and therefore may have di?culty on an
In addition, in some theories, the learner’s goal serves as exam which requires this skill as part of a larger problem.
a link among the various pieces of knowledge relevant to Students improve greatly on the particular skills and sub-
the task at hand. This linkage promotes associations among skills that they actually practice but improve only slightly
related pieces of knowledge so that when one piece is cur- (or not at all) on related skills that they do not practice.
rently being used, other related pieces are made more avail- That is, the content of practice can greatly a?ect the degree
able. The fourth theoretical claim states that an individual’s to which the practice helps students achieve their speci?ed
cognitive capacity is limited. Di?erent theories represent learning goals.
this limitation in di?erent ways, but the basic idea is the
All of this suggests that instructors need to pay special
same: there is only so much information that a person can attention to exactly what concepts and skills students are
process at once. This limitation, in turn, in?uences how well practicing when they complete various assignments. Hence,
people can learn and perform complex tasks. These shared we generate the following practical guideline correspond-
theoretical claims form the foundation of current cognitive ing to Principle 1: identify the skills and subskills students
theories. They also lead to our ?ve principles of learning, are supposed to learn, and then give students opportunities
which will be elaborated in the following subsections.
to perform and practice all of those skills. Note that this
guideline implies that instructors should both identify a set
of relevant skills and knowledge for their students to learn
3.1 Principle 1: Students Learn Best What They
and then design instructional opportunities that allow stu-
Practice and Perform on Their Own
dents to practice this set. Gar?eld (1995) similarly argued
At ?rst blush, this principle seems merely to mimic the for a consistency in the learning activities students perform
aphorism “practice makes perfect.” The ?rst two themes of as part of coursework and for a speci?cation of the learning
cognitive theory, as described above—(1) cognitive abili- outcomes that constitute the goals of the course.
ties can be decomposed into separate pieces of knowledge
In the Carnegie Mellon statistical reasoning course, stu-
and (2) these pieces of knowledge are strengthened based dents’ opportunities for practice are carefully chosen for
on their use—suggest a more speci?c relationship between content, and practice is repeated throughout the semester.
practice and learning. They imply the more learners engage As mentioned earlier, in designing the curriculum for this
in processing that requires them to access certain pieces of course, a speci?c set of target skills was identi?ed. Di?er-
knowledge, the more they will learn those pieces of knowl- ent activities were then designed to give students practice at
edge and not other pieces of un-accessed knowledge.
those component skills. Speci?cally, during each week of
Research on learning has supported this aspect of cog- the course, students practice applying a few new skills—
nitive theory by showing that the bene?ts of practice are ?rst, under supervision in the computer labs and then with-
actually quite circumscribed. For example, take the skills out supervision on related homework assignments. This of-
involved in (1) writing new computer programs and (2) eval- fers multiple opportunities to practice the same concepts or
uating existing computer code. These two sets of skills seem skills. For example, a commonly “missed” skill (or set of
quite closely related—so much so, in fact, that it is reason- skills) is selecting the appropriate display or analysis. It is
able to expect that practice on one will bene?t both greatly. likely missed because this step is so automatic for experts;
The American Statistician, August 2000, Vol. 54, No. 3
it is little practiced because students are usually learning structors’ experiences with students who manage to solve
about only one analysis type at a time, so their choice of a homework problem correctly but fail to apply the same
analysis can be made on trivial (nonconceptual) grounds. skills to solve a closely related test problem. Earlier we sug-
For these reasons, there are “synthesis” labs at regular in- gested that such poor test performance might be attributed
tervals during the course where students work on a prob- to the fact that students did not actually practice all the
lem that requires several types of analyses drawn from the skills they need for the test problem. The current discussion
preceding weeks’ material. Another advantage of these par- suggests that even when students have practiced all the rel-
ticular labs is that they give students practice at combining evant skills, there may still be the question of whether stu-
skills in di?erent ways; this is important, too, because cog- dents have learned the necessary skills at a su?cient level
nitive theory suggests that the skills required for synthesis of generality to be able to apply them appropriately.
(e.g., comparing di?erent approaches and managing long
Several laboratory experiments have tried to gain insight
solutions) will not be learned unless they too are practiced. into this learning problem by exploring what conditions ac-
tually do facilitate transfer. The basic approach aims to en-
3.2 Principle 2: Knowledge Tends to be Speci?c to the
courage students to learn new knowledge and skills in a
Context in Which it is Learned
general way such that they can apply the knowledge appro-
priately in a variety of situations. One intervention that has
The third theme of cognitive theory as described earlier worked in several di?erent domains involves giving stu-
is that the student’s current goal provides important con- dents multiple problems that have related solution struc-
textual information that in?uences what is learned. For ex- tures but that appear di?erent. For example, Paas and Van
ample, when a declarative fact is retrieved in the context Merrienboer (1991) gave students sets of geometry prob-
of a particular goal, not only is that fact strengthened but lems (with solutions) that were either similar or varied in
the links between that fact and other facts that describe appearance. (The fact that solutions accompanied the prob-
the current goal are strengthened. This link strengthening lems in this study relates to the issue of feedback, which
makes the former fact subsequently easier to retrieve un- we discuss in the next subsection.) Note that both types of
der similar, future goals. The context of the current goal problem sets exercised the same set of skills in order to con-
is also incorporated into newly learned procedural knowl- trol for the content of practice. After studying these prob-
edge, enabling the new rule to be used only in future situa- lems, the students were tested on a new set of problems that
tions where the current goal is similar to the goal that was were unfamiliar to all. The new problems were considered
current during learning. All of this implies that knowledge “transfer problems” because they forced students to use the
will be more easily accessed in contexts that are similar to target skills in new or more complicated ways. The students
the student’s learning context. Here, learning context typ- who had practiced with the “varied” problem set performed
ically refers to the type of problem the learner is trying better on this transfer test than did students who had prac-
to solve (i.e., learning tends to be tied to particular prob- ticed with the “similar” problem set, even though they did
lems or problem types), but it may also refer to the physical not take signi?cantly longer during the study phase. These
and other related results suggest that students tend to learn
This principle has received empirical support via the ?nd- more generally applicable knowledge and skills when the
ing that students do not naturally generalize what they have problems they encounter appear at least somewhat varied
learned to new situations. This oft-cited ?nding is called (Elio and Anderson 1984; Paas and Van Merrienboer 1991;
a “failure to transfer.” It applies both when students fail Ranzijn 1991).
to transfer what they have learned to new problems (e.g.,
It is also worth noting that students who have been asked
problems that appear di?erent from what the students have to compare and contrast problems that appear di?erent be-
already encountered) and to new situations (e.g., out-of-the- cause of their di?erent cover stories also show more gener-
class situations where students may not even consider ap- alized learning. In a series of laboratory experiments, Cum-
plying their relevant skills). Many studies in the laboratory mins (1992) found that simply instructing algebra students
and in the classroom have shown that this di?culty of trans- to re?ect on the similarities and di?erences between pairs
ferring what students have learned on one set of problems of problems led them to transfer their algebra skills better
to another set of problems is great—even when the second than students without these instructions. Cummins’s theory
set of problems is very similar to the ?rst. For example, is that, by comparing problems, students end up re?ecting
Reed, Dempster, and Ettinger (1985) studied college stu- on the deep (rather than super?cial) relationships between
dents in an algebra class. All students initially saw solu- problems and develop an a more generalized “schema” for
tions to several di?erent kinds of algebra word problems. how problems are solved.
Then students were asked to solve new problems that were
Putting this principle into practice leads to the following
either equivalent (identical except for di?erent numbers) or guideline: give students problems that vary in appearance so
similar (requiring slight adaptation of a previous solution). their practice will involve applying knowledge and skills in
Four experiments showed that students had extreme di?- a variety of ways. This not only provides more opportunities
culty solving even the equivalent problems—except when for practice but more opportunities of the kind that encour-
the previous solutions were made available during problem age students to generalize their understanding. This guide-
solving—and that students almost never solved the simi- line is implemented in our introductory statistics course by
lar problems. These results are likely reminiscent of in- virtue of the fact that students get to work on real-world
problems that cover a variety of contexts, from the changes culating throughout the room to check on their work. In
in infant mortality rate in nineteenth century Sweden to the fact, each lab assignment includes several “checkpoints,”
e?cacy of pharmacological interventions for the prevention where the students must contact a teaching assistant and
of the recurrence of depression. That is, students get to ap- demonstrate their understanding up to that point in the exer-
ply the same statistical ideas to problems that are super?- cise. This process was motivated to give students more feed-
cially very di?erent, and the instructor explicitly discusses back during these supervised practice sessions. Students are
these commonalties with students. This technique can be now guaranteed to be “on track” at certain key points in the
used in other courses as well. For example, in a probability problem. Another approach that o?ers immediate feedback
course for engineers, a colleague regularly gives students to students on their understanding is the “peer instruction”
a homework assignment composed of ten problems with technique (Mazur 1997). In peer instruction, the instruc-
very di?erent super?cial features (e.g., a problem about so- tor poses a question to the class, students discuss their an-
lar ?ares, a problem about highway driving speeds). Un- swers in pairs, and then the instructor continues (e.g., by
beknownst to students, these problems all have the same discussing common misconceptions and/or processes for
solution structure. The assignment is to solve any three of generating a good answer). This peer instruction technique
the ten problems and then comment on the purpose of the has been used e?ectively in classes with large numbers of
assignment. Note that this problem-solving assignment (1) students.
gives students multiple problems to solve; (2) makes those
problems appear di?erent even though they are similar in 3.4 Principle 4: Learning Involves Integrating New
solution structure; and (3) encourages students to re?ect on
Knowledge With Existing Knowledge
the problems’ relationships.
One of the important learning mechanisms posited by
cognitive theory involves the strengthening of links in
the network of declarative knowledge. Here the theory
3.3 Principle 3: Learning is More E?cient When
claims that links between pairs of nodes in the network
Students Receive Real-Time Feedback on Errors
are strengthened based on how often the learner accesses
the corresponding pair of facts in the same context. These
Most of the learning mechanisms posited by cognitive links are important for learning because the stronger a link
theories have the common feature that some kind of learn- between two facts, the more easily one fact can be retrieved
ing occurs regardless of whether the learner succeeded or in the context of the other; that is, the more easily a student
failed in achieving the current goal. This suggests that, can make appropriate and useful associations between con-
in many situations, learners will be strengthening incor- cepts and/or ideas. This implies that the entire network of
rect knowledge, acquiring invalid procedures, or strengthen- associations held by a learner must be considered in making
ing inappropriate connections—in essence, “practicing bad predictions about learning and performance: It is not just
habits.” Thus, it is very important for learners to avoid er- important for individual facts to be strengthened, but for the
rors or, if they cannot avoid errors, to compensate for the appropriate connections between them to be strengthened
strengthening of incorrect knowledge with opportunities to as well. Note that the importance of making proper links
practice the corresponding correct knowledge.
between pieces of knowledge applies in two related situa-
Experiments directed at this issue have manipulated the tions: (1) integrating what students will learn in a course
immediacy of feedback given to students as they practice with what they already know and (2) integrating material
solving problems. For example, in a study reported in An- that students will learn later in the course with material they
derson, Conrad, and Corbett (1989), students learning to learn early on in a course.
program in LISP were either (1) given feedback just after
Students do not enter the classroom as blank slates; their
each mistake they made or (2) given an opportunity to re- prior knowledge can have an impact on their learning. Re-
quest feedback at the end of each problem. This experiment search on classroom learning shows that students often in-
and others like it have shown that immediate feedback, rel- terpret technical terms loosely based on the way those terms
ative to delayed feedback, leads to signi?cant reductions in are used in daily life. For example, the terms “speed” and
the time taken for students to achieve a desired level of per- “acceleration” are used quite loosely in daily life but re-
formance. Similarly, the sizable learning gains exhibited by quire special interpretations in introductory physics class,
students learning from human tutors is largely attributed a di?erence that can create an obstacle to learning (Reif
to the rich feedback tutors can give (Bloom 1984). One- and Allen 1992). More relevant to statistics instruction are
on-one tutoring, however, is not always possible. Instruc- students’ everyday interpretations of statistical terms such
tors need other ways of giving students real-time feedback as “chance,” “probability,” “hypothesis,” and “variability.”
during problem solving and learning. This leads to the fol- Gar?eld and her colleagues have shown that students’ mis-
lowing somewhat modest guideline: try to “close the loop” conceptions in these areas can make learning certain con-
as tightly as possible between students’ thinking and the cepts much more di?cult (Gar?eld and delMas 1991). For
example, if students can make use of their pre-existing (but
This guideline is instantiated in our introductory statistics incorrect) knowledge about probabilities in statistics class,
course’s computer laboratories, where students are working they may not see the need to acquire new knowledge. Even
on data-analysis exercises in pairs (so they can potentially if they do eventually learn an appropriate de?nition for
provide feedback to each other) with teaching assistants cir- the term “probability,” this new knowledge will tend to be
The American Statistician, August 2000, Vol. 54, No. 3
strongly linked to their old, inappropriate de?nition, mak- ples, the new concept is motivated in terms of other, closely
ing it di?cult for them to consistently interpret probabilities related ideas that have been presented earlier in the course;
questions correctly. Therefore, it can be helpful for instruc- then after some initial practice, students work on larger,
tors to know about students’ prior knowledge and concep- more complex problems that require linking the new idea
tions. Then, they can build on the strengths of reasoning to more distantly related knowledge. In this way, students
that students have in order to shore up their weaknesses. are encouraged to use related ideas on a common problem
Consider the example of conditional probability problems. which should help them strengthen appropriate links in their
Although people’s intuitive reasoning on these problems of- knowledge structures.
ten leads to errors when the problems are stated in terms
of probabilities (Tversky and Kahneman 1982), controlled 3.5 Principle 5: Learning Becomes Less E?cient as the
laboratory experiments have shown that people reason quite
Mental Load Students Must Carry Increases
well when the same problems are stated in terms of frequen-
To account for the fact that people have limitations on
cies (e.g., 850 cases out of 1,000 instead of 85% probability, the amount of information they can attend to at once, cog-
and so on; Gigerenzer and Ho?rage 1995). This suggests nitive theories posit a constraint on people’s cognitive ca-
that, by using the frequency format as a starting point, stu- pacity. This can be speci?ed in the following terms: the
dents could link their new knowledge about probabilities to more complex the current goal (i.e., the more information
pre-existing correct intuitions about frequency.
simultaneously needed to solve it), the more di?cult it will
The idea of helping students create appropriate links be- be to access that needed information. Here, “mental load”
tween pieces of knowledge appropriately also applies to the can be interpreted in terms of the amount of information
problem of presenting new material throughout a course. simultaneously needed for solving the current goal. Note
For example, students often view what they are learning as a that when students are learning to perform a new task (e.g.,
set of isolated facts (Hammer 1994; Schoenfeld 1988) when, solve addition problems), their current goal will tend to be
from the instructor’s point of view, there is a clear structure fairly complicated: It must include a certain amount of in-
to the material being taught. Students may not see this struc- formation to represent the details of that new task (e.g., that
ture unless it is explicitly presented. Moreover, controlled the current problem is 34 + 81 and the one’s column has
laboratory experiments have shown that students learn new been summed) as well as additional information regarding
verbal material better when it is presented in an organized the students’ parallel goal of learning something about that
structure (e.g., hierarchically as compared to merely in a task (e.g., that “carries” require a special procedure that
list). Cognitive theory would explain this by positing that, should be remembered for the future). This suggests that
in the hierarchical case, students acquire not only the indi- learning should be viewed as a mentally demanding task; it
vidual words to be learned but also a set of links associat- will proceed more e?ectively when the complexity of ac-
ing related words, which makes it easier to retrieve one in tivities to be performed during learning is reduced.
the context of the other. This same idea has been applied
Several researchers have studied how learning is a?ected
successfully in classroom situations—both when students by the combined mental load of performing an assigned
are learning a new set of facts and when they are learn- activity while learning. For example, Sweller and his col-
ing new problem—solving procedures. For example, Eylon leagues have found that students who are just beginning to
and Reif (1984) presented di?erent groups of students the learn to solve problems in a particular area can bene?t from
same physics material on gravitational acceleration, but the worked example problems (e.g., Sweller and Cooper 1985;
information was organized in either a hierarchical fashion Sweller 1988). The idea here is that seeing worked examples
(with higher levels representing information most important before solving new problems makes the subsequent prob-
to the task and lower levels representing the details) or in a lem solving an easier task. Note that this “examples-then-
linear fashion (an unorganized list of ideas). When students problems” activity also reduces the errors students make
were asked to recall the material or to use the material to when solving problems on their own, so students’ learning
solve new problems, the hierarchical group outperformed may also bene?t from some of the issues related to imme-
the linear group (even when the linear group received more diate feedback (Principle 3, p. 5).
time to study the materials).
Beyond the general advantage of initially giving students
Putting these ideas into practice leads to the follow- some worked examples to study, Ward and Sweller (1990)
ing guideline: study students’ relevant initial conceptions revealed that di?erent ways of formatting the worked ex-
and misconceptions and then organize instruction so that amples can lead to more or less learning. Again, this di?er-
students will build appropriate knowledge structures. This ence in learning relates to di?erences in the mental e?ort
guideline is applied in the Carnegie Mellon course at several students must expend in the di?erent situations. In partic-
levels. At a global curriculum level, the Carnegie Mellon ular, Ward and Sweller found that the more information
introductory sequence was revised to ?rst teach students students must integrate on their own while processing the
about describing and analyzing data, attempting to build worked examples, the lower their learning outcomes, and
upon their intuitions about describing data before teaching the more the example’s format places relevant information
the underlying probability theory. At a more local level, where it will be needed, the better the learning outcomes.
the organization of material within the introductory course For example, in one study introductory physics students ei-
was very closely prescribed: students ?rst learn about a new ther received worked examples with the solution equations
concept or procedure with a few examples; in these exam- incorporated into relevant parts of the problem statement
Table 1. Summary of Principles and Guideliines for Instructional Design
Principles of learning
Guidelines for instructional design
Students learn best what they practice
Identify the skills and subskills students are supposed to learn, and
and perform on their own.
then give students opportunities to perform and practice all of those
skills. Give students repeated practice at applying certain concepts
or skills and time this practice so that it is spread out in time.
Knowledge tends to be speci?c to the
Give students problems with di?erent contexts so they exercise what
context in which it is learned.
they have learned in a variety of ways.
Learning is more e?cient when students
Try to “close the loop” as tightly as possible between students’ think-
receive real-time feedback as they solve
ing and the instructor’s feedback.
Learning involves integrating new knowl-
Study students’ relevant initial conceptions and misconceptions and
edge with existing knowledge.
then sequence instruction to build on what students already know.
Students’ learning becomes less e?-
Make the necessary information readily available to students during
cient as the mental load they must carry
learning and o?oad extraneous processing during problem solving
so that students can focus their attention on learning the material at
or worked examples with the solution equations format- to instructional design than developing and implementing
ted separately from the problem description. Students in e?ective instructional techniques, the focus of this article.
the former group (integrated problem text and equations) As mentioned in the description of our course design (Sec-
solved more test problems overall and performed better on tion 2), several steps preceded the decisions regarding in-
transfer test items than did students in the latter group (sep- structional technique—identifying course goals, establish-
arated problem text and equations). These results have been ing performance criteria for those goals, and decomposing
replicated with several di?erent kinds of materials.
the goals into learnable “pieces” of knowledge—and other
Putting this principle into practice leads to the follow- steps followed—assessment and re-design [see Dick (1997)
ing guideline: make the necessary information readily avail- for a review of the steps of instructional design]. The re-
able to students during learning and o?oad extraneous pro- form movement in statistics education has confronted these
cessing during problem solving so that students can focus larger issues while still directing considerable e?ort towards
their attention on learning the material at hand. This guide- improving instructional techniques. In the following subsec-
line is implemented in our introductory statistics course in tions, then, we provide a brief overview of the statistics ed-
two ways. First, students are taught to use the computer ucation reform movement to place our approach in a more
to generate summary statistics and graphical displays when general context and to argue that our approach can com-
they are working on data-analysis exercises. This frees them plement the existing work in this area. Note that because
from having to do detailed calculations (i.e., makes the task of space limitations the following only represents a small
of data analysis easier) and hence allows students to focus sample of the work and progress in statistics education.
their e?orts on learning the larger task at hand. Second,
students are given ample practice at applying new subskills 4.1 Statistics Education Reform: Changes in
in simpler contexts before they have to solve more complex
problems. This way, by the time they encounter more com-
plex and di?cult problems, they are already comfortable
One of the major shifts produced by the statistics ed-
at applying most of the subskills involved. They no longer ucation reform movement involves the content of statis-
need to labor over the basic steps in the problem solution, tics courses—especially introductory level courses. Here,
but rather can focus on the larger problem of learning how the trend is toward emphasizing students’ practical use of
to put those steps together.
statistical reasoning relative to their memorization of sta-
tistical formulas and procedures. This is similar to a recent
trend in mathematics education more generally, where the
4. BRINGING IT ALL TOGETHER
content of instruction has come to focus more on problem
In the preceding subsections, we have described each of solving and the usefulness of mathematics in many familiar
the ?ve principles of learning in relative isolation (see Ta- contexts.
ble 1 for a summary). However, in designing or redesign-
In the case of statistics education, the emphasis on the
ing a course, much more must be considered. First, we “practice of statistics” can be seen through a number of dif-
note that it is important to apply the ?ve principles of ferent changes to course curricula. Course goals no longer
learning jointly, so that they can mutually guide the de- refer to students’ ability to derive particular statistical for-
sign process. Second, we acknowledge that there is more mulas or to compute certain statistics by hand, but rather
The American Statistician, August 2000, Vol. 54, No. 3
Table 2. Annotated Overview of Educational Reforms
Examples of Use
• Students work together to solve problems (e.g., Borresen 1990; Dietz 1993) or dis-
cuss concepts, sharing ideas and understanding (Gar?eld 1993).
• Students are engaged in data collection (e.g., Rossman 1996; Schea?er et al. 1996;
Spurrier et al. 1995), re?ection on and exploration of statistical concepts (Lan et al.
1993), and solving problems on their own (cf. Use of technology).
• Instruction is designed so that students will be confronted with their misconcep-
tions and then have the opportunity to re?ect and derive a more coherent conceptual
understanding (Gar?eld and delMas 1991).
Use of technology
• Several textbooks (e.g., Rossman 1996) and multi-media resources (e.g., Velleman
1996) are designed to coordinate the presentation of new material with the use of
• Simulation programs allow students to explore statistical concepts in discovery-
world environments (e.g., Lang, Coyne, and Wackerly 1993; Loomis and Boynton 1992;
they refer to “statistical literacy” and students’ ability to sikan, Schea?er, Watkins, and Witmer 1997; Grabowski and
reason statistically about real-world problems. For example, Harkness 1996; Keeler and Steinhorst 1995; Magel 1998;
the course called “Chance” (Snell 1996) builds its entire cur- Smith 1998). To take a few examples, Lan et al. (1993)
riculum around statistical problems that arise as current is- found that students who were encouraged to re?ect on their
sues in the media. In addition, course curricula have been af- learning (by recording time spent working on di?erent con-
fected more generally by an array of new textbooks that are cepts and estimating their own e?cacy at solving problems
dedicated to the analysis and interpretation of real datasets using those concepts) scored higher on in-class examina-
and to the qualitative (instead of quantitative) understand- tions than did two di?erent groups of “control” students.
ing of statistical concepts (e.g., Cobb 1987; Glenberg 1996; Other approaches to getting students actively engaged in
Moore 1997c; Moore and McCabe 1993; Rossman 1996; learning statistics have also been used (e.g., Gnanadesikan
Schea?er, Gnanadesikan, Watkins, and Witmer 1996; Utts et al. 1997; Smith 1998). For example, Smith (1998) found
1996). In courses using these texts, problems typically re- that incorporating a sequence of projects in a semester-
quire students to work with a statistical software package long introductory course led to positive responses from stu-
in order to answer questions regarding a given dataset (e.g., dents and improved exam scores relative to the previous
performing exploratory or inferential data analyses and in- semester’s students.
terpreting the results). As an aid to instructors wanting to
On the issue of collaborative learning, Borresen (1990)
assign such problems, several groups have made available compared two groups of students in an introductory statis-
databases of real datasets (e.g., DASL, JSE, EESEE) which tics class: those who worked on their assignments in small
are indexed by statistical concept and application area. In groups during class and those who worked individually on
some cases, students are even responsible for collecting the same assignments for homework. In terms of a mea-
their own data (e.g., considering issues of experimental de- sure based on total points received in the course, Borresen
sign and data collection).
found that the “small-group” students signi?cantly outper-
formed the “individual-learning” students. Similar results
4.2 Statistics Education Reform: Changes in
have been shown by other researchers as well (e.g., Dietz
1993; Gar?eld 1993; Giraud 1997; Keeler and Steinhorst
Beyond the shift in the content of statistics courses, a 1995; Magel 1998).
major focus of the statistics education reform movement
Technology has also played a major role in instruc-
has been on improving the instructional techniques used in tional innovations. Some software packages o?er students
these courses. Table 2 provides a list of some reform-based simulation systems for exploring statistical concepts (e.g.,
techniques and brief descriptions of how they are used. This Alper and Raymond 1998; delMas, Gar?eld, and Chance in
list is not meant to be comprehensive but rather to demon- press; Finch and Cumming 1998; Finzer and Erickson 1998;
strate the variety of innovative instructional techniques that Loomis and Boynton 1992; Velleman 1996) while others
are being employed in a variety of statistics classes today.
provide educationally enhanced statistical software pack-
These methods tend to lead to improvements in students’ ages (e.g., Cohen, Tsai, and Checile 1995; Cohen et al. 1996;
interest in statistics, their learning outcomes, or both (e.g., Cohen and Checile 1997; Schuyten and Dekeyser 1998;
Borresen 1990; Lan, Bradley, and Parr 1993; Stedman 1993; Shaughnessy 1998). These technological aids to learning
Cohen et al. 1996; Gar?eld 1996; Giraud 1997; Gnanade- generally lead to improvements in students’ understanding
and problem-solving skills, most likely because the tech- call, some have argued that the next era of reform needs to
nology give students more opportunities to consider con- focus on using technology more fully and on getting more
ceptual implications and work through problems on their instructors to embrace the techniques of the current reform
own. For example, delMas et al. (in press) demonstrate that movement (e.g., Gar?eld 1997; Hawkins 1997; Hoerl et al.
students showed better statistical reasoning after working 1997; Moore 1997a,b). It is important to note, however,
with their simulation world, especially when students ex- that both of these approaches require individual instructors
plored the system fully.
to know a good deal about how technology should best
As the above results suggest, assessments of the out- be incorporated and how reform-based instructional tech-
comes of instructional reforms often show encouraging re- niques should best be implemented. Making all the design
sults. However, there are also caveats. For example, Cohen decisions required to answer these “how?” questions is far
and his colleagues (e.g., Cohen, Tsai, and Checile 1995; from trivial, even for an instructor familiar with current
Cohen et al. 1996; Cohen and Checile 1997) have com- reform-based techniques.
pleted several in-depth assessments of a hands-on curricu-
This is where we believe a cognitive psychology per-
lum in which students use an instructional software pack- spective can be helpful. First, the principles of learning
age to learn statistics. Although these students exhibited presented in this article o?er guidance in specifying these
greater learning gains (post-test–pre-test) than did “con- design decisions in a way that will likely lead to e?ec-
trol” students, Cohen and Checile (1997, p. 110) remarked tive instruction. Second, new instructors could greatly ben-
that “even those students with adequate basic mathematical e?t from a theoretical framework that o?ers general guid-
skills [who had used the hands-on instructional software] ance in making instructional decisions and helps structure
still scored only an average of 57% [correct] on the [post-] their knowledge about teaching as they acquire their own
test of conceptual understanding.” While this is a signi?cant database of experiences, read the statistics education lit-
improvement relative to that group’s average score of 42% erature, and gradually gain expertise. Taking into account
correct at pretest, it shows that students still have a lot to these issues as well as the strengths of individual instructors
and the statistics education reform movement more gener-
Gar?eld and delMas (1991) similarly found that, although ally, we advocate a combined approach in which instructors
innovative instructional interventions lead to gains in pre- both (1) draw on their own experiences and other docu-
to post-test performance, certain misconceptions held by mented cases of good instructional designs and (2) apply
students do not disappear, leaving absolute levels of per- the principles of learning discussed above to guide their
formance after instruction unsatisfactory. In our own as- instructional design decisions. In this way, our framework
sessment of our introductory statistical reasoning course at can complement both instructors’ expertise and the existing
Carnegie Mellon, we found signi?cantly greater learning research on statistics education, extending the progress that
gains from pre-test to post-test among students who took has been made already.
the course relative to a group of control students. Neverthe-
less, these gains were attributable to large improvements on
75% of the test items and little or no improvement on the
other 25% of the items (see Lovett, Greenhouse, Johnson,
Instructors working in the ?eld have a di?cult job of
and Gluck in press for more details). Assessment results
such as these, derived from comprehensive study designs, implementing new instructional techniques. They often are
attest to the inherent di?culty of statistical concepts and responsible for all aspects of instructional design, develop-
to the potential need for additional guidance in improving ment, and implementation. Most of the resources available
to them, however, fail to emphasize the process of how to
design a course. In contrast, the principles we laid out in
this article describe the processes of learning that apply to
4.3 How Can Statistics Education Bene?t from a
students in general and lead to a set of practical guidelines
Cognitive Psychology Perspective?
that can guide the process of instructional design for a wide
The reform movement in statistics education has made range of courses.
substantial progress in changing the nature of instruction. It
Applying the principles of learning to course and cur-
has increased instructors’ awareness of and dialogue about riculum design is akin to applying basic physical principles
new instructional ideas. For example, terms such as “active to various engineering designs. Creating designs with these
learning” and “hands-on practice” are becoming part of the principles in mind can increase the likelihood of developing
statistics educator’s standard vocabulary. These ideas have a course that works. These principles also point to ways for
helped instructors begin to analyze the relative merits of analyzing what features of a course are more or less e?ec-
di?erent types of instruction and have led to the develop- tive and for predicting how various new learning activities
ment of many innovative, reform-based courses. In sum, will impact students’ learning. The essential idea is that stu-
there is a vast amount of research on improving statistics dents’ processing (e.g., how students learn, what processes
instruction, much of it producing very encouraging results. they engage while performing various learning activities)
There is still a call for more improvement, however, espe- is an important part of what makes instruction e?ective.
cially in helping students to grasp the broader issues of sta- Focusing on student learning, therefore, can help us under-
tistical reasoning (e.g., Moore 1997a,b). In response to this stand how best to improve it.
The American Statistician, August 2000, Vol. 54, No. 3
[Received January 1999. Revised November 1999.]
a Chance Course,” Communications in Statistics—Theory and Method,
(1997), Discussion of “New Pedagogy and New Content: The Case
of Statistics,” by D. Moore, International Statistical Review, 65, 137–
Alper, P., and Raymond, R. (1998), “Some Experience Comparing Simu-
lation and Conventional Methods in an Elementary Statistics Course,”
Gar?eld, J., and delMas, R. (1991), “Students’ Conceptions of Probabil-
in Proceedings of the Fifth International Conference on the Teaching of
ity,” in Proceedings of the Third International Conference on Teaching
Statistics, Singapore: ISIASE/ISI.
Statistics (vol. 1), ed. D. Vere-Jones, Voorburg, The Netherlands: Inter-
Anderson, J. R., Conrad, F. G., and Corbett, A. T. (1989), “Skill Acquisition
national Statistical Institute, pp. 340–349.
and the LISP Tutor,” Cognitive Science, 13, 467–505.
Gigerenzer, G., and Ho?rage, U. (1995), “How to Improve Bayesian Rea-
Anderson, J. R., and Lebiere, C. (1998), Atomic Components of Thought,
soning Without Instructions: Frequency Formats,” Psychological Re-
Mahwah, NJ: Erlbaum.
view, 102, 684–704.
Anderson, J. R., and Matessa, M. P. (1997), “A Production System Theory
Giraud, G. (1997), “Cooperative Learning and Statistics Instruction,” Jour-
of Serial Memory,” Psychological Review, 104, 728–748.
nal of Statistics Education, 5, http://www.amstat.org/publications/jse.
Bloom, B. S. (1984), “The 2 Sigma Problem: The Search for Methods
Glenberg, A. M. (1996), Learning From Data: An Introduction to Statistical
of Group Instruction as E?ective as One-to-One Tutoring,” Educational
Reasoning (2nd ed.), Mahwah, NJ: Erlbaum.
Researcher, 13, 4–16.
Gnanadesikan, M., Schea?er, R. L., Watkins, A. E., and Witmer, J. A.
Borresen, C. R. (1990), “Success in Introductory Statistics With Small
(1997), “An Activity-Based Statistics Course,” Journal of Statistics Ed-
Groups,” College Teaching, 38, 26–28.
ucation, 5, http://www.amstat.org/publications/jse.
Cobb, G. W. (1987), “Introductory Textbooks: A Framework for Evalua-
Grabowski, B. L., and Harkness, W. L. (1996), “Enhancing Statistics Ed-
tion,” Journal of the American Statistical Association, 82, 321–339.
ucation with Expert Systems: More than an Advisory System,” Journal
(1992), “Teaching Statistics,” in Heeding the Call for Change: Sug-
of Statistics Education, 4, 3, http://www.amstat.org/publications/jse.
gestions for Curricular Action, MAA Notes Number 22, ed. L. Steen, pp.
Hammer, D. (1994), “Epistemological Beliefs in Introductory Physics,”
Cognition and Instruction, 12, 151–183.
Cohen, S., and Checile, R. A. (1997), “Overview of ConStats and the Con-
Hawkins, A. (1997), Discussion of “New Pedagogy and New Content: The
Stats Assessment,” in Research on the Role of Technology in Teaching
Case of Statistics,” by D. Moore, International Statistical Review, 65,
and Learning Statistics, eds. J. B. Gar?eld and G. Burrill, Voorburg, The
Netherlands: International Statistical Institute.
Hoerl, R., Hahn, G., and Doganaksoy, N. (1997), Discussion of “New Ped-
Cohen, S., Smith, G., Checile, R. A., Burns, G., and Tsai, F. (1996), “Iden-
agogy and New Content: The Case of Statistics,” by D. Moore, Interna-
tifying Impediments to Learning Probability and Statistics From an As-
tional Statistical Review, 65, 147–153.
sessment of Instructional Software,” Journal of Educational and Behav-
Holland, J. H., Holyoak, K. J., Nisbett, R. E., and Thagard, P. R. (1987),
ioral Statistics, 21, 35–54.
Induction: Processes of Inference, Learning, and Discovery, Cambridge,
Cohen, S., Tsai, F., and Checile, R. (1995), “A Model for Assessing Student
MA: MIT Press.
Interaction With Educational Software,” Behavior Research Methods,
Keeler, C. M., and Steinhorst, R. K. (1995), “Using Small Groups
Instruments, and Computers, 27, 251–256.
to Promote Active Learning in the Introductory Statistics Course:
Cummins, D. D. (1992), “Role of Analogical Reasoning in Induction of
A Report From the Field,” Journal of Statistics Education, 3,
Problem Categories,” Journal of Experimental Psychology: Learning,
Memory, and Cognition, 18, 1103–1138.
Kessler, C. (1988), “Transfer of Programming Skills in Novice LISP
Data Desk (1993), Statistical Software (Student Version 4), Ithaca, NY:
Learners,” unpublished doctoral dissertation, Carnegie Mellon.
Lan, W. Y., Bradley, L., and Parr, G. (1993), “The E?ects of a Self-
DelMas, R., Gar?eld, J., and Chance, B. (in press), “Assessing the E?ects of
Monitoring Process on College Students’ Learning in an Introductory
a Computer Microworld on Statistical Reasoning,” to appear in Journal
Statistics Course,” Journal of Experimental Education, 62, 26–40.
of Statistics Education, http://www.amstat.org/publications/jse.
Lang, J., Coyne, G., and Wackerly, D. (1993), ExplorStat, Gainesville, FL:
Dick, W. (1997), “A Model for the Systematic Design of Instruction,” in
University of Florida.
Instructional Design: International Perspectives: Theory, Research, and
Loomis, J., and Boynton, G. (1992), Stat Visuals, Santa Barbara, CA: Uni-
Models (vol. 1), eds. R. D. Tennyson, F. Schott, N. Seel, and S. Dijkstra,
versity of California.
Mahwah, NJ: Erlbaum, pp. 361–369.
Lovett, M. C., and Anderson, J. R. (1996), “History of Success and Current
Dietz, E. J. (1993), “A Cooperative Learning Activity on Methods of Se-
Context in Problem Solving: Combined In?uences on Operator Selec-
lecting a Sample,” The American Statistician, 47, 104–108.
tion,” Cognitive Psychology, 31, 168–217.
Elio, R., and Anderson, J. R. (1984), “The E?ects of Information Order
Lovett, M. C., Greenhouse, J. B., Johnson, M., and Gluck, K. (in press)
and Learning Mode on Schema Abstraction,” Memory and Cognition,
“Assessing an Introductory Statistics Course Using the Rasch