High-Stakes Testing and Curricular Control:
A Qualitative Metasynthesis
by Wayne Au
Using the method of qualitative metasynthesis, this study analyzes 49
it is evident in the way schools are generally organized around a
qualitative studies to interrogate how high-stakes testing affects cur-
course of predetermined, required subject matter classes that stu-
riculum, defined here as embodying content, knowledge form, and
dents must pass to graduate. Thus most scholars and educators
would at least recognize that curriculum encompasses a body of con-
pedagogy. The findings from this study complicate the understanding
tent knowledge to be learned in some way, shape, or form.
of the relationship between high-stakes testing and classroom prac-
However, to stop at the level of content obscures other crucial
tice by identifying contradictory trends. The primary effect of high-
aspects of curriculum because subject matter content within
stakes testing is that curricular content is narrowed to tested
schools implies not only selection but also transmission of knowl-
subjects, subject area knowledge is fragmented into test-related
edge. As McEwan and Bull (1991) state,
pieces, and teachers increase the use of teacher-centered pedago-
Subject matter is always an expression of a desire to communicate
gies. However, this study also finds that, in a significant minority of
ideas to others. . . . Differences within the form and content of var-
cases, certain types of high-stakes tests have led to curricular con-
ious expressions of subject matter reflect an understanding of dif-
ferences in the backgrounds of potential audiences and the
tent expansion, the integration of knowledge, and more student-
circumstances of the subject matter’s formulation. (p. 331)
centered, cooperative pedagogies. Thus the findings of the study
Indeed, all content is pedagogical. It implies the communication
suggest that the nature of high-stakes-test-induced curricular control
of ideas to an audience and does so through the structuring of
is highly dependent on the structures of the tests themselves.
knowledge (Segall, 2004a, 2004b). The concept of curriculum,
therefore, also implicates the structure of knowledge embedded
curriculum theory; high-stakes testing; qualitative
in curricular form—the form of how knowledge is organized
metasynthesis; template analysis.
and presented within a curriculum (Apple, 1995), as well as
pedagogy—the intended form of communication of selected
content. Thus the trilogy of (a) subject matter content knowledge,
(b) structure or form of curricular knowledge, and (c) pedagogy
With the advent of federally mandated high-stakes test- are three defining aspects of “curriculum.” This basic conception
ing since the No Child Left Behind Act of 2001,
of curriculum is what I use for the present analysis.
many important questions have been raised regarding
the implementation of this policy tool at the classroom level. In
this article, I focus on one such question: What, if any, is the effect
A test is high-stakes when its results are used to make important deci-
of high-stakes testing on curriculum? To answer this question, I
sions that affect students, teachers, administrators, communities,
begin by exploring the meanings of two key terms, “curriculum” and
schools, and districts (Madaus, 1988). In very specific terms, high-
“high-stakes testing,” and by offering a brief review of some of the lit-
stakes tests are a part of a policy design (Schneider & Ingram, 1997)
erature regarding the relationship between the two. Then, using the
that “links the score on one set of standardized tests to grade pro-
method of qualitative metasynthesis, I undertake a comparative
motion, high school graduation and, in some cases, teacher and prin-
study of 49 qualitative studies of high-stakes testing to better under-
cipal salaries and tenure decisions” (Orfield & Wald, 2000, p. 38).
stand testing’s impact on curriculum.
As part of the accountability movement, stakes are also deemed high
because the results of tests, as well as the ranking and categorization
of schools, teachers, and children that extend from those results, are
There exists a wide range of definitions of the term “curriculum”
reported to the public (McNeil, 2000).
(Beauchamp, 1982; Jackson, 1996; Kliebard, 1989). Historically,
the word has its roots in the Latin word currere, which means a
The Research Debate
course to be run (Eisner, 1994), and was first used at the University
The question of whether high-stakes testing affects curriculum
of Glasgow in the 17th century to describe “a formal course of study
has been highly contested in the field of educational research. For
that the students completed” (Harden, 2001, p. 335). This defini-
instance, at a time when high-stakes testing policies were incon-
tion is perhaps the simplest and easiest for most to recognize because
sistently implemented across individual states, Airasian (1987)
Educational Researcher, Vol. 36, No. 5, pp. 258–267
© 2007 AERA. http://er.aera.net
and Madaus (1988) offered some of the earliest assertions that the
tests would control classroom practice. M. L. Smith (1991)
The data set consists of 49 qualitative studies. These studies were
followed with one of the few early empirical studies, finding that
gathered from a search completed in June of 2006 using the
high-stakes tests promote “multiple choice teaching.” More
Educational Resources Information Center (ERIC), Academic
recent research on high-stakes testing is more conflicted. Some
Search, and Education Full Text databases, as well as the library
research finds that high-stakes tests merely represent one limited
book database at the University of Wisconsin, Madison. Initially,
factor, among others, influencing classroom practice (see, e.g.,
the search terms “high-stakes testing” and “state-mandated testing”
Cimbricz, 2002; Firestone, Mayrowetz, & Fairman, 1998;
were used to identify potential studies for use in my qualitative
Grant, 2003), have little to no influence on what teachers do in
metasynthesis. This rather large initial pool was then narrowed
the classroom (see, e.g., Gradwell, 2006; van Hover, 2006), or
to studies (a) based on original, scholarly research, (b) using quali-
lead to improved learning experiences and positive educational
tative methods, (c) taking place in the United States, and (d)
outcomes (see, e.g., Braun, 2004; Williamson, Bondy, Langley,
specifically addressing the relationship between high-stakes tests
& Mayne, 2005). Other research challenges these claims,
and either curriculum or instruction, or both. Because this study
however, finding that high-stakes testing undermines education
focuses on the relationship between high-stakes testing and cur-
because it narrows curriculum, limits the ability of teachers to
riculum at the K–12 classroom level, the sample excludes studies
meet the sociocultural needs of their students, and corrupts
that examine the relationship between high-stakes testing and
systems of educational measurement (see, e.g., Amrein & Berliner,
retention, studies that focus on the role of high-stakes testing and
2002a, 2002b; Lipman, 2004; McNeil, 2000; McNeil &
access to teacher education programs (e.g., Praxis II), studies that
Valenzuela, 2001; Nichols & Berliner, 2005, 2007; Watanabe,
focus on the tests themselves (e.g., discourse analyses of the actual
2007). Given the wide range of research evidence, and given the
test content), and policy studies that use qualitative methods to
ubiquity of high-stakes testing in education in the United States,
compare pressures between states. In addition, because of their
the purpose of this study is to develop a broader, more complex
ambiguous and complicated positions in school hierarchies, stud-
understanding of the ways that these tests influence curriculum
ies that focus on student teachers are also excluded.
at the classroom level.
Based on the self-identification of the researchers, the data
gathered and analyzed from the 49 studies used in the qualitative
metasynthesis performed here include at least 740 “teachers”
For the purposes of this study I have chosen to analyze examples
identified as participants; 845 “educators” or “teachers and
of qualitative research because of their focus on human interac-
administrators” (not broken out into “teachers” alone) identified
tion and attention to the day-to-day functioning of schools and
as participants; 96 schools identified as the focus of study; 38 dis-
classrooms (Valenzuela, Prieto, & Hamilton, 2007). To review
tricts identified as the level of focus of study; and covers at least
the body of evidence reported in qualitative studies, I draw on the
19 states (Arizona, Colorado, Florida, Illinois, Kansas, Kentucky,
methodology of qualitative metasynthesis (DeWitt-Brinks &
Maine, Maryland, Massachusetts, Michigan, Minnesota, New
Rhodes, 1992; Noblit & Hare, 1988; Sandelowski, Docherty, &
York, North Carolina, Ohio, Oregon, Texas, Vermont, Virginia,
Emden, 1997; Thorne, Jensen, Kearney, Noblit, & Sandelowski,
and Washington). In addition, of the 49 qualitative studies used
2004), also referred to as qualitative meta-analysis (McCormick,
in this metasynthesis, 15 focus on elementary education, 23 focus
Rodney, & Varcoe, 2003). Qualitative metasynthesis is part of a
on secondary education, and 11 are K–12 analyses. Alternatively,
tradition of metaresearch that involves synthesizing the results of
while several of the included studies (23) are more general in
qualitative studies to gain a better understanding of the general
focus, 14 are history/social studies–specific (3 elementary and 11
nature of a given phenomenon (DeWitt-Brinks & Rhodes, 1992;
secondary), 9 are English/language arts–specific (1 elementary
Thorne et al., 2004).
and 8 secondary), and 3 are math/science–specific. (See Table 1
In this study I make use of a specific form of qualitative meta-
for a complete listing of the studies analyzed here.)
synthesis known as template analysis (Crabtree & Miller, 1999;
King, 1998, 2006). In this form of thematic meta-analysis, tex-
tual data are coded using a template of codes designed by the
For this study I tracked the citation information, research sites,
researcher. These codes are often hierarchical in nature, starting
scope, and methods of inquiry of the 49 qualitative studies, includ-
with broad themes and moving toward more narrow or specific
ing the dominant themes in each study’s findings. I then coded
ones. In this case the textual data used are from the collection of
dominant themes using the above definition of curriculum as the
qualitative studies gathered by the researcher. In template analy-
framework for my initial template of analysis. Thus my thematic
sis the coding template is developed in two stages based on
coding began with three broad categories: Subject Matter Content,
themes that arise from the body of textual data. In the first stage
Pedagogy, and Structure of Knowledge. Consistent with the tem-
the researcher begins by developing an initial template based on
plate analysis methodological framework, the full elaboration of my
a combination of a priori codes and an initial reading and coding
coding template evolved during the course of the research. For
of a subset of the textual data. In the second stage, the initial tem-
instance, it has been widely asserted over the past 20-plus years that
plate is then applied to the whole data set, and codes are added to
high-stakes tests cause a narrowing or contraction of nontested sub-
the template as new themes arise. This leads to the creation of the
ject areas. I was aware of research substantiating this assertion prior
final template. The final template is then used to interpret the
to beginning the template analysis and thus assumed that I would
textual data set as a whole, and the findings are presented in some
need to code the studies that reported the theme of contraction of
form (King, 1998, 2006).
subject matter content. Based on my previous understandings and
Qualitative Metasynthesis Studies and Codes
Qualitative Metasynthesis Code Template
SAC, PCT, KCF
SAC—Subject matter content alignment,
SAC, SAE, PCT, KCF
SAC, PCT, KCF
SAE—Subject matter content alignment,
KCF—Form of knowledge changed,
KCI—Form of knowledge changed,
Clarke et al., 2003
SAC, SAE, PCT, PCS, KCI
PCT—Pedagogic change to teacher-centered
PCS—Pedagogic change to student-centered
Debray, Parson, & Avila, 2003
SAE, PCT, KCI
Firestone, Mayrowetz, &
SAC, PCT, KCF
of curriculum to align with high-stakes tests, I also encountered the
Gerwin & Visone, 2006
SAC, PCT, KCF
theme of subject matter content expansion. This finding required the
addition of a new thematic code. As I read and reread the 49 quali-
tative studies, I added thematic codes as the patterns emerged and
Grant et al., 2002
SAC, PCT, KCF
used them to develop the final template of codes for metasynthesis.
The thematic codes in Table 2 can be explained as follows. The
first set of thematic codes seeks to track whether teachers, as indi-
SAC, PCT, KCF
vidual actors at the classroom level, aligned their classroom con-
SAE, PCS, KCI
tent to the high-stakes tests. If they did, the thematic codes then
SAC, PCT, KCF
Lomax et al., 1995
SAC, PCT, KCF
mark the nature of this alignment—either subject matter content
Luna & Turner, 2001
SAC, SAE, PCT, KCF
expansion or subject matter content contraction. In looking for
SAC, PCT, KCF
subject matter contraction, I studied the research findings for
McNeil & Valenzuela, 2001
SAC, PCT, KCF
occurrences of teachers and schools reducing the amount of instruc-
Murillo & Flores, 2002
tional time and course offerings in either tested or nontested sub-
SAC, PCT, KCF
ject areas. An example of findings being coded for content matter
expansion can be found in the research of Renter and colleagues
Renter et al., 2006
(Renter et al., 2006), who found that schools were reducing the
amount of instruction in science and social studies because those
Rex & Nelson, 2004
SAE, PCS, KCI
subjects were not a focus of the high-stakes tests. Conversely, in
SAE, SAC, PCT, KCF, KCI
looking for subject matter expansion, I analyzed the data for
reports of teachers and schools increasing the teaching of either
SAC, PCT, KCF
tested or nontested subjects in response to high-stakes tests. Vogler
Smagorinsky, Lakly, &
SAC, PCT, KCF
(2003) is an example of a study that was coded for test-related con-
tent expansion because he found that social studies teachers in his
Smith, A. M., 2006
SAE, SAC, PCT, KCF
study added language arts/literacy instruction to their social studies
Taylor et al., 2001
SAC, PCT, KCF
curriculum in response to high-stakes tests, which tested for
SAC, PCT, KCF
writing but not for social studies content knowledge.
van Hover, 2006
The second set of thematic codes tracked whether the high-
van Hover & Heinecke, 2005
stakes tests affected curricular knowledge forms. This theme was
Williamson et al., 2005
perhaps the most elusive of the three because it required that I fol-
Wolf & Wolf, 2002
SAE, PCS, KCI
low how teachers organized the knowledge in their classrooms in
SAE, PCS, KCI
relation to high-stakes testing. If a study reported that there was
Wright & Choi, 2005
a shift in how teachers structured the knowledge they taught, I
SAE, PCS, KCI
then coded for whether classroom knowledge forms became more
SAC, PCT, KCF
fragmented and isolated into discrete, test-driven bits or became
aSee Table 2 and the text discussion of it for explanations of the codes.
more expansive, inclusive in integrated wholes. As an instance of
bThese two studies reported no curricular changes in response to high-
a study being coded for knowledge fragmentation, one study in
this metasynthesis found that math and science were increasingly
being taught as a collection of procedures and facts, as opposed to
on an initial analysis of qualitative studies, I produced an initial tem-
being taught as conceptual, thematic, and higher-order mathematic
plate of codes. However, after I undertook the template analysis, for
and scientific thinking (Lomax, West, Harmon, Viator, & Madaus,
instance, in addition to finding the theme of narrowing/contraction
1995). Such test-influenced instruction thus essentially fragmented
the content knowledge into individuated and isolated procedures
and facts for use on the high-stakes test. Other examples can be
Reliability is a known issue within template analysis (King, 1998,
found where researchers reported that subjects such as social stud-
2006; Pawson, Greenhalgh, Harvey, & Walshe, 2005), and I have
ies were broken up into collections of historical data (see, e.g., Grant
used two strategies to ensure the reliability of the findings of this
et al., 2002) or subjects such as writing were reduced to the pro-
study. First, to empirically determine the interrater reliability of my
duction of formulaic and procedural five-paragraph essays (see, e.g.,
own coding, two colleagues independently coded findings of a
Hillocks, 2002). Conversely, more integrated knowledge forms
sample subset of 10 studies. The findings of these coders were then
were coded in studies that found, for instance, some teachers focus-
checked against my own, resulting in the following interrater reli-
ing on more conceptual, higher-order thinking that sought to
ability percentages: subject matter content contraction, 86.7%;
develop more holistic understanding of mathematics (see, e.g.,
subject matter content expansion, 83.3%; knowledge fragmenta-
Firestone et al., 1998) or studies that found language arts teachers
tion, 93.3%; knowledge integration, 96.7%; teacher-centered ped-
focusing more conceptually on the process of writing as opposed to
agogy, 90%; student-centered pedagogy, 86.7%. The overall
step-by-step procedural essay writing (see, e.g., Hillocks, 2002).
interrater reliability for this study was 89.4%.
Third, I looked at the theme of teachers’ pedagogy in response
Second, reliability in template analysis is also improved when
to high-stakes tests. If a study reported that teachers changed their
researchers are explicitly reflexive about both the process of their
instructional practice because of the testing, then I coded for the
research and their positioning in relation to their study (King,
theme of teacher-centered instructional strategies or the theme
1998, 2006; Pawson et al., 2005). Thus it is important to explain
of student-centered instructional strategies. In tracking these
my research orientation. I approach this study from within the
themes, I analyzed the studies’ findings for evidence of teachers’
critical realist tradition, which holds that a real world exists objec-
increasing their use of direct instruction or increasing their use of
tively outside human perception, that this world is to varying
more interactive pedagogies in response to the tests. For instance,
extents knowable through human cognition, and that this world
in their research into high-stakes-testing-related social studies
is in fact changeable relative to our knowledge of it. Furthermore,
instruction, Gerwin and Visone (2006) found that teachers in
critical realism recognizes human subjectivity in the understand-
their study showed dramatic increases in the amount of teacher-
ing of the externally existing world, and as such views knowledge
centered, fact-driven instruction in subjects included in state-
as a social process and as fallible. In these ways, critical realism
mandated tests. Studies such as this were coded as demonstrating
simultaneously rejects both positivist objectivist and relativist sub-
increased teacher-centered pedagogy. Studies reporting teachers’
jectivist theories of knowledge in favor of an epistemology that in
increasing the amount of student-centered, constructivist instruc-
essence synthesizes aspects of both—an objectively existing world
tion in response to high-stakes tests, for example, some studies of
and a socially mediated understanding of that world (Benton &
language arts classrooms where teachers increased their use of
Craib, 2001; Bhaskar, 1989). Consequently, my use of template
interactional and student-led activities (see Wollman-Bonilla,
analysis combined with critical realism makes this study a form of
2004), were coded accordingly.
realist review (Pawson et al., 2005).
Once coding was completed, I analyzed the codes for patterns
My critical realist positioning also influences this study in
and anomalies on three levels. First, looking at the data as a whole
that the use of the word “critical” points to a particular set of
collection, I tracked the predominant themes in terms of indi-
political commitments on the part of the researcher. Critical
vidual codes, essentially asking, What do these studies tell us
realists seek to understand the world to change it for the better,
about the overall effects of high-stakes testing on curriculum in
seek to reflexively understand social mechanisms to promote
terms of content, form, and pedagogy? Within this first level of
social equality (Benton & Craib, 2001; Bhaskar, 1989). A similar
analysis, I then sought to find relationships between the trends at
political commitment underlies the impetus for this study,
the level of the single codes and other contextual variables found
because I, as a social justice educator, scholar, and activist, have
within the research, looking for overlaps between grade levels and
sought to understand the relationship between education and
subject areas and the trends found among individual themes.
power (see, e.g., Au, 2005, 2006; Au & Apple, 2004). As such,
At the second level I analyzed theme pairings. This involved
I am interested in the relationship between high-stakes testing
tracking the number of times that particular codes appeared in
and inequalities associated with race and socioeconomic status
corresponding pairs to determine if any relationships existed
(see, e.g., Hunter & Bartee, 2003; Kim & Sunderman, 2005;
between changes in content, knowledge structures, and peda-
Sirin, 2005). However, although ultimately inseparable from my
gogy. At this level of analysis, I also tracked whether the pairings
overall research agenda, for the purposes of this study I have
corresponded to particular grade levels or subject areas.
attempted to put my political commitments aside in favor of a
Finally, at the third level, I analyzed theme triplets, seeking
focused empirical analysis of how high-stakes testing affects
any potential connections between all three areas of content, ped-
curriculum. Thus, although these effects may have implications
agogy, and knowledge form in relation to the effects of high-
for educational equality and social justice, I have made a conscious
stakes testing on classroom practice.
choice here to bracket those implications as beyond the scope of
In addition to these three levels of analysis, I looked at the
this specific study and analysis.
anomalies or weaker thematic relationships. Some studies simply
came up with singular findings that did not match or support the
trends and patterns of the larger metasynthesis; some groups of
Before presenting the findings, it is important to recognize that this
studies (such as are found within the social studies) were more
study has a specific focus and is therefore limited in at least two par-
conflicted in their findings.
ticular ways. First, in this metasynthesis I inquire into the frequency
Summary Findings: Effects of High-Stakes Testing on Curriculum
Studies, N = 49
Exemplar of Dominant Theme
A Colorado teacher: “Our district has told us to focus on reading, writing,
and mathematics. Therefore, science and social studies . . . don’t get
taught.” Taylor et al., 2001, p. 30
A Massachusetts teacher: “You know, we’re not really teaching them how
to write. We’re teaching them how to follow a format. . . . It’s like . . .
they’re doing paint-by-numbers.” Luna & Turner, 2001, p. 83
A Kansas teacher: “ . . . I don’t get to do as many fun activities, like
cooperative learning activities or projects. . . . [T]his year I’ve done a lot
more direct teaching than being able to do student-led learning. . . .”
Clarke et al., 2003, p. 50
Note. Individual code totals do not necessarily equal the total for any one category because some studies exhibit multiple, even contradictory, codes;
for example, subject alignment contraction and subject alignment expansion may appear in the same study.
and types of curricular change induced by high-stakes testing.
content change, whether by contraction or expansion. Furthermore,
Consequently, my inquiry excludes instances where high-stakes test-
as Table 3 shows, in an overwhelming number of the qualitative
ing does not affect the curriculum. As this study’s findings will show,
studies, participants reported instances of the narrowing of
the body of research analyzed here focuses predominantly on test-
curriculum, or curricular contraction to tested subjects. This
related events, as opposed to test-related nonevents. In this regard,
phenomenon was the most prominent way in which “teaching
even though a handful of studies included here specifically focus on
to the test” manifested in curricula, as nontested subjects were
a lack of test-related instructional changes (see, e.g., Bolgatz, 2006;
increasingly excluded from curricular content. A more detailed
Gradwell, 2006; Grant, 2003), the findings of this qualitative meta-
analysis finds that the narrowing of curricular content was strongest
synthesis are inherently skewed toward what the researchers in the
among participants in the studies that focused on secondary
majority of these studies chose to focus on in their research: class-
education, with the most narrowing found in studies of social
room-level changes due to high-stakes tests.
studies and language arts classrooms. In addition, in another
A second way in which the findings of this qualitative metasyn-
expression of curricular alignment, a significant minority of studies
thesis are limited relates to the time periods reported on. The studies
reported some form of content expansion as a result of high-stakes
analyzed here report inconsistently on how curriculum changes in
testing, with most of these coming from studies focusing on
response to high-stakes testing relative to time. Thus some studies
secondary education and social studies classrooms. As the above
focus on periods of curricular change in the months, weeks, or days
evidence suggests, whether in the form of content contraction
leading up to high-stakes tests, and others focus on test-related
or content expansion, high-stakes testing leverages a significant
curricular change more generally. Consequently, it was difficult to
amount content control over curriculum.
ascertain whether high-stakes testing was affecting the curriculum
all year or simply in time periods immediately preceding the tests.
I would argue, however, that these two limits do not take away from
Table 3 also indicates that, in a significant number of the qualitative
the power of the findings presented here. Rather, the limits simply
studies, participants reported changes to the form that curricular
refine the focus of this qualitative metasynthesis, which provides
knowledge took in response to high-stakes testing. The dominant
a snapshot and general depiction of the types and frequency of
theme in this category suggests that there is a relationship between
changes made to curricula in high-stakes testing environments.
high-stakes testing and teachers’ increasing the fragmentation of
knowledge. Such fragmentation manifested in the teaching of
content in small, individuated, and isolated test-size pieces, as well
As Table 3 indicates, the findings of this study suggest that there
as teaching in direct relation to the tests rather than in relation to
is a significant relationship between the implementation of high-
other subject matter knowledge. However, it is important to note
stakes testing and changes in the content of a curriculum, the
that, as shown in Table 3, a minority of studies found that high-
structure of knowledge contained within the content, and the
stakes testing had led to the increased integration of knowledge
types of pedagogy associated with communication of that content.
in the classroom. Thus, within the body of qualitative research, a
These changes represent three types of control that high-stakes
dominant theme is that, whether leading to fragmentation or
tests exert on curriculum: content control, formal control, and
integration of knowledge, high-stakes testing affects curricular
form, that is, it leverages formal control over the curriculum.
The dominant theme found in the qualitative research regarding
A third dominant theme that appears in the qualitative research
high-stakes testing and curriculum is that of content alignment.
is pedagogic change. As shown in Table 3, a significant number
More than 80% of the studies contained the theme of curricular
of participants in qualitative studies reported that their pedagogy
changed in response to high-stakes tests and that a significant
majority of the changes included an increase in teacher-centered
Summary of Selected Theme Pairings
instruction associated with lecturing and the direct transmission
of test-related facts. In addition, as Table 3 indicates, a small but
important number of studies exhibited the theme of increased
student-centered instruction as an effect of high-stakes testing.
Further analysis shows that, in this metasynthesis, a cluster of test-
related, teacher-centered pedagogy exists surrounding instruction in
both language arts and social studies classrooms. Whether in the
form of increased teacher-centered instruction or increased student-
centered instruction, the evidence suggests that high-stakes testing
exerts significant pedagogic control over curriculum.
An analysis of theme pairings generally mirrors the above find-
ings but also provides a more nuanced outline of potentially sig-
nificant relationships between dominant themes.
As Table 4 indicates, the most prominent theme pairing sug-
gests that there is a relationship between the narrowing of cur-
riculum and an increase in teacher-centered instruction as
teachers respond to pressures created by high-stakes testing envi-
ronments. The next highest occurrence of theme pairing suggests
that increased teacher-centered pedagogy and increased frag-
mentation of knowledge forms are likely to coincide in response
Despite some researchers’ claims to the contrary, the findings of
to high-stakes testing. The third most frequent theme pairing
this study suggest that high-stakes tests encourage curricular align-
suggests a relationship between curricular content narrowing and
ment to the tests themselves. This alignment tends to take the form
the fragmentation of knowledge forms, which are likely to occur
of a curricular content narrowing to tested subjects, to the detri-
together in response to high-stakes testing.
ment or exclusion of nontested subjects. The findings of this study
The findings further suggest that there are weaker but signifi-
further suggest that the structure of the knowledge itself is also
cant relationships between the expansion of subject matter and an
changed to meet the test-based norms: Content is increasingly
increase of a more integrated structure of knowledge in response to
taught in isolated pieces and often learned only within the context
high-stakes testing, as well as a contraction or narrowing of curric-
of the tests themselves. Finally, in tandem with both content con-
ular content and a simultaneous content expansion. Three other
traction and the fragmentation of knowledge, pedagogy is also
significant theme pairings appear in the study, two of which are
implicated, as teachers increasingly turn to teacher-centered
seemingly contradictory to the dominant trends outlined above. As
instruction to cover the breadth of test-required information and
Table 4 shows, theme pairing of curricular expansion and an
procedures. Thus I have identified three different, interrelated
increase in teacher-centered pedagogy in response to high-stakes
types of curricular control associated with high-stakes testing: con-
testing was also found. Other findings showed increases in student-
tent, formal, and pedagogic. The control over knowledge content
centered pedagogy paired with an increase in the integration of
and the form the knowledge takes are related to and associated with
knowledge in response to high-stakes testing.
control of pedagogy as well.
As I noted in Tables 3 and 4, however, several less frequently
occurring themes seemed to contradict the predominant findings of
A total of 28 studies in this qualitative metasynthesis produced
this study. The data suggest that in a small number of cases, high-
codes within each area of curriculum identified here. I now turn
stakes testing was associated with an increase in student-centered
to the final level of analysis, examining these theme triplets to
instruction, content integration, and subject matter expansion. For
determine if there are any potential relationships between all
instance, there are seven simultaneous occurrences of the themes of
three thematic areas. Overwhelmingly, the prevalent theme triplet
content contraction and content expansion related to high-stakes
in the qualitative research was the combination of contracting
tests, most of which come from secondary social studies and lan-
curricular content, fragmentation of the structure of knowledge,
guage arts (see, e.g., Anagnostopolous, 2003b; Luna & Turner,
and increasing teacher-centered pedagogy in response to high-
2001; Segall, 2003; A. M. Smith, 2006; Vogler, 2003). In these
stakes testing. This theme triplet appears 21 times (75%) among
cases, teachers are both adding some content to meet the demands
the 28 studies that produced themes in all three areas, suggesting
of the tests and contracting content in other areas. In addition,
a relationship between the themes in response to high-stakes
because the stakes of state-mandated social studies testing vary
testing. The second most frequently occurring theme triplet, that
greatly from state to state (Grant & Horn, 2006), the findings indi-
of curricular content expansion, increasing integration of knowl-
cate that high-stakes-test-induced curricular expansion has taken
edge, and increasing student-centered instruction, appears 6 times
place in social studies classrooms as teachers integrate reading-test-
(21.4%) in the study. This triplet is indeed the exact opposite of
related literacy skills into their own social studies curricula (see, e.g.,
the dominant triplet.
Vogler, 2003). Indeed, this phenomenon of expanding curricular
content due to the integration of test-required literacy skills or test-
Another significant finding of this study is that, in a minority of
specific content accounts for the majority of the instances of curric-
cases, high-stakes tests have led to increases in student-centered ped-
ular expansion (see, e.g., Barton, 2005; Clarke et al., 2003; Libresco,
agogy and increases in content knowledge integration. Combined,
2005; Rex & Nelson, 2004; Wolf & Wolf, 2002; Wollman-
these findings indicate that high-stakes testing exerts significant
Bonilla, 2004; Yeh, 2005).
amounts of control over the content, knowledge forms, and peda-
There appears to be a similar relationship regarding the small
gogies at the classroom level.
numbers of increases in student-centered pedagogies relative to
The curricular control found in this study further suggests that
high-stakes testing. Almost all occurrences of the theme of
high-stakes testing represents the tightening of the loose coupling
increases in student-centered pedagogy occur with instances of
between policymakers’ intentions and the institutional environ-
subject matter expansion. These cases revolve around teachers
ments created by their policies (Burch, 2007). This conclusion
whose test-based instruction involves the development of critical
should not be surprising to educational researchers and practi-
literacy skills (see, e.g., Clarke et al., 2003; Libresco, 2005; Rex
tioners because systems of educational accountability built on
& Nelson, 2004; Wolf & Wolf, 2002; Wollman-Bonilla, 2004;
high-stakes, standardized tests are in fact intended to increase
Yeh, 2005). For instance, New York State’s history exam involves
external control over what happens in schools and classrooms. As
a mix of multiple-choice questions and a document-based essay
Moe (2003) explains, the rationale behind systems of high-stakes
question (DBQ; Grant, 2003). Social studies teachers, in prepar-
accountability is quite clear:
ing students for DBQs, have the charge of teaching a specific crit-
ical literacy skill set instead of being forced to focus solely on a
The movement for school accountability is essentially a movement
rigidly imposed collection of historical facts (see, e.g., Bolgatz,
for more effective top-down control of the schools. The idea is that,
2006; Clarke et al., 2003; Grant, 2003; Libresco, 2005). It is
if public authorities want to promote student achievement, they
need to adopt organizational control mechanisms—tests, school
likely that teachers in these studies thus find the potential for
report cards, rewards and sanctions, and the like—designed to get
increased flexibility in the content and pedagogy they use to teach
district officials, principals, teachers, and students to change their
social studies in their respective high-stakes environments.
behavior. . . . Virtually all organizations need to engage in top-
Furthermore, because social studies instruction figures promi-
down control, because the people at the top have goals they want
nently in the above contradictory findings, and because the only
the people at the bottom to pursue, and something has to be done
two studies to argue that testing does not influence any aspect of
to bring about the desired behaviors.
curriculum also focus on this subject area (Gradwell, 2006;
The public school system is just like other organizations in this
Grant, 2003), it is also possible that social studies represents a
respect. (p. 81)
special case in relation to high-stakes testing and curricular con-
trol (Au, in press).
The intentions of promoters of high-stakes test-based educational
The above discussion indicates a likely relationship between
reforms are thus apparent in the policy designs, which are pur-
the construction of the high-stakes tests themselves and the cur-
posefully constructed to negate “asymmetries” between classroom
ricular changes induced by the tests. Research supports the exis-
practice and the policy goals of those with political and bureau-
tence of such a relationship. As Yeh (2005) finds, teachers in
cratic power (Wößmann, 2003).
Minnesota report that their pedagogy is not negatively affected
Given the central findings of this study, however, a crucial
by high-stakes tests because they feel the tests there are well
question is raised: Are test-driven curriculum and teacher-
designed and do not promote drill and rote memorization.
centered instruction good or bad for teachers, students, schools,
Another example comes from Hillocks (2002), who analyzes the
communities, and education in general? Considering the body of
teaching of writing in relation to the writing examinations deliv-
research connecting high-stakes testing with increased drop-out
ered in Texas, Illinois, New York, Oregon, and Kentucky. One
rates and lower achievement for working-class students and stu-
of Hillocks’s main findings is that states with poorly designed sys-
dents of color (see, e.g., Amrein & Berliner, 2002b; Groves,
tems of writing assessment promote a technical, mechanical, five-
2002; Madaus & Clarke, 2001; Marchant & Paulson, 2005;
paragraph essay form, and that teachers’ pedagogy adapts to that
Nichols, Glass, & Berliner, 2005), the findings of this study point
form in those states. The findings of these studies suggest that test
to the need for further analysis of how curricular control may or
construction matters in terms of teachers’ curricular responses to
may not contribute to educational inequality.
high-stakes tests (see also Clarke et al., 2003).
I would like to thank Diana Hess, Simone Schweber, Keita
In this study, using a form of qualitative metasynthesis called tem-
Takayama, Ross Collin, Eduardo Cavieres, Quentin Wheeler-Bell, the
plate analysis, I have reviewed the findings of 49 qualitative studies
three anonymous ER reviewers, and ER editor Gregory Camilli for their
addressing the impact of high-stakes testing on curriculum. As
invaluable feedback on this article.
Tables 3 and 4 indicate, the evidence presented here strongly sug-
gests that as teachers negotiate high-stakes testing educational envi-
ronments, the tests have the predominant effect of narrowing
References marked with an asterisk indicate studies included in the
curricular content to those subjects included in the tests, resulting
in the increased fragmentation of knowledge forms into bits and
*Agee, J. (2004). Negotiating a teaching identity: An African American
pieces learned for the sake of the tests themselves, and compelling
teacher’s struggle to teach in test-driven contexts. Teachers College
teachers to use more lecture-based, teacher-centered pedagogies.
Record, 106(4), 747–774.
Airasian, P. W. (1987). State mandated testing and educational reform:
Educational Testing and Public Policy, Lynch School of Education,
Context and consequences. American Journal of Education, 95(3),
Boston College. Retrieved March 20, 2006, from http://www.bc.edu/
Amrein, A. L., & Berliner, D. C. (2002a, December). An analysis of some
*Costigan, A. T., III. (2002). Teaching the culture of high stakes test-
unintended and negative consequences of high-stakes testing. Tempe, AZ:
ing: Listening to new teachers. Action in Teacher Education, 23(4),
Educational Policy Studies Laboratory, Arizona State University.
Retrieved February 12, 2006, from http://www.asu.edu/educ/epsl/
Crabtree, B. F., & Miller, W. L. (1999). Using codes and code manuals:
A template organizing style of interpretation. In B. F. Crabtree &
Amrein, A. L., & Berliner, D. C. (2002b). High-stakes testing, uncertainty,
W. L. Miller (Eds.), Doing qualitative research (2nd ed., pp. 163–178).
and student learning. Education Policy Analysis Archives, 10(18). Retrieved
Thousand Oaks, CA: Sage.
September 27, 2005, from http://epaa.asu.edu/epaa/v10n18
*Debray, E., Parson, G., & Avila, S. (2003). Internal alignment and
*Anagnostopolous, D. (2003a). The new accountability, student failure, and
external pressure. In M. Carnoy, R. Elmore, & L. S. Siskin (Eds.), The
teachers’ work in urban high schools. Educational Policy, 17(3), 291–316.
new accountability: High schools and high-stakes testing (pp. 55–85).
*Anagnostopolous, D. (2003b). Testing and student engagement with
New York: RoutledgeFalmer.
literature in urban classrooms: A multi-layered perspective. Research
DeWitt-Brinks, D., & Rhodes, S. C. (1992, May 20–25). Listening
in the Teaching of English, 38(2), 177–212.
instruction: A qualitative meta-analysis of twenty-four selected studies.
Apple, M. W. (1995). Education and power (2nd ed.). New York: Routledge.
Paper presented at the annual meeting of the International
Au, W. (2005). Power, identity, and the third rail. In P. C. Miller (Ed.),
Communication Association, Miami, FL.
Narratives from the classroom: An introduction to teaching (pp. 65–85).
Eisner, E. W. (1994). The educational imagination: On the design and
Thousand Oaks, CA: Sage.
evaluation of school programs (3rd ed.). New York: Macmillan.
Au, W. (2006, November). Against economic determinism: Revisiting
*Fickel, L. H. (2006). Paradox of practice: Expanding and contracting
the roots of neo-Marxism in critical educational theory. Journal for
curriculum in a high-stakes climate. In S. G. Grant (Ed.), Measuring
Critical Education Policy Studies, 4(2). Retrieved December 12, 2006,
history: Cases of state-level testing across the United States (pp. 75–103).
Greenwich, CT: Information Age Publishing.
Au, W. (in press). Social studies, social justice: W(h)ither the social stud-
*Firestone, W. A., Mayrowetz, D., & Fairman, J. (1998). Performance-
ies in high-stakes testing? Teacher Education Quarterly.
based assessment and instructional change: The effects of testing in
Au, W., & Apple, M. W. (2004). Interrupting globalization as an edu-
Maine and Maryland. Educational Evaluation and Policy Analysis,
cational practice. Educational Policy, 18(5), 784–793.
*Barton, K. C. (2005). “I’m not saying these are going to be easy”: Wise prac-
*Gerwin, D., & Visone, F. (2006). The freedom to teach: Contrasting
tice in an urban elementary school. In E. A. Yeager & O. L. Davis Jr.
history teaching in elective and state-tested course. Social Education,
(Eds.), Wise social studies teaching in an age of high-stakes testing (pp. 11–31).
Greenwich, CT: Information Age Publishing.
*Gradwell, J. M. (2006). Teaching in spite of, rather than because of,
Beauchamp, G. A. (1982). Curriculum theory: Meaning, development,
the test: A case of ambitious history teaching in New York State. In
and use. Theory Into Practice, 21(1), 23–27.
S. G. Grant (Ed.), Measuring history: Cases of state-level testing across
Benton, T., & Craib, I. (2001). Philosophy of social science: The philo-
the United States (pp. 157–176). Greenwich, CT: Information Age
sophical foundations of social thought. New York: Palgrave.
Bhaskar, R. (1989). Reclaiming reality: A critical introduction to contem-
*Grant, S. G. (2003). History lessons: Teaching, learning, and testing in
porary philosophy (2nd ed.). New York: Verso.
U.S. high school classrooms. Mahwah, NJ: Lawrence Erlbaum.
*Bol, L. (2004). Teachers’ assessment practices in a high-stakes testing
*Grant, S. G., Gradwell, J. M., Lauricella, A. M., Derme-Insinna, A.,
environment. Teacher Education and Practice, 17(2), 162–181.
Pullano, L., & Tzetzo, K. (2002). When increasing stakes need not
*Bolgatz, J. (2006). Using primary documents with fourth-grade
mean increasing standards: The case of the New York state global his-
students: Talking about racism while preparing for state-level tests. In
tory and geography exam. Theory and Research in Social Education,
S. G. Grant (Ed.), Measuring history: Cases of state-level testing across
the United States (pp. 133–156). Greenwich, CT: Information Age
Grant, S. G., & Horn, C. L. (2006). The state of state-level history
tests. In S. G. Grant (Ed.), Measuring history: Cases of state-level test-
*Booher-Jennings, J. (2005). Below the bubble: “Educational triage”
ing across the United States (pp. 9–27). Greenwich, CT: Information
and the Texas accountability system. American Educational Research
Journal, 42(2), 231–268.
*Groves, P. (2002). “Doesn’t it feel morbid here?” High-stakes testing
Braun, H. (2004). Reconsidering the impact of high-stakes testing.
and the widening of the equity gap. Educational Foundations, 16(2),
Education Policy Analysis Archives, 12(1). Retrieved March 29, 2006, from
Harden, R. M. (2001). The learning environment and the curriculum.
*Brimijoin, K. (2005). Differentiation and high-stakes testing: An oxy-
Medical Teacher, 23(4), 335–336.
moron? Theory Into Practice, 44(3), 254–261.
*Hillocks, G., Jr. (2002). The testing trap: How state writing assessments
Burch, P. E. (2007). Educational policy and practice from the perspective
control learning. New York: Teachers College Press.
of institutional theory: Crafting a wider lens. Educational Researcher,
Hunter, R. C., & Bartee, R. (2003). The achievement gap: Issues of com-
petition, class, and race. Education and Urban Society, 35(2), 151–160.
Cimbricz, S. (2002). State testing and teachers’ thinking and practice.
Jackson, P. W. (1996). Conceptions of curriculum and curriculum spe-
Education Policy Analysis Archives, 10(2). Retrieved September 4,
cialists. In P. W. Jackson (Ed.), Handbook of research on curriculum: A
2006, from http://epaa.asu.edu/epaa/v10n2.html
project of the American Educational Research Association (pp. 3–40).
*Clarke, M., Shore, A., Rhoades, K., Abrams, L. M., Miao, J., & Li, J.
New York: Simon & Schuster Macmillan.
(2003, January). Perceived effects of state-mandated testing programs
Kim, J. S., & Sunderman, G. L. (2005). Measuring academic proficiency
on teaching and learning: Findings from interviews with educators in
under the No Child Left Behind Act: Implications for educational
low-, medium-, and high-stakes states. Boston: National Board on
equity. Educational Researcher, 34(8), 3–13.
King, N. (1998) Template analysis. In G. Symon & C. Cassell (Eds.),
Arizona State University. Retrieved September 27, 2005, from
Qualitative methods and analysis in organizational research: A practical
guide (pp. 118–134). London: Sage.
King, N. (2006, October 23). What is template analysis? University of
Nichols, S. L., & Berliner, D. C. (2007). Collateral damage: How high-
Huddersfield School of Human and Health Sciences. Retrieved April
stakes testing corrupts America’s schools. Cambridge, MA: Harvard
27, 2007, from http://www.hud.ac.uk/hhs/research/template_analysis/
Nichols, S. L., Glass, G. V., & Berliner, D. C. (2005, September). High-
Kliebard, H. M. (1989). Problems of definition in curriculum. Journal
stakes testing and student achievement: Problems for the No Child Left
of Curriculum and Supervision, 5(1), 1–5.
Behind Act (No. EPSL-0509-105-EPRU). Tempe, AZ: Education
*Landman, J. (2000, January 7). A state-mandated curriculum, a high-
Policy Studies Laboratory, Arizona State University. Retrieved
stakes test: One Massachusetts high school history department’s response to
September 27, 2005, from http://www.asu.edu/educ/epsl/EPRU/
a very new policy context. Doctoral qualifying paper, Harvard Graduate
School of Education, Cambridge, MA. (ERIC Document
Noblit, G. W., & Hare, R. D. (1988). Meta-ethnography: Synthesizing
Reproduction Service No. ED440915)
qualitative studies (Vol. 11). Newbury Park, CA: Sage.
*Libresco, A. S. (2005). How she stopped worrying and learned to love
Orfield, G., & Wald, J. (2000). Testing, testing: The high-stakes testing
the test . . . sort of. In E. A. Yeager & O. L. Davis Jr. (Eds.), Wise social
mania hurts poor and minority students the most. Nation, 270(22),
studies teaching in an age of high-stakes testing (pp. 33–49). Greenwich,
CT: Information Age Publishing.
*Passman, R. (2001). Experiences with student-centered teaching and
*Lipman, P. (2002). Making the global city, making inequality: The
learning in high-stakes assessment environments. Education, 122(1),
political economy and cultural politics of Chicago school policy.
American Educational Research Journal, 39(2), 379–419.
Pawson, R., Greenhalgh, T., Harvey, G., & Walshe, K. (2005). Realist
Lipman, P. (2004). High stakes education: Inequality, globalization, and
review: A new method of systematic review designed for complex pol-
urban school reform. New York: RoutledgeFalmer.
icy interventions. Journal of Health Services Research & Policy, 10(1),
*Lomax, R. G., West, M. M., Harmon, M. C., Viator, K. A., & Madaus,
G. F. (1995). The impact of mandated standardized testing on minor-
*Perreault, G. (2000). The classroom impact of high-stress testing.
ity students. Journal of Negro Education, 64(2), 171–185.
Education, 120(4), 705–710.
*Luna, C., & Turner, C. L. (2001). The impact of the MCAS: Teachers
*Renter, D. S., Scott, C., Kober, N., Chudowsky, N., Joftus, S., & Zabala, D.
talk about high-stakes testing. English Journal, 91(1), 79–87.
(2006, March 28). From the capital to the classroom: Year 4 of the No Child
Madaus, G. F. (1988). The influence of testing on the curriculum. In
Left Behind Act. Washington, DC: Center on Education Policy. Retrieved
L. N. Tanner (Ed.), Critical issues in curriculum: Eighty-seventh year-
March 28, 2006, from http://www.cep-dc.org
book of the national society for the study of education (pp. 83–121).
*Rex, L. A. (2003). Loss of the creature: The obscuring of inclusivity in
Chicago: University of Chicago Press.
classroom discourse. Communication Education, 52(1), 30–46.
Madaus, G. F., & Clarke, M. (2001). The adverse impact of high-stakes
*Rex, L. A., & Nelson, M. C. (2004). How teachers’ professional iden-
testing on minority students: Evidence from one hundred years of test
tities position high-stakes preparation in their classrooms. Teachers
data. In G. Orfield & M. L. Kornhaber (Eds.), Raising standards or
College Record, 106(6), 1288–1331.
raising barriers? Inequality and high-stakes testing in public education
*Salinas, C. (2006). Teaching in a high-stakes testing setting: What
(pp. 85–106). New York: Century Foundation Press.
becomes of teacher knowledge. In S. G. Grant (Ed.), Measuring his-
Marchant, G. J., & Paulson, S. E. (2005, January 21). The relationship
tory: Cases of state-level testing across the United States (pp. 177–193).
of high school graduation exams to graduation rates and SAT scores.
Greenwich, CT: Information Age Publishing.
Education Policy Analysis Archives, 13(6). Retrieved February 8, 2006,
Sandelowski, M., Docherty, S., & Emden, C. (1997). Qualitative meta-
synthesis: Issues and techniques. Research in Nursing & Health, 20(4),
McCormick, J., Rodney, P., & Varcoe, C. (2003). Reinterpretations across
studies: An approach to meta-analysis. Qualitative Health Research,
Schneider, A. L., & Ingram, H. (1997). Policy design for democracy.
Lawrence: University of Kansas.
McEwan, H., & Bull, B. (1991). The pedagogic nature of subject matter
*Segall, A. (2003). Teachers’ perceptions of the impact of state-mandated
knowledge. American Educational Research Journal, 28(2),316–334.
standardized testing: The Michigan Educational Assessment Program
*McNeil, L. M. (2000). Contradictions of school reform: Educational costs
(MEAP) as a case study of consequences. Theory and Research in Social
of standardized testing. New York: Routledge.
Education, 31(3), 287–325.
*McNeil, L. M., & Valenzuela, A. (2001). The harmful impact of
Segall, A. (2004a). Blurring the lines between content and pedagogy.
the TAAS system of testing in Texas: Beneath the accountability
Social Education, 68(7), 479–482.
rhetoric. In G. Orfield & M. L. Kornhaber (Eds.), Raising standards
Segall, A. (2004b). Revisiting pedagogical content knowledge: The
or raising barriers? Inequality and high-stakes testing in public education
pedagogy of content/the content of pedagogy. Teaching and Teacher
(pp. 127–150). New York: Century Foundation Press.
Education, 20, 489–504.
Moe, T. M. (2003). Politics, control, and the future of school account-
Sirin, S. R. (2005). Socioeconomic status and student achievement: A
ability. In P. E. Peterson & M. R. West (Eds.), No child left behind?
meta-analytic review of research. Review of Educational Research,
The politics and practice of school accountability (pp. 80–106).
Washington, DC: Brookings Institution Press.
*Siskin, L. S. (2003). Outside the core: Accountability in tested and
*Murillo E. G., Jr., & Flores, S. Y. (2002). Reform by shame: Managing
untested subjects. In M. Carnoy, R. Elmore, & L. S. Siskin (Eds.), The
the stigma of labels in high-stakes testing. Educational Foundations,
new accountability: High schools and high-stakes testing (pp. 87–98).
New York: RoutledgeFalmer.
Nichols, S. L., & Berliner, D. C. (2005, March). The inevitable corrup-
*Sloan, K. (2005). Playing to the logic of the Texas accountability sys-
tion of indic