This is not the document you are looking for? Use the search form below to find more!

Report home > Psychology

Are Within-subjects designs transparent?

5.00 (1 votes)
Document Description
Researchers frequently argue that within-subjects designs should be avoided because they result in research hypotheses that are transparent to the subjects in the study. This conjecture was empirically tested by replicating several classic between-subjects experiments as within-subjects designs. In two additional experiments, psychology students were given the within-subjects versions of these studies and asked to guess what the researcher was hoping to find (i.e. the research hypothesis), and members of the Society for Judgment and Decision Making (SJDM) were asked to predict how well students would perform this task. On the whole, students were unable to identify the research hypothesis when provided with the within-subjects version of the experiments. Furthermore, SJDM members were largely inaccurate in their predictions of the transparency of a within-subjects design.
File Details
  • Added: February, 07th 2010
  • Reads: 145
  • Downloads: 2
  • File size: 149.56kb
  • Pages: 13
  • Tags: methodology, research design, subject, transparent
  • content preview
Submitter
  • Username: shinta
  • Name: shinta
  • Documents: 4332
Embed Code:

Add New Comment




Related Documents

What are the Latest Designs in One Piece Swimsuits?

by: boardshorts945, 2 pages

We usually wear swimsuits whenever going to a public beach or pool. These are the most sensible atti...

Online Jobs That Are Within Your Reach

by: router4screen, 2 pages

The types of jobs you can do online from the comfort of your own home will first require a prerequis...

What are the Newest Designs and styles in One Piece Bathing suits?

by: boardshorts945, 2 pages

Again on the olden days, our ancestors wore swimsuits also considering that they observed these atti...

$20,000 Unsecured Personal Loans Are Within Reach, Despite Bad Credit

by: easeroom13, 2 pages

Given that in search of a individual mortgage with poor credit score mostly depends on affordability...

Picture Perfect Photos Are Within Your Reach With These Handy Tips

by: metal5sack, 2 pages

There is more to photography than merely capturing an image in a well lit setting. Photography reall...

Are complex decisions better left to the unconscious? Further failed replications of the deliberation-without-attention effect

by: shinta, 9 pages

The deliberation-without-attention effect occurs when better decisions are made when people experience a period of distraction before a decision than when they make decisions immediately or ...

Hindi Text Books Of Cbse Board

by: edurite, 3 pages

These online CBSE textbooks come in very handy for students of classes X and XII who wish to start reading the books, during the summer vacations and are sometimes unable to do so due to non- ...

How to Use Facebook to Market your Restaurant, Cafe, Hotel, or Bar

by: rika, 2 pages

Facebook is the hot new way to connect with people. It has millions of members worldwide, and thousands of them are within your area. It’s an excellent way to link with customers new and old.

Effect of Preservative on the Shelf Life of Yoghurt Produced from Soya Beans Milk

by: shinta, 12 pages

This study concentrated on the effects of preservatives on shelf life of yoghurt produced from Soya beans milk. The yoghurt was produced by heating Soya beans milk slurry, cooled and ...

Content Preview
Judgment and Decision Making, Vol. 4, No. 7, December 2009, pp. 554–566
Are Within-subjects designs transparent?
Charles Lambdin? and Victoria A. Shaffer
Department of Psychology, Wichita State University
Abstract
Researchers frequently argue that within-subjects designs should be avoided because they result in research hypothe-
ses that are transparent to the subjects in the study. This conjecture was empirically tested by replicating several classic
between-subjects experiments as within-subjects designs. In two additional experiments, psychology students were
given the within-subjects versions of these studies and asked to guess what the researcher was hoping to ?nd (i.e. the
research hypothesis), and members of the Society for Judgment and Decision Making (SJDM) were asked to predict
how well students would perform this task. On the whole, students were unable to identify the research hypothesis when
provided with the within-subjects version of the experiments. Furthermore, SJDM members were largely inaccurate in
their predictions of the transparency of a within-subjects design.
Keywords: methodology, research design.
1 Introduction
this case, the stimulus to be manipulated was the number
being rated, 9 or 221. However, Birnbaum argues that
In the ?eld of psychology, there is a long-standing con-
subjective judgments cannot be made in isolation; they
troversy over the appropriate use of between- and within-
require a context. In a within-subjects design, the context
subjects designs. Within-subjects designs have greater
is speci?ed; the two (or more) conditions are compared to
power and less variability, but many researchers eschew
each other. In a between-subjects design, the subjects are
their use for two reasons. First, it has been argued that
left to construct their own context to evaluate the stim-
within-subjects designs render our research hypotheses
ulus. When this is the case, it is very likely that differ-
transparent (e.g., Tversky & Kahneman, 1983). That is,
ent contexts will be invoked for different stimuli. In this
in a within-subjects design, subjects will be aware of the
example, 9 is likely to bring to mind other single-digit
purposes of our experiment and may behave accordingly,
numbers for comparison, thus leaving the impression that
thus posing a threat to the internal validity of the experi-
9 is a relatively large number. In contrast, 221 is likely to
ment. Second, some have argued that life is more sim-
bring to mind other triple-digit numbers for comparison,
ilar to a between-subjects design (Fischhoff, Slovic &
leaving the impression that 221 is a relatively small num-
Lichtenstein, 1979; Kahneman, Slovic & Tversky, 1982).
ber. Thus, in a between-subjects design both the stimuli
Therefore, between-subjects designs increase the gener-
and the context vary between conditions, confounding the
alizability of the experimental ?ndings. However, oth-
results.
ers have argued that between-subjects designs pose their
Although the relative merits may be theoretically de-
own risks. Parducci (1965) and Birnbaum (1999) contend
bated, what is more troubling is that hypotheses tested
that, particularly with subjective judgments, the between-
in these two designs often do not result in the same
subjects design should be abandoned because it results in
conclusions (Grice, 1966). For example, in between-
the confounding of context and stimulus.
subjects comparisons, manipulations of base rates do not
The most famous demonstration of this principle was
affect judgments of probability, leading to the conclusion
provided by Birnbaum (1999), who showed that in a
that base rates are ignored (e.g., Kahneman & Tversky,
between-subjects design subjects rated the number 9 as
1973). However, in within-subjects comparisons, base
being signi?cantly larger than 221. Theoretically, in a
rates have large, signi?cant effects on judgments of prob-
between-subjects design, the two conditions are identi-
ability, leading to the conclusion that base rates are not
cal except for the manipulation of a single stimulus. In
ignored (Birnbaum & Mellers, 1983).
Despite the concerns about using between-subjects de-
?These experiments are based on a dissertation submitted by the ?rst
signs to evaluate subjective judgments raised by Birn-
author in partial ful?llment of the requirements for a Ph.D. degree at
baum and others, it appears that the within-subjects de-
Wichita State University. Address: Victoria A. Shaffer, Department of
Psychology, Wichita State University, 1845 Fairmount St., Wichita, KS,
sign has fallen out of favor in many areas of psychol-
67260–0034. E-mail: victoria.shaffer@wichita.edu
ogy. In particular, within judgment and decision mak-
554

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
555
ing there exist many ?ndings supported almost exclu-
frame) and Programs C and D were presented in terms
sively by between-subjects data. Examples include the
of lives lost (the “mortality” frame). For instance, Pro-
hindsight bias (Fischhoff, 1975), research on reason-
gram A states that, if adopted, 200 people will be saved.
based choice (Sha?r, 1993), availability (Schwarz, Bless,
Program C states that, if adopted, 400 people will cer-
Strack, Klumpp, Rittenauer-Schatka & Simons, 1991),
tainly die. Since the sample space is 600 people, these
support theory (Tversky & Koehler, 1994) and many clas-
two statements should be seen as imparting the same in-
sic demonstrations of heuristics and biases (e.g., Kahne-
formation. Tversky and Kahneman (1981) hypothesized
man et al., 1982). Researchers in these areas have fre-
that a preference reversal would occur between the two
quently argued that between-subjects designs are more
frames, a framing effect. Subjects would choose Program
appropriate. One of the main reasons cited for the superi-
A in the ?rst pair and Program D in the second pair. Fram-
ority of between-subjects designs is the belief that within-
ing effects occur when different descriptions of function-
subjects designs are transparent (Bastardi & Sha?r, 1998;
ally equivalent information cause people’s preferences to
Fischhoff, Slovic & Lichtenstein, 1979; Kahneman &
differ.
Frederick, 2005; Tversky & Kahneman, 1983).
Al-
In Tversky and Kahneman’s (1986) marbles lotteries
though this appears to be a popular reason for rejecting
experiment, subjects were asked to choose one of two
the within-subjects design, this assertion has never been
lotteries that they would like to play; these “lotteries” in-
empirically tested.
volved drawing a colored marble from a jar. One group of
Thus, this gap in the literature inspired the two spe-
subjects was given Options A and B, and another group of
ci?c aims of this research. The ?rst aim was to determine
subjects was shown Options C and D, both given below.
if classic examples of between-subjects designs could be
replicated using within-subjects designs. The second was
to empirically test the hypothesis that within-subjects de-
signs are transparent.
Option A
90% white 6% red 1% green 1% blue 2% yellow
$0
win $45 win $30 lose $15
lose $15
2 Experiment 1
Option B
90% white 6% red 1% green 1% blue 2% yellow
In Experiment 1, we attempted to replicate three classic
$0
win $45 win $45 lose $10
lose $15
between-subjects designs (Sha?r, 1993; Tversky & Kah-
neman, 1981; Tversky & Kahneman, 1986) in a within-
subjects format. In Sha?r (1993), subjects were asked to
pretend they were serving on a jury for a custody trial.
Option C
They were presented with two generic parents, Parent
90% white 6% red 1% green 3% yellow
A and Parent B. Parent A had a number of average at-
$0
win $45 win $30
lose $15
tributes, whereas Parent B had both very positive and
Option D
negative attributes. Sha?r hypothesized that, because of
90% white 7% red 1% green 2% yellow
the theory of reason-based choice, when asked to which
parent custody should be awarded, subjects would selec-
$0
win $45 lose $10
lose $15
tively search for reasons to award custody. Since Parent B
has more extreme positive attributes than A, most subjects
would award custody to Parent B. Similarly, when asked
Between Options A and B, B is the dominating lottery.
to which parent custody should be denied, subjects would
The dominating lottery is the one with the better odds of
selectively search for reasons to deny custody. Since Par-
winning. Between Options C and D, D is the dominating
ent B has more extreme negative attributes than A, most
lottery. Options A and B are the same lotteries as Options
subjects would also deny custody to B.
C and D. The 6% red and 1% green to win $45 in Option
In the Asian disease problem, Tversky and Kahneman
B have simply been combined into Option D’s 7% red to
(1986) presented subjects with two “programs” which
win $45. Similarly, Option A’s 1% blue and 2% yellow to
were proposed solutions to a hypothetical outbreak of an
lose $15 have been combined into Option C’s 3% yellow
“Asian disease”, of which 600 were expected to die. One
to lose $15. Because of this, Option D now has two losing
group of subjects was given a choice between Programs A
outcomes and only one winning, whereas Option C now
and B and another group was given a choice between Pro-
has two winning outcomes and only one losing. It was
grams C and D. Programs A and B were logically equiv-
hypothesized that subjects would choose Option C over
alent to Programs C and D. However, Programs A and
D, even though Option D is the same as B and dominates
B were presented in terms of lives saved (the “survival”
C.

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
556
2.1 Method
Table 1: Percent (N) of subjects choosing Parent A.
Eighty-nine undergraduate students at Wichita State Uni-
Award
Deny
versity volunteered to participate in a brief survey dur-
ing a psychology class.
All students received extra
Within-subjects
74% (66) 24% (21)
credit for their participation. The survey contained a
Between-subjects 73% (33) 23% (10)
within-subjects adaptation of three between-subjects ex-
periments: the child custody case (Sha?r, 1993), the
Asian disease problem (Tversky & Kahneman, 1981) and
Table 2: Percent (N) of subjects choosing the risk-averse
the marbles lotteries (Tversky & Kahneman, 1986). See
option
Appendix A for the full text of all three experiments. The
order in which the two conditions were presented was
Survival frame Mortality frame
randomized for each design. Because the order of pre-
Within-subjects
62% (55)
34% (30)
sentation is randomized, the use of a within-subjects de-
sign allows for both between- and within-subjects com-
Between-subjects
76% (34)
27% (12)
parisons of the same data. In each of the three studies
there were two conditions (A and B). Half of the sub-
jects responded to A then B, and the other half responded
The child custody case: The data presented in Table 1
to B then A. If only the ?rst response is analyzed, then
show the percentage of subjects choosing Parent A in the
between-subjects comparisons can be employed, albeit
award and deny conditions.
with only half the sample size.
The within- and between-subjects analyses yielded
Additionally, although we are discussing the two de-
identical conclusions. In both the “award” and “deny”
signs as completely different methods, they are really best
conditions, the majority of subjects indicated that Par-
thought of as two endpoints on a single continuum. A
ent A should have custody. These data did not replicate
within-subjects design, in its most pure form, would pro-
Sha?r’s (1993) original ?ndings that people will award
vide all of the conditions to the subjects simultaneously.
and deny custody to the same parent due to reason-based
Alternatively, you could have the different conditions on
choice.
different pages, or administer the different conditions on
different days, weeks or months. As illustrated, these de-
The Asian disease problem The data presented in Ta-
signs successively move further away from a pure within-
ble 2 show the percentage of subjects choosing the risk-
subjects design toward the between-subjects endpoint. To
averse option in the survival and mortality frames.
test whether the place on this continuum matters, half
In both the within- and between-subjects analyses of
of the subjects saw the two conditions presented on the
this data, the majority of subjects chose the risk-averse
same page and half saw the two conditions on two sepa-
option in the survival frame and the risk-seeking option
rate pages.
in the mortality frame. This replicates Tversky and Kah-
neman’s (1981) original results.
2.2 Results and discussion
Marbles lotteries
The data presented in Table 3 show
2.2.1 Testing for differences along the “within-
the percentage of subjects choosing the dominating op-
subjects continuum”
tion in pair one (Option A vs. Option B) and pair two
The purpose of the replications was to see whether the
(Option C vs. Option D). In pair 1, Option B is the dom-
overall patterns of data would be similar between designs.
inant option, hence the rational choice. In pair 2, Op-
The point, therefore, of testing for differences along the
tion D is the dominant option. The within-subjects anal-
within-subjects continuum is not to altogether rule out
yses used data from both pairs for all 89 subjects. The
their presence, but rather to ensure that — if present —
between-subjects analyses used data from only the ?rst
they are not of a magnitude that would create a qualita-
pair shown. Thus, half of the subjects (N = 45) saw Pair
tive change in the overall pattern of data. With this study,
1 ?rst and half of the subjects (N = 44) saw Pair 2 ?rst.
we had adequate power (i.e., 80%) to detect differences of
For both the within- and between-subjects analyses,
20 percentage points between conditions presented on the
the majority of subjects chose the dominating lottery in
same page and conditions presented on separate pages.
Pair 1 but not in Pair 2. This replicates Kahneman and
We found no signi?cant differences between the results of
Tversky’s (1986) original ?ndings.
these three within-subjects studies presented on the same
In Experiment 1, three famous between-subjects exper-
vs. different pages. Therefore, the results reported below
iments were replicated using a within-subjects design to
are collapsed across this condition.
determine the impact of design type on the ?ndings. We

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
557
hypothesis of a ?ctitious experiment, which was designed
Table 3: Percent (N) of subjects choosing the dominant
to be extremely transparent. This ?ctitious experiment
lottery.
served as a manipulation check to make sure that all sub-
Pair 1
Pair 2
jects understood their task and could identify a hypothesis
in a very simple research design. See Appendix B for the
Within-subjects
96% (85) 33% (29)
full text of the instructions and manipulation check. Sub-
Between-subjects 96% (43) 39% (17)
jects were then presented with the within-subjects adapta-
tions of the child custody case, the Asian disease problem
and the marbles lotteries (the same materials presented in
were able to replicate the original ?ndings in two of the
Experiment 1). Half of the subjects saw the two within-
three experiments. In addition, the between- and within-
subjects conditions on the same page and half saw the
subjects analyses of the same data led to the same conclu-
two conditions on different pages. After completing each
sions in all cases. Experiment 1 supports the conclusion
study, subjects were asked to describe the experimenter’s
that between- and within-subjects designs may be more
hypothesis, rate their con?dence in their response (on a
interchangeable than many researchers think. And, there
7-point Likert scale) and rate the transparency of the de-
may not be a need to dismiss the within-subjects design a
sign as completely transparent, somewhat transparent or
priori on the grounds that it will decrease internal valid-
not transparent at all.
ity.
Subjects’ descriptions of the research hypotheses were
judged to be correct or incorrect. The criteria upon which
the responses were judged were very lenient. For exam-
3 Experiment 2
ple, in the child custody case, subjects merely needed to
mention that changing the phrasing or wording would in
In this experiment, we directly tested the assertion
some way impact the results. Thus, exhibiting some mi-
that within-subjects designs are more transparent than
nor amount of insight was considered a successful com-
between-subjects designs (e.g., Bastardi & Sha?r, 1998;
pletion of the task.
Fischhoff, Slovic & Lichtenstein, 1979; Kahneman &
Frederick, 2005). To do so, we presented psychology stu-
3.2 Results and discussion
dents with the within-subjects versions of three between-
subjects designs (the same three studies used in Experi-
As in Experiment 1, there were no signi?cant differences
ment 1): the child custody case (Sha?r, 1993), the Asian
between the results of the three studies when the condi-
disease problem (Tverksy & Kahneman, 1981) and the
tions were presented on the same page vs. different pages.
marbles lotteries (Tversky & Kahneman, 1986). Stu-
Thus, the results reported have been collapsed across this
dents were asked to ?gure out what the experimenter was
condition.
looking for (i.e., identify the research hypothesis). These
The manipulation check indicated that the vast major-
three studies were chosen because we predicted that they
ity of subjects understood the instructions and were able
would vary in their transparency to undergraduate stu-
to recognize and articulate the hypothesis of a simple re-
dents. Speci?cally, we predicted that the child custody
search design. However, nine of the 72 subjects did not
case would be most transparent and the marbles lotteries
pass the manipulation check. Analyses were done both
would be the least transparent. Thus, we believed that
with and without these nine subjects. The conclusions re-
most psychology students would be able to guess Sha?r’s
mained the same; therefore, the results below include all
research hypothesis in the child custody case. However,
72 subjects.
we believed that Tverksy and Kahneman’s hypothesis in
Although most of the subjects were able to identify the
the marbles lotteries would be opaque to the students.
research hypothesis in the manipulation check, they were
generally unable to do so with any of the three published
3.1 Method
experiments. Only 7% of subjects were able to correctly
articulate some portion of the research hypothesis in the
Eighty undergraduate students at Wichita State Univer-
child custody case, the study deemed to be the most trans-
sity volunteered to participate for extra credit. Eight sub-
parent a priori (95% CI: .01 to .13). Thirty-two percent
jects were dropped because they did not answer all of the
of subjects correctly identi?ed the research hypothesis in
questions, resulting in a sample size of 72. Before be-
the Asian disease problem (95% CI: .21 to .43), while
ginning the experiment, students were given a brief tuto-
only 3% of subjects correctly identi?ed the hypothesis in
rial describing the concept of a hypothesis and illustrating
the marbles lotteries (95% CI: .00 to .07).
how researchers design experiments to test their hypothe-
Subjects, therefore, were most accurate with the Asian
ses. In addition, they were asked to identify the research
disease problem, though it should perhaps be reiterated

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
558
Table 4: Con?dence in ability to guess research hypothe-
Table 5: Ratings of transparency by undergraduate stu-
sis.
dents, (N).
M (SD)
95% CI
Completely
Somewhat
Not
transparent
transparent
transparent
Manipulation check
2.32 (1.12) 2.06 to 2.58
Child custody case
3.32 (1.46) 2.98 to 3.66
Manipulation
36% (26)
60% (43)
4% (3)
Asian disease problem 3.01 (1.48) 2.67 to 3.35
check
Marbles lotteries
3.72 (1.65) 3.34 to 4.10
Marbles
17% (12)
53% (38)
31% (22)
lotteries
Child custody
25% (18)
51% (37)
24% (17)
that these ?gures were arrived at by being very lenient
case
with the criteria for success.
Examples of subjects’
Asian disease
25% (18)
60% (43)
15% (11)
guesses that were counted as correctly identifying the
problem
research hypothesis for the Asian disease problem in-
clude: “If rewording the question makes a difference in
the choice” and “Changing the wording of the question-
to subjects, thus potentially increasing demand charac-
naire from life to death expectancies will in?uence re-
teristics and reducing internal validity. Taken together,
sponses.” Examples of guesses judged incorrect include:
both Experiments 1 and 2 provide evidence that within-
“The value of other people’s lives,” “Maybe how we let
subjects designs are largely opaque. The vast majority of
?gures impress us,” “What is considered a ‘loss’ of pop-
psychology students were unable to identify the research
ulation,” “It is better to take a risk than leave hundreds
hypothesis from within-subjects adaptations of three fa-
helpless” and “How many people will die based on the
mous between-subjects experiments. Additionally, the
program chosen.” This last quote is of a subject who was
psychology students were essentially unaware of their in-
very con?dent he was correct and who rated the scenario
ability to do so. Subjects tended to be a least somewhat
as being “completely transparent.”
con?dent in their responses and only a small minority of
Although this appeared to be a particularly dif?cult
subjects categorized these experiments as “not transpar-
task for most subjects, they reported being somewhat
ent”.
con?dent in their responses. Table 4 lists the means and
standard deviations for the con?dence ratings for the ma-
nipulation check and the three studies (1 = “extremely
4 Experiment 3
con?dent” and 7 = “extremely uncon?dent”).
Subjects were most con?dent in their responses to
This ?nal experiment was designed to determine if re-
the manipulation check, followed by the Asian disease
searchers are accurately able to predict the transparency
problem, the child custody case and the marbles lotter-
of within-subjects research designs. To do so, we asked
ies. The ordinal relationship between the con?dence es-
members of the Society for Judgment and Decision Mak-
timates matches subjects’ accuracy; subjects were most
ing, who should be very familiar with all three studies,
accurate (and were most con?dent) with the manipula-
to categorize these studies as “completely transparent”,
tion check and least accurate (and least con?dent) with
“somewhat transparent” or “not transparent at all”.
the marbles lotteries. In contrast, subjects’ categorization
of transparency did not appear to be very sensitive with
respect to their accuracy.
4.1 Method
Given that 88% of subjects accurately described the re-
search hypothesis in the manipulation check, this study
Participation was solicited from the members of the Soci-
should have been characterized as completely transparent
ety for Judgment and Decision Making via the society’s
by more than 36% of the subjects. Furthermore, a quarter
mailing list. Forty-eight members began the online sur-
of the subjects categorized the child custody case and the
vey; two subjects were removed from the data because
Asian disease problem as completely transparent. Given
they answered only a couple of questions.
that very few people accurately identi?ed the research hy-
potheses in these cases, this categorization seems largely
4.2 Results and discussion
optimistic. See Table 5 for transparency ratings across all
four studies.
Members of the Society for Judgment and Decision Mak-
Experiment 2 directly tested the claim that within-
ing (SJDM) shared our intuition that the child custody
subjects designs make research hypotheses transparent
case would be the most transparent and the marbles lot-

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
559
jects would ?nd some within-subjects research designs to
Table 6: Ratings of transparency by members of SJDM,
be transparent when, in reality, very few undergraduate
% (N).
students were accurately able to describe the research hy-
Completely
Somewhat
Not
pothesis when presented with all conditions of a research
transparent
transparent
transparent
design.
Child custody
54% (25)
33% (15)
13% (6)
case
5 General Discussion
Asian disease
17% (8)
67% (31)
15% (7)
problem
Within-subjects designs have a number of well-
Marbles
4% (2)
37% (17)
59% (27)
documented bene?ts; most importantly, they increase the
lotteries
power of an experiment (Kirk, 1995; Zimmerman, 1997)
and avoid stimulus and context confounds (Birbaum,
1999).
Researchers, however, often abandon within-
teries would be the least transparent; see Table 6 for the
subjects designs amid a priori concerns that increased
transparency ratings of the three studies.
task transparency will alter experimental outcomes. In
Although there was considerable agreement in the pre-
this paper, we tried to assuage these concerns by repli-
dicted transparency of the studies, these predictions ap-
cating three famous between-subjects studies in a within-
pear to be largely inaccurate. Recall from Experiment
subjects format and empirically testing the transparency
2 that undergraduate students performed poorly on this
of the within-subjects designs. In Experiment 1, we were
task across all three studies. Subjects were essentially un-
successfully able to replicate the ?ndings in two of the
able to identify the research hypothesis in any of the three
three studies (the Asian disease problem and the marbles
studies, the only exception being a sizeable minority of
lotteries). Although our conclusions did not match those
students (32%) who were able to identify what was be-
of Sha?r (1993), both the between- and within-subjects
ing manipulated in the Asian disease problem. Thus, al-
analyses yielded identical results. Thus, we argue that
though SJDM members predicted variability in the trans-
this ?nding may not easily replicate with either design.
parency of these studies, undergraduate students had very
In Experiments 2 and 3, undergraduate students and
little success identifying the research hypothesis in any of
members of the Society for Judgment and Decision
the studies.
Making categorized these within-subjects adaptations on
In addition, the transparency ratings provided by the
their transparency: “completely transparent”, “somewhat
SJDM members can be compared with the transparency
transparent” or “not transparent at all”. Although these
ratings provided by the undergraduate students. The two
two groups differed in their assessment of the trans-
groups differed in their assessment of the transparency
parency of within-subjects designs, neither assessment
of the marbles lotteries, ?2 (2, N = 118) = 10.45, p <
proved to be accurate. Both experienced researchers and
.05. The majority of SJDM members (59%) thought that
undergraduate students overestimated the transparency of
the marbles lotteries would not be transparent. The ma-
these within-subjects designs. Very few undergraduate
jority of undergraduates (70%), however, rated the mar-
students demonstrated any ability to identify the hypothe-
bles lotteries as being either “somewhat” or “completely
ses driving these original experiments. Even the research
transparent”. The two groups also differed in their trans-
hypothesis that most agreed would be readily apparent
parecy ratings of the child custody case, ?2 (2, N =
was not obvious to a vast majority of undergraduate stu-
118) = 10.11, p < .05. A majority of SJDM members
dents. A limitation of these studies is the small sample
(54%) thought the child custody case experiment would
sizes, both of subjects and items. However, we had ade-
be “completely transparent”, whereas a majority of un-
quate power to detect moderate effect sizes.
dergraduates (51%) rated it as being “somewhat trans-
The main contribution of the paper is to demonstrate
parent”. However, the two groups did not differ in their
that within-subjects designs do not necessarily make re-
transparency ratings of the Asian disease problem, ?2 (2,
search hypotheses transparent. In fact, research hypothe-
N = 118) = 1.00, p > .05. The majority of both groups
ses are probably more opaque than we would imagine.
rated this design as “somewhat transparent”.
However, there are still some research hypotheses that
Although ratings of transparency differed between un-
will be transparent to most subjects in within-subjects de-
dergraduate students and SJDM members (who are very
signs. For example, 88% of subjects were able to ?gure
familiar with these three studies), the SJDM members
out the research hypothesis in our manipulation check.
were no more accurate than the undergraduate students
Note that this is still less than the 100% we had imag-
at assessing the transparency of within-subjects research
ined when creating this sample experiment. Given that
designs. SJDM members erroneously believed that sub-
the risk of task transparency appears to be very low, we

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
560
do not think this should be a cause to abandon the within-
can allow the researcher to test for task transparency in
subjects design. Furthermore, although there may be
the case that it may be important to guard against de-
some risk to internal validity if the task is transparent,
mand characteristics or provide evidence that all condi-
we argue that this risk does not outweigh the stimulus-
tions were understood.
context confound for subjective judgments in between-
subjects designs. Instead, we suggest that researchers ask
subjects to guess the hypothesis tested in the experiment
References
after the study has been completed to ensure the task was
not transparent. If, on the other hand, it is important that
subjects understand what is being manipulated in the ex-
Baron, J. & Hershey, J. C. (1988). Outcome bias in de-
periment, asking them to describe the research hypothesis
cision evaluation. Journal of Personality and Social
can provide a type of manipulation check. The key point
Psychology, 54, 569–579.
here is that our data indicate that neither experienced
Bastardi, A. & Sha?r, E. (1998). On the pursuit and mis-
researchers nor subjects have accurate intuitions about
use of useless information. Journal of Personality and
which research hypotheses will be transparent. There-
Social Psychology, 75, 19–32.
fore, we argue that this is an empirical question that can
Birnbaum, M. H. (1999). How to show that 9 > 221:
easily be tested in each experiment.
Collect judgments in a between-subjects design. Psy-
Additionally, within- and between-subjects analyses of
chological Methods, 4, 243–249.
our data produced the same conclusions. Thus, for situ-
Birnbaum, M. H. & Mellers, B. A. (1983). Bayesian
ations in which context and stimulus are not confounded
inference: Combining base rates with opinions of
in between-subjects designs and research hypotheses are
sources who vary in credibility. Journal of Personal-
not transparent in within-subjects designs, the two types
ity and Social Psychology, 45, 792–804.
of designs may produce the same conclusions. However,
Fischhoff, B. (1975). Hindsight is not equal to foresight:
because we are often unable to predict a priori whether
The effect of outcome knowledge on judgment under
stimulus-context confounds exist or research hypotheses
uncertainty. Journal of Experimental Psychology: Hu-
are transparent, we argue that it important to collect data
man Perception and Performance, 1, 288–299.
using a within-subjects design, randomizing the presen-
Fischhoff, B., Slovic, P. & Lichtenstein, S. (1979). Sub-
tation of conditions so that both between- and within-
jective sensitivity analysis. Organizational Behavior
subjects analyses can be conducted. This will provide in-
and Human Performance, 23, 339–359.
formation about the effect of design type on conclusions,
Frisch, D. (1993). Reasons for framing effects. Organi-
which will provide a richer understanding of the effect
zational Behavior and Human Decision Processes, 54,
(e.g., Frisch, 1993). For example, the use of a within-
399–429.
subjects design enables the researcher to ascertain sub-
Grice, G. R. (1966). Dependence of empirical laws upon
jects’ perception of the normative model (e.g., Baron &
the source of experimental variation. Psychological
Hershey, 1988). In addition, there are predictable cases
Bulletin, 66, 488–498.
in which between- and within-subjects designs result in
different conclusions; see the literature on joint vs. sep-
Hsee, C. K., Loewenstein, G. F., Blount, S., & Bazerman,
arate evaluations for several demonstrations of this phe-
M. H. (1999). Preference reversals between joint and
nomenon (e.g., Hsee, Loewenstein, Blount & Bazerman,
separate evaluations of options: A review and theoret-
1999).
ical analysis. Psychological Bulletin, 12, 576–590.
Kahneman, D. & Frederick, S. (2005). A model of
heuristic judgment.
The Cambridge Handbook of
6 Conclusions
Thinking and Reasoning, New York: Cambridge Uni-
versity Press.
Data from three experiments demonstrate that within-
Kahneman, D., Slovic, P. & Tversky, A. (1982). Judg-
subjects designs do not regularly render the experimental
ment Under Uncertainty: Heuristics and Biases. New
task transparent. Therefore, this popular reason for re-
York: Cambridge University Press.
jecting the within-subjects design in favor of the between-
Kahneman, D., & Tversky, A. (1973). On the psychology
subjects design has little empirical merit. Thus, we argue,
of prediction. Psychological Review, 80, 237–251.
as others have done in the past (e.g., Birnbaum, 1999),
Kirk, R. E. (1995). Experimental Design: Procedures for
that the within-subjects design, with counterbalancing,
the Behavioral Sciences (3rd ed.). Paci?c Grove, CA:
should be the default design when measuring subjective
Brooks/Cole.
judgments. In addition, routinely asking subjects to guess
Parducci, A. (1965).
Category judgment: A range-
the research hypothesis upon completion of the study
frequency model. Psychological Review, 72, 407–418.

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
561
Sha?r, E. (1993). Choosing versus rejecting: Why some
____Parent A
____Parent B
options are both better and worse than others. Memory
Average income
Above-average income
& Cognition, 21, 546–556.
Very close relationship
Schwarz, N., Bless, H., Strack, F., Klumpp, G.,
Average health
with the child
Rittenauer-Schatka, H. & Simons, A. (1991). Ease of
Extremely active social
retrieval as information: Another look at the availabil-
Average working hours
life
ity heuristic. Journal of Personality and Social Psy-
chology, 61, 195–202.
Reasonable rapport with
Lots of work-related travel
the child
Tversky, A. & Kahneman, D. (1981). The framing of
decisions and the psychology of choice. Science, 211,
Relatively stable social life Minor health problems
453–458.
Again, imagine you serve on the jury of an only-child
Tversky, A. & Kahneman, D. (1983). Extensional versus
sole-custody case following a relatively messy divorce.
intuitive reasoning: The conjunction fallacy in proba-
The facts of the case are complicated by ambiguous eco-
bility judgment. Psychological Review, 90, 293–315.
nomic, social, and emotional considerations, and you de-
Tversky, A. & Kahneman, D. (1986). Rational choices
cide to base your decision entirely on the following few
and the framing of decisions. Journal of Business, 59,
observations. Which parent would you DENY sole cus-
251–278.
tody of the child?
Tversky, A. & Koehler, D. (1994). Support theory: A
____Parent A
____Parent B
nonextensional representation of subjective probabil-
ity. Psychological Review, 101, 547–567.
Average income
Above-average income
Zimmerman, D. E. (1997). A note on interpretation of
Very close relationship
Average health
the paired-samples t test. Journal of Educational and
with the child
Behavioural Statistics, 22, 349–360.
Extremely active social
Average working hours
life
Reasonable rapport with
Lots of work-related travel
the child
Relatively stable social life Minor health problems
Appendix A: Instructions and ques-
tions used in Experiment 1
Scenario 2
Imagine that the United States is preparing for the out-
Instructions
break of an unusual Asian disease that is expected to kill
600 people. Two alternative programs to combat the dis-
What follows is a brief questionnaire in which you will
ease have been proposed. Assume that the exact scienti?c
be asked to imagine different hypothetical scenarios and
estimates of the consequences of the program are as fol-
then answer a few questions about them. There are only
lows:
nine questions and you should be completed in about 10
minutes. Detailed instructions are provided with each
If Program A is adopted, 200 people will be saved.
scenario. Please answer each question by drawing a
If Program B is adopted, there is a one-third probabil-
checkmark in the space provided next to the option you
ity that 600 people will be saved and a two-thirds proba-
prefer. Thank you.
bility that no people will be saved.
Which program do you prefer?
Scenario 1
____Program A
____Program B
Imagine you serve on the jury of an only-child sole-
Again, imagine that the United States is preparing for
custody case following a relatively messy divorce. The
the outbreak of an unusual Asian disease that is expected
facts of the case are complicated by ambiguous eco-
to kill 600 people. Two alternative programs to combat
nomic, social, and emotional considerations, and you de-
the disease have been proposed. Assume that the exact
cide to base your decision entirely on the following few
scienti?c estimates of the consequences of the program
observations. To which parent would you AWARD sole
are as follows:
custody of the child?

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
562
If Program C is adopted, 400 people will certainly die.
example: Say your teacher decides to conduct an in-class
If Program D is adopted, there is a one-third probabil-
experiment. Following a lecture on cognitive psychol-
ity that no one will die and a two-thirds probability that
ogy she has everyone in the class take the same exam,
600 people will die.
an exam testing your knowledge of cognitive psychol-
ogy. She also, however, has everyone in the class wear
Which program do you prefer?
headphones while taking the exam. Half of the students
____Program C
listen to classical music, and the other half listens to top-
____Program D
20 favorites. In this example, the score is provided by the
test everyone takes; the manipulation is whether students
Scenario 3
listen to classical or top-20 music; and the hypothesis, or
prediction, is that students who listen to music with lyrics
Consider the following two lotteries, described by the
(top-20 music) will be more distracted and therefore do
percentage of marbles of different colors in each box and
worse on the test than those who listen to classical music
the amount of money you win or lose depending on the
(music without lyrics).
color of a randomly drawn marble. Which lottery do you
In what follows there are ?ve separate scenarios, each
prefer?
with their own instructions and brief set of questions.
After each scenario you will be asked what exactly you
Option A
think it is the experimenter is trying to learn, or in other
words, what the experimenter’s hypothesis or prediction
90% white 6% red 1% green 1% blue 2% yellow
is. You will then be asked to rate how con?dent you are
$0
win $45 win $30 lose $15
lose $15
that you are correct. And ?nally, you will be asked to rate
Option B
each scenario as being either: 1) “Completely transpar-
90% white 6% red 1% green 1% blue 2% yellow
ent”, 2) “Somewhat transparent/somewhat opaque or 3)
$0
win $45 win $45 lose $10
lose $15
“Not transparent at all/completely opaque.” A scenario is
“transparent” if it is relatively easy to guess what the re-
search hypothesis or prediction is. Conversely, a scenario
Again, consider the following two lotteries, described
is “opaque” if it is very dif?cult to ?gure out what is be-
by the percentage of marbles of different colors in each
ing predicted. Please answer each question by drawing a
box and the amount of money you win or lose depending
checkmark in the space provided next to the option you
on the color of a randomly drawn marble. Which lottery
prefer. Thank you.
do you prefer?
Scenario 1
Option C
90% white 6% red 1% green 3% yellow
Please imagine that an experimenter sits you down in a
$0
win $45 win $30
lose $15
waiting room and tells you that you are to wait there until
he comes and gets you. While you are waiting, a man in
Option D
the waiting room says that he is trying to help his son sell
90% white 7% red 1% green 2% yellow
raf?e tickets for school. He asks you if you will purchase
$0
win $45 lose $10
lose $15
a $10 raf?e ticket. Do you choose to:
____Purchase the $10 raf?e ticket
Appendix B: Instructions and ques-
____Decline the request
tions used in Experiment 2
The experimenter then returns and tells you that the
Instructions
experiment is completed.
Again, please imagine that an experimenter sits you
To begin, please read the following blurb:
down in a waiting room and tells you that you are to wait
In all experiments, the researcher has a hypothesis,
there until he comes and gets you. While you are wait-
which is what he is putting to the test. The research hy-
ing, a man in the waiting room gets up and goes over to a
pothesis is the experimenter’s prediction, and the experi-
vending machine. He comes over to you and says, “Hey
ment is set up in a way so that the data that results will tell
excuse me. I just went to buy a pop and the vending ma-
the experimenter whether that prediction is met or not. If
chine accidentally gave me two. Do you want this one?”
the prediction is con?rmed then the experimenter can say
You gladly accept the can of pop. The man then says that
that his hypothesis is supported by the data. Let’s have an
he is trying to help his son sell raf?e tickets for school.

Judgment and Decision Making, Vol. 4, No. 7, December 2009
Are within subjects designs transparent?
563
He asks you if you will purchase a $10 raf?e ticket. Do
The facts of the case are complicated by ambiguous eco-
you choose to:
nomic, social, and emotional considerations, and you de-
cide to base your decision entirely on the following few
____Purchase the $10 raf?e ticket
observations. To which parent would you DENY sole
____Decline the request
custody of the child?
The experimenter then returns and tells you that the
experiment is completed.
We would now like you to answer the following ques-
____Parent A
____Parent B
tions about Scenario 1.
Average income
Above-average income
Very close relationship
1. What do you think the researcher is trying to ?nd
Average health
with the child
out? In other words, what do think the researcher’s
Extremely active social
hypothesis or prediction is for this scenario? Please
Average working hours
life
write your answer in the space provided.
Reasonable rapport with
Lots of work-related travel
2. How con?dent are you that you have accurately
the child
guessed what the experimenter is trying to ?nd out,
Relatively stable social life Minor health problems
what his prediction is?
____ 3: Extremely Con?dent
We would now like you to answer the following ques-
____ 2: Very Con?dent
tions about Scenario 2.
____ 1: Somewhat Con?dent
____ 0: Neither Con?dent nor Uncon?dent
1. What do you think the researcher is trying to ?nd
____?1: Somewhat Uncon?dent
out? In other words, what do think the researcher’s
____?2: Very Uncon?dent
hypothesis or prediction is for this scenario? Please
____?3: Extremely Uncon?dent
write your answer in the space provided.
3. How would you rate this within-subjects experimen-
2. How con?dent are you that you have accurately
tal scenario?
guessed what the experimenter is trying to ?nd out,
__ 1) Completely transparent
what his prediction is?
__ 2) Somewhat transparent/somewhat opaque
____ 3: Extremely Con?dent
__ 3) Not transparent at all/completely opaque
____ 2: Very Con?dent
____ 1: Somewhat Con?dent
____ 0: Neither Con?dent nor Uncon?dent
Scenario 2
____?1: Somewhat Uncon?dent
Imagine you serve on the jury of an only-child sole-
____?2: Very Uncon?dent
custody case following a relatively messy divorce. The
____?3: Extremely Uncon?dent
facts of the case are complicated by ambiguous eco-
3. How would you rate this within-subjects experimen-
nomic, social, and emotional considerations, and you de-
tal scenario?
cide to base your decision entirely on the following few
__ 1) Completely transparent
observations. To which parent would you AWARD sole
__ 2) Somewhat transparent/somewhat opaque
custody of the child?
__ 3) Not transparent at all/completely opaque
____Parent A
____Parent B
Scenario 3
Average income
Above-average income
Very close relationship
Imagine that the United States is preparing for the out-
Average health
with the child
break of an unusual Asian disease that is expected to kill
Extremely active social
600 people. Two alternative programs to combat the dis-
Average working hours
life
ease have been proposed. Assume that the exact scienti?c
estimates of the consequences of the program are as fol-
Reasonable rapport with
Lots of work-related travel
lows. Which of the two programs do you favor?
the child
Relatively stable social life Minor health problems
____ If program A is adopted, 200 people will be saved
____ If program B is adopted, there is a one-third proba-
Again, imagine you serve on the jury of an only-child
bility that 600 people will be saved and a two-thirds prob-
sole-custody case following a relatively messy divorce.
ability that no people will be saved.

Download
Are Within-subjects designs transparent?

 

 

Your download will begin in a moment.
If it doesn't, click here to try again.

Share Are Within-subjects designs transparent? to:

Insert your wordpress URL:

example:

http://myblog.wordpress.com/
or
http://myblog.com/

Share Are Within-subjects designs transparent? as:

From:

To:

Share Are Within-subjects designs transparent?.

Enter two words as shown below. If you cannot read the words, click the refresh icon.

loading

Share Are Within-subjects designs transparent? as:

Copy html code above and paste to your web page.

loading