Development of reasoning about random events.
Caney, Annaliese
Abstract
The development of school students' understanding of random
events is explored in three related studies based on tasks well known in
the research literature. In Study 1,99 students in Grades 3 to 9 were
interviewed on three tasks and surveyed on two tasks about luck and
random behaviors in chance settings. Four levels of response are
identified across the five tasks, reflecting increasing structural
complexity and statistical appropriateness. In Study 2,23 of these
students were interviewed on the same interview protocol tasks, three or
four years later to monitor developmental change. In Study 3, a
different group of 15 students was interviewed with two of the tasks and
prompted with conflicting responses of other students on video. The aim
of Study 3 was to monitor the influence of cognitive conflict in
improving student levels of response. Implications for teachers,
educational planners, and researchers are discussed in the light of
other researchers' findings.
Introduction
Many research studies over the years have explored people's
understanding of random processes. The original studies mainly dealt
with college students and their misunderstandings by describing from a
psychological perspective how people reason in uncertain conditions, a
psychological approach. Later studies were conducted by mathematics
educators on data sets composed of school students, some showing little
change in understanding across grades. The desire to improve
students' statistical understanding and reasoning motivated this
research, an educational approach. These two different research
perspectives and the contributions they make to the literature on
probability and statistics have been considered in some detail by
Shaughnessy (1983, 1992). After a review of the place of the random
concept in the school curriculum and of previous research on the
concept, this study uses familiar tasks to extend previous research in
three directions. First, a development model is proposed that displays
increasingly complex understandings of the concept of random as involved
in the tasks presented. Second, longitudinal interview data are used to
monitor the developmental change in understanding over three or four
years. Third, cognitive conflict from other students is introduced as a
means of testing the tenacity of beliefs and their susceptibility to
change. Background for these three avenues of research is also provided.
Random in the School Curriculum
As a key probability concept, random is notoriously difficult for
students to grasp. Used as an adjective, more attention is often given
to the associated noun, such as in "random sample," to give
meaning to a complex idea (Batanero, Green, & Serrano, 1998). The
word is often used colloquially in non-mathematical contexts to convey a
meaning of haphazard. This is reflected to some extent in the nebulous
nature of some dictionary definitions, for example, "Made, done
without method or conscious choice" (Waite, 1998, p. 530). Indeed
like many concepts that are difficult to define, it seems easier to
define random by considering "antonyms to randomness" and
exploring what is not random, for example, "order,"
"organization," and "predictability" (Falk, 1991).
Curriculum documents are generally guilty of discussing random events,
and random numbers, without specifically defining the term
"random" (e.g., Australian Education Council [AEC], 1991;
Department for Education, 1995; Ministry of Education, 1992; National
Council of Teachers of Mathematics [NCTM], 2000). An exception is the
Mathematics Guidelines K-8: Overview to Chance and Data of the
Department of Education and the Arts [DEA] in the Australian state of
Tasmania, which follows closely the model provided by Moore (1990).
The focus on chance in this Strand is to develop in students the ability
to describe randomness and to measure (quantify) uncertainty. Phenomena
or events which may individually have uncertain outcomes (for example,
tossing a coin), but that have a regular pattern of outcomes over the
long term, are referred to as being random. (For example, we would
expect that in tossing a coin, over the long term, approximately 50%
will be heads). Teachers should provide a range of activities for
students to help them to start to develop, in mathematical terms, an
understanding of the difference between random and what we might
describe in every day language as haphazard (DEA, 1993, p.6).
The implications of the concept of random for the curriculum are
well stated, however, by the NCTM (2000). If an event is random and if
it is repeated many, many times, then the distribution of outcomes forms
a pattern. The idea that individual events are not predictable in such a
situation but that a pattern of outcomes can be predicted is an
important concept that serves as a foundation for the study of
inferential statistics (p. 51).
Luck, as a word often associated colloquially with chance, attracts very
little attention in curriculum documents. In its earliest band of
experiences for chance, the AEC (1991) suggests, "Use with clarity,
everyday language associated with chance events" (p. 166) and as a
possible activity, "Clarify and use common expressions such as 'being
lucky'" (p. 166). As is shown by a dictionary definition, however,
colloquial connotations can make interpreting comments and reasoning by
students difficult: "Good or bad fortune; circumstances brought by this;
success due to chance" (Waite, 1998, p. 378).
Many applaud the fact that in recent years school curricula around
the world have introduced random phenomena to students, but there is
also concern about providing for appropriate statistical understanding
(Batanero et al., 1998). Research over the years has shown that people
do not appear to develop these intuitions naturally.
Previous Research on Random Phenomena
The research on random concepts began with the work of
psychologists, using students at the college level and focusing on
misconceptions (Kahneman & Tversky, 1972, 1973, 1982). Relevant to
the current study was their account of the representativeness heuristic,
which reflects how well the outcome of an event (or sample) reflects the
parent population. Misconceptions related to representativeness are
often based on putting too much confidence in small samples, for
example, in expecting the same exact proportion in samples as in the
population, in expecting frequent oscillation in random outcomes, and in
expecting current outcomes to balance those obtained previously. It was
a few years before research began to focus on school students'
appreciation of the nature of random behavior. Fischbein and Gazit
(1984) considered middle school students' intuitions from the same
perspective as the earlier psychologists, whereas Green (1983) took a
more mathematical approach, also considering problems of probability
based on proportional reasoning.
In terms of the tasks employed by the current study, the one
reported most frequently in the literature is based on the sequential birth order of six babies in a hospital, assuming boys (B) and girls (G)
are equally likely to be born and the process is random. Kahneman and
Tversky (1972) found with college students that 82% judged the sequence
BGBBBB to be less likely to occur than GBGBBG. Likewise, BBBGGG was
deemed significantly less likely to occur than the sequence GBBGBG.
Shaughnessy (1977) reported that 62.5% of his college sample responded
in a similar fashion, even when the option of "same chance"
was included. Garfield and delMas (1991) employed the same two pairs of
sequences for comparison, again including the opportunity for students
to choose "equally likely." In exploring both the "exact
proportion" belief (BGGBGB versus BBBBGB) and the "random
order" belief (BGGBGB versus BBBGGG), 47% of their pre-test college
sample held both misconceptions with a further 32% choosing correctly
for one pair, and 21% choosing the correct response for both sets.
Konold, Pollatsek, Well, Lohmeier, and Lipson (1993) adapted the
question for a coin tossing scenario and also asked their college sample
which of a group of outcomes was least, as well as most, likely to
occur. Initially their findings were contrary to many earlier results in
that overall, 72% correctly chose the option "equally likely"
when asked to indicate which sequence was most likely to occur;
reasoning based on the representative heuristic was limited. When asked
which sequence was least likely to occur, however, only 38% chose the
"equally likely" option. Reasoning was not consistent and
beliefs were not stable, with similar differences and inconsistencies
found in follow-up interviews with the same question.
Fischbein and Gazit (1984) exploited the lottery scenario to
explore school students' understanding of random processes. In a
questionnaire given to students in Grades 5, 6, and 7 they considered
the idea of winning numbers being "lucky," within the context
of the lottery. Few students agreed to the idea of luck influencing the
outcome and an increasing number of students over the grades correctly
rejected luck. There were, however, high response levels for the belief
that "the same number cannot win anymore," a common
misconception. In the same study they used a task comparing the
likelihood of a consecutive sequence of numbers versus a
"random" sequence of numbers in winning a lottery. The authors
reported the persistence of the belief that it is more probable that
random numbers would win in a lottery draw, with the percentage of
students' responses in Grade 6 and 7 averaging 45%. In a follow-up
study with students from Grade 5 to college, Fischbein and Schnarch
(1997) found performance improved with increasing grade on the
consecutive-number lottery question with reasoning moving away from a
representative belief. Watson, Collis, and Moritz (1995) also observed
improved performance over Grades 3, 6, and 9 for the task exploring the
belief in lucky numbers.
Very little has been reported in the literature about
students' descriptions of the word "random" or random
processes. Moritz, Watson, and Pereira-Mendoza (1996) asked students,
"What things happen in a random way?" and found increasing
sophistication in responses with grade. A range of phenomena related to
humanly constructed processes such as breath tests, to natural events
such as earthquakes, and to games such as a lottery, was suggested as
things that happen in a random way. Some responses included
characteristics of a random process such as "things that happen in
no pattern or order" or "you choose jelly beans from a packet
and don't know which one you'll get--nothing influences your
choice." Non-response also declined, in that 78% of Grade 3 did not
respond to the question but only 16% of Grade 9 did not respond. Watson,
Kelly, Callingham, and Shaughnessy (2003) augmented the above question
by asking what the word "random" means. Achieving the highest
level response was one of the most difficult tasks in a survey assessing
understanding of chance, data, and variation.
Developmental Research into Students' Understanding of Random
Events
Although there has been more research into students'
probabilistic understanding than into understanding of other parts of
the chance and data curriculum (e.g., Green, 1983; Fischbein &
Gazit, 1984; Fischbein & Schnarch, 1997), much of this has focused
on identifying misconceptions rather than documenting the development of
understanding with respect to the increased complexity and
appropriateness of responses. Watson et al. (1995) and Watson, Collis,
and Moritz (1997) considered students responses to tasks related to luck
and to chance measurement using a structural model from cognitive
psychology (Biggs & Collis, 1982, 1991). They found it possible to
identify cycles of responses reflecting the number of elements of a task
used in a response and how these elements were combined. At the
prestructural or ikonic level (IK) reasoning was imaginative not
reflecting the relevant elements of the tasks. At the unistructional
level (U), responses employed a single relevant element of the task and
if conflict occurred for elements, this was not recognized. At the
multistructural level (M) responses used more than one element, usually
in sequence, and if conflict arose it was recognized but not completely
resolved. At the relational level (R), responses tied together relevant
elements of the task for closure, resolving conflict if it occurred. For
one item involving proportional reasoning, Watson et al. (1997)
identified a second UMR cycle associated with correct reasoning to
support an answer.
This identification of increasingly complex levels of development
of ideas was similar to that employed in analyzing responses to tasks
related to average (Watson & Moritz, 1999a), to sampling (Watson
& Moritz, 2000a), to comparing data sets (Watson & Moritz,
1999b), and to creating pictographs (Watson & Moritz, 2001b).
Longitudinal Research on Student Understanding
Until recently there has been very little longitudinal research in
relation to the chance and data school curriculum. In 1988, Garfield and
Ahlgren called for "longitudinal studies of how individuals
actually develop in stochastic sophistication" (p. 58) and this was
reinforced periodically thereafter (Green, 1993; Shaughnessy, 1997).
Green (1991) conducted a follow-up study of 305 students, four years
after they had been surveyed on questions related to
"randomness" and "comparison of odds" (Green, 1988).
Over the four years there was very little change in the prediction of
random outcomes in a two-dimensional grid but the development in
"comparison of odds" was quite marked over the time period.
Green considered that a lack of curriculum exposure to the behavior of
random generators possibly contributed to the difference in performance
on the two questions.
In follow-up to initial developmental analyses of chance
measurement tasks for school students (Watson et al., 1997), Watson and
Moritz (1998) considered longitudinal data collected after two and four
years. Overall they observed an average increase of one developmental
level over four years. In a related area of the chance curriculum,
Watson and Moritz (2002) found that performance on conditional and
conjunction probability tasks did not change markedly over two or four
years. This may have been due to a lack of emphasis on these ideas in
the curriculum. In extending research methodologies to follow the change
of students' statistical concepts over time, a number of studies
that are part of the larger project which includes the current study,
have employed the use of interviews, including work on average (Watson
& Moritz, 2000b), representing, interpreting, and predicting with
pictographs (Watson & Moritz, 2001b), inferential reasoning and
observation of variation in graphical presentations (Watson, 2001),
sampling (Watson, 2004), and fairness of dice (Watson & Moritz,
2003).
Research Involving Cognitive Conflict
For educational researchers, exploring the irregularities and
conflicts that arise in the understanding of students enriches the
process of facilitating conceptual development. In learning, cognitive
conflict arises during the process of conceptual change whereby a
student becomes aware that an existing concept or belief does not
adequately explain a new experience or idea (Strike & Posner, 1992).
Within a classroom, students may naturally experience a sense of
"dissatisfaction" through their own reading and investigation,
or be exposed to new ideas through the intervention of their teachers
and other students. This conflict may cause students to reevaluate and
adjust a concept or, in reconsidering an idea, reject a conflicting idea
and strengthen an existing concept. Macbeth (2000) outlines four types
of "teaching moves" that facilitate conceptual change: (a)
providing opportunities for pupils to make their own conceptions about a
particular topic area explicit so that they are available for
inspection, (b) presenting empirical counterexamples, (c) presenting and
reviewing alternative conceptions, and (d) providing opportunities to
use conceptions. The challenge for researchers is to simulate appropriate environments and integrate these teaching concepts into
their investigations. This is not easy and can be confounded in a number
of ways, particularly with research involving collaborative group-work
(Chick & Watson, 2001).
Cognitive conflict is often suggested as a strategy in relation to the
development of appropriate curriculum guidelines and teaching approaches
for science education (Macbeth, 2000; Posner, Strike, Hewson, & Gertzog,
1982). Such a move is largely influenced by the desire for scientific
theories to be aligned with everyday intuitions. In a sense, the
mathematics curriculum experiences the same dilemma as science,
particularly in the area of probability, in that "the gap between
potential everyday applicability and formal understanding is at its
greatest" (Pratt, 1998, p. 2).
In the larger project of which this study was a part, cognitive
conflict was utilized in an individual interview environment, where
videotaped responses from students interviewed earlier were used to
present cognitive conflict to those currently being interviewed. For a
task on chance measurement, 33% of students responded at higher levels
in the presence of cognitive conflict (Watson & Moritz, 2001a). For
tasks on sampling the improvement rate was 22% (Watson, 2002a). For
easier tasks on comparing two data sets of equal size (Watson, 2002b)
and creating pictographs (Watson & Moritz, 2001b), cognitive
conflict assisted about 60% of students, whereas for more difficult
tasks with unequal sized groups or predicting from pictographs,
improvement rates dropped to about 30%.
Research Questions
In three closely related studies the following research questions
are addressed for students' understanding of random processes.
1. What are the observed cognitive levels of performance for random
tasks? Do these vary with grade? What is the degree of association of
levels of response among tasks?
2. How does students' reasoning develop over time? Do levels
of response improve over 3 or 4 years?
3. Can other students' ideas be used to prompt cognitive
conflict and improve response levels?
Methodology
Participants
A total of 99 Australian school students were included in this
research as part of a wider research project. Students were from seven
Tasmanian government schools, including three primary, two district, and
two secondary, and a private school in South Australia. Table 1 shows
the demographic data for participants in each of the studies described
below.
In Study 1, 84 Tasmanian students were surveyed and interviewed:
twenty-four students from Grade 3 (aged 8-9), thirty students from Grade
6 (aged 11-12), and thirty students from Grade 9 (aged 14-15). Fifteen
South Australian students were also interviewed; three students from
Grade 3 (aged 8-9), seven students from Grades 5 and 7 (aged 10-11 and
12-13 respectively), and four students from Grade 9 (aged 14-15). The
researchers selected students for interview based on their responses to
survey items. Students were judged either as being representative of
their grade level or as providing unusual answers to some questions,
although not necessarily the questions discussed here. These students
were also regarded by their teachers as articulate and prepared to
discuss their ideas with an interviewer. Not all participants answered
all five items and hence reduced sample sizes will be reported as
appropriate.
Follow-up interviews were conducted three or four years later with
23 students from Study 1. Five Grade 3 students (aged 8-9), fourteen
Grade 6 students (aged 11-12), and four Grade 9 students (aged 14-15)
completed a second, longitudinal, interview.
In Study 3, a different group of 15 Tasmanian students from
government schools participated in interviews involving cognitive
conflict. The students were a Grade 3 student (aged 8-9), seven Grade 6
students (aged 11-12), and seven Grade 9 students (aged 14-15). Their
initial interview responses, before being presented with cognitive
conflict, were included with the data in Study 1.
Tasks
The Random Protocol used in this study consisted of two survey
questions and three interview questions (Figure 1). These items were
parts of two instruments developed to assess student understanding of
statistical concepts. The items were among a larger set chosen to meet
five practical criteria: 1) to reflect the current national curriculum
guidelines, 2) to take into account the existing research from other
countries, 3) to allow for different levels and modes of cognitive
functioning, 4) to be motivating and elicit optimal responses, and 5) to
be practical to administer and interpret (Watson, 1994).
Both survey questions were adapted from Fischbein and Gazit (1984).
Question QS1 explored random processes and was related to the
"chance language" aspect of the curriculum. Question QS2,
investigated students' understanding of luck, particularly in
relation to the curriculum reference related to "interpreting
events."
Questions QI1 on the interview protocol provided an opportunity for
students to show their understanding of what "random" means
and to show if they appreciated random elements embedded in a context.
Question QI2 was adapted from Fischbein and Gazit (1984) to study the
appreciation of random as a chance process within the familiar social
context of the lottery. Question QI3, based on an item of Garfield and
delMas (1991), explored students' conceptions of order and
equal-likelihood in a context where representativeness is also an issue
in an observed sequential sample. The interview protocol in Study 2 was
identical to that used in Study 1.
The cognitive conflict interview protocol that was used in Study 3
extended QI2 lottery and QI3 birth order tasks (see Figure 1) to include
the responses of other students. After their initial responses, students
were prompted with video footage of responses from other students in
earlier interviews. Prompts were selected by the interviewer as
different from the participant's response in order to create
cognitive conflict. Prompts available for use with QI2 are shown in
Figure 2 and those for QI3 in Figure 3. After being asked to comment on
the prompt, students were asked to decide the better or best response.
It was anticipated that this procedure, based on input from other
students rather than teachers or researchers, would simulate a
comparable learning environment in the classroom.
[FIGURE 1 OMITTED]
Procedure
Surveys were administered during school class time with 45 minutes
allocated to complete the whole survey. In the secondary schools they
were distributed during a mathematics class. For primary classes in
particular, teachers assisted students in reading the questions, but did
not provide explanations or answers. It was explained to the younger
students that they might find some questions easy and some harder as the
questions were for older students as well.
[FIGURE 2 OMITTED]
The interviews were also conducted during class time with all
sessions videotaped. Students participated in a 45-minute individual
interview in the course of which several protocols related to the chance
and data part of the mathematics curricula were administered. The Random
Protocol formed one part of the interview; other topics included
interpreting bar charts, comparing graphs, average, sampling, fairness
of dice, and probability. The Random Protocol was presented toward the
end of the interview session and hence was sometimes cut short or
omitted due to time constraints.
Coding and Analysis
Both authors coded the two survey questions and the three interview
questions. A spreadsheet system that linked the digitized interview
videos, the transcripts, and survey responses facilitated this process.
For Study 1, all students' responses for each item were coded in
two stages; initially, similar types of responses were grouped together
in a clustering technique and assigned a code (Miles & Huberman,
1994); these codes were then assigned to one of four levels--Ikonic
(IK), Unistructural (U), Multistructural (M), Relational (R)--within the
cognitive framework of Biggs and Collis (1982, 1991). Each level from
the ikonic to relational represents an increase in the complexity of
structure and appropriateness of responses. The coding framework allowed
for continuity across the questions in that a similar structural
complexity and statistical appropriateness related to the concept of
random could be identified for each item. The longitudinal responses in
Study 2 were also coded using this framework. Changes in students'
response levels hence could be tracked for each item. For Study 3, codes
were applied to the prompts; final responses coded within the above
framework could then be associated with the level of the prompt.
[FIGURE 3 OMITTED]
Missing data for some students meant that comparison of total
scores across tasks for students was not a meaningful process. Response
levels are presented for different grades for questions where
appropriate, as are indicative mean values. The association of levels of
response among the questions was considered using two-way tables. The
Pearson product-moment correlation coefficient was used as an indicator
of the strength of association. Change over time and after the
presentation of cognitive conflict is considered descriptively. The
results are presented in the order of the research questions.
Results
Research Question 1: Development of Understanding
Although the levels of response for coding the five questions were
consistent, the distribution of levels of response varied greatly among
the questions. For this reason descriptive accounts will be given for
the five questions separately and the association of levels of response
between pairs of items will be considered. Table 2 contains an overall
description of the levels of response, with highlights of features
observed on the individual questions, combining survey and interview
questions on "the meaning of random." The codes reflect the
observed levels of development of Biggs and Collis (1982, 1991) as
detailed earlier.
Random survey question QS1. The survey question "What things
happen in a random way?" was answered by 71 students and its
reference to "things" encouraged the presentation of examples
rather than responses of a definitional nature. Some students, however,
did provide a broad description, in what might be called an optimal
rather than functional response. For this question all Level 0 responses
were in fact non-responses on the survey sheet. At Level 1 responses
reflected intuitive ideas or personal experiences that were likely to be
based in idiosyncratic contexts, for example, "work sheets,"
"puzzling things," "a car will crash," or
"trees grow." Level 2 responses were generally single
recognizable examples, such as "the weather," "Tattslotto numbers," and "when you get picked out of a number of
people," or a single characteristic of a random process, such as
"by chance" and "any order, higgledy, piggledy." At
Level 3, responses went further in describing the process combining a
description and an example, or providing examples of different types of
random processes.
S1: The rolling of a dice, when rain comes from the sky, what the
weather is like. [Grade 9]
S2: Something that isn't organized, there is no particular
pattern to what is happening. [Grade 9]
S3: The Tattslotto numbers come up randomly; each number has an
equally small chance of getting chosen. [Grade 9]
No responses reached Level 4 sophistication on the survey question.
Table 3 shows the distribution of levels of response across the
grades for question QS1. The association of coding level with grade
shows improving performance overall with grade, the average level of
Grade 3 being .68, with 1.41 for Grade 6, and 1.95 for Grade 9.
Random interview question, QI1. The interview question on the
meaning of random, QI1, offered more support than the survey question,
principally by providing a context, a television announcement about a
lottery, for eliciting students' ideas. Coding was assigned for
Part (a) and Part (b) of this question; these codes were then combined
to a single code. At Level 0, only one response was recorded where the
student did not respond to either part. Many of the Level 1 responses,
however, had no initial suggestion for the first part but within a
context could provide an intuitive response or elaborate on the process
taking place in the drawing of lottery numbers in the second part.
S4: [a] No response. [b] They just come out, all kinds. [Grade 3]
S5: [a] Just grab a few people, random people just go around the
school ... they just grab them. [b] The machine just picks out the
numbers and they come up and you get more numbers come up than others I
have seen. [Grade 9]
Level 2 responses again reflected single ideas or examples, which
could be given for either or both parts of the question.
S6: [a] It means you sort of guess. [b] They are not picking them
themselves, they are just letting them come up. [Grade 6]
S7: [a] You might say random as in randomly pick. It is just a
lucky one, a lucky pick. [b] They haven't done anything with the
camera or anything to make it show a certain number. They would just
pick them out and that is their lucky pick. [Grade 6]
Level 3 responses combined multiple ideas and examples.
S8: [a] Choosing at random like in raffles, and they say "pick
a name out of a hat," and then it could be any. [b] They just pick
some out and they could be any of them. No really definite numbers and
so it could be any so you have just as much chance of them being yours.
[Grade 6]
S9: [a] When you pick names or numbers out of a hat, everyone has
got an equal chance. [Grade 9]
At Level 4 responses included comments about avoiding bias in the
selection process.
S10: [a] Well random is used a lot ... like in elections when they
do popularity polls ... or they take random phone numbers out of the
book. It is a way of selecting a sample, without trying, not to be
biased so it can fairly fair. Like feed the names into the computer and
they select a few names out of whatever. [b] It means there's any
possibility of any numbers coming up, they are saying that to make it
sound like they don't cheat or rig it. [Grade 9]
Table 4 shows the distribution of levels of response across the
grades for question QI1 for the 99 students who were interviewed. Again
there was an association of performance with grade indicated by
increasing average codes from 1.69 for Grade 3, to 2.32 for Grade 6, and
2.94 for Grade 9. These averages are approximately one level higher than
obtained for the same grades on the survey question QS1. The value of
the interview in reducing the number of non-responses is seen. The
association of responses for the levels of response of the two questions
is fairly strong as is shown in Table 5 (r = .592, p < .0005). Only
two students responded at a lower level in the interview than on the
survey, with 93% of non-responses and 68% of Level 1 and Level 2
responses improving from the survey to the interview.
Lucky numbers survey question, QS2. The survey question, QS2, about
Claire's belief in lucky numbers, was answered by 84 students.
Level 0 responses were non-responses, or "yes" or
"no" responses with no accompanying explanation, whereas Level
1 responses reflected a belief in "luck."
S11: The same numbers always win. [Grade 3]
S12: She CHOSE lucky numbers. [Grade 3]
Level 2 responses stated disagreement with Claire's view
either implicitly or explicitly, or expressed an "anything can
happen" view of the context.
S13: I wouldn't do it. [Grade 3]
S14: They wont [sic] win it for her again. [Grade 3]
S15: It's a superstition. [Grade 6]
S16: In Tattslotto not often the numbers are the same so I think
it's impossible. [Grade 6]
S17: It does not matter what numbers you have you might win. [Grade
3]
Some responses went on to express a qualitative chance statement in
support of the view disagreeing with Claire at Level 3.
S18: No number has more chance of being drawn than any other.
[Grade 9]
S19: Because Tattslotto is random it makes no difference. It may
come out SOMETIMES but randomly. But if you have the same numbers at
least you wouldn't get frustrated because you didn't put in
the same numbers as last week. [Grade 6]
No responses were observed at a level higher than Level 3.
As can be seen in Table 6, the predominant response for QS2 for all
grades was Level 2 disagreement with Claire's belief but with
little in the way of justification. There was again a slight trend for
improvement in levels of response with grade, with more Level 3
responses provided at Grade 9. Average code levels for the three grades
were 1.46, 2.03, and 2.33, respectively.
Lottery interview question, Q12. Interview question, QI2, on
choosing sequential or "spread out" numbers for a lottery was
answered by 98 students. Only two responses provided idiosyncratic
reasoning at Level 0.
S11: [Writes down 13, 22, 39, 7, 15, 43. Chooses Jenny] My
Tattslotto number's 7 and it comes up every night so I think the
first one. [Grade 3]
S7: [Chooses Jenny] I wouldn't send that one [Ruth] because it
is cheating. [Grade 6]
At Level 1, however, many responses expressed straightforward
disagreement with Ruth's numbers and favored Jenny's numbers,
with justification based on the observation that numbers are generally
more "spread out" and "mixed up."
S20: [Writes down 10, 35, 7, 29,40, 39. Chooses Jenny] It is not
very usual for all the numbers that are picked are in a row, because
they are not usually in a row, they are spread out. [Grade 6]
S21: [Writes down, 42,3,45,21, 10, 9. Chooses Jenny] The balls
never really go 1, 2, 3, 4, 5, 6. There's more chance of random
numbers. [Grade 6]
S22: [Writes down 7, 23, 11, 39, 41, 9. Chooses Jenny] It has a
better range of numbers in it. [Grade 3]
At Level 2 responses were likely to express contradictory ideas
without realization.
S23: [Writes down 10, 11, 12, 13, 14, 15. Chooses Equally Likely]
Equally but I don't think, you never really have small numbers,
mostly they're big numbers. [Grade 6]
S24: [Writes down 24, 43, 3, 17, 39, 33. Chooses Jenny] Have the
same chance but it is unlikely, it has never happened 1, 2, 3, 4, 5, 6,
they always come up randomly. [Grade 6]
At Level 3 students chose "equally likely" for the chance
of Ruth's and Jenny's numbers being drawn in the lottery, with
qualitative chance justifications.
S25: [Choose Equally Likely] 50/50 chance. If they are both in,
there is an equal chance of both of them winning, the machine could pick
any of them. [Grade 6]
S26: [Writes down 7,9,32,40,22,14. Chooses Equally Likely]
It's unlikely that you would get consecutive numbers but either has
just as much chance because they are just picking any numbers randomly
out of the thing. [Grade 9]
Responses at Level 4 focused explicitly on each number having the
same chance of being chosen and /or included mention of bias.
S27: [Writes down 7, 13, 26, 36, 39, 17. Chooses Equally Likely] I
think they have the same chance because they all have the same chance,
like a 1 in 45 chance of coming up. [Grade 9]
S28: [Writes down 1, 5, 36, 12, 10, 18. Chooses Equally Likely]
Either one has got an equal chance. The Tattslotto thing isn't
biased, there's the same chance of getting every number. So just as
likely to get consecutive numbers than different numbers. [Grade 9]
Table 7 shows that responses reflecting the view that the choice of
numbers with spread was more likely to win were the most common at all
grade levels. There was, however, a trend for higher average codes with
grade from 1.12 in Grade 3, to 1.74 in Grade 6, and 2.32 in Grade 9.
Although there was a predominance of Level 1 responses to this question
and Level 2 responses to the survey question on lucky lottery numbers,
it is interesting to observe the association of the two sets of
responses for the 84 students who answered both (r = .438, p<.0005).
This is shown in Table 8 where 46% of all students responded with
disagreement on the existence of lucky numbers but agreement with a
"spread out" choice of numbers as more likely than a sequence.
The association of levels of responses for the two interview
questions on the meaning of random, QI1, and the lottery question, QI2,
is shown in Table 9. For the 91 students with data for both questions,
there is a tendency for an association for those with codes of 2 or
higher on the lottery question; for those at Level 1 on QI2, however,
there is a wide spread and a fairly even representation of responses at
Level 1 and Level 3 to the random definition question QI1 (r = .481, p
< .0005). It would appear that belief in the necessity for lottery
numbers to be "spread out" to increase the chance of winning
is held by students with a wide range of ability to explain the meaning
of random more generally.
Birth order interview question, QI3. The interview question, Q13,
on the birth order for six babies was answered by 83 students. The
levels of response were determined both by choices between the possible
birth sequences and by the reasoning associated with the pair of
choices. Level 0 responses made choices but gave no supporting reasons
or explanations. At Level 1, most responses chose BGGBGB consistently
and displayed both beliefs, that there should not be an imbalance of
gender and that there should not be straight runs of boys followed by
girls.
S4: [Chooses BGGBGB Part (a)] There's 3 of each. [Chooses
BGGBGB Part (b)] You wouldn't have 3 boys and then 3 girls born.
[Grade 3]
One response, however, selected BBBGGG for the second pair.
S29: [Chooses BGGBGB Part (a)] You've got a variety of
children being born and they're not always girls and not always
boys. [Chooses BBBGGG Part (b)] It's a different type but there
might be all boys coming and then all girls. [Grade 6]
At Level 2 responses chose "equally likely" for each part
but gave no supporting reasons besides an "anything can
happen" justification.
S30: [Chooses Equally Likely for both parts] Because you don't
know which is going to come out a boy or a girl. [Grade 3]
S23: [Chooses Equally Likely for both parts] Because it just
depends which baby they have, it could be any really. [Grade 6]
At Level 3, responses recognized that one of the apparent
imbalances (number of boys and girls or order of birth) did not affect
the overall chances but were susceptible to the other; that is in one
part a sequence of births was specified as being "more likely"
whereas the other part was answered "equally likely" with
appropriate reasoning.
S31: [Chooses BGGBGB Part (a)] Boys and girls are evenly matched.
[Chooses Equally Likely Part (b)] They have both got 3 again. [Grade 9]
S32: [Chooses Equally Likely Part (a)] It can be that many boys
with 1 girl or it could be equally the same. [Chooses BGGBGB Part (b)]
More likely because it's a mixture. [Grade 9]
For the highest level of response, students were not susceptible to
either type of apparent imbalance and gave appropriate justifications
for their choices in both cases.
S8: [Chooses Equally Likely for both parts] Because it has as much
chance as being a boy and a girl, so it could be in any order at all.
[Grade 6]
Table 10 shows the distribution of levels of response across grades
for the birth order problem, QI3. More improvement occurs between Grades
3 and 6 in average level of performance (1.55 to 2.14) than between
Grades 6 and 9 (2.14 to 2.38). In terms of association of level of
response with the other two interview questions, data are given in
Tables 11 and 12. There is no relationship between the levels of
response for birth order and random tasks (r = .126, p = .265) and for
birth order and tasks it is still not strong (r = .301, p = .006). As
noted earlier there are many Level 3 responses to the random definition
and Level 1 responses to the lottery question.
In considering the performance of students who answered all five
items, no student scored a maximum score of 19. The highest total score
was a 17 achieved by a Grade 9 student. This student was S2, who
provided a Level 3 response to the survey question on Random, QS1. For
the survey item on Luck, QS2, S2 responded at Level 3 in the following
fashion including reference to the probability of numbers occurring.
S2: I doubt she'd be able to win Tattslotto twice by using the
same numbers because if each set of numbers has a certain chance (e.g.
1/1000000) once one set of numbers has one (won), it is unlikely that
they'd come again until (1000000) goes later.
For the interview task Q11, S2 mentioned lack of bias in random
selection (Level 4).
S2: It means that the children are just, they are not chosen
because they have some special quality or something, they are just
picked out any old way it doesn't really matter.... It means that
they just pick the numbers out they are not biased towards one number or
something. Or they haven't specially chosen the one number to come
out, it's just as it happens.
For the interview question on the lottery, Q12, S2 had difficulty
resolving the conflict of equal likelihood of outcomes and their
perceived spread. This was given a Level 3 code for the final part of
the response.
S2: [Writes down, 2, 4, 27, 35, 43,40] I just guessed them. I just
sat down and wrote out numbers between 1 and 45. [Well, which ticket do
you think is more likely to get all 6 numbers right? Would it be this
ticket (Jenny's) or would it be this ticket (Ruth's)?] ...
Probably Jenny's because she has a wider range of numbers but I
don't know that it would make that much difference. [Do you think
they might have the same chance?] I think that they might have about the
same chance, it would lean either way.
S2, however, appeared to handle equal likelihood more easily in the
birth order task, Q13, with the following Level 4 response.
S2: [Chooses Equally Likely Part (a)] Well, if there is not really
any other information saying that it is specifically a hospital mainly
for boys or mainly for girls. So you would think that out of 6 babies
they would be equally likely. [Chooses Equally Like Part (b)] Because
there is the same amount of boys and girls in both of them.
S8 was a Grade 6 student whose total score summed to 16 and who
responded at Level 3 for the random interview task, QI1, and at Level 4
for the birth order question, Q13. For survey random question, QS1, S8
responded by giving two examples, "Tattslotto" and
"raffle draw," which was a Level 2 response. For the luck
question, QS2, the Level 3 response was brief but to the point, "It
is as likely to be them as any other numbers." Finally for the
lottery interview question, Q12, S8 was confident about equal
likelihood, offering the following Level 4 response.
S8: [Writes down 3, 12, 26, 37,43, 32] I just took any that were
any where, I didn't make any definite numbers. [Chooses equally
likely] Equally because the thing that picks them out is just as likely
to pick 1, 2, 3, 4, 5, 6 as if it is going to pick any other number.
These two students displayed overall the highest levels of
performance for tasks associated with random processes.
A summary of the correlations between the pairs of tasks is given
in Table 13 to indicate the degree of association displayed in the
responses of students. The birth order task, Q13, had the least
association with the other tasks employed. Although seven of the
correlations are significant at the .01 level, the percent of variation
explained varies from only 9% for the lottery and birth order interview
tasks to 35% for the two questions on random.
Research Question 2: Longitudinal Change
A subset of students who participated in Study 1 was interviewed
for a second time three or four years later. All 23 of the students were
asked to discuss the meaning of random, QI1. Of these, 12 responded at
the same level the second time, 8 responses were Level 3 and only one
was at Level 4 each time. Nine students improved their level of
response, whereas two Grade 6 students regressed from Level 3 to Level
2. Of those who improved, three were initially in Grade 3 and these
students all improved from Level 1 to Level 3, four were in Grade 6, and
two in Grade 9. The two Grade 9 students both moved from Level 3 to
Level 4.
S33: [First interview] Pick something without a formula, just any
old thing or person. [Second interview] A random sample, choose a sample
that is not biased so that it doesn't represent one part more than
the other. [Grade 9 in first interview]
Twenty-one students answered the lottery question, QI2, in each
interview and of these, 12 responded at the same level each time, four
at the highest possible and eight at Level 1. The following responses
are consistently at Level 1.
S34: [First interview. Chooses Jenny] It is nearly impossible to
get it all in a row. [Second interview. Chooses Jenny] Ruth
wouldn't have a chance because there would be a slim chance of all
those numbers coming up in a row. [Grade 6 in first interview]
Seven students improved levels, four by two or more levels from
Level 1. One student improved from Level 1 to Level 4.
S6: [First interview. Chooses Jenny] Because the numbers don't
usually come up consecutively. [Second interview. Chooses Equally
Likely] Because all of these numbers have the same chance of coming up.
[Grade 6 in first interview]
Subsequently two students, a Grade 6 and a Grade 9, regressed by
one level.
Only 13 students completed the birth order question, QI3, in the
longitudinal interview. The two Grade 3 students regressed from Level 2,
whereas the two Grade 9 students improved to Level 4. Of these two Grade
9 students, one moved from Level 3 to Level 4 and the other student
improved from Level 1 to Level 4, indicating that they were no longer
susceptible to probabilistic or order imbalance.
S35: [First interview. Chooses BBGBGB Part (a)] Because it has got
an even number ... but I suppose more days, equal amounts of each are
born. [Chooses BBGBGB Part (b)] Because it is very likely, it isn't
often that 3 boys and 3 girls come out like that. [Second interview.
Chooses Equally Likely for both parts]. I think for it to be in that
sequence it is the same [goes on to explain the actual chance of the
sequence occurring] 1 in 64. [Grade 9 in first interview]
Of the nine Grade 6 students, three remained at the same level, one
at Level 4, whereas one regressed from Level 3 to Level 2, and five
improved, two to Level 4.
Overall for all three interview questions there was a trend for
improvement over three or four years, with little indication of
regression. This is shown in detail in Table 14 for the 57 pairs of
responses for the three interview questions. Responses to the questions
are denoted by a letter for the item (R for Random, L for Lottery, and B
for Birth Order) followed by the initial grade levels for all responses
to that item in a given cell. L366, for example, indicates that one
Grade 3 and two Grade 6 students responded to the lottery question (QI2)
at Level 1 in the initial interview and at Level 2 in the longitudinal
interview. Although the numbers are small for a given grade and item, it
can be seen that no student initially at Level 4 gave a lower level
response later and 45% of responses overall that could improve over the
time between interviews, did so. Also, the students originally in Grade
6 gave longitudinal responses whose distribution was not inconsistent with that of the Grade 9 students in the initial interview.
Research Question 3: Cognitive Conflict
Study 3 involved the use of video prompts of earlier students'
responses (Study 1) to create cognitive conflict for the new students
being interviewed. Fifteen students were prompted with at least one
student response that differed from their initial response on the
lottery interview question, QI2, and five students were prompted on the
birth order question, QI3. In some instances a second prompt was also
shown to students as a means for determining the strength of their
initial decision.
Lottery interview question, QI2. Of the 15 students presented with
a prompt for the lottery interview question, initially 5 gave Level 1
responses, 2 at Level 2, 4 at Level 3, and 4 at Level 4 as described in
Table 2. The five students who initially responded at Level 1 were given
Level 4 prompts. Four of these students rejected the higher level prompt
and were still influenced by the "look' of the numbers as in
the order or spread.
S36: [First response. Writes down 14,23, 28, 37, 44, 7. Chooses
Jenny] Umm Jenny's I would say because Tattslotto doesn't
usually end up being 1, 2, 3, 4, 5, 6 so Jenny's is more logical.
[Video Prompt Tony, Figure 2] It's more unlikely to get 1, 2, 3, 4,
5, 6 than it is to get the random numbers. It should be a bit spread out
more evenly. [Grade 9]
S21: [First response. Writes down 42, 3, 45, 21, 10, 9. Chooses
Jenny] The balls never really go 1, 2, 3, 4, 5, 6 and sometimes they go
like 31, 32, 33 and they will go like 42 and then back again.
There's more chance of random numbers I think. [Video Prompt Tony,
Figure 2] No. Apart from the fact if he did go like 1, 2, 3, 4, 5, 6
someone might think it might have been rigged. [Grade 6]
One Level 1 student who initially chose Jenny's numbers
because "It comes out random, you don't see 1, 2, 3, 4, 5, 6
come out that often," attempted to integrate a Level 4 idea into
her answer but did not successfully resolve the conflicting ideas.
S37: [Video Prompt Tony, Figure 2] They have the same chance but it
is more likely that they will come out in random order rather than
consecutive order. The two tickets have the same chance but if you had
to back one, you would back the one with random ones because it
doesn't usually come out in consecutive numbers. [Grade 9]
For the two students who responded at Level 2, the Level 4 prompts
were rejected and the students did not appear to resolve the
contradictions apparent in their initial explanations. Of the four
students at Level 3, one rejected a lower level response. The other
three students passively agreed with both lower and higher level
responses. These students did not seem to have the same strength of
belief as observed in those students who gave Level 1 or Level 4
responses.
The four students who initially responded at Level 4 were all
prompted with lower Level 1 responses and three of the students rejected
the lower level responses, preferring their own earlier responses.
S38: [First response. Chooses Equally Likely] It doesn't
matter at all you have still got the same chance. That's one set of
numbers [Ruth] and that would be another set of numbers [Jenny] and like
you if there's a million combinations then you've still got a
1 in a million chance. [Video Prompt Tracey, Figure 2] No I don't
think so because that's still, you still have got a 1 in a million
chance of getting that number ... and those numbers. [Grade 9]
The other student who initially responded at Level 4 changed his or
her argument and accepted a Level 1 response. When given a second prompt
at Level 4 this student still favored the Level 1 response.
S9: [First response. Chooses Equally likely] Same chances because
each number has got a same chance of getting drawn, so therefore they
both have an equal chance. [Video Prompt Terry, Figure 2] I've changed my mind his idea is better. [Video Prompt Tony, Figure 2] I
still go with the first kid. [Interviewer ... you are more likely to get
a range of numbers rather than just do all of them?] Yes. [Grade 9]
Birth order interview question, QI3. Of the five students (1 in
Grade 6 and 4 in Grade 9) presented with a prompt for the birth order
interview question, QI3, initially two gave responses at Level 1, one at
Level 2, one at Level 3, and one at Level 4. Of the two students who
responded at Level 1, one did not accept a Level 4 response and also
rejected a Level 2 response, whereas the other started to accept a
higher level response but did not make a definite change in response
level. When given a Level 4 prompt the student who initially gave a
Level 3 response recognized the higher level "equally likely"
idea in that both sequences "could happen" but was still
influenced by the balance of gender, in Part (a). The student did,
however, reject a lower level response.
S26: [Video Prompt Tracey, Figure 3] Well, I sort of said that, I
said the BBBBGB could happen but that BGGBGB is more likely because of
the numbers, they are even. [Video Prompt Tony, Figure 3] Well that
could be a random order like boy, boy, girl, girl, boy, girl or
something, that's pretty random. [Grade 9]
The student who initially responded at Level 4 agreed the BGGBGB
"looks like" a random order but maintained the sequences were
both "equally likely" and had the "same chance."
Overall for the cognitive conflict interviews, students who gave
responses at Level 1 generally rejected higher level responses,
preferring their own initial responses. Those students who gave
responses at Level 4 generally rejected lower level responses,
reiterating their initial Level 4 arguments and reasoning. Level 2
students seemed unable to resolve the conflicts or contradictions within
their explanations. A characteristic of Level 3 responses was passive
agreement with conflicting ideas as presented by the other students.
Discussion
The Discussion focuses on six aspects of this study: the
hierarchical aspects of the responses to the five tasks, the association
of levels of response across the tasks, longitudinal change, change
after cognitive conflict, implications for future research, and
implications for teaching.
Understanding for the Five Tasks
Performance on item QS1, the survey task on random, improved with
grade, particularly between Grades 3 and 6. These results show a
concentration of responses from Levels 0 to 2 with few higher level
responses. Given that the nature of the question encouraged students to
offer examples this is not surprising. In considering students'
understanding of statistical language Moritz et al. (1996), although
using a slight variation in coding, reported a similar increase in level
of response sophistication across grades for the larger number of
students surveyed in the group from which 60 of the students in Study 1
were selected.
Task QI1, the interview question on random, extended the survey
question by asking for a meaning, rather than an example, and providing
a context for further description. In asking students about random the
three elements of these two items (meaning, example, and context)
appeared to work well together in that there was a moderately high
relationship between tasks QS1 and QI1. Watson et al. (2003) used a
survey item that was structured in a fashion that combined the aspects
of the survey item, QS1, and Part (a) of the interview task. Using a
hierarchical coding scheme similar to that employed in the present
study, they found overall levels of response (Grades 3 to 9) with a
distribution more similar to that of task QI1 than QS1. Perhaps
reflecting the survey environment, 40% of students responded at the
equivalent of Level 1 in the present study, with 20% at Level 2, 37% at
Level 3, and 3% at Level 4. This provides some support for the view that
the more general questions used in the interview setting will elicit
higher level responses, even in a survey setting.
Ideas about luck in a lottery setting have received little
attention in the literature that considers students' understanding
of random concepts. The question used in this study as a survey item,
QS2, was adapted from a questionnaire developed by Fischbein and Gazit
(1984). Like Fischbein and Gazit, the results of the current study
indicate that only a very small proportion of students attribute random
lottery outcomes to luck. The majority of responses, however, were at
Level 2 and this revealed a belief related to negative recency, that is
winning numbers cannot come up again. This misconception is discussed by
Kahneman and Tversky (1972) and Shaughnessy (1992) in relation to the
representativeness heuristic.
For task QI2, the lottery interview question, the majority of
responses occurred at Level 1, indicating that students were influenced
by the "look" of the numbers and expected a random sequence of
numbers to be varied and spread. Fischbein and Schnarch (1997) reported
that the use of the representative heuristic for this item declined as
grade level increased. Similarly in this study, by Grade 9 responses
were more spread over Levels 2 and 3, with approximately a third of
responses at Level 4. Responses at Level 1 prompted the interviewer to
ask an additional question, "What numbers would you choose next
week?" In Grades 3 and 6 most students indicated that they would
choose "different numbers" and, like with the luck question,
reasoned that winning numbers would not come up again.
The birth order questions, interview task QI3, have been
administered in many forms over the years. In exploring the
representativeness heuristic, they have been adapted for several
contexts, including birth order and coin tossing. Earlier research
(Shaughnessy, 1977; Garfield & delMas, 1991; Kahneman & Tversky,
1972) reported that students selected a sequence like GBGBBG as being
more likely than the sequences BGBBBB and BBBGGG. Those studies that
asked students to explain their answers report that responses were
influenced by the representativeness belief in that alternative
sequences a) did not reflect the expected even numbers of boys and girls
being born and b) did not reflect the perceived spread of boys and
girls. The combined Level 1 responses across grades reported in this
study show only 24% of students reasoned with both beliefs based on
representativeness which is lower than results reported by Garfield
& delMas (1991). A further 21.7% of students, however, indicated
that they were susceptible to one or other of the beliefs with only
13.3% reasoning correctly on both sequences. The presence of the largest
number of responses (33.7%) at Level 2, giving technically correct
multiple choice responses but with simplified reasoning not reflecting
the complex nature of the independent events, provides a warning for
teachers to explore students' reasoning beyond a choice of
alternative. Although response levels improved substantially from Grade
3 in this study, the difference between Grades 6 and 9 was not as great,
perhaps indicating that the intuitions supporting responses are not
being addressed in the middle school years.
The Association of Performance across Tasks
Although from the point of view of cognitive theory there would be
an expectation that students should respond at similar structural levels
across different tasks, such was not the case in this study of random
processes. The different contexts in which the questions were
presented--in survey and interview--contributed to this difference. The
advantage of the interview setting over a survey seems clear from two
perspectives. First, it is difficult for students not to provide at
least some type of response, which the interviewer can then probe
further, greatly reducing the number of non-responses. As well, the
interview task on random in this study provided a second context, the
television setting for a lottery, which gave students another angle from
which to draw upon their memories about random processes. Both of these
factors contributed to higher levels of response to the interview item
than to the survey item concerning the meaning of random.
The other three tasks tapped into widely differing aspects of
equally likely random processes, exploring intuitions about luck, the
idea of random as spread out, and the classic representative dilemmas
associated with a sample having the exact population proportion or an
uneven distribution. Whereas most students are comfortable disregarding luck as a factor in winning a lottery, not many can go on to give higher
level justifications in terms of equal likelihood of outcomes. This
difficulty is also evident in the task to distinguish which of
consecutive or spread out numbers are likely to win the lottery. The
fact that 46% of students were in both of these categories points to a
deficit in the appreciation of the implications of equal likelihood.
Lack of understanding did not stop students achieving a Level 2
technically correct response to QS2 but produced the Level 1 incorrect
response to QI2. The modal response levels point to the greater
difficulty of the context of QI2 for students. Again comparing the modal
response level of QI1 on the meaning of random (Level 3) with the two
lottery items suggests that the added contextual support in QI1 may have
aided students in responding at higher levels.
Although the distribution of responses across levels was the most
even for the birth order task, QI3, the spread of data over the cells in
Tables 11 and 12 shows that the association of levels of response is not
high between Q13 and the other two interview tasks. There was in fact no
association between the birth order task and two random items. The
nature of the birth order questions, relying as they do on an
appreciation of independent compound events may partially explain the
lack of association. This distribution of responses to QI3 may represent
exposure to classroom experiences such as coin tossing for some
students, whereas discussion of random processes and luck may not have
taken place.
Differing levels of response on tasks featuring different aspects
of the same probability construct were also observed by Garfield and
delMas (1991) and Burgess (2000). Both suggested that context was a
significant feature and the current research supports this view.
Longitudinal Change
Over the three or four-year period, few students performed at lower
levels on the three interview tasks, only two for each of the random and
lottery questions and three for the birth order task. For the random
task, of the 22 who could improve their performance, 41% did so. For the
lottery question 37% of the 17 who could improve did and for the 12 on
the birth order question, 58% improved. Overall this represents an
improvement rate of 45% for the three tasks. The observed improvement
over time suggests a positive development in understanding reflected in
nearly half of the responses over a three- or four-year period. This and
the similarity of pattern of responses for the Grade 6 students in
longitudinal interviews to the earlier Grade 9 students supports the
suggestion of developmental change also being displayed by the original
cohorts in this study. The small number of longitudinal interviews means
that this hypothesis must remain tentative but the issue deserves
further research. This is particularly the case since some studies have
suggested a lack of change across the school years in understanding
random phenomena.
It is interesting to compare the rates of improvement over time for
these tasks with the rates for tasks associated with other topics in the
chance and data curriculum. Improvement here was similar to that for
beliefs about fair dice (47%) and strategies for determining the
fairness of dice (39%) (Watson & Moritz, 2003), topics closely
related to ideas about random behavior. Rates were also similar to
improvement on representation of data with pictographs (38%) and
interpreting pictographs (58%), but much less than the improvement in
predicting from pictographs (90%) (Watson & Moritz, 2001b). Greater
improvement rates were also observed for the concepts related to average
(79%) (Watson & Moritz, 2000b), sampling (74%) (Watson, 2004), and
beginning inference (65%) (Watson, 2001). Two reasons are suggested for
the difference in improvement rates. One is that a topic such as
average, which contributes to many of the concepts covered in the tasks
including prediction, is included in the middle school curriculum
completed by many of the students during the three or four years between
interviews. The other is that ideas related to chance and probability
are notoriously difficult to change (Fischbein & Gazit, 1984) and
the improvement rates here may reflect that difficulty.
Cognitive Conflict
The sample size for students exposed to cognitive conflict for two
of the interview questions was small and hence it is not possible to
suggest generalizations. It is of interest, however, that presenting
students with higher level responses did not have the desired effect. No
student accepted with confidence a higher level argument than that
originally offered. The likely explanation for this is that beliefs
based on intuitions about random behavior cannot be changed without
considerable first-hand empirical experience. An opinion expressed in a
few seconds, especially by another student and not an authority figure,
is easy to dismiss. If the students being interviewed realize that their
responses are based on intuition, then so may the beliefs expressed on
the video clip be based on the intuitions of the other person. There was
no opportunity during the interview to test beliefs with trials. In
comparison with the longitudinal interviews in this report, the
intervening period of three or four years obviously had a more positive
effect in producing higher level responses. In comparison with other
protocols that employed cognitive conflict, this one on random processes
was the least successful (Watson, 2002a, 2002b; Watson & Moritz,
2001a, 2001b). The one student in this study (S9) who was persuaded by a
lower level response was the only example of such an influence over the
five topics analyzed, including chance measurement, sampling,
pictographs, and comparing two data sets.
The outcomes with respect to cognitive conflict are perhaps not
surprising in the light of the observations of responses to random tasks
by Fischbein and Gazit (1984) and Fischbein and Schnarch (1997). If it
is unusual to observe improved performance across grades and if over
three or four years less than half of students are likely to improve,
then to expect improvement in a few minutes is probably unrealistic.
Implications for Future Research
Several suggestions for future research arise from the outcomes of
these related studies. First, wherever possible it is advisable to
interview students on topics related to random behavior rather than rely
on written survey responses. Students appear more willing in interview
setting to suggest a tentative response even if they are unsure. In a
survey they are likely to leave the question blank. Providing a context
within which to discuss the concept of random is also helpful to
students.
In an interview setting, if cognitive conflict is to be employed,
it would appear that time must also be provided to allow for
experimentation with random generators. For the birth order task it
would be possible to experiment with tossing six coins in order many
times and record the outcomes. Using a computer simulation would be more
efficient but in this case care must be taken to ensure that students
understand how the simulation works and what outcomes are being
recorded, especially if this is done by the computer and not the
student.
Following the outcomes of this study and those of Fischbein and
Gazit (1984), it would appear that a research design employing a strong
instructional intervention, perhaps over a period of a few weeks, would
be more likely to achieve positive outcomes in a post-instruction
interview. Computer simulation for the birth order task would appear
promising, along with a thorough discussion of independent events. For
tasks related to lotteries, studying distributions of outcomes from
simulations may be helpful.
Implications for Teaching
Falk and Konold (1994) contend that presenting ideas about random
to students involves "undertaking the challenge of bringing up the
doubts and difficulties that students will predictably have (p.
2)." The colloquial ideas associated with random in everyday life
need to be discussed in the classroom and compared and contrasted with
statistically appropriate descriptions. Perhaps the most useful from the
middle school level is that of David Moore (1990).
Phenomena having uncertain individual outcomes but a regular pattern of
outcomes in many repetitions are called random. 'Random' is not a
synonym for 'haphazard,' but a description of a kind of order different
from the deterministic one that is popularly associated with science and
mathematics. Probability is the branch of mathematics that describes
randomness. (p. 98)
Much discussion of expectation and variation should follow exposure
to this description. As Metz (1998) points out for her study,
The construct of randomness addressed in this study involves an
integration of the uncertainty and unpredictibility of a given event,
with patterns manifested over many repetitions of the event. These data
indicate that children and adults frequently fail to integrate the
uncertainty with the patterns, instead either interpreting the patterns
in deterministic form or recognizing the uncertainty apart from the
associated patterns. Although the first error overestimates the
information given, the second error underestimates it. (p. 349)
Overall the suggestions made in the previous section about teaching
intervention as part of research apply to classroom instructions
generally. If a goal of the chance part of the data and chance
curriculum is for students to have an adequate understanding of random
processes, then appropriate, explicit teaching activities must be
incorporated. These must be accompanied by extensive discussion and
comparison of in-school and out-of-school observation of random
phenomena.
Acknowledgments
This research was funded by Australian Research Council Grants No.
AC9231385, W0009108, A79800950, and DP02088607. Jonathan Moritz
conducted some interviews in Study 1, and all of the interviews in Study
2 and Study 3; he also set up the spreadsheet environment for the
analysis.
References
Australian Education Council. (1991). A national statement on
mathematics for Australian schools. Carlton, VIC: Author.
Batanero, C., Green, D. R., & Serrano, L. R. (1998).
Randomness, its meanings and educational implications. International
Journal of Mathematics Education and Science Technology, 29(1), 113-123.
Biggs, J. B., Collis, K. F. (1991). Multimodal learning and the
quality of intelligent behavior. In H. A. H. Rowe (Ed.), Intelligence:
Reconceptualisation and Measurement (pp. 57-76). Hillsdale, NJ: Lawrence Erlbaum.
Burgess, T. (2000). Are teachers' probability concepts more
sophisticated than those of students? In J. Bana & A. Chapman (Eds.), Mathematics education beyond 2000 (Proceedings of the 23rd
Annual Conference of the Mathematics Education Research Group of
Australasia, Vol. 1, pp. 126-133). Perth, WA: MERGA.
Chick, H. L., & Watson, J. M. (2001). Data representation and
interpretation by primary school students working in groups. Mathematics
Education Research Journal, 13, 91-111.
Department for Education (England and Wales). (1995). Mathematics
in the national curriculum. London: Author.
Department of Education and the Arts. (1993). Mathematics
Guidelines K-8 (Chance and Data). Tasmania: Curriculum Services Branch.
Falk, R. (1991). Randomness--an ill-defined but much needed
concept. Journal of Behavioral Decision Making, 4, 215-226.
Falk, R., & Konold, C. (1994). Random means hard to digest.
Focus on Learning Problems in Mathematics, 16, 2-12.
Fischbein, E., & Gazit, A. (1984). Does the teaching of
probability improve probabilistic intuitions? An exploratory research study. Educational Studies in Mathematics, 15, 1-24.
Fischbein, E., & Schnarch, D. (1997). The evolution with age of
probabilistic, intuitively based misconceptions. Journal for Research in
Mathematical Education, 28, 96-105.
Garfield, J., & Ahlgren, A. (1988). Difficulties in learning
basic concepts in probability and statistics: Implications for research.
Journal for Research in Mathematics Education, 19, 44-63.
Garfield, J., & delMas, R. (1991). Students' conceptions
of probability. In D. Vere-Jones (Ed.), Proceedings of the Third
International Conference on Teaching Statistics: Vol. 1. School and
general issues (pp. 340-349). Voorburg, The Netherlands: International
Statistical Institute.
Green, D. (1983). From thumbtacks to inference. School Science and
Mathematics, 83, 541-551.
Green, D. (1988). Children's understanding of randomness:
Report of a survey of 1600 children aged 7-11 years. In R. Davidson & J. Swift (Eds.), The Proceedings of the Second International
Conference on Teaching Statistics (pp. 287-291). Victoria, B.C:
University of Victoria.
Green, D. (1991). Longitudinal study of Pupil's probability
concepts. In D. Vere-Jones (Ed.), Proceedings of the Third International
Conference on Teaching Statistics: Vol. 1. School and general issues
(pp. 320-328). Voorburg, The Netherlands: International Statistical
Institute.
Green, D. (1993). Data analysis: What research do we need? In L.
Pereira-Mendoza (Ed.), Introducing data analysis in the schools: Who
should teach it? (pp. 219-239). Voorburg, The Netherlands: International
Statistical Institute.
Kahneman, D., & Tversky, A. (1972). Subjective probability: A
judgment of representativeness. Cognitive Psychology, 3, 430-454.
Kahneman, D., & Tversky, A. (1973). On the psychology of
prediction. Psychological Review, 80, 237-251.
Kahneman, D., & Tversky, A. (1982). Judgment under uncertainty:
Heuristics and biases. Cambridge: Cambridge University Press.
Konold, C., Pollatske, A., Well, A., Lohmeier, J., & Lipson, A.
(1993). Inconsistencies in students' reasoning about probability.
Journal for Research in Mathematics Education, 24, 392-414.
Macbeth, D. (2000). On an apparatus for conceptual change. Science
Education, 84, 228-264.
Metz, K.E. (1998). Emergent understanding and attribution of
randomness: Comparative analysis of the reasoning of primary grade
children and undergraduates. Cognition and Instruction, 16, 285-365.
Miles, M.B., & Huberman, A. M. (1994). Qualitative data
analysis: An expanded sourcebook (2nd ed.). Thousand Oaks, CA: Sage
Publications.
Ministry of Education. (1992). Mathematics in the New Zealand curriculum. Wellington: Author.
Moore, D. S. (1990). Uncertainty. In L.S. Steen (Ed.), On the
shoulders of giants: New approaches to numeracy (pp. 95-137).
Washington, DC: National Academy Press.
Moritz, J. B., Watson, J. M., & Pereira-Mendoza, L. (1996,
November). The language of statistical understanding: An investigation
in two countries. Paper presented at the annual conference of the
Australian Association for Research in Education, Singapore.
National Council of Teachers of Mathematics. (2000). Principles and
standards for school mathematics. Reston, VA: Author.
Posner, G. J., Strike, K.A., Hewson, P.W., & Gertzog, W. A.
(1982). Accommodation of a scientific conception: Toward a theory of
conceptual change. Science Education, 66, 211-227.
Pratt, D. (1998). The coordination of meaning for randomness. For
the Learning of Mathematics, 18, 2-11.
Shaughnessy, J. M. (1977). Misconceptions of probability: An
experiment with a small-group activity-based, model building approach to
introductory probability at the college level. Educational Studies in
Mathematics, 8, 295-316.
Shaughnessy, J. M. (1983). The psychology of inference and the
teaching of probability and statistics. In R. W. Scholz (Ed.), Decision
making under uncertainty (pp. 325-350). Amsterdam: North-Holland.
Shaughnessy, J. M. (1992). Research in probability and statistics:
Reflections and directions. In D. A. Grouws (Ed.), Handbook of research
on mathematics teaching and learning (pp. 465-494). New York: NCTM &
MacMillan.
Shaughnessy, J. M. (1997). Missed opportunities in research on the
teaching and learning of data and chance. In F. Biddilph & K. Carr (Eds.), People in mathematics education (Proceedings of the Twentieth
Annual Conference for the Mathematics Education Research Group of
Australasia, pp. 6-12). Waikato, NZ: MERGA.
Strike, K. A., & Posner, G. J. (1992). A revisionist theory of
conceptual change. In R. A. Duschl & R.J. Hamilton (Eds.),
Philosophy of science, cognitive psychology, and educational theory and
practice (pp. 147-176). Albany, NY: State University of New York Press.
Waite, M. (Ed.). (1998). The Little Oxford Dictionary (Rev. 7th
ed.). Oxford: University Press.
Watson, J. M. (1994). Instruments to assess statistical concepts in
the school curriculum. In National Organizing Committee (Ed.),
Proceedings of the Fourth International Conference on Teaching
Statistics: Vol. 1 (pp. 73-80). Rabat, Morocco: National Institute of
Statistics and Applied Economics.
Watson, J. M. (2001). Longitudinal development of inferential
reasoning by school students. Educational Studies in Mathematics, 47,
337-372.
Watson, J. M. (2002a). Creating cognitive conflict in a controlled
research setting: Sampling. In B. Phillips (Ed.), Proceedings of the
Sixth International Conference on Teaching Statistics: Developing a
statistically literate society, Cape Town, South Africa. Voorburg, The
Netherlands: International Statistical Institute.
Watson, J. M. (2002b). Inferential reasoning and the influence of
cognitive conflict. Educational Studies in Mathematics, 51, 225-256.
Watson, J. M. (2004). Developing reasoning about samples. In D.
Ben-Zvi & J. Garfield (Eds.), The challenge of developing
statistical literacy, reasoning, and thinking (pp. 277-294). Dordrecht:
Kluwer.
Watson, J. M., Collis, K. F., & Moritz, J. B. (1995).
Children's understanding of luck. In B. Atwek & S. Flavel
(Eds.), Galtha (Proceedings of the 18th Annual Conference of the
Mathematics Education Research Group of Australasia, pp. 550-556).
Darwin, NT: MERGA.
Watson, J. M., Collis K. F., & Moritz, J. B. (1997). The
development of chance measurement. Mathematics Education Research
Journal, 9, 60-82.
Watson, J. M., Kelly, B. A., Callingham, R. A., & Shaughnessy,
J. M. (2003). The measurement of school students' understanding of
statistical variation. International Journal of Mathematical Education
in Science and Technology, 34, 1-29.
Watson, J. M., & Moritz, J. B. (1998). Longitudinal development
of chance measurement. Mathematics Education Research Journal, 10(2),
103-127.
Watson, J. M., & Moritz, J. B. (1999a). The development of the
concept of average. Focus on Learning Problems in Mathematics, 21(4),
15-39.
Watson, J. M., & Moritz, J. B. (1999b). The beginning of
statistical inference: Comparing two data sets. Educational Studies in
Mathematics, 37, 145-168.
Watson, J. M., & Moritz, J. B. (2000a). Development of
understanding of sampling for statistical literacy. Journal of
Mathematical Behavior, 19, 109-136.
Watson, J. M., & Moritz, J. B. (2000b). The longitudinal
development of understanding of average. Mathematical Thinking and
Learning, 2 (1 & 2), 11-50.
Watson, J. M., & Moritz, J. B. (2001a). The role of cognitive
conflict in developing students' understanding of chance
measurement. In J. Bobis, B. Perry, & M. Mitchelmore (Eds.),
Numeracy and beyond (Proceedings of the 24th Annual Conference of the
Mathematics Education Research Group of Australasia, Vol. 2, pp.
523-530). Sydney, NSW: MERGA.
Watson, J. M., & Moritz, J. B. (2001b). Development of
reasoning associated with pictographs: Representing, interpreting, and
predicting. Educational Studies in Mathematics, 48 (1), 47-81.
Watson, J. M., & Moritz, J. B. (2002). School students'
reasoning about conjunction and conditional events. International
Journal of Mathematical Education in Science and Technology, 33, 59-84.
Watson, J. M., & Moritz, J. B. (2003). Fairness of dice: A
longitudinal study of students' beliefs and strategies for making
judgments. Journal for Research in Mathematics Education, 34, 270-304.
Jane M. Watson and Annaliese Caney
University of Tasmania
Table 1 Number of Participants in Each Grade and Study
Study 3
Study 1 Initial Study 2 Longitudinal Cognitive Conflict
South South
Grade Tasmania Australia Tasmania Australia Tasmania
3 24 3 3 2 1
6* 30 8 11 3 7
9 30 4 4 0 7
Sub total 84 15 18 5 15
Total 99 23 (subset of 15 (subset of
Study 1) Study 1)
*Dala collected in South Australia were from students in Grade 5 and 7.
These grades have been combined with Grade 6 for analysis.
Table 2 Global Descriptions of Response Levels for Survey and Interview
Items.
Random
Response Global (Interview & Lottery
Code Level Description Survey) (Interview)
4 Relational Integrated Multiple Not influenced by
understanding as qualification consecutive order
required by the of random of numbers and
question. integrated with ideas of
Appreciation of good example(s) variation/spread
random/chance and mention of with explicit
outcomes, not "bias" or mention of all
influenced by "fairness." numbers having an
order or balance. equal chance. May
include ideas on
sets of numbers
and chance.
3 Multi- Recognize A qualified Express a
structural regularity of idea about generalized idea
chance. random about chance
Inconsistent integrated with with qualitative
cognition of an example. chance
balance or order statements.
as issues.
Sequential ideas
(multiple) in
definitions.
2 Uni- Single ideas A single idea Express
structural about random or about random or contradictory
luck or broad an example. ideas and show no
chance reasoning recognition of
that "anything contradiction or
could happen." broad chance
reasoning with no
qualified chance
statement.
1 Ikonic Intuitive ideas An intuitive Influenced by
about random. response with consecutive order
Susceptible to no clear of numbers and
both balance and definition. expected
order. variation/spread
of numbers.
Choose Jenny's
numbers.
0 No response No response or no No response. Idiosyncratic
explanation. Any responses and no
reasoning that is reasoning.
inappropriate or
idiosyncratic.
Code Birth Order (Interview) Luck (Survey)
4 Not susceptible to probabilistic and Not observed.
order imbalance. Ideas may extend to
include sample size. Choose equally
likely with appropriate reasoning.
3 Susceptible to probabilistic or order Disagreement with
imbalance. Recognizes conflict in random luck accompanied by a
behavior in one but not all contexts. qualitative chance
Choose a combination of sequences BGGBGB statement or an idea
and equally likely. about random.
2 Broad chance reasoning that "anything Broad chance reasoning
could happen" with no qualified ideas. that "anything could
Choose Equal/Equal. happen" or disagreement
with luck either
implicitly or
explicitly.
1 Susceptible to both probabilistic and An intuitive response
order imbalance. Strict views as indicating belief in
reflected in small samples. Choose luck.
BGGBGB pattern consistently.
0 No response or no explanation. Any Idiosyncratic.
reasoning that is inappropriate or
idiosyncratic.
Table 3 Response Levels Across Grades for the Random Survey Question,
QS1
Response levels
Grade 0 1 2 3 Total
3 11 8 2 1 22
6 5 8 12 2 27
9 1 5 10 6 22
Total 17 21 24 9 71
Table 4 Levels of Response Across Grades for Meaning of Random Interview
Question, QI1
Response levels
Grade 0 1 2 3 4 Total
3 0 15 4 7 0 26
6 1 5 11 16 1 34
9 0 1 5 21 5 32
Total 1 21 20 44 6 92
Table 5 Levels of Response for Meaning of Random in Survey and Interview
Questions
Interview Survey response levels
response levels 0 1 2 3 Total
0 1 0 0 0 1
1 8 7 2 0 17
2 4 5 6 0 15
3 2 10 16 7 35
4 0 0 1 2 3
Total 15 22 25 9 71
Table 6 Response Levels Across Grades for the Luck Survey Question, QS2
Response levels
Grade 0 1 2 3 Total
3 5 4 17 0 26
6 0 2 26 3 31
9 0 2 14 11 27
Total 5 8 57 14 84
Table 7 Response Levels Across Grades for the Lottery Interview
Question, QI2
Response levels
Grade 0 1 2 3 4 Total
3 1 22 2 1 0 26
6 1 20 9 4 4 38
9 0 14 5 5 10 34
Total 2 56 16 10 14 98
Table 8 Levels of Response for Luck Survey Question and Lottery
Interview Question
Lottery response Luck response levels
levels 0 1 2 3 Total
0 0 1 1 0 2
1 5 5 39 2 51
2 0 1 9 4 14
3 0 1 4 1 6
4 0 0 3 7 10
Total 5 8 56 14 83
Table 9 Levels of Response for Meaning of Random and Lottery Interview
Questions
Lottery response Random response levels
levels 0 1 2 3 4 Total
0 0 1 1 0 0 2
1 1 18 11 21 1 52
2 0 1 5 8 0 14
3 0 0 2 5 2 9
4 0 0 1 10 3 14
Total 1 20 20 44 6 91
Table 10 Response Levels Across Grades for the Birth Order Interview
Question, QI3
Response levels
Grade 0 1 2 3 4 Total
3 4 3 11 2 0 20
6 1 8 10 6 4 29
9 1 9 7 10 7 34
Total 6 20 28 18 11 83
Table 11 Levels of Response for Meaning of Random and Birth Order
Interview Questions
Random response Birth Order response levels
levels 0 1 2 3 4 Total
0 0 0 0 0 1 1
1 4 2 6 2 0 14
2 0 2 7 5 3 17
3 2 13 14 9 4 42
4 0 1 1 2 2 6
Total 6 18 28 18 10 80
Table 12 Levels of Response for Birth Order and Lottery Interview
Questions
Lottery Birth Order response levels
response levels 0 1 2 3 4 Total
1 5 15 16 8 4 48
2 0 1 10 2 1 14
3 1 0 1 2 3 7
4 0 4 1 6 3 14
Total 6 20 28 18 11 83
Table 13 Correlation between Levels of Response on the Five Tasks
Random Luck Random Lottery
Survey Survey Interview Interview
Luck Survey .314**
Random Interview .592** .359**
Lottery Interview .347** .483** .481**
Birth Order Interview -.162 .284* .126 .301**
*p [less than or equal to] .05, **p [less than or equal to] .01
Table 14 Longitudinal Change in Level of Response for the Random (R),
Lottery (L), and Birth Order (B) Questions for Each Student by Initial
Grade (3, 6, or 9).
Level at Initial
Interview 1 2 3 4 Total
Level at
Longitudinal
Interview
0 [1]
B3 1 B
R3 1 R
1 L33666666 L6 9 L [12]
B6 B3 2 B
R6 R66 R66 5 R
2 L366 L9 4 L [10]
B6 1 B
R333 R6 R36666699 12 R
3 L3 L6 2 L [17]
B6 B66 3 B
R6 R699 R6 5 R
4 L69 L6669 6 L [17]
B669 B69 B6 6 B
5 R 4 R 13 R 1 R
Total 14 L [24] 2 L [10] 1 L [17] 4 L [6] [57]
5 B 4 B 3 B 1 B