Developing box plots while navigating the maze of data representations.
Duncan, Bruce ; Fitzallen, Noleine
Activity sequence Description Teaching opportunities
1. Posing the "Do people complete The process of data
problem and mazes faster the collection and
identifying the second time around?" representation is
question to be shown to have an
answered. authentic purpose.
Medium Mazes Set 5:
Run-of-the-Mill
(www.printablemazes.
net)
2. The event. Every student The teacher may need
receives a copy of to establish an
the maze, face down, upper limit for the
and is instructed to duration of this
turn the paper over task, by which time
and attempt the maze some students may
when the teacher not have finished.
says, "Go." The Stopwatch (www.
teacher starts a online-stopwatch.com
stopwatch on a data /large-stopwatch/)
projector that all
students can see and
they attempt to
complete the maze by
drawing a path from
start to finish
without crossing any
lines. When students
finish they record
the time on the
stopwatch as the
duration of their
attempts.
3. Raw data. The time taken for The need to organise
each student to data can be made
complete the maze is clear by first
collected on a board collecting data from
at the front of the students in a random
classroom. order, such as
Initially, these "around the room."
data are collected
in a random order to
produce a list.
4. Ordering data. Students asked to The advantage in
consider, "How can ordering data can be
we make these data made clear to
easier to read?" and students by
"How can we describe scaffolding
this set of discussion about
results?" organising the data.
5. Grouping. Description be Teaching
grouped and then opportunities from a
group the data possibly continuous
according to a range of
strategy selected by measurements and
the class, which that it therefore
becomes the stem of makes sense to speak
the stem-and-leaf of the frequency of
display. outcomes within
specified intervals
(grouped data)
rather than the
frequency of
occurrence of
particular
measurements.
6. Stem-and An appropriate scale Now the purpose of
leaf-displays. is determined by organising the data
discussion and drawn can be made clear
on the board and the through discussions
data are recorded. that attempt to
describe the data
set by asking
questions such as
"What can we say
about the data?" The
data are analysed,
organised, and
represented in
different ways to
identify the range,
any skewed
distribution, and
central tendency.
The focus now shifts
from students
identifying their
individual
information to
looking more broadly
at the data from the
whole group.
7. The second The maze activity Discussion should
event. (step 2) is repeated elicit the
with the same maze expectation that
and times recorded. durations to
complete the maze
the second time
around may become
shorter. This
comparison can be
discussed informally
after the data have
been collected but
before the data are
organised so that
the data are seen to
confirm an
explanation.
9. Organising the Students organise This process is a
second set of data from Trial 2 repetition of the
data. into a back-to-back process undertaken
stem-and-leaf on the first data
display with the set. The opportunity
data from Trial 1. exists, therefore,
to allow students to
carry out this
process with greater
independence from
the teacher. In the
example the data
shows a very
dramatic improvement
in times, one that
would be obvious
from the raw data. A
more challenging
maze or a younger
group of students
may produce data
that are less
markedly different.
10. Comparing Description does Teaching
data sets: this representation opportunities
Representations help us answer our comparing data sets
with a shared question? Are the on a common scale.
scale in a second times faster? Once again the
back-to-back Why do you say discussion should be
stem-and-leaf that?" guided by the
display. purpose so a good,
guiding question
here is, "How can we
compare your maze
completion time from
Trial 1 with the
completion time in
Trial 2?" Discussion
includes the
comparison of the
characteristics of
each data
set--range, skew,
central tendency.
11. Medians and Students discuss Establishment of
quartiles. "What is the middle these features
score?" or "What pre-empts the box
score divides this plots but the
group in half?" discussion must
focus students'
understanding on
these terms as
characteristics of
the population, not
the range. Once
students understand
that the median is
determined by
considering the
number of scores in
order, rather than
the value of each
score, the concept
of quartiles,
dividing the
population into four
equal sized groups,
follows as a natural
progression.
12. Box plots. Students identify Box plots can be
the five points on seen as simplified
the stem-and-leaf stem-and-leaf
display (minimum, displays. Although
first quartile, the detail of each
median, third datum is lost, the
quartile, maximum) simplification of
and mark against the this representation
same scale to create allows the data set
the box plot. to occupy less space
and, therefore,
makes box plots
appropriate for the
purpose of
comparison.
13. Answering the "Do people complete Comparison of the
question. mazes faster the two box plots shows
second time around?" that the
Attention can then interquartile ranges
be given to thinking do not overlap,
about the informal therefore, the claim
inferences that can can be made that the
be made from the people in the group
data, asking "Do you were faster the
think another group second time round.
of students would Note that the first
get the same result? quartile and the
Can we claim that median in Trial 2
students always fall at the same
complete Trial 2 point on the
quicker than Trial vertical scale. That
1? results in an
unconventional
looking box
(interquartile
range). Anomalies
such as this arise
when using real life
data and present the
opportunity to
discuss why the
representation looks
different to what
was expected.