Designing more effective teaching of comprehension in culturally and linguistically diverse classrooms in New Zealand.
McNaughton, Stuart ; Lai, Mei ; MacDonald, Shelley 等
A central goal in New Zealand's National Literacy Strategy
(Ministry of Education, 2002) is to understand and reduce disparities in
achievement in the first four years of schooling. The focus has been on
schools serving the communities with time lowest employment rates and
income levels, with the highest proportions of Maori (indigenous) and
Pasifika (from Pacific Islands communities) students (termed
'decile 1' schools). There have been a number of national and
local initiatives since 1998 and there is evidence for increased
effectiveness of early literacy instruction. The National Education
Monitoring Project's (Flockton & Crooks, 2001) second cycle of
assessments of reading showed reduced numbers of children in the lowest
bands at year four (8 year olds) in reading accuracy. A recent renorming
of standardised assessments after one year at school (6.0 years)
suggests knowledge of letters and sounds is higher (Clay, 2002).
Similarly, research-based interventions in low decile schools have
demonstrated teachers can raise rates of progress significantly across a
broad band of early literacy measures, including accurate decoding of
text (Phillips, McNaughton & MacDonald, 2004).
But a major challenge has been created by these advances. In part
this is an issue of sustainability, a concern to build further progress
in literacy, adding to the more effective instruction in time early
years. This inevitably means considering the quality of the teaching and
learning of comprehension (Sweet & Snow, 2003). The issue in the
'decile 1' schools is that the subsequent instructional
conditions set channels for further development, and if the channels are
constructed for relatively 'low' gradients of progress, this
creates a need for further intervention. Unfortunately, the available
evidence shows that despite the gains in decoding, there are still wide
and possibly increasing disparities in achievement on comprehension
tasks for Maori and Pasifika children, and particularly in low decile
schools (Flockton & Crooks, 2001; Hattie, 2002; Lai, McNaughton,
MacDonald & Farry, 2004).
The instructional need is for high quality teaching practices with
known relationships with the learning of linguistically and culturally
diverse students (Alton-Lee, 2003). But although effective practices may
be able to be identified, there is a further concern: sustaining high
quality intervention (Coburn, 2003), which, it now seems, is dependent
on the degree to which a professional learning community is able to
develop. Such communities can sustain deep and meaningful change in
schools (Toole & Seashore, 2002), and can effectively change teacher
beliefs and practice (Annan, Lai & Robinson, 2003; Hawley &
Valli, 1999; Timperley & Robinson, 2001).
There are several critical features of a collaboration between
teachers and researchers which would contribute to such a community
developing (see Coburn, 2003; Toole & Seashore, 2002; Robinson &
Lai, in press). One is the need for the community's shared ideas,
beliefs and goals to be theoretically rich. This shared knowledge is
about the target domain (in the present case of comprehension), but it
also entails detailed understanding of the nature of teaching and
learning related to that domain (Coburn, 2003). Yet a further area of
belief that has emerged as very significant in the achievement of
linguistically and culturally diverse students in general, but for
indigenous and minority children in particular, is the expectations that
teachers have about children and their learning (Bishop, 2004; Delpit,
2003; Timperley, 2003).
Being theoretically rich requires not just consideration of
researchers' theories, but also consideration of
practitioners' theories and adjudication between them. Robinson
& Lai (in press) provide the framework by which different theories
can be negotiated using four standards of theory evaluation. These are
accuracy (empirical claims about practice are well founded in evidence),
effectiveness (theories meet the goals and values of those who hold
them), coherence (competing theories from outside perspectives are
considered) and improvability (theories and solutions can be adapted to
meet changing needs or incorporate new goals, values and contextual
constraints).
This means that a second feature of an effective learning community
is that their goals and practices for an intervention are based on
evidence. That evidence should draw on close descriptions of
children's learning, as well as descriptions of patterns of
teaching, and systematic data on both learning and teaching would need
to be collected and analysed together. That assessment data would need
to be broad based in order to know about the children's patterns of
strengths and weaknesses, to provide a basis for informed decisions
about teaching, and to clarify and test hypotheses about how to develop
effective and sustainable practices (McNaughton, Phillips &
MacDonald, 2003).
However what is also crucial is the validity of the inferences
drawn, or claims made about the evidence (Robinson & Lai, in press).
The case reported in Buly and Valencia (2002), for example, show how
inappropriate inferences drawn from the data can result in interventions
that are mismatched to students' learning needs. Robinson and Lai
(in press) suggest that all inferences be treated as competing theories
and evaluated.
So this requires a further feature, which is an analytic stance on
the collection and use of evidence. One part of this is that a research
framework needs to be designed to show whether and how planned
interventions do impact on teaching and learning, enabling the community
to know how effective interventions are in meeting its goals. The
research framework adopted by the community needs therefore to be staged
so that the effect of interventions can be determined. The design part
of this is by no means simple, especially when considered in the context
of recent debates about what counts as appropriate research evidence
(McCall & Green, 2004; McNaughton & MacDonald, 2004).
Another part of the analytic stance is the critical reflection on
practice, rather than a comfortable collaboration in which ideas are
simply shared (Annan, Lai & Robinson, 2003; Ball & Cohen, 1999;
Toole & Seashore, 2002). Recent New Zealand research indicates that
collaborations which incorporate critical reflection have been linked to
improved student achievement (Phillips et al., 2004; Timperley, 2003)
and changed teacher perceptions (Timperley & Robinson, 2001).
A final feature is that the researchers' and teachers'
ideas and practices need to be culturally located. We mean by this that
the ideas and practices that are developed and tested need to entail an
understanding of children's language and literacy practices, as
these reflect children's local and global cultural identities.
Importantly this means knowing how these practices relate (or do not
relate) to classroom practices (New London Group, 1996).
The present report describes the results of these processes in
action, as researchers and practitioners developed a community to meet
the challenge of building more effective instruction for reading
comprehension in linguistically and culturally diverse urban schools. In
the following report we describe the results of the first stage of a
three-year research and development partnership between schools and
researchers. In this stage evidence was collected and analysed by
literacy leaders, classroom teachers and researchers, and hypotheses
developed about the teaching and learning needs in the schools. The
second phase, currently in progress, involves a more direct intervention
through professional development.
The research design we have employed enables us to systematically
analyse the effects of the initial feedback and analysis process. In a
previous effective intervention for beginning literacy instruction with
these schools the data collection and feedback stage was part of a
professional development process (Phillips et al., 2004). But current
research on learning communities suggests that the critical discussion
and analysis of data may have an independent impact on effective
practice (Ball & Cohen, 1999; Timperly, 2003; Toole & Seashore,
2002). Theories about the needs for teaching and learning are developed
through critical discussion, which is predicted to strengthen shared
understanding and to inform current practices. Given the predicted
significance of this stage we wanted to examine the effects of this
process prior to the planned professional development stage.
Methods
Participants
The overall partnership involves teachers and leaders in seven
'decile 1' schools in a New Zealand Ministry of Education
school improvement initiative, researchers from the University of
Auckland, and New Zealand Ministry of Education representatives.
Baseline data (in February 2003) were collected from 1216 students in
six (one school who joined the partnership was unable to participate in
the first round of data collection) of the schools at the following
levels: Year 4 (mean age 8 years, n = 205), Year 5 (mean age 9 years, n
= 208), Year 6 (mean age 10 years, n = 265), Year 7 (mean age 11 years,
n = 267) and Year 8 (mean age 12 years, n = 271). The total group
consisted of equal proportions of males and females (50% and 50%
respectively) from 14 ethnic groups, the major groups being Samoan
(33%), Maori (20%), Tongan (19%) and Cook Island (15%). Approximately
half the children have a home language other than English.
Design
Repeated measures of children's achievement were collected in
February 2003 (Time 1), November 2003 (Time 2) and February 2004 (Time
3), as part of a quasi experimental design (McNaughton & MacDonald,
2004). The design uses single case logic within a developmental
framework of cross sectional and longitudinal data. The measures at Time
1 generated a cross section of achievement across year levels (Years
4-5-6-7-8), which provides a baseline forecast of what the expected
trajectory of development would be if planned interventions had not
occurred (Risley & Wolf, 1973). Successive stages of an intervention
can then be compared with the baseline forecast. In the present case the
first of these planned interventions has been the analysis and feedback.
This design provides a high degree of both internal and external
validity, which overcomes the inadequacies of experimental and
randomised control group designs in the context of open and messy applied systems that are schools.
The cross sectional baseline was established at Time 1. Students
from that initial cohort were re-tested at Time 2, providing
longitudinal data over a school year. Where possible the same students
were tested again at Time 3 over a full 12 months, when the students
were now in the next level class. Two of the schools, known as
'contributing' schools, had year levels only to Year 6. Their
students left at the end of the year (after Time 2) and attended
'intermediate' schools not in the cluster of schools. The
other four schools had levels up to Year 8. At the end of the year these
students went to 'secondary' schools. Hence in the analyses
reported below we first examine achievement over the school year from
Time 1 to Time 2. These are essentially pre post measures. But because
they are able to be corrected for age through transformation into
stanine scores (Elley, 2001), they provide an indicator of the impact of
the analysis and feedback stage against national distributions at
similar times of the school year. However, a more robust analysis of
relationships with achievement is provided by the Time 1 and Time 3
data, when they are used within the quasi experimental design format.
They show change over a repeated interval, established as 12 months by
the cross sectional baseline. Numbers of students are smaller from Time
1 to Time 3 because of the 'loss' of students moving out of
two of the schools and other movement away from schools.
Measures
We report here outcome data on reading comprehension across the
three time points using the Supplementary Tests of Achievement in
Reading (STAR, Elley, 2001). These tests were designed for repeated
measurement within and across years, are used by schools and provide a
recognised, standardised measure of reading comprehension which can be
reliably compared across schools. The schools used other reading
measures for both diagnostic and summative purposes, and the baseline
results for these are reported elsewhere (Lai et al., 2004)
STAR was designed to supplement the assessments that the teachers
make about students' 'close' reading ability in Years 4
to 9 in New Zealand (Elley, 2001). In Years 4 to 6, the test consists of
four sub-tests measuring word recognition (decoding of familiar words
through identifying a word from a set of words that describe a familiar
picture), sentence comprehension (complete sentences by selecting
appropriate words), paragraph comprehension (replace words which have
been deleted from the text in a 'Cloze' format) and vocabulary
range (find a simile for an underlined word). In Years 7 and 8, students
complete two more sub-tests, involving understanding the language of
advertising (identify emotive words from a series of sentences) and
reading different genres or styles of writing (select phrases in
paragraphs of different genres which best fits the purpose and style of
the writer).
Procedure
At the beginning of the year, the school leaders and researchers
developed an intra-school standardised process of administering the test
and moderating the accuracy of teacher scoring. This involved
standardising the time of testing and creating a system of randomly
checking a sample of teachers' marking for accuracy of scoring.
Accuracy of scoring was further checked by the research team during data
entry and analysis. The STAR was administered as part of schools'
normal assessment cycle at the beginning of the school year.
Classroom Observations. Systematic classroom observations were
carried out by the research team. These were designed to provide a
sample of how features of teaching and learning might map on to the
achievement data. Observations using diary and audio recording
procedures were carried out in 16 classrooms in seven schools from Year
4 through to Year 8, (including one bilingual Samoan classroom).
Classroom instruction was observed for 40-60 minutes during the usually
scheduled core reading session, within which the teaching and learning
of reading comprehension occurred. Class sizes generally ranged from
21-26 students. Discussions with teachers also provided an important
level of professional reflection. Specific details from these
observations are reported further in the paper when the student
achievement profiles are discussed.
Observations revealed that the general reading programme was
similar in most classes. A whole class activity typically was followed
by group work, with the (2-5) groups organised by ability level.
Typically the whole class activity involved the teacher introducing and
sharing a text, over about 20 minutes. For the following (up to) 40
minutes, small group activities occurred, included text study and
analysis (such as study of plot or character in narrative texts, and
extracting and using information in informational texts), specific group
or paired forms of instructional / guided reading (such as
'reciprocal teaching'), and individual or group project work
(such as developing taxonomies of categories in science topics).
Typically, the teacher worked with two groups over this time period and
held conferences on the run with other groups.
The general programme appeared to work well in terms of levels of
engagement, which were high, with routines well established and frequent
instances of teacher-student and student-student interactions. The
general organisation meant whole class activities occurred three to five
days per week, and small group work often occurred daily. Each group had
at least one session and up to three sessions with direct teacher
guidance each week, although the variation in frequency of contact with
each group was quite marked between schools. Across all classrooms some
form of direct strategy instruction was observed during whole class or
small group instruction. When not with the teacher, groups engaged in a
range of activities, including peer guidance in strategy instruction, or
completing worksheets containing questions about a text with sentence,
word, or sub word studies.
Analysis and feedback process
Student and classroom data was analysed in several steps and
involved each of the schools. The analysis and feedback process involved
two key steps. Firstly a close examination of students' strengths
and weaknesses, and of current instruction to understand learning and
teaching needs, and secondly raising competing theories of the
'problem' and evaluating the evidence for these competing
theories. This meant using standards of accuracy, coherence and
improvability (Robinson & Lai, in press). This process further
ensured that the collaboration was a critical examination of practice
and that valid inferences were drawn from the information. The feedback
procedures with examples are described fully in Robinson and Lai (in
press).
Using this framework for analysis area-wide data was analysed by
the school leaders and researchers in two meetings, then analysed by
senior managers and senior teachers with each school using the specific
school data. Additional sessions were conducted with support from the
second author.
As a result of the analysis and feedback process several theories
about effective instruction in these schools were developed (described
further in Lai et al., 2004). The data ruled out as a general
explanation decoding as a major problem in low accuracy and lack of
fluency (Pressley, 2002). Similarly the data did not support a
prediction that because of their community literacy practices
(McNaughton, 1995) many of the Pasifika students would necessarily be
stronger answering recall questions rather than inferential questions.
This raised the question of whether strategy instruction and strategy
use was a major problem. However, classroom observations suggested that
the direct teaching of comprehension strategies was a strength
(Pressley, 2002).
However, more analysis generated a specific hypothesis about
strategy instruction. The most difficult subtest across all ages and
types of students was a cloze subtest. The primary pattern of responding
in this subtest involved inserting a word that made sense up to the
cloze gap, but tended to not make any sense in terms of semantic or
syntactic coherence in the post cloze context. This pattern has been
found elsewhere. For example, Dewitz and Dewitz (2003) described a small
group of 5th grade readers who were fast, efficient decoders, but had
low comprehension scores. Error analysis revealed high rates of errors
termed 'excessive elaborations' (i.e., guessing).
Classroom observation showed that there were only 9 instances in 16
hours of observation where there was overt checking of predictions and
shared attempts to clarify meanings of words or passages by checking
with texts or dictionaries or thesauruses. And yet there were many
instances of rich discussion involving prediction and clarification in
which students could identify and explain the strategies of prediction
and clarifying. This suggested a specific teaching and learning need in
using textual sources for evidence to detect, check and solve threats to
generating accurate text-based meaning.
The achievement patterns and classroom observations suggested that
vocabulary levels have been constraining comprehension (Biemiller,
1999), and other researchers have found that children in the middle
grades for whom English is a second language may have difficulties with
word meanings on English comprehension tests (Buly & Valencia,
2002). The observations also suggest that the overall density of
instruction needs to be increased (McNaughton, 2002; Stanovich, West,
Cunningham, Cipielewski, & Siddiqui, 1996). That means rates of
instructional components such as teacher feedback and elaboration of
language needed to be increased per child, and that access to varied
high quality texts also needed to be increased.
Finally, the observations and the achievement data revealed that
the identification and use of cultural and linguistic resources could be
increased. Specifically, there was a need to understand better
children's complex thinking in familiar everyday activities and
finding ways to bridge from these capabilities into the less familiar
classroom texts and activities (Lee, Spencer & Harpalani, 2003).
Results
The cross-sectional baseline that provided a projected course of
development across year levels, is shown in Figure 1. The baseline had
two general features. The first is that the average achievement of the
cohorts across years in the 6 schools was two stanines below national
averages (mean = 3.15), representing about a two year disparity in
average achievement on these school literacy measures. The second
feature was the relatively flat line in stanines across year levels.
Together they indicated that under initial instructional conditions
children made about a year's gain for a year at school, remaining
at two stanines below national average across years.
[FIGURE 1 OMITTED]
A first step in analysing patterns over time is to compare
achievement in February 2003 with achievement at the end of the year in
November 2003. When stanines are used, obtained levels have been
adjusted for the change in age (Elley, 2001), and hence this comparison
provides an initial indication of the impact on student achievement of
the cluster-wide and school-based analysis of evidence and feedback. At
each age level t test comparisons showed there was a significant gain in
stanine, with effect sizes (using stanines) ranging from 0.42 to 1.37
(see Table 1).
There was an overall gain in achievement in stanines across year
levels of 0.52 of a stanine. This meant that relative to the baseline
pattern, which indicated that students gained about one year for a year
at school, students gained about 18 months.
In design terms this initial analysis simply uses pre and post
measures with no comparison against control groups or equivalents.
Therefore, gains are not able to be systematically attributed to what
schools did. To better test this, attribution gains over a full
chronological year were analysed to parallel comparisons with the
projected means for each year established by the cross-sectional
baseline. The design as shown in Figure 1 demonstrates that the analysis
and feedback phase was systematically associated with increases in
stanines. In three out of four year levels there was a significant
difference in mean achievement between each cohort after a year compared
with the projected level for that cohort established at baseline,
varying between 0.38 and 1.03 stanines. The effect sizes using stanines
ranged from 0.26 to 0.68 (see Table 2).
Further replication is built into the design when these results are
examined across schools. The schools' results, collapsed across
year levels, are shown in Figure 2. There was a similar order of change
across the schools (mean gain = 0.74 stanine; range 0.44 to 0.96
stanine), which in each case was statistically significant (p < .01).
There were no obvious ceiling or floor effects. That is, schools with
higher initial achievement averages did not gain more and schools with
lower initial achievement averages did not gain less than other schools
(or vice versa).
[FIGURE 2 OMITTED]
Discussion
The significance of research-based evidence to inform educational
policy and practice has been a major theme in recent commentaries on
improving outcomes for children (McCardle & Chhabra, 2004), and
especially in the case of children with cultural and linguistic
identities associated with 'minority' status in poorer schools
(Alton-Lee, 2003). While searching for an evidence base for effective
reading instruction is important, it is also important to demonstrate
that the use of that evidence can make a difference, and to understand
the mediation processes in the use of that evidence.
We have explored the significance of the mediation process in the
first phase of a three year project based on a research-practice
collaboration. Data on levels of achievement and students'
comprehension were collected across age levels and across a cluster of
seven schools. In addition, observations of classroom practice provided
details of current patterns of instruction. These two sources of
evidence were fed back to school leaders and classroom teachers who,
with the research team, then systematically analysed and developed
hypotheses about teaching and learning needs. This process of critical
discussion and analysis of data within the school cluster was based on
previous research suggesting that the critical examination of practice
in collaborative groups can be effective in creating meaningful and
sustainable changes in practice (e.g., Ball & Cohen, 1999;
Timperley, 2003; Toole & Seashore, 2002).
The results indicate that the feedback and analysis process had a
detectable impact on children's achievement in comprehension that
was statistically significant as well as being educationally
significant. It is important to see National benchmarks as not the only,
nor indeed necessarily the most important, means of judging educational
outcomes. However, in the context of a long term pattern of continued
disparities in achievement less than for effective instruction, these
results are important indicators.
These outcomes show that gathering systematic profiles of
children's achievement (McNaughton et al., 2003) and classroom
instruction provide one important mechanism for problem solving.
However, the current analysis does not permit us to separate out the
significance of looking at both achievement data as well as classroom
instruction data for the formation and testing of theories. But clearly
the latter adds importantly to understanding the value. Patterns in the
children's data can be married with patterns in the classroom
instruction data. For example, without the classroom observation data
the patterns of errors in the cloze tests might have suggested the need
for more explicit teaching of comprehension strategies (Pressley, 2002).
However, the observations revealed that explicit teaching was generally
present and occupied significant amounts of teaching time. Rather, the
issue was more directly a problem in the purpose of using strategies,
that is, constructing meaning from and enjoyment of texts, and
specifically the need to use evidence within texts to support those
purposes. There are some anecdotal references to this potentially being
a problem in strategy instruction (Dewitz & Dewitz, 2003; Moats,
2004), but specific observations for this context were needed.
An interesting feature of the school analysis is that there were no
differences in gains associated with overall initial achievement levels
in schools. It might be expected that schools with initially higher
achievement gains would benefit more from the analysis and feedback
process, analogous to Matthew effects for individuals and groups
(Bandura, 1995; Stanovich, 1986). Conversely, it might be expected that
schools with lower achievement levels would make more gains because of a
higher 'ceiling', meaning it would be easier to achieve some
shifts where the student body was very low. The absence of these effects
suggests that the processes of analysis and feedback were quite robust
across schools.
We are currently engaged in a second phase of the project, which
entails intensive professional development with all the teachers. This
second phase will provide important information about the value of
professional development to add to the initial feedback and analysis.
References
Alton-Lee, A. (2003). Quality Teaching for Diverse Students in
Schooling: Best Evidence Synthesis. Report to the Ministry of Education.
Wellington: Ministry of Education.
Annan, B., Lai, M.K. & Robinson, V.M.J (2003). Teacher talk to
improve teaching practices. SET, 1, 31-35.
Ball, D.L. & Cohen. D.K. (1999). Developing practice,
developing practitioners: Toward a practice-based theory of professional
education. In L. Darling-Hammond & G. Sykes (Eds) Teaching as the
Learning Profession. San Fransisco: Jossey-Base, 3-32.
Bandura, A. (1995). Exercise of personal and collective efficacy in
changing societies. In A. Bandura (Ed.) Self Efficacy in Changing
Societies. Cambridge, MA: Cambridge University Press.
Biemiller, A. (1999). Language and Reading Success. Cambridge, MA:
Brookline Books.
Bishop, R. (2004). Te kotahitanga,
www.minedu.govt.nz/goto/tekotahitanga
Buly, M.R. & Valencia, S.W. (2002). Below the bar: Profiles of
students who fail state reading assessments. Educational Evaluation and
Policy Analysis, 24(3), 219-239.
Clay, M.M. (2002). An Observation Survey of Early Literacy
Development. Auckland: Heinemann Educational.
Coburn, C.E. (2003). Rethinking scale: Moving beyond numbers to
deep and lasting change. Educational Researcher, 32(6), 3-12.
Delpit, L. (2003). Educators as 'Seed people' growing a
new future. Educational Researcher, 32(7), 14-21.
Dewitz, P. & Dewitz, P.K. (2003). They can read the words but
they can't understand: Refining comprehension assessment. The
Reading Teacher, 56(5), 422-435.
Elley, W.B. (2001). Supplementary Tests of Achievement in Reading.
Wellington: New Zealand Council of Educational Research.
Flockton, L.T. & Crooks, T. (2001). Reading and Speaking:
Assessment Results 2000. Dunedin: Educational Assessment Research Unit.
Hattie, J. (2002). What are the attributes of excellent teachers?
In Teachers make a difference: What is the research evidence? Conference
Proceedings October 2002. Wellington: NZCER.
Hawley, W.D. & Valli, L. (1999). The essentials of effective
professional development. In L. Darling-Hammond & G. Sykes (Eds)
Teaching as the Learning Profession: Handbook of Policy and Practice.
San Fransisco: Jossey-Bass Publications, San Francisco, 127-150.
Lai, M.K., McNaughton, S., MacDonald, S. & Farry, S. (2004).
Profiling Reading Comprehension in Mangere Schools: A Research and
Development Collaboration. Manuscript submitted for publication.
Lee, C.D., Spencer, M.B. & Harpalani, V. (2003). Every shut eye
ain't sleep: Studying how people live culturally. Educational
Researcher, 32(5), 6-13.
McCall, R.B. & Green, B.L. (2004). Beyond the methodological
gold standards of behavioural research: Considerations for practice and
policy. Social Policy Report. Giving Child and Youth Development
Knowledge away, Vol XVIII, No. II. Society for the Research in Child
Development.
McCardle, P. & Chhabra, V. (Eds) (2004). The Voice of Evidence
in Reading Research. Baltimore: Brookes.
McNaughton, S. (1995). Patterns of Emergent Literacy: Processes of
Development and Transition. Auckland: Oxford University Press.
McNaughton, S. (2002). Meeting of Minds. Wellington: Learning
Media.
McNaughton, S. & MacDonald, S. (2004). A Quasi-experimental
Design with Cross-sectional and Longitudinal Features for Research-based
Interventions in Educational settings. Manuscript submitted for
publication.
McNaughton, S., Phillips, G.E. & MacDonald, S. (2003).
Profiling teaching and learning needs in beginning literacy instruction:
The case of children in 'low decile' schools in New Zealand.
Journal of Literacy Research, 35(2), 703-730.
Ministry of Education (2002). Curriculum Update. Issue 50, July.
Moats, L.C. (2004). Science, language, and imagination in the
professional development of reading teachers. In P. McCardle & V.
Chhabra (Eds) The Voice of Evidence in Reading Research. Baltimore:
Brookes, 269-287.
New London Group (1996). A pedagogy of multiliteracies: Designing
social futures. Harvard Educational Review, 66, 60-92.
Phillips, G.E., McNaughton, S. & MacDonald, S. (2001). Picking
up the pace: Effective literacy for accelerated progress over the
transition into decile 1 schools. Report to the Ministry of Education.
Wellington: Ministry of Education.
Phillips, G., McNaughton, S. & MacDonald, S. (2004). Managing
the mismatch: Enhancing early literacy progress for children with
diverse language and cultural identities in mainstream urban schools in
New Zealand. Journal of Educational Psychology, 96(2), 309-323.
Pressley, M. (2002). Comprehension strategies instruction: A turn
of the century status report. In C.C. Block & M. Pressley (Eds)
Comprehension Instruction: Research-based Best Practice. New York: The
Guilford Press, 11-27.
Risley, T.R. & Wolf, M.M. (1973). Strategies for analysing
behavioral change over time. In J.R. Nesselroade & H.W. Reese (Eds)
Life-span Developmental Psychology: Methodological Issues. New York:
Academic Press, 175-183.
Robinson, V.M.J, & Lai, M.K in press. Practitioners as
Researchers: Making it Core Business. Corwin Press.
Stanovich, K.E. (1986). Matthew effects in reading: Some
consequences of individual differences in the acquisition of literacy.
Reading Research Quarterly, 21(4), 360-401.
Stanovich, K.E., West, R.F., Cunningham, A.E., Cipielewski, J.
& Siddiqui, S. (1996). The role of inadequate print exposure as a
determinant of reading comprehension problems. In C. Cornoldi & J.
Oakhill (Eds) Reading Comprehension Difficulties: Processes and
Intervention. Mahwah, NJ: Lawrence Erlbaum, 15-32.
Sweet, A.P. & Snow, C.E. (Eds) (2003). Rethinking Reading
Comprehension. New York: Guilford Press.
Timperley, H. (2003). Shifting the Focus: Achievement Information
for Professional Learning. New Zealand: Ministry of Education.
Timperley, H.S. & Robinson, V.M.J. (2001). Achieving school
improvement through challenging and changing teachers' schema.
Journal of Educational Change, 2, 281-300.
Toole, J.C. & Seashore Louis, K. (2002). The role of
professional learning communities in international education. In K.
Leithwood & P. Hallinger (Eds) Second International Handbook of
Educational Leadership and Administration. Dordrecht, The Netherlands:
Kluwer Academic, 245-279.
Stuart McNaughton, Mei Lai, Shelley MacDonald and Sasha Farry
UNIVERSITY OF AUCKLAND
Table 1. Mean student achievement in comprehension
(in stanines) across year levels at Time 1 and Time 2
Class level Time 1 Time 2
Time 1 (Feb 03) (Nov 03) Effect Size
Year 4 Mean 3.27 3.59 ** 0.42
(n = 205) SD 1.32 1.41
Year 5 Mean 3.52 4.10 ** 0.82
(n = 208) SD 1.52 1.55
Year 6 Mean 3.16 3.54 ** 0.53
(n = 265) SD 1.56 1.58
Year 7 Mean 2.84 3.44 ** 1.07
(n = 267) SD 1.31 1.64
Year 8 Mean 2.99 3.73 ** 1.37
(n = 271) SD 1.46 1.84
Total Mean 3.13 3.66 ** 0.81
(n = 1216) SD 1.45 1.64
* p < .01 ** p < .001
Table 2. Mean student achievement in comprehension (in stanines)
across year levels at Time 1 and Time 3
Class level Time 1 Time 3
Time 1 (Feb 03) (Feb 04) (a) Effect Size
Year 4 Mean 3.28 3.82 * 0.26
(n = 174) SD 1.31 1.41
Year 5 Mean 3.44 4.02 ** 0.68
(n = 193) SD 1.55 1.68
Year 6 Mean 2.99 3.11 0.19
(n = 119) SD 1.53 1.34
Year 7 Mean 2.89 3.92 ** 0.51
(n = 238) SD 1.29 3.09
Year 8 Mean 2.94 NA NA
(n = 306) (b) SD 1.44
* p < .01
** p < .001
(a) Diagonal lines show the direction of comparison between
the obtained level after a year (Time 3) and the projected
level established at Time 1.
(b) The baseline cohort at Year 8 involved all students
present at Time 1.