文章基本信息

标题：What works?
作者：Roberts, Helen
期刊名称：Social Policy Journal of New Zealand
印刷版ISSN：1172-4382
出版年度：2005
期号：March
语种：English
出版社：Ministry of Social Development
摘要："What works?" is a fundamental question for policy makers, practitioners and service users. As well as asking what works, and how to implement it, we also need to think about what does not work, and how to stop it. If, as many of us believe (and as there is good evidence to demonstrate), the most effective time to make a difference to outcomes is in childhood, then it is likely that this is also a time when we need to be careful of doing damage. Intervening in children's lives is not just a research, policy and practice issue for those of us at the supply end. It is also a rights issue for children and young people. Those of us in the evidence-based arena who have been pushing policy makers and practitioners to adopt evidence-based practice need to be careful that we do not sell the "What works?" agenda as a simple way to solve problems. Social and educational interventions are complex and are capable of doing as much or even more harm than medical ones. Drawing on examples that relate to children and young people, this paper will suggest that the public and NGO sectors need to invest more heavily in the "D" part of R&D, and that we need to stimulate demand for research-based interventions. In the world of childbearing and HIV/AIDS, services in the United Kingdom have been transformed by powerful alliances between the evidence and the "user" lobbies. What would the world look like for young people if those having difficulties with their education, in trouble with the law, or in the care of the state, were to ask what the evidence is behind the services they are offered?
关键词：AIDS (Disease);Behavioral health care;Behavioral medicine;Child behavior;Children;Clinical psychiatry;Clinical trials;Elementary school students;Evidence-based medicine;HIV;HIV (Viruses);Medical journals;Pediatric diseases;Pediatrics

What works?

Roberts, Helen

Abstract

"What works?" is a fundamental question for policy makers, practitioners and service users. As well as asking what works, and how to implement it, we also need to think about what does not work, and how to stop it. If, as many of us believe (and as there is good evidence to demonstrate), the most effective time to make a difference to outcomes is in childhood, then it is likely that this is also a time when we need to be careful of doing damage. Intervening in children's lives is not just a research, policy and practice issue for those of us at the supply end. It is also a rights issue for children and young people. Those of us in the evidence-based arena who have been pushing policy makers and practitioners to adopt evidence-based practice need to be careful that we do not sell the "What works?" agenda as a simple way to solve problems. Social and educational interventions are complex and are capable of doing as much or even more harm than medical ones. Drawing on examples that relate to children and young people, this paper will suggest that the public and NGO sectors need to invest more heavily in the "D" part of R&D, and that we need to stimulate demand for research-based interventions. In the world of childbearing and HIV/AIDS, services in the United Kingdom have been transformed by powerful alliances between the evidence and the "user" lobbies. What would the world look like for young people if those having difficulties with their education, in trouble with the law, or in the care of the state, were to ask what the evidence is behind the services they are offered?

INTRODUCTION

 It is important not only to know what we know, but that we know
 what we do not know. (Lao-Tze, Chinese philosopher)

 As we know, there are known knowns. There are things we know we
 know. We also know there are known unknowns. That is to say we
 know there are some things we do not know. But there are also
 unknown unknowns, the ones we don't know we don't know. (Donald
 Rumsfeld, United States Defence Secretary (2))

As this conference, the one that preceded it and the considerable work done by public and social policy, practice and academic colleagues in New Zealand make clear, "What works?" is a fundamental question for policy makers, practitioners and service users. Indeed, it is a fundamental question for all of us. "What works?" and "Does this work better than that?" or "Is it better to do something, or do nothing at all?" are questions we can relate to in making small decisions as well as big ones, both in our everyday lives and in our professional activities.

New Zealand has been a leader in this field, and has addressed some important questions. The best evidence syntheses on the Ministry of Education website (3) provide good pragmatic examples of evidence use, matching research design to the particular questions asked. SPEAR itself produces helpful practice guidelines on the policy cycle. (4) In 2002, the New Zealand Treasury asked what we know about the effectiveness of early years services (Annesley et al. 2002). New Zealand academics have asked about the effect of incomes on outcomes (Connor et al. 1999). Your country questioned the evidence for going to war in Iraq, the evidence for which has been the subject of so much discussion in the United Kingdom since (Butler 2004).

At present, practitioners, parents and children and young people looking for good research evidence on common problems will often find the evidence cupboard disappointingly bare. Where it is not bare, the outcomes measured will not always be those that seem most important to these players. It is refreshing that in New Zealand, the learning outcomes for schooling derived from best evidence syntheses include not just achievement and skills, but wellbeing and whanau spirit, as well as respect for others (rangimarie), tolerance, non-racist behaviour, caring or compassion (aroha), diligence and hospitality or generosity (manaakitanga) (Ministry of Education 1993:17).

The role of formal research evidence in the formation of social and public policy, at least in the United Kingdom, has a relatively short history. There has probably been a role for "intelligence" for as long as countries have had foreign policies, and this is likely to have been a mixture of sound research, know how, and something a bloke said in a bar (or possibly at a cocktail party if we are talking about the diplomatic corps). Much of this will have involved matters such as trade, military competition and conflict. The relative distance of the majority of the expatriate British from local populations, so well described by P.M. Forster, George Orwell and Doris Lessing suggests that the learning that could have been had from different countries' or indigenous peoples' ways of doing things was not a core part of the task of those who spent time overseas.

In the 19th century there were certainly a number of "investigations" which bear the hallmarks of good qualitative research. One example of "listening to children" can be seen in the report of Andrew Doyle to the local government board at Whitehall on the emigration of pauper children to Canada. He spoke to many children, some of whom had had happy experiences, some less so. Of course if the United Kingdom's policy of sending orphan children to New Zealand, Canada, Australia and Rhodesia (as it then was) had been subjected to a traditional evaluation, the report may well have said that the children looked well cared for, the food, transport and child-to-adult ratios were adequate, and the youngsters, asked about their experiences might well have said: "It's OK here" (or whatever the 19th century equivalent was). Doyle managed to collect the other side of the story, often untold even in current policy evaluations. Of the sea voyage, he was told: "we sicked all over each other," and a child patiently explained the meaning of adoption: "'Doption sir, is when folks get a girl to work without wages" (Doyle 1875).

There were other investigations which we would certainly view as research evidence now. Charles Booth's work in the United Kingdom in the late 19th century charted inequalities in health and welfare. Between 1886 and 1903 Charles Booth (and Mary Booth, about whom we hear rather less) worked on the life and labour of the people of London--17 volumes in all--working with, among others, Beatrice Webb. He lodged with working-class families, and observed both the inequalities and the resilience and coping strategies of families living with disadvantage that we continue to learn from today. Booth wrote:

 The children in class E ["... above the line of poverty"], and still
 more in class D ["... the poor"], have when young less chance of
 surviving than those of the rich, but I certainly think their lives
 are happier, ... always provided they have decent parents. It is the
 constant occupation, which makes the children's lives so happy. They
 have their regular school hours. They have for playground the back
 yard, [or] the even greater delights of the street. Let it not be
 supposed, however, that on this I propose to base any argument
 against the desire of this class to better its position. Very far
 from it. [...] the uncertainty of their lot, whether or not felt as
 an anxiety, is ever present as a danger. (Booth 1902: 159-60)

Booth was not only a keen social observer and essayist. He produced maps descriptive of London poverty which would be the pride of a 21st century social geographer, with investigators collecting data while accompanying policemen on their rounds. The Fabian Society, of which Booth was a member, was founded in 1884, and committed to gradual social reform. Also active in the Fabians were Beatrice and Sidney Webb, who saw research very specifically as informing social policy in a linear way which we might now consider unrealistic. The London School of Economics, now part of the University of London was founded by the Webbs in 1895 to inform social and public policy. Its motto, "rerum cognoscere causas"--to seek the cause of things, remains one key question at the heart of evidence-informed policy today. Another is what to do when we do know the cause of things, and a further one is to learn the consequences of things.

While there were huge scientific advances in all fields in the 20th century, some benign or better, some less so, the project of uniting research policy and practice became less central as a key intellectual interest. Indeed, even today there are those who recoil at the implied positivism of research-informed policy and practice. Although the social sciences and agriculture had been in the forefront of the development and use of research evidence, and in particular experimental methods, it was the establishment of the worldwide Cochrane Collaboration in health care in 1993 that gave fresh impetus to the use of sound research evidence to inform practice. In the late 1990s, and to some extent prompted by both social scientists and clinicians involved in Cochrane, the Economic and Social Research Council (ESRC) in the United Kingdom made available funding for a centre, and a number of "nodes" to work on research evidence. One of these ESRC-funded "nodes" is the Research Unit in Research Utilisation, whose co-director, Sandra Nutley, spoke at the 2003 Social Research and Evaluation Conference (Nutley 2003). Another is What Works for Children? which is a collaborative project between City University, the University of York and Barnardo's.

In this paper, I am drawing on work we have done as part of the What Works for Children enterprise and on joint work with Mark Petticrew from the University of Glasgow on ways of answering different kinds of research question, and of carrying out systematic reviews (Petticrew and Roberts 2003, Petticrew and Roberts 2005). In order to illustrate some of the problems (and steps towards solutions) in the world of evidence-based policy and practice, I am going to:

* briefly describe some of the work on the What Works for Children project

* describe some of the ways in which systematic reviewing can help move us towards making sense of large volumes of research

* discuss the gap between the "R" and the "D" of R&D in much academic work

* speculate on the emancipatory potential of the fuller involvement in the evidence agenda of end-point users of services or policies

* sound a note of caution on promising too much from the evidence agenda without sufficient investment, and the difficulties that can arise when the research evidence suggests that something does not work

WHAT WORKS FOR CHILDREN?

Academic life tends to be about the generation of new knowledge. Scholarship and primary studies are at the core of university activity, and rightly so. But for those of us working on the boundaries between research, policy and practice there is another agenda, also with a scholarly component, which involves dissemination and implementation. What Works for Children focuses on sharing and implementing research evidence rather than generating "new" evidence. With a population group, children, rather than a discipline or a single profession as the core unifying factor, our work covers a range of disciplinary and professional interests, emphasising interventions to address children's wellbeing in its broadest sense.

Successful research impact has been identified as depending on more than dissemination (Barnardo's R&D 2000, Nutley et al. 2002), and facilitation may play a key role in the implementation of evidence in practice. This is by no means always the case, however, and a single primary study or review can have massive impact if the findings say the right things to the right people at the right time. In Morton Hunt's influential work on the way in which science "takes stock", for instance, he describes how Professor Eric Hanushek wrote an article at the start of the Reagan administration in 1981 maintaining that empirical studies showed that increased spending on education did not increase student achievement (Hanushek 1981). Hunt writes:

 This startingly counter-intuitive finding was catnip to
 conservatives [and] Hanushek soon became an expert witness for the
 defence at hearings ... against school boards ... accused of miserly
 budgeting. (Hunt 1997:54)

Since we did not have a dramatic piece of work likely to attract this kind of attention, we proceeded with the evidence-based notion that facilitation makes a difference, and on the basis of this, our What Works for Children project incorporates a post of development or implementation officer, putting the "D" into R&D. She supports practitioners and service planners, and both she and the university researchers are exploring the levers and barriers to the use of research in a real world setting. Alongside this, the research team at City University has produced a range of tools for practitioners; an Evidence Guide, Evidence Nuggets (which are summaries of evidence, including costings and examples), and research overviews. These and other resources are on the What Works for Children website (www.whatworksforchildren.org.uk).

Rather than describing in detail a project that is fully documented on our website, and well described by my colleague Kristin Liabo in this journal (2005), and elsewhere in published papers (Roberts et al. 2004, Liabo et al. 2003, Lucas et al. 2003, Stevens et al. 2005), I want to draw on our project for examples of the complexity of finding, appraising and where appropriate using research evidence--or, put more simply, the evidence of what works--which relate to children and young people. I want to argue in this paper that despite some excellent examples of cross-professional and cross-disciplinary working, such as the work of Davies et al. (2002) and Wood and Kunze (2004) in New Zealand, for instance, the public sector needs to invest more heavily in the "D" part of R&D. Moreover we need to stimulate greater demand for, as well as a steady supply of research-based interventions. In the world of childbearing and HIV/AIDS, services in the United Kingdom have been transformed by powerful alliances between the evidence lobby and the "user" lobbies. What would the world look like for children and young people if those in trouble with the law, or in the care of the state, were to ask what the evidence is behind the services they are offered?

FINDING OUT WHAT WORKS

* Is water fluoridation effective in reducing dental caries in children?

* Do children learn better in small classes?

* Can young offenders be "scared straight" through tough penal measures?

* Can the steep social class gradient in fire-related child deaths be reduced by installing smoke alarms?

As anyone attending this conference will be well aware, finding out "what works" is no mean feat. Many evaluations do not have the methods or the tools to demonstrate whether something has had the intended outcomes, though they can often give us really good information on the processes that underpin a successful service. And if we cannot get people in through the door to use services (and keep them there), then no service is going to have a chance of proving its success.

To give a practical example of problems that may arise with evaluation: suppose someone planning services reads a high-quality study showing that for children with a poor start in life, having parents who take an interest in their children's education seems to lead to better outcomes in school and later on. In the light of this, a school-based discussion group to encourage parental support for children's education is set up. Suppose that the programme runs for eight weekly sessions, with before and after questionnaires. It is clear from the "after" questionnaires that it is considered by both workers and parents to have been a success (though there are no data on those who dropped out, or who did not attend in the first place). Their judgement is based on the self-reports of the parents, the workers' observations of parents' increasing confidence in the school setting, and improvements in the apparent attention span and behaviour of the children (who have been cared for in an after-school group during the parents' discussions). The work is carefully carried out and gets good local publicity. The researchers present the work well, and it is picked up for a national early morning news programme. Before asserting that "discussion groups work" we need to be as certain as possible that in this particular case:

* improvements have taken place

* they have been brought about as a result of the discussion group.

It is difficult to do this if we do not have mechanisms for ruling out competing explanations, such as the following.

* The parents might have become more involved in the school in any case with the passage of time. There is evidence from other fields that many problems improve spontaneously over time in two-thirds of cases (Rachman and Wilson 1980). This provides a reason for having a "no treatment control", so that we can compare the effects of our intervention with the passage of time.

* The children might have become more settled through spending time with skilled after-school club workers.

* Other external factors might be responsible for changes, such as improved income support or additional help from social services.

* The perceived improvement in the parents might be due to their having learned the "right" things to say in the course of the intervention, having been asked the same kind of questions at the beginning and end of the programme, and having become familiar with the expectations of the workers.

* The parents who stayed the course might have been highly motivated and would have improved anyway. Alternatively, parents who dropped out might have done just as well as those in the programme. We simply don't know (adapted from Macdonald and Roberts 1995).

What is more, we also need to know what other studies have shown, and how good those studies are. Maybe this study showed it "worked" because of the play of chance, and not because the intervention really was generally effective. Maybe other studies show that it does not work, or that it only works in some types of setting, or for certain types of parent or child. After all, we are well used to seeing research studies reported in the media and elsewhere that show that something works one year, only to be contradicted by a different study (or a different researcher) the next. What happens if we take the results of all these studies together? Will we still conclude that discussion groups "work"? Or, having seen all the relevant evidence, will we conclude the opposite? Or remain uncertain? These problems demonstrate the difficulties that arise whenever we plan to do something that we claim produces clear outcomes, whether good or bad.

This example also illustrates the not particularly earth-shattering observation that some research methods are better than others for answering different kinds of question. In effect, we need the right horse for each particular course. A particular cause of upset, adversarial sabre rattling and misrepresentation of the views of those one disagrees with has been the hierarchy of evidence, initially developed by the Canadian Task Force on the Periodic Health Examination, and subsequently adopted by the United States Preventive Services Task Force to help decide on priorities when searching for studies to answer clinical questions (see Table 1).

With growing interest in the effectiveness of social interventions, a single hierarchy of methods has become unhelpful as well as divisive, and at present certainly misrepresents the interplay between the question being asked and the type of research most suited to answering it. For this reason, a matrix, or a typology, may be a useful construct. As Table 2 illustrates, different research methods are, after all, more or less good at answering different kinds of research question. A randomised controlled trial, well conducted, can tell us which kind of smoke alarm is most likely to be functioning 18 months after installation, but it cannot tell us the best way to work effectively with housing managers to make sure smoke alarms are installed effectively and cost effectively, and that the households of the most vulnerable tenants are included. The obstacles and levers for the uptake of research findings are likely to be understood through methods different from those usually found at the top of the evidence hierarchy. It may therefore be more useful to think of how one can best use the wide range of evidence available--and particularly to consider what types of study are most suitable for answering particular types of question.

A related problem lies in the stark use of the term "evidence". It is not uncommon for discussion papers to use the terms "evidence", "evidence based" and "hierarchies of evidence," while avoiding any discussion of what sort of evidence they are advocating (or rejecting). For epidemiological questions relating to "real world" risk factors which are not amenable to randomisation (e.g., does passive smoking in the home cause cancer in later life to children exposed to cigarette smoke?) a particular sort of data are required, with prospective cohort studies at the top of the hierarchy. Qualitative studies, expert opinion and surveys, on the other hand, are likely to have crucial lessons for those wanting to understand the process of implementing an intervention, what can go wrong, and what the unexpected adverse effects might be when an implementation is rolled out to a larger population. A different sort of hierarchy is again implied.

Overall, information on both outcomes and processes is of value. Knowing that an intervention works is no guarantee that it will be used, no matter how obvious or simple it is to implement. For example, it is around 150 years since Semmelweis showed that handwashing reduces infection, yet healthcare workers' compliance with handwashing remains poor. Even the most simple, cost-effective and logical intervention fails if people will not carry it out. The British government is currently exercised by the problems of MRSA (Methicillin-resistant Staphylococcus Aureus), and a number of solutions are being explored. While it may have strong common sense value for a patient to say to a doctor or nurse, "Have you washed your hands?" it is potentially deeply offensive to cast doubt on the personal cleanliness habits of others. To do so from the vulnerable position of a hospital bed may be particularly difficult, and possibly unwise.

An example of an area where policy and research have been evolving in tandem is in the growing use of child public health interventions, which can be effective in both the immediate and the longer term in improving outcomes for children (Roberts 1997, Glass 1999, Glass 2001). Highscope, Headstart, parenting education, home visiting and mentoring provide examples of well-designed programmes that have been the subject of robust evaluations, some of them using complex randomised controlled trials (Schweinhart and Weikart 1993, Barlow 1999, Webster Stratton et al. 1989, Olds et al. 2002, Tierney et al. 2000, Grossman and Tierney 1998, Dubois et al. 2002). But "parent education", "home visiting" and mentoring, as many of their proponents and evaluators would agree, largely remain black boxes with a great many unanswered questions about what exactly the intervention involves, and how it works.

Meanwhile, a climate has been created in a number of countries, including my own, where it is widely held that these interventions "work" and national programmes have been established. The questions of who delivers the service, the kind of young people who might benefit, and the content of services likely to be effective can be lost in the drive to get the show on the road. These programmes can gain momentum because they have strong face validity. They look like the sort of things that should work, our "gut" feelings tell us that they will work, and we want them to work. Not only may this result in premature roll-out on the basis on insufficient evidence or a simplistic interpretation of the evidence, but it may then be difficult to stop or change direction after programmes have been launched.

Take an example of a big social problem--anti-social behaviour in young people (Scott 1998)--and an intervention--mentoring--where a sound meta-analysis (Dubois et al. 2002) has demonstrated benefit. Anti-social behaviour in young people is a problem for families, for the young people themselves, for the police, for communities, and for politicians. This makes finding a solution a political as well as a therapeutic imperative--a potent driver to "do something". Mentoring is non-invasive and medication-free. It is easy to see why it might work, and why it is attractive to politicians and policy makers. In February 2003, Lord Filkin, then a minister in the Home Office, announced 850,000 [pounds sterling] of funding for mentoring schemes in England:

 Mentors can make a real difference to ... some of the most
 vulnerable people ... and help to make our society more inclusive.
 There are ... excellent examples of schemes which really work. (5)

There was equal enthusiasm on the other side of the pond. In his state of the union address in January 2003, President Bush announced plans for a $450 million initiative to expand the availability of mentoring programmes for young people. This included $300 million for mentoring at-risk pupils and $150 million to provide mentors to children of prisoners:

 I ask Congress and the American people to focus the spirit of
 service and the resources of government on the needs of some of our
 most vulnerable citizens--boys and girls trying to grow up without
 guidance and attention and children who must walk through a prison
 gate to be hugged by their mom or dad. (6)

WHAT DOES THE RESEARCH SHOW?

A problem with interventions that become politically attractive, and to which large amounts of cash are attached, is that research may be used for support rather than illumination. There is indeed robust research that indicates benefits from mentoring for some young people, for some programmes, in some circumstances, in relation to some outcomes. There are also good descriptive evaluations that suggest that those young people who stay on in programmes are inclined to report favourably on the experience (St James-Roberts and Singh 2001, Tarling et al. 2001).

As part of our What Works for Children project, we reviewed the evidence on whether one-to-one, non-directive mentoring programmes targeted towards young people at risk of, or already involved in, offending can improve behaviour. We then contacted David Dubois in Chicago who, with colleagues, has authored the most complete meta-analysis on youth mentoring. He reviewed our work, pointed us to further evidence and with us co-authored a publication on mentoring in the British Medical Journal suggesting that it is less than 100% effective for all young people at all times. In effect, on the basis of existing reviews we concluded that research on mentoring programmes does not provide evidence of measurable gains in outcomes for mentees who enter programmes as a result of, or to solve problems of offending, truanting or involvement in other anti-social behaviours.

In fact it looks from the evidence as if mentoring programmes for vulnerable young people may have a negative impact, and adverse effects associated with mentor--mentee relationship breakdowns have been reported (Grossman and Rhodes 2002). Worryingly, a 10-year follow-up study of one well-designed scheme found that a sub-group of mentored young people, some of whom had previously been arrested for minor offences, were unexpectedly found to be more likely to be arrested after the project than those not mentored (O'Donnell et al. 1979) On the basis of findings such as these, we concluded rather cautiously that non-directive mentoring programmes delivered by volunteers cannot be recommended as an effective intervention for young people at risk for, or already involved in, anti-social behaviour or criminal activities.

We were not suggesting that mentoring cannot work. There are many different kinds of mentoring and some show better evidence of effect than others. Our current state of knowledge on the effectiveness of mentoring is similar to that of a new drug that shows promise, but remains in need of further research and development. There is no equivalent of the National Institute of Clinical Excellence (NICE) or the Food and Drugs Administration (FDA) for social interventions. If there were, no more than a handful of programmes might have realistic hopes of qualifying. And even then, it would have to be acknowledged that a full understanding of the safeguards needed to ensure that young people are not harmed by participation is lacking. This observation was picked up by Mark Fulop, the Director of the National Mentoring Center in the United States, as part of a mentoring exchange digest (www.nwrel.org/mentoring). Fulop has a rather different perspective:

 Mentoring is not a "drug" or a "treatment" that needs FDA approval.
 Mentoring is a process of community-building and community
 organization. If we have to measure with a microscope to see if
 youth mentoring is making a difference then we are not making enough
 difference in the first place. Every mentoring program is a living
 lab that is telling a story each and every day but the story is not
 whether Johnny is skipping few classes or that Suzie is now doing
 her homework. The outcomes of mentoring need to be measured at the
 community level. Collectively, is our community holding our children
 closer to our hearts? (Mentor exchange digest, 5.3.2004).

Fulop makes the important point that social value judgements are involved in the outcomes we choose to measure, and clearly, the amount of investment a community makes in its children could be a useful and important outcome. While holding children close to our hearts might be harder to measure, the point is a sound one. We need to make sure that we are thinking about children as whole people, in the context of their families and communities, rather than as trainee adults. Meanwhile, for some of the most vulnerable young people, mentoring programmes as currently implemented may become one more intervention that fails to deliver on its promises.

Showing that something works (or not) is one thing, and difficult enough. But what happens next in real-life settings?

In our case, we summarised the research findings and presented them to practitioners and planners tasked with implementing mentoring and allocated the funds to do so. Unsurprisingly, their response was not, "Let's send back the funds and go home". What they did do was to roll up their sleeves and ask "How can we make it work?" They asked for evidence on what seemed to have a more positive effect for the group of children with whom they were working, and decided to include a directive element in their approach, drawing on the evidence of the apparent effectiveness of cognitive behavioural therapeutic approaches in attaining some of the outcomes sought, including delivery via a mentoring-like component (Davidson et al. 1987, Cavell and Hughes 2000). They also drew on practices that seem to be correlated with stronger benefits for young people, such as ongoing training for volunteer mentors and involvement of parents (DuBois et al. 2002). Of course, without implementing these innovations in a trial setting, we will never know whether these approaches are better, worse or much the same as doing nothing, or implementing the standard local means of delivering mentoring.

MOVING FORWARD

 "Under no child left behind, schools are being set up to fail"
 "Good intentions, bad results"
 "No child left behind produces unintended negative results"
 "Youth programs in Queens are educational, fun and often free"
 "Why mentoring programmes and relationships fail"
 (headlines from newspapers in the United States)

The research (if any) behind the kinds of headlines above is generally given much greater credence than it merits. With a clear message and a good communicator, they can take off in much the same way as Hanushek's work on spending on schools. But there are few studies that are so methodologically sound, whose results are so generalisable and that leave us so certain that the results represent a good approximation of the "truth" that we should accept their findings outright.

A problem with interpreting and using research is that it is often so far removed from real-life settings that it may be difficult for policy makers or the public to know whether the results are to be taken seriously, or whether they represent no more than the latest unreliable dispatch from the world of science. The more sceptical research-informed policy maker may simply wait patiently, on the grounds that another researcher will soon publish a paper saying the opposite. If one study appears claiming that what delinquents need is a short sharp shock, another is sure to follow suggesting that what they actually need is a teambuilding adventure holiday.

But if one problem is faced by those sceptical of single studies, quite another is faced by the researcher, policy maker or practitioner who tries to range more broadly in his or her reading and thinking. With new journals launched yearly, and thousands of research papers published, it is impossible for even the most energetic policy maker or researcher to keep up-to-date with the most recent research evidence, unless they are interested in a very narrow field indeed. The increasing amount of research information, which varies in quality and relevance, can make it difficult to respond to these pressures, and can make the integration of evidence into practice difficult. An example of information overload is provided in Box 2.

Box 2--Stopping bullying: information overload (adapted from Petticrew
and Roberts 2005)

Teachers, parents and pupils interested in preventing bullying, or
stopping it when it does happen, will have no shortage of information.
There are over a quarter of a million sites which refer to school
bullying on the web. Among the approaches to this problem described by
one government organisation, the Department for Education and Skills in
the United Kingdom (http://www.parentcentre.gov.uk) are:

* co-operative group work

* Circle Time

* Circle of Friends

* befriending

* Schoolwatch

* the support group approach

* mediation by adults

* mediation by peers

* active listening/counselling based approaches

* quality circles

* assertiveness training groups.

How can those using the web work out which sites to trust, and which
interventions might actually work? Some sites suggest that certain
interventions such as using sanctions against bullies can be
ineffective, or even harmful--that is, actually increase bullying
(e.g., http://www.education.unisa.edu.au/bullying). Other sites suggest
that such approaches may work (http://www.educationworld.com/a_issues/
issues103.shtml). The same intervention may appear to work for some
children, but not for others--younger children, for example--and some
types of bullying, such as physical bullying, may be more readily
reduced than others, such as verbal bullying.

For bullying, as for other types of social problem, one can quickly
become swamped with well-meaning advice. Navigating one's way through
the swamp is tricky, but systematic reviews provide stepping
stones--differentiating between the boggy areas (the morass of
irrelevant information) and the higher ground (the pockets of reliable
research information on what works and for whom, and where and when).

Systematic reviews can provide a means of synthesising information on
bullying, or aspects of bullying, and give a reliable overview of what
the research literature can tell us about what works. For example, a
systematic review of school-based violence prevention programmes
identified 44 trials in all, and concluded that while more high-quality
trials are needed, three kinds of programmes may reduce aggressive and
violent behaviours in children who already exhibit such behaviour
(Mytton et al. 2002).

Systematic literature reviews are a method of making sense of large bodies of information, and a means of contributing to the answers to questions about what works and what does not. They are a method of mapping out areas of uncertainty, and identifying where little or no relevant research has been done but where new studies are needed.

Systematic reviews are a method of critically appraising, summarising and attempting to reconcile the evidence in order to inform policy and practice, and they provide a synthesis of robust studies in a particular field of work which no policymaker or practitioner, however diligent, could possibly hope to read themselves. Systematic reviews are thus unlike "reviews of the studies I could find", "reviews of the authors I admire," "reviews which leave out inconveniently inconclusive findings or findings I don't like," and "reviews which support the policy or intervention I intend to introduce". Not only do they tell us about the current state of knowledge in an area, and any inconsistencies within it, but they also clarify what we still need to know.

The systematic review adopts a particular methodology in an endeavour to limit bias, with the overall aim of producing a scientific summary of the evidence in any area. In this respect, systematic reviews are simply another research method, and in many respects they are very similar to a survey--though in this case they involve a survey of the literature, not of people. It is less of a discussion of the literature, and more of a scientific tool; but it can also do more than this, and can be used to summarise, appraise and communicate the results and implications of otherwise unmanageable quantities of research. It is widely agreed, however, that at least one of these elements--communication--needs to be greatly improved if systematic reviews are to be really useful.

WHEN TO DO A SYSTEMATIC REVIEW

It can help to do a systematic review:

* when there is uncertainty (for example, about the effectiveness of a policy or a service, and where there has been research on the issue)

* in the early stages of development of a policy, when evidence of the likely effects of an intervention is required

* when it is known that there is a wide range of research on a subject but where key questions remain unanswered--such as questions about treatment, prevention, diagnosis, or causation, or questions about people's experiences of being on the receiving end of an intervention

* when a general overall picture of the evidence in a topic area is needed to direct future research efforts.

PUTTING THE "D" INTO R&D

If, as many of us believe, and as there is good evidence to demonstrate, the most effective time to make a difference to outcomes is in childhood, and the earlier the better, then it is likely that this is a time when we need to be particularly careful of doing damage. In other words, as well as asking what works and how to implement it, we also need to think about what does not work and how to stop it.

It is not part of the initial training of most academics to work on policy or practice development, and it tends not to be part of research budgets to provide cash for the "D" component, where "D" means development, although increasingly, funders provide time and funds for dissemination, which can be a first step to development.

There are clearly important training needs here, with an eye to those who really do know about the "D" of R&D, such as pharmaceutical companies, and exchanges and secondments between academic life, policy and practice. But none of these will work well if the interventions being proposed are not fit for purpose; are not meaningful to those who are intended to receive them, or are culturally inappropriate.

For this reason, the inclusion of end-point users at every point in the R&D process is not just good democratic practice--it is likely to result in better work and more effective interventions, to say nothing of more fun in the working day.

CONCLUSION

Social interventions are complex and are capable of doing as much or even more harm than medical ones. They need to be subject to as much if not more evaluation before and after implementation.

There are no simple solutions to complex problems, and therein lies a problem for policy makers and politicians. To find a quick-acting solution to an important social problem is a great prize, particularly if the solution is one that will have a result before the next general election. When that solution seems to have common sense behind it, and is not too expensive, it becomes even more compelling.

Those of us who have been pushing policy makers and practitioners to adopt evidence-based policy need to be careful that we do not sell it as a simple way to solve problems. We need a lot more work on how to collaborate effectively with policy makers dealing with complex interventions and evidence. It is probably even clearer to practitioners, policy makers and front-line users of services than it is to researchers that there are massive evidence gaps, sometimes because the right questions are not being asked. For many social interventions, there will be little evidence to review--few primary studies, even fewer that are sufficiently robust to affect policy. But we must be careful not to confuse absence of evidence with evidence of absence.

The R&D agenda in health and social care needs huge investment if we are to develop adequate social interventions for big problems. At present, practitioners, parents and children, and young people themselves looking for good research evidence on common problems will find the evidence cupboard disappointingly bare.

Intervening in children's lives is not just a research policy and practice issue for those of us at the supply end. It is also a rights issue for children and young people. Young people have the right to evidence-based interventions. We know from the past that many well-meaning attempts to do good resulted in harm, but we now have the means through systematic review, trials, sound evaluations and good qualitative work, to do better.

Box 1--Key Messages

* Mentoring children and young people at risk for, or already involved
 in, antisocial behaviours has become popular, but research evidence
 to support the most commonly used programmes is lacking.
* There is evidence that failed mentoring relationships may have a
 detrimental effect on a sub-group of children and young people.
* A commitment to research-based practice needs to focus on what works
 in implementation as well as evidence of effect. Mentoring practices
 vary widely.
* In order to know more, we need further trials, with end-point users
 and practitioners involved from the outset in study design

Source: adapted from Roberts et al. 2004

Table 1 An Example of the "Hierarchy of Evidence"
Type of Evidence

* Systematic reviews and meta-analyses

* Randomised controlled trials with definitive results

* Randomised controlled trials with non-definitive results

* Cohort studies

* Case-control studies

* Cross-sectional surveys

* Case reports

Table 2 An Example of a Typology of Evidence
(for Social Interventions in Children)

Research question Qualitative Survey
 Research

Effectiveness

Does this work? Does
 doing this work better
 than doing that?

Effectiveness of service delivery
How does it work? ++ +

Salience
Does it matter? ++ ++

Safety
Will it do more good
 than harm? +

Acceptability
Will children/parents be
 willing to or want to take
 up the service offered? ++ +

Cost effectiveness
Is it worth buying this
 service?

Appropriateness
Is this the right service
 for these children? ++ ++

Quality
How good is the service? ++ ++

Research question Case Cohort
 Control Studies
 Studies

Effectiveness

Does this work? Does
 doing this work better
 than doing that? +

Effectiveness of service delivery
How does it work?

Salience
Does it matter?

Safety
Will it do more good
 than harm? + +

Acceptability
Will children/parents be
 willing to or want to take
 up the service offered?

Cost effectiveness
Is it worth buying this
 service?

Appropriateness
Is this the right service
 for these children?

Quality
How good is the service? + +

Research question RCTs Quasi-
 experimental
 Studies

Effectiveness

Does this work? Does
 doing this work better
 than doing that? ++ +

Effectiveness of service delivery
How does it work?

Salience
Does it matter?

Safety
Will it do more good
 than harm? ++ +

Acceptability
Will children/parents be
 willing to or want to take
 up the service offered? + +

Cost effectiveness
Is it worth buying this
 service? ++

Appropriateness
Is this the right service
 for these children?

Quality
How good is the service?

Research question Non- Systematic
 experimental Reviews
 Evaluations

Effectiveness

Does this work? Does
 doing this work better
 than doing that? +++

Effectiveness of service delivery
How does it work? + +++

Salience
Does it matter? +++

Safety
Will it do more good
 than harm? + +++

Acceptability
Will children/parents be
 willing to or want to take
 up the service offered? + +++

Cost effectiveness
Is it worth buying this
 service? +++

Appropriateness
Is this the right service
 for these children? ++

Quality
How good is the service? +

Source: Petticrew and Roberts 2002 (adapted from Muir Gray 1997)

(1) Acknowledgements

I am grateful to my colleagues in the ESRC-funded What Works for Children initiative (www.whatworksforchildren.org.uk), whose work has informed mine, and to other colleagues in the Child Policy and Research Unit of City University. My observations on Charles Booth and the Fabians were drawn largely from the Charles Booth Archive at the London School of Economics (LSE), and conversations with Rodney Barker, Professor of Government at LSE. My colleague Mark Petticrew allowed me to draw on our forthcoming book on systematic reviewing (Petticrew and Roberts 2005) and to reproduce a diagram from our paper "Horses for courses" (Petticrew and Roberts 2003). I am particularly grateful to the Ministry of Education and to Martin Connelly, Senior Manager of Education Management Policy in the Ministry, for inviting me to the conference.

(2) Feb 12 2002, Department of Defence News Briefing http://www.defensetink.mil/transcripts/2002/t02122002_t212sdv2.html (Petticrew and Roberts 2005).

(3) www.minedu.govt.nz

(4) www.spear.govt.nz/SPEAR/documents/best-practice/ background-paper-series-1.-doc

(5) www.homeoffice.gov.uk/docs/capital_mentoring_grants.htm] [Accessed November 12th 2004]

(6) www.cnn.com/2003/ALLPOLITICS/01/28/sotu.transcript/ [Accessed 12-11-2004]

REFERENCES

Annesley, B., P. Christoffel, R. Crawford, V. Jacobsen, G. Johnston and N. Mays (2002) Investing in Children's Well-Being from a Life Course Perspective: A Preliminary Analytical Framework and Overview of the Literature, New Zealand Treasury, Wellington.

Barlow, J. (1999) Systematic Review of the Effectiveness of Parent-Training Programmes in Improving Behaviour Problems in Children Aged 3-10 Years (second edition), Health Services Research Unit, University of Oxford.

Barnardo's R&D (2002) What Works? Making Connections Linking? Research and Practice,

Barnardo's R&D team, Barnardo's, Barkingside.

Booth Charles (1902) Life and Labour of the People in London, vol. 1, Macmillan, London.

Butler (2004) Review of Intelligence on Weapons of Mass Destruction, Return to an address of the Honourable the House of Commons, dated July 14th, 2004, Report of a Committee of Privy Counsellors, Chairman The Rt Hon the Lord Butler of Brockwell, KG, GCB, CVO, http://www.butlerreview.org.uk/report/index.asp

Cavell, T.A. and J.N. Hughes (2000) "Secondary prevention as context for assessing change processes in aggressive children" Journal of School Psychology, 38:199-235.

Connor J., A. Rodgers and P. Priest (1999) "Randomised studies of income supplementation: A lost opportunity to assess health outcomes" Journal of Epidemiology and Community Health, 53:725-730.

Davidson, W.S., R. Redner, C.H. Blakely, C.M. Mitchell and J.G. Emshoff. (1987) "Diversion of juvenile offenders: An experimental comparison" Journal of Consulting and Clinical Psychology, 55(l):68-75.

Davies, E., B. Wood and R. Stephens (2002) "From rhetoric to action: A case for a comprehensive community-based initiative to improve developmental outcomes for disadvantaged children" Social Policy Journal of New Zealand, 19:28-47.

Doyle, A. (1875) Pauper Children (Canada), return to an order of the Honourable the House of Commons, dated 8 February 1875.

DuBois, D.L., B.E. Holloway, J.C. Valentine and H. Cooper (2002) "Effectiveness of mentoring programs for youth: a meta-analytic review" American Journal of Community Psychology, 30:157-197.

Glass, N. (1999) "Sure Start: the development of an early intervention programme for young children in the UK" Children and Society, 13(4):257-264

Glass, N. (2001) "What works for children: The political issues" Children and Society, 15(1):14-20

Grossman, J.B. and J.E. Rhodes (2002) "The test of time: Predictors and effects of duration in youth mentoring programs" American Journal of Community Psychology, 30:199-206.

Grossman, J.B. and J.P. Tierney (1998) "Does mentoring work? An impact study of the Big Brothers Big Sisters program" Evaluation Review, 22:403-426.

Hanusheck, Eric A. (1981) "Throwing money at schools" Journal of Policy Analysis and Management, 1:19-41.

Hunt, M. (1997) How Science Takes Stock: The Story of Meta Analysis, Russell Sage Foundation, New York.

Liabo, K. (2002) "What works for children? An evidence-based information source for children's social care" Learning and Skills Research 5(3):50-51.

Liabo, K. (2005) "What works for children and what works in research implementation? Experiences from a research and development project in the united kingdom" Social Policy Journal of New Zealand, 24:.

Liabo, K., P. Lucas and H. Roberts (2003) "Can traffic calming measures achieve the Children's Fund objective of reducing inequalities in child health?" Archives of Disease in Childhood, 88(3):235-36.

Lucas, P., K. Liabo, H. Roberts (2003) "Do behavioural treatments for sleep disorders in children with Down's syndrome work?" Archives of Disease ill Childhood 87(5):413-414.

Macdonald, G. and H. Roberts (1995) What Works in the Early Years? Barnardo's Barkingside.

Ministry of Education (1993) The New Zealand Curriculum Framework, Ministry of Education, Wellington.

Muir Gray JA (1997) Evidence Based Healthcare, Churchill Livingstone, Edinburgh.

Mytton, J., C. DiGuiseppi, D. Gough, R. Taylor and S. Logan (2002) "School-based violence prevention programming: Systematic review of secondary prevention trials" Archives of Pediatrics and Adolescent Medicine, 156(8):752-762.

Nutley, S., H. Davies and I. Walter (2003) "Evidence-based policy and practice: Cross-sector lessons from the United Kingdom" Social Policy Journal of New Zealand, 20:29-48.

Nutley, S., I. Walter and H. Davies (2002) From Knowing to Doing: A Framework for Understanding the Evidence into Practice Agenda, Research Unit for Research Utilisation, Department of Management, University of St Andrews, www.standrews.ac.uk/~ruru/RURU/%20publications%20list.htm.

O'Donnell, C.R., T. Lydgate and W.S.O. Fo (1979) "The Buddy System: Review and follow-up" Child Behavior Therapy, 1:161-169.

Olds D. L., J. Robinson, R. O'Brien, D.W. Luckey, L.M. Pettitt, C.R. Henderson Jr., R.K. Ng, K.L. Sheff, J. Korfmacher, S. Haitt and A. Talmi (2002) "Home visiting by paraprofessionals and by nurses: A randomized, controlled trial" Pediatrics, 110(3):486-496.

Petticrew, M. and H. Roberts (2003) "Evidence, hierarchies and typologies: Horses for courses" Journal of Epidemiology and Community Health, 57:527-529.

Petticrew, M. and H. Roberts (2005) Systematic Reviews in the Social Sciences: A Practical Guide, Blackwells, Oxford.

Rachman, S. and G.T Wilson (1980) The Effects of Psychological Therapy, Pergamon, London.

Roberts, H. (1997) "Socio-economic determinants of health: Children, inequalities and health" British Medical Journal, 314(7087):1122-1125.

Roberts, H., K. Liabo, P. Lucas, D. DuBois and T.A. Sheldon (2004) "Mentoring to reduce antisocial behaviour in childhood" British Medical Journal, 328(7438):512-514.

Schweinhart, L. and D. Weikart (1993) A Summary of Significant Benefits: The High-Scope Perry Pre-school Study Through Age 27, High Scope, Ypsilanti, Michigan, and the United Kingdom.

Scott, S. (1998) "Fortnightly review: Aggressive behaviour in childhood" British Medical Journal, 316:202-206.

St James-Roberts, I. and C. Singh (2001) Can Mentors Help Primary School Children with Behaviour Problems? Final report of the Thomas Coram Research Unit between March 1997 and 2000, 233 Home Office Research, Development and Statistics Directorate, Home Office Research Study, London.

Stevens, M., K. Liabo, S. Frost and H. Roberts (2005 in press) "Using research in practice: A research information service for social care practitioners" Child and Family Social Work, 10(1):67-75.

Tarling, R., J. Burrows and A. Clarke (2001) Dalston Youth Project Part II (11-14): An Evaluation, 232 Home Office Research, Development and Statistics Directorate, Home Office Research Study, London.

Tierney, J.P., J.B. Grossman and N.L. Resch (2000) Making a Difference: An Impact Study of Big Brothers Big Sisters, Public/Private Ventures, Philadelphia.

Webster-Stratton, C., T. Hollinsworth and M. Kolpacoff (1989) "The long-term effectiveness and clinical significance of three cost-effective training programs for families with conduct-problem children" Journal of Consulting and Clinical Psychiatry, 57(4):550-553.

Wood, E. and K. Kunze (2004) Making New Zealand Fit For Children: Promoting a National Plan of Action for New Zealand Children (Violence, Exploitation and Abuse Section), UNICEF New Zealand, Wellington.

Helen Roberts (1)

Professor of Child Health

City University, London