Vocational qualifications in Britain and Europe: theory and practice.
Prais, S.J.
VOCATIONAL QUALIFICATIONS IN BRITAIN AND EUROPE: THEORY AND PRACTICE
(1)
This Note considers three questions bearing on the reform of
vocational qualifications in Britain, against the background of changes
being introduced by the National Council for Vocational Qualifications.
First, in what important respects did Britain need a reformed and
centrally-standardised system of vocational qualifications? Secondly,
what are the proper criteria for choosing between alternative methods of
awarding qualifications? Much that is at issue hinges on the relative
importance of externally-marked written tests as compared with practical
tasks assessed by an instructor; the discussion and conclusions reached
here in relation to vocational testing apply in large measure also to
current debates in other contexts, such as the proper role of
teacher-assessed coursework in school examinations at 16+ (GCSE) and the
official teacher-assessment of pupils at age 7 (SATs) currently being
administered in British schools for the first time. Our third question
is: in what significant ways do Continental systems of awarding
qualifications differ from those now proposed for Britain? (2)
Need for a standardised system
It is now accepted on all sides that Britain needs more of its
workforce to be vocationally trained to intermediate levels; that is to
say, to craft or technician standards as represented, for example, by
City and Guilds examinations (at part 2) or BTEC National Certificates
and Diplomas. In engineering, building and related trades there has for
long been a system for the award of qualifications that has worked more
or less satisfactorily; indeed, the City and Guilds system established
at the end of the last century was in many ways an internationally
admired pioneer, and its syllabuses and examinations were followed, and
are still followed, in many parts of the world. In other occupations,
such as office work or retailing, a variety of qualifying bodies grew up
in Britain - such as the Royal Society of Arts, the London Chamber of
Commerce and Industry, Pitman's, the Institute of Drapers - which
developed (what has been called) a |jungle' of qualifications at a
variety of unco-ordinated levels. In many other occupations in Britain
there was no system of qualifications at all.
On the other hand, in Germany - but also, for example, in France,
Austria, Switzerland and the Netherlands - vocational qualifications and
associated part-time or full-time courses were developed which covered
virtually the whole range of occupations in the economy. The
qualifications awarded usually at ages 18-20 at the end of these
vocational courses - the Berufsschulabschluss in Germany and the
Certificat d'aptitude professionnelle in France - are as widely
understood as, say, O-level passes were recently in Britain (the
narrower and clearer range of attainments encompassed by an O-level pass
make it a more appropriate standard of comparison than the new GSCE,
with the very wide range of attainments spanned by its awards).
What was essentially wrong in Britain with engineering and building
qualifications was that too few people took them - but I believe there
was nothing fundamentally wrong with the qualification-procedure itself.
For the rest of the economy there was a serious need (a) to make the
system coherent, so that equivalent levels could more easily be
recognised; and (b) to expand the occupational coverage. These two
objectives - greater recognisability and expansion of coverage - are of
course to some extent linked. Greater recognisability should lead to
greater marketability, reduced transaction costs in the labour market,
and to greater demand for qualifications and skills both by employers
and by trainees. The benefits to be expected are similar to those
ensuing from |hallmarking'. There are also economies of scale in
organising training programmes, and in specifying standards and
certification-procedures for a limited number of defined
training-occupations at defined levels. Something is of course lost in
standardising and restricting the number of training-occupations and
levels: just as something is lost in not having a suit made to measure;
but, it hardly needs saying, manufacturing to standardised sizes enables
many more to buy a decent suit.
Criteria for vocational qualifications
There has always been debate on the relative roles of theory and
practice in general education; that debate has been at least as vigorous
in relation to vocational education and the award of vocational
qualifications. The unsatisfactory extremes of relying solely on
|time-serving' or solely on |pencil-and-paper' tests have
often been contrasted as the basis for the award of vocational
qualifications.
In general, it is clear that all procedures for the award of
qualifications can provide no more than imperfect indicators of future
capability. Before describing how qualifications are awarded in
practice, let us for a moment consider the issues in an entirely
theoretical way, with the aid of some basic statistical mathematics.
Suppose we wish to estimate the capability of a person, not simply in
relation to what he has done so far, but in relation to what he is
likely to be able to do in the future under similar, but not identical,
circumstances to those encountered in the past; for example, the quality
of materials may alter, designs may alter, the type of person under whom
(or with whom) he will be working may alter. Let [xi] denote his true,
but yet unobserved, capability in the future; and let x denote his
performance as measured by some test-procedure based on his past
performance in specimen tasks. Without affecting the argument, these can
both be considered as multi-dimensional - relating, for example, to
speed of work, accuracy of work, cleanliness, etc. In choosing amongst
alternative test-procedures we have to accept, as said, that none will
be wholly accurate; and we have also to accept that testing is an
expensive process, and only limited resources can be devoted to it.
The expected total discrepancy between test and actual performance
can be divided into two components. In statisticians' terms they
correspond to bias and variance; in educationists' terms they
correspond, respectively, to Validity and Reliability (3). In detail -
when choosing between alternative estimators, or between alternative
test-procedures, we wish to minimise:-
(a) the bias: that is, in a sufficiently large number
of repeated applications we would like the
expected value to correspond to the true value;
that is, we wish to minimise
E {x} - [xi]; and
(b) the variance: as between alternative test-procedures
which were equally satisfactory from
the point of view of their bias, we choose the one
that has the minimum variability in repeated
applications (whether by different examiners, or
on different samples of questions or tasks); that
is, we choose the alternative which yields the
minimum value of
E[{(x - E{x})}.sup.2].
These two components contribute to the total discrepancy between
test and actual performance as follows: -
E{[(x - [xi]).sup.2} = E{[(x - E{x}).sup.2]} + [[E{x} -
[xi]].sup.2]
i.e. Total mean- square-error = Variance + [(Bias).sup.2]
= Reliability + [(Validity).sup.2].
In contrasting written and practical testing of vocational
capability, it is widely agreed that written tests have greater
Reliability in the sense that different external examiners would give
much the same marks if they independently examined a group of
candidates. On the other hand, it is argued by those of |modern'
views, written tests have a lower Validity since they are applied under
|artificial' examination conditions, and do not test what the
candidate actually does in the course of his work; on that view, the
greatest Validity attaches to the assessment of practical tasks carried
out by the candidate in a workplace environment, preferably in the
course of his normal work and assessed by his normal workplace
supervisor. Any lower Reliability of such procedures resulting from the
supervisor knowing his own trainee or for any other reason, it has
sometimes incautiously been suggested, is of no consequence (4).
The view in favour of giving great weight to written testing can
perhaps be summarised as follows. First, any argument that bases itself
on the notion that Validity (ie lack of bias) is all that matters, is
essentially wrong. We need to be concerned with the total expected error
associated with a qualification-procedure (ie Validity plus
Reliability); we are likely to be misled if we focus on only one
component. Secondly, if in reality there was a relation such that
procedures of high Reliability had low Validity, and vice versa, then
that relation needs careful empirical research. The relation is likely
to vary from one occupation to another; for example, it is likely to
depend on the relative importance in each occupation of applied craft
tasks and of planning tasks. Thirdly, we have to take into account the
costs of testing. One simple rule seems to hold very widely, namely,
that pencil-and-paper tests are quicker and cheaper to administer than
assessing practical tasks; consequently, written tests method can
examine a much wider range of activities per unit of resources devoted
to certification.
A simple example may not be out of place. A carpenter or mechanical
fitter needs to know which type of metal screw to choose for each job;
screws come in a myriad of different lengths, diameters, threads, heads
(flat, round, Phillips, etc.); and in different metals (brass, steel,
chrome, etc.). If a final assessment of capabilities had to wait till
the candidate had used each type in the course of his normal work in
front of his supervisor, and had done so properly on a sufficient
proportion of repeated occasions, it would take a very, very long time
for him to be judged as qualified. On the other hand, a few specimen
written questions, such as: -
Which kind of screw would you use for fitting a
mirror to a bathroom wall, and why?
would only take a few minutes. By testing that the candidate knows
why he is doing what he is doing, and not merely observing that he is
doing it correctly, we attain greater confidence that he can operate
under the variety of different circumstances that arise in practice.
Notice also that he needs to be tested not only on which is the right
type of screw, but, if that type is not readily available, he needs to
know which of the available alternatives are acceptable, even if less
than ideal; and he also needs to know which are not acceptable, even if
the customer would not immediately twig. The case for testing knowledge,
and not merely observing practice, is thus a strong one even for the
simplest of tasks.
But let us return to the costs of testing. Inevitably only a sample
of relevant knowledge and skills can be tested. Otherwise not only would
the direct costs of examination become excessive, but so would the
indirect costs; as HMI recently noted, the new NVQ assessment procedures
have already |encroached on the time available for teaching and
learning'. This was said in relation to engineering qualifications;
but HMI were probably also influenced by the example they quoted of the
hairdressing NVQ which involves a |1000 task checklist'.(5)
Clearly, the greater the number of test-items, the greater can be
our confidence in the final verdict - but also the greater is the cost.
By the familiar statistical rule, a doubling in the required precision
requires a quadrupling in the number of test-items. This rule applies if
the observations are independent; if they are correlated and, so to
speak, to some extent they test the same capability in another way, then
more than a quadrupling will be necessary (it may not even be possible
to double the precision after a certain point). In other words, the way
the first ten questions or ten tasks are dealt with by the candidate
tells us a great deal about him; if we aimed to double our confidence in
our judgment we are likely to require many more than forty questions or
tasks. Further, because practical tests are so very much more expensive
than written tests, both in administration and marking, it is efficient
when working within a limited budget to allocate more questions to
written tests than to practical tests. A complex balancing exercise is
thus involved in the economic design of test-procedures; it is not
surprising that in reality they are developed only slowly over the
years, often with step-by-step experimentation, and require an intimate
knowledge of the details of each occupation. To illustrate the
complexity of what is involved, the mathematical principles governing
the optimal mix of theoretical and practical tests are developed in the
Appendix below; to derive orders of magnitude, some examples are worked
out there under plausible assumptions regarding relative costs,
precision, and intra-correlations. It appears, for example, that even in
craft-occupations in which practical aspects might account for, say,
three-quarters of the total required skills (and theoretical aspects
account for one quarter), it may be efficient to allocate only one
quarter of the total budget to practical tests and three-quarters to
written theoretical tests. Other assumptions may yield different mixes;
but it is clear that the cost factor provides strong rational ground for
carrying out so much vocational testing in a written form.
European practice
We are now ready to outline how vocational qualifications are awarded
in Continental Europe, concentrating on those aspects which differ from
Britain under the principles promoted by NCVQ. The number of accredited training-occupations, and the number of associated vocational
qualifications, is limited to just under 400 in Germany, France and the
Netherlands; in Britain, under NCVQ arrangements which give priority to
the views of employers, much larger numbers of approved qualifications
are likely to emerge - perhaps running into very many thousands. The
Continental approach yields a smaller total because breadth is demanded
on all approved training courses in the interests of nationwide
standards and the transferability of skills; the British approach
emphasises the tailoring of qualifications to meet as far as possible
the varying needs of employers. For example, to gain an NVQ in
engineering (at level 3) it is proposed that the candidate should be
able to choose any six so-called Segments out of an available menu of
some 250. If there were no restriction on choice, this would yield a
theoretical maximum number of combinations in excess of a thousand
billion ([10.sup.12])! In practice a limited number of favoured
combinations will no doubt be settled upon, though larger than under the
previous arrangements developed by the Engineering Industry Training
Board (6). While closely-tailored specialisation may encourage British
employers to provide training facilities, and encourage them to
contribute to the finance of training, it is not obvious that carrying
the process to such an extreme promotes an adaptable economy, nor is it
in the long-term interest of employees who need to be able to change
their type of work.
Let us now consider the length of course and qualification
procedures; for the sake of brevity we focus on the German system. The
main vocational qualification in Germany is usually awarded following
three years' of apprenticeship, combined with day-release at
college; that is, at about age 18 or 19. The qualification is not based
on length of college attendance (this has to be emphasised, as there has
been some misunderstanding), but on success in final examinations; the
length of college attendance is reduced for those who have higher
initial qualifications, or extended for those who have failed their
examinations (there is a limit to the number of times that an
examination can be repeated - usually only once or twice.) Examinations
are both theoretical (written) and practical; they are taken at the end
of the apprenticeship period, and also at an interim stage. The written
examinations cover vocational and general subjects (mathematics,
language, social studies, etc.); they are externally set and externally
marked. About half a dozen papers are usually involved; the inclusion of
general subjects indicates clearly that vocational education is intended
to be a form of continuing education (comparable to the philosophy of
the previous |Continuation Colleges' in England). The practical
examination may extend for more than a whole day (for example in
building work) and usually includes an oral test; it is also externally
marked, usually by three examiners, none of whom is permitted to know
the examinee.
An independent jury (used in French to mean a board of examiners)
is also regarded as an essential feature of the French vocational
qualification system, to preserve the |authenticity of the
procedures' and to avoid suspicions that diplomas are handed out
for improper reasons - |pour faire plaisir aux gens' (as pointed
out in a recent article on the French system) (7).
In addition to attaining passes in both written and practical
examinations, the German trainee has to produce a satisfactory record of
completion of the centrally-specified list of tasks to be carried out as
part of his apprenticeship.
It is this last requirement which has become the over-riding
element in the NVQ approach. A vast, highly expensive and (in my view)
largely unnecessary re-specification of existing qualifications has been
demanded by NCVQ in order to re-express them in term of basic |can
do' tasks, to be carried out as far as possible in front of a
supervisor at work; this has affected established and highly experienced
organisations, such as City and Guilds, BTEC, EITB, etc. I have no doubt
that the satisfactory execution of a sample of practical tasks is an
important part of a sensible qualification process. But it is, I hope,
evident from what has been said here that the German and French
requirements and safeguards, as listed above, are sensible and
efficient; and that they are particularly important in ensuring
reliability of the qualification - so promoting a more efficient market
mechanism for the allocation of scarce skills, and thereby promoting the
acquisition of yet higher skill-levels.
What has actually been put into effect by those responsible for
vocational qualifications in other successful economies, as well as in
successful training sectors here, should - in my view - provide a better
guide to what is generally required in this country today than reliance
on newly-formulated theoretical principles - at least in the present
state of knowledge, and until much detailed empirical research has been
carried out on the relevant parameters (of the type defined in the
Appendix). Perhaps in the light of experience, the authorities in this
country will yet re-consider what are the right principles governing a
national system of vocational qualifications; and that European
experience will be judged to provide important and relevant lessons.
Mathematical Appendix
ON THE OPTIMUM MIX OF PRACTICAL AND THEORETICAL TESTS
This appendix is concerned with the principles governing the optimum
mix of different types of tests in the award of a qualification. The
problem arises because, for example, written (|theoretical') tests
and practical tests can be administered and marked at typically very
different unit-costs, and marked with different degrees of precision.
For a given required degree of precision in the combined final mark, and
for a given total of resources devoted to testing, it is consequently
advantageous to over-represent those types of test that have a lower
unit-cost, after allowing for differences in their relative precision;
the marks on the constituent tests have of course to be weighted to
reflect the a priori required relative importance of the different types
of skill represented by the different types of test (8).
For simplicity we need consider just two types of test-questions,
theoretical and practical, denoted by subscripts t and p respectively.
Suppose there are [n.sub.t] theoretical questions on which a candidate
obtains an average mark of [M.sub.t]; and that mark [x.sub.it] for each
theoretical question has an equal measure of uncertainty attached to it
(because of errors of measurement, etc.), denoted by its sampling error
[sigma.sub.t]. The sampling error of the average [M.sub.t] is then, in
the usual way, [Mathematical Expression Omitted]. Corresponding symbols
([n.sub.p],[x.sub.ip], [M.sub.p], ...) relate to the practical
questions.
The average mark M awarded on the combined tests is a weighted
average
M = [w.sub.t][M.sub.t] + [w.sub.p][M.sub.p], (1)
where [M.sub.t] = [epsilon] [x.sub.it]/[n.sub.t], [M.sub.p] =
[epsilon [x.sub.ip]/[n.sub.p], and [w.sub.t] + [w.sub.p] = 1. Practical
skills are clearly more important in craft occupations and, roughly
speaking, we may suppose that [w.sub.p] might be 3/4 and [w.sub.t] might
be 1/4; for technician occupations those weights might be reversed. Let
us next assume that a practical question costs k times as much to
administer and mark as a theoretical question; it may not be
unreasonable to suppose that k lies in the range 10-100, the higher
ratio applying if comparison is between practical and written
multiple-choice questions (perhaps higher still if the multiple-choice
questions are marked mechanically). The total budget available for
testing, measured in terms of unit-costs for theoretical questions, is
thus
B = [n.sub.t] + [kn.sub.p]. (2)
Let us also allow for the possibility that the degree of
uncertainty attached to the marking of a practical question differs from
that attached to a theoretical question by a factor s, that is
[[sigma.sub.p] = s [[sigma.sub.t]; (3)
because of the element of judgment in assessing practical tasks, s is
probably greater than 1, but perhaps not greater than 2.
The variance of the total mark, M can then be derived in the usual
way, allowing for (3), to give
[Mathematical Expression Omitted]
Our problem is to choose [n.sub.t] and [n.sub.p] so as to minimise
this variance subject to the budget constraint (2). Following the method
of Lagrange multipliers, we minimise
V{M} - [lambda][B - [n.sub.t] - [kn.sub.p]] (5)
by partial differentiation with respect to [n.sub.t] and [n.sub.p];
this yields
[Mathematical Expression Omitted]
and
[Mathematical Expression Omitted]
Combining those last two equations and eliminating [lambda], we
find
[Mathematical Expression Omitted]
that is, the ratio of theoretical to practical test-items should
reflect their relative importance in the final marking criteria, but
modified by the factor [Mathematical Expression Omitted] which reflects
the relatively greater cost of practical tests and their different
reliability.
By way of example we take the specimen values for the parameters
mentioned above. For craft-type occupations, we assume that practical
skills form three-quarters of the total ([w.sub.p] = 0.75), that a
question on a practical test costs ten times as much to administer and
mark as a theoretical question (k = 10), and that is marked with equal
precision (s = 1); the optimum combination is then approximately equal
numbers of practical and theoretical questions (rather than 3:1, as
might be suggested by the occupational skill-mix). If practical
questions cost 100 times more to mark than theoretical questions, and
were two-thirds as precise in their marking (k = 100, s = 1.5), then
practical questions should optimally form only 31 per cent of the total.
For technician-type occupations, we assume the converse proportions (ie,
[w.sub.p = 0.25), and practical questions should account for only 10 or
5 per cent of the total number on the alternative assumptions mentioned.
Lack of independence in marking
We now move to more complex matters. So far we have treated each of a
series of test-questions as providing independent information on a
candidate's capabilities. In practice a positive correlation is to
be expected amongst the |errors of measurement' in the marks for
different questions awarded to any candidate. It arises, for example,
because a candidate is examined on a particular but perhaps
unrepresentative day; or because he is marked by a particular examiner,
with particular views, or particular personal sympathies or antipathies
to that candidate. (There is also a formally analogous problem that
arises if we are interested in estimating the average mark for a whole
class of pupils; in that case the correlation amongst the marks - rather
than amongst the errors in the marks - is relevant.) This correlation
has consequences for the extent to which an increase in the length of a
test (adding to the number of questions) improves the precision of our
knowledge of a candidate's capabilities; and for the optimum mix of
theoretical and practical tests.
Because of the complexity of the issues, it is best to begin again
with the simpler case of one type of test, and to set out assumptions
more explicitly. A candidate has to answer n questions on each of which
he is awarded a mark [x.sub.i] (i = 1, 2,..., n). His final mark for the
test is the average of those marks.
M = [epsilon] [x.sub.i]/n. (7)
We suppose that a great many such questions are available (say, from
a computerised data-bank of questions), that any number can be chosen at
random, and that the marks are subject to errors which are not
independent. We are interested in the rate at which the precision of the
average mark rises as we increase the number of questions, that is, the
rate at which the variance of M falls as n increases.
We suppose for analytical simplicity that the uncertainty attached
to the mark for each question is the same, so that the covariance between any two marks is
V{[x.sub.1]} = V{[x.sub.2]} ... = [o.sup.2]; (8)
we also assume that the correlation (9) between the errors attached
to the marks on any two questions is the same, p, so that
C{[x.sub.i], [x.sub.j]} = [p[sigma].sup.2] (9)
To derive the variance of M we square (7)
[Mathematical Expression Omitted]
where the first summation extends over n squared terms, and the
second (double-summation) relates to n(n - 1) cross-products. Taking
expectations yields
V{M} + [[no.sup.2] + n(n - 1)[[rho][sigma].sup.2]]/[n.sup.2]
= [[sigma].sup.2][(1/n) + [rho] (1 - 1/n]; (10)
the standard error of the average mark is thus
[Mathematical Expression Omitted]
There are two familiar extreme cases. If the observations are truly
independent, that is if [rho] = 0, we derive the usual formula for the
sampling error of an average, [Mathematical Expression Omitted]; in this
case an increase in the number of questions leads ultimately to complete
precision. The other extreme is given by [rho] = 1, for example if each
candidate obtains the same score on every question (but not the same
score as other candidates). In such a situation more questions will not
change the average mark for any candidate; equation (11) thus shows
correctly that the standard error of the average mark does not vary with
n. The notable feature of the general case (where [rho] lies between 0
and 1) is that an indefinite increase in the number of questions does
not lead to ultimate complete precision, but only to a finite asymptotic
level [Mathematical Expression Omitted]. The adjacent table shows that a
correlation as low as 0.1 implies negligible improvement in precision
after, say, the first forty questions. The reason may be put intuitively
as follows. Each additional question may have only a low correlation
with any single previous question; but, considered in relation to the
whole succession of previous questions - say, all marked by the same
examiner - it adds little fresh knowledge. To improve precision
radically, it may be better to add independent examiners rather than
additional questions.
We are now ready to return to our central problem, namely, how to
allocate a limited budget between practical and theoretical questions,
taking account of the possibility of correlation between questions. We
must allow for three types of correlation: between one theoretical
question and another theoretical question; between one practical
question and another practical question; and between a theoretical and a
practical question. These are devoted by [[rho].sub.t], [[rho].sub.p]
and [[rho].sub.pt]. Proceeding as previously, applying the result of
(10) to (4) and including the covariance term (10), we derive the
variance of the combined mark on both theoretical and practical
questions as
[Mathematical Expression Omitted]
Taking the budget constraint into account as in (5), we eventually
derive a modified optimal condition
[Mathematical Expression Omitted]
this differs from (6) by the factor [Mathematical Expression
Omitted]. Notice that [[rho].sub.pt] does not affect the optimal mix.
To illustrate the impact of this factor, let us suppose that
theoretical questions can be formulated and marked in a moderately
independent way, so that their correlation is as low as [[rho].sub.t] =
0.1; but that practical questions have a considerably greater
correlation - say, because of the inevitably closer contact between
examiner and candidate - so that [[rho].sub.p] = 0.5. Taking the
assumptions for craft-type occupations made previously ([w.sub.p] =
0.75, k = 100, s = 1.5), this leads to an optimal requirement for
practical questions to form only 25 per cent of the total
testing-budget, instead of the 31 per cent if the interclass
correlations are zero, or the 75 per cent that practical requirements
have in the posited total mix of skills.
There is space here to do no more than mention two further issues
thatarise in testing and which might benefit from analysis on the above
approach. First, the optimum bunching of questions around specified
levels in |criterion referenced' tests (for example, can the
candidate drive a car satisfactorily?). Secondly, the appropriate margin
of safety in balancing the risks of failing those who ought to be passed
(because the assessment was carried out on a |bad day'), against
the risks attached to passing someone who ought to fail (he was assessed
on a |good day', but he suffers from an unusually large number of
|bad days').
NOTES
( 1) Originally presented at a seminar at the University of Warwick on 19 March 1991; revised with
the benefit of discussion
there, and subsequently with my colleagues at the National
Institute. It develops ideas
previously put forward in a Note
in this Review, August 1989. The underlying research forms part
of a wider programme of
international comparisons of
training, education and productivity supported by the Economic
and Social Research Council and
the Gatsby
Foundation, to which bodies my thanks are due. Responsibility
for errors remains my own. ( 2) Strictly speaking NCVQ applies only to
England and Wales, and Scotland comes under a separate
body (Scotvec); the
issues are however much the same, and nothing of substance is
sacrificed if, for convenience of
exposition, we refer
throughout simply to |Britain'. ( 3) Capitals are attached
to these words to indicate their technical connotation here. ( 4) For an
extreme statement (|we should just forget altogether') by
NCVQ's Director of Research,
see G Jessup,
Outcomes: NVQs and the Emerging Model of Education and Training
(Falmer, 1991), p. 191. Similar
views are to be
detected in earlier publications from the Training Agency of the
Department of Employment
in their
Guidance Notes for
the Development of Assessable Standards for National
Certification (Sheffield, 1989); see, for
example, the remark on
the merits (sic) of oral questioning: |it does not require
candidates to be able to read or
write' (Guidance Note 5, p.7) Of
course someone may be considered a capable carpenter for many
purposes without being able to
write; the Continental
view would be that an employer who wished to employ him as a
carpenter is permitted to do so,
but he should not be
awarded a Vocational Qualification. On the other hand, NCVQ
would be prepared to award a
Qualification. One of the
dangers of the latter approach is that vocational qualifications
will acquire a cumulatively
lower status in Britain
(|suitable for illiterates'), whereas on the Continent
great pains have been taken to enhance
their esteem. ( 5) HMI, National Vocational Qualifications in
Further Education 1989-1990, DES, 1991, pp. 6, 8. ( 6) Progress seems to
require ever finer grinding. Previously, two Modules were the
requirement for
a craft qualification in
engineering; a Module was then divided into three Segments. On
the latest development each
Segment is to be divided
into an average of four Elements (yielding a total of about a
thousand Elements). At the time
of writing it seems that
extensive negotiations are in progress with NCVQ in several
occupational areas. |Conditional
Accreditation' has been
granted by NCVQ for some existing qualifications, so that
government training subsidies may
immediately be received
by the industry concerned pending agreement on the longer-term
re-structuring of their
qualification-procedures in
accordance with NCVQ's principles. ( 7) E Kirsch, Formation
Emploi, Oct-Dec 1990, p. 13. ( 8) From a formal point of view the
mathematical development that follows in this Appendix is, in
essence, no more than an
application of the standard theory of stratified and clustered
sampling; but I am not aware
that it has previously been
applied in this context (the algebra here concentrates on the
essentials required in the
present application and is, I
hope, simpler to follow than provided in general texts on
sampling theory). ( 9) This corresponds to the intraclass correlation which arises |mainly in biological studies' (G U
Yule and M G Kendall, An
Introduction to the Theory of Statistics, Griffin, London, 14th
edition, 1950, p.272; the
charming application to variations
in the length of cuckoos' eggs according to nest of foster
parent - Robin, Wren or Hedge
Sparrow - will bring joy to
many a scientific heart: ibid., p.280, based on a a study in
Biometrika, 1905). For its
application in cluster sampling see,
for example, M H Hansen, W N Hurwitz and W G Madow, Sample
Survey Methods and Theory (Wiley,
1953), vol. II,
ch. 6. (10) There are 2 [n.sub.t][n.sub.p] covariance terms of
the type [x.sub.it][x.sub.jp] which, on
taking expected values, reduce to the simple final term
in (12).