Consistency in diagnoses for a sample of adolescents at a private psychiatric hospital.
Hickin, Nancy ; Slate, John R. ; Saarnio, David A. 等
An estimated 18 to 20% of children and adolescents are affected by
emotional and behavioral disorders (Canning, Hanser, Shade, & Boyce,
1992). Although psychiatric diagnoses are infrequently assigned to
children, such diagnoses become more frequent for adolescents (Smeeton,
Wilkinson, Skuse, & Fry, 1992). Moreover, the frequency of adult
diagnoses is quite high given a diagnosis in adolescence. For example,
Smeeton et al. (1992) found that of the adolescents they sampled who had
received a psychiatric diagnosis (approximately one in 19), 38% received
a psychiatric diagnosis as young adults. This percentage is higher than
that of others (e.g., Graham & Rutter (1985) found that about 17% of
adolescents with emotional problems do not show improvement as they
enter adulthood), but still indicates that many adolescents will have
problems into adulthood. In light of these findings, a clear need exists
for the proper identification and diagnosis of adolescents with problems
in order for treatment to be as effective as possible before the
problems become long term.
The Diagnostic and Statistical Manual of Mental Disorders, Third
Edition (American Psychiatric Association, 1987) is widely used for
diagnoses, and is considered to be a reliable classification system for
mental health disorders (Skre, Onstad, Torgersen, Kringlen, 1991).
Although several studies have found interrater agreement of the DSM-IIIR
to be high, especially with regard to Axis I diagnoses (Skre et al.,
1991), the available studies have examined broad categories of mental
health disorders (e.g., mood disorders) rather than specific diagnoses
(e.g., depression). As a result, the interrater agreement reported in
studies may not accurately represent the interrater agreement for
specific diagnostic categories (Skre et al., 1991) or, for that matter,
the interrater agreement for combinations of DSM-IIIR diagnoses.
Whereas the DSM-IIIR itself has been found to be a reliable
classification system, much less information is available regarding
clinicians' precision in using it. In fact, inferences from several
studies (e.g., First et al., 1993; Spitzer, Forman, & Nee, 1979;
Webb, Gold, Johnstone, & Diclementa, 1981) can be drawn that the
diagnostic criteria delineated in the DSM-IIIR are not used correctly.
This can create a problem if adolescents are misdiagnosed, especially if
medications are involved.
In this study, admission psychiatric, psychological, and discharge
Axis I diagnoses were examined in order to determine the relationships
between admission and discharge diagnosis. Also investigated was whether
interrater agreement varied as a result of diagnostic label and as a
function of the particular evaluation. The specific research questions
in this study were: (1) To what extent are the primary and secondary
AXIS I diagnoses on the initial psychiatric evaluation, the
psychological evaluation (when conducted), and the discharge evaluation
related? (2) To what extent are sex differences present in the
interrater agreement of the primary and secondary AXIS I discharge
evaluations? (3) To what extent are differences present in interrater
reliability as a result of the initial primary AXIS I diagnosis, for the
initial psychiatric, psychological, and discharge evaluations? and (4)
Does interrater agreement differ across evaluations for primary vs.
secondary diagnoses?
METHOD
Data were collected from 291 medical records of adolescent inpatients
at a private psychiatric and substance abuse hospital in the Mid-South.
a random sample of every third medical record of adolescent inpatients
was examined for the following information: (a) AXIS I primary and
secondary diagnoses on the initial psychiatric evaluation, (b) AXIS I
primary and secondary diagnoses on the psychological evaluation
conducted during the adolescent's stay at the facility, and (c)
AXIS I primary and secondary diagnoses on the discharge summary. For all
adolescents admitted to this particular facility, admit and discharge
evaluations are conducted. The psychological evaluations were conducted
only when the physician specifically requested them in order to obtain
additional information. Demographic information regarding the
adolescent's age, sex, and ethnicity was also recorded.
The sample consisted of 97 adolescents (53 males, 44 females); 85
were Caucasians, 11 African-Americans, and 1 Native-American. Their ages
ranged from 12 to 18 years, with a mean age of 14.7 years (SD = 1.6). Of
the AXIS I primary diagnoses on the initial psychiatric disorder, 10
were polysubstance abuse, 9 were conduct disorder, and 8 were adjustment
disorder. For the remaining 27 adolescents, no other diagnosis occurred
more than 4 times. Of the AXIS I secondary diagnoses on the initial
psychiatric evaluation, 11 were polysubstance abuse, 6 were alcohol
abuse/dependence, and 4 were conduct disorder. For the remaining 76
adolescents, no other secondary diagnosis occurred more than 4 times.
An examination of AXIS I primary diagnoses for males revealed that 13
(25.5%) were that of depression, 8 (15.7%) were polysubstance abuse, 6
(11.8%) were conduct disorder, and 6 (11.8%) were oppositional-defiant
disorder. No other diagnosis was assigned more than 4 times. An analysis
of AXIS I primary diagnoses for females indicated that 21 (47.7%) were
that of depression. No other diagnosis label was assigned more than 4
times. Unless otherwise specified, statistical data analyses utilized
all diagnostic labels.
RESULTS
The first research question focused on interrater agreement across
evaluations. Perfect agreement between the primary AXIS I diagnosis on
the initial psychiatric evaluation and on the psychological evaluation
was obtained 71.4% of the time. Partial agreement (i.e., the primary on
the initial psychiatric evaluation was either the secondary or was
listed as a diagnosis on the psychological evaluation) was obtained 5.7%
of the time. No agreement occurred 22.9% of the time. Therefore, viewed
liberally, percent of interrater agreement from the initial psychiatric
evaluation primary diagnosis to the psychological evaluation primary
diagnosis was 77.1%. The interrater agreement for the AXIS I secondary
diagnosis from the initial psychiatric to the psychological evaluation
was exact 65.2% of the time; partial agreement occurred 13.0%, and no
agreement occurred 21.7%. Thus, a liberal view is that the percent
agreement for the secondary diagnosis was 78.2%.
Perfect agreement between the primary AXIS I diagnosis on the initial
psychiatric evaluation and on the discharge evaluation was obtained
61.7% of the time. Partial agreement was obtained 9.5% of the time. No
agreement occurred 28.7% of the time. Partial agreement was obtained
9.5% of the time. No agreement occurred 28.7% of the time. Therefore,
percent of interrater agreement from the initial psychiatric evaluation
primary diagnosis to the discharge evaluation primary diagnosis was, at
best, 71.2%. For the AXIS I secondary primary diagnosis from the initial
psychiatric to the discharge evaluation, perfect agreement was obtained
45.8% of the time and partial agreement occurred 20.8%. No agreement
occurred 33.3%. Thus, the percent agreement for the secondary diagnosis
was 66.6%.
Between the psychological evaluation, when conducted, and the
discharge evaluation, perfect agreement in the primary AXIS I diagnosis
was obtained, 81.6% of the time. Partial agreement was obtained 7.9% of
the time. No agreement occurred 10.5% of the time. Therefore, percent of
interrater agreement from the psychological evaluation primary diagnosis
to the discharge evaluation primary diagnosis was 89.5%. An analysis of
the interrater agreement for the AXIS I secondary diagnosis from the
psychological to the discharge evaluation revealed that perfect
agreement was obtained 52% of the time and partial agreement occurred
28%. No agreement occurred 20%. Thus, the percent agreement for the
secondary diagnosis was 80%.
The rates of agreement clearly differ across different pairs of
evaluations. The greatest agreement occurred between psychological and
discharge evaluations, and the least agreement was between the initial
psychiatric and the discharge evaluations. Consistency between the
psychiatric and psychological evaluation was midway between the other
pairs.
For the second question, chi-squares were conducted to determine
whether sex differences were present in the interrater agreement of the
primary and secondary AXIS I diagnoses for the initial psychiatric,
psychological, and discharge evaluations. For these analyses, perfect
and partial agreements were combined and compared to no agreement. Only
one of the 6 chi-squares resulted in a finding that approached
statistical significance, that of the primary diagnosis from the initial
psychiatric to the psychological evaluation, [[Chi].sup.2] (1) = 2.86, p
[less than] .09. The other chi-squares indicated no differences by sex
in interrater agreement, suggesting that raters' diagnoses were
equally consistent across the three settings for males and for females.
Chi-squares were again conducted to determine whether differences
were present in interrater reliability as a result of the initial
primary AXIS I diagnosis on the initial psychiatric, psychological, and
discharge evaluations. For these analyses, perfect and partial
agreements were again combined and compared to no agreement. Moreover,
because of the frequency with which a diagnosis of depression was made
versus all other diagnoses, the diagnoses were grouped into two
categories: depression and nondepression (i.e., all other diagnoses
combined). In this instance, one chi-square resulted in a statistically
significant finding, that of the primary diagnosis from the initial
psychiatric to the discharge evaluation, [[Chi].sup.2] (1) = 13.57, p
[less than] .001. This analysis revealed strong consistency in
interrater agreement from the psychiatric to the discharge evaluation
for depression, 71.3%. Interrater agreement for the nondepression
diagnoses, however, was highly inconsistent, 28.7%. The other
chi-squares indicated no differences in interrater agreement for
depression versus nondepression diagnoses.
To address the fourth question, chi-squares were conducted on exact
agreements to ascertain whether differences in interrater agreement were
present between primary and secondary diagnoses assigned to our
adolescent inpatient sample. It might be expected that primary diagnoses
would show greater agreement than secondary diagnoses, and that was, in
fact, found. Interrater agreement was significantly higher for the
primary than the secondary diagnosis for the psychiatric to the
psychological evaluation [[Chi].sup.2] (1) = 9.08, for the psychiatric
to the discharge evaluation, [[Chi].sup.2] (1) = 9.28, and for the
psychological to the discharge evaluation, [[Chi].sup.2] (1) = 5.67, all
ps [less than] .01.
DISCUSSION
The findings indicate that interrater reliability ranged from
approximately 71 to 90% on the primary diagnoses and 67 to 80% on the
secondary diagnoses. These figures indicate that the interrater
consistency of diagnoses over time was high, at least from a liberal
viewpoint. A more conservative viewpoint, using only exact agreement
across primary diagnoses and secondary diagnoses would yield more
worrisome ranges of 62 to 82% and 46 to 65%, respectively.
The consistency was higher between the psychological and discharge
evaluations (89.5% primary diagnoses and 80% secondary diagnoses) than
between admission and psychological diagnoses (77.1% primary and 78.2%
secondary). One possible reason for these findings is that admission
diagnoses are usually tentative until the client's condition can be
further examined in more detail (Libb, Murray, Thurstin, & Alarcon,
1992). Psychological evaluations, requested by the assessment
specialist, may assist the diagnostic process by "shedding
light" on the patient's condition and also by allowing time
for the clinician to gain a more accurate understanding of the
patient's condition. With even more information being available,
discharge diagnoses may be even more accurate, or at least more
consistent with previous diagnostic labels (Libb et al., 1992). This may
also be part of the reason why, though certainly desirable, 100%
agreement between raters is probably not realistic in a real world
setting. Another way of viewing the apparent difference across
evaluation occasions is in the amount of time between diagnoses; more
time may yield less agreement. In fact, consistent with that hypothesis,
adjacent evaluations (psychiatric to psychological and psychological to
discharge) revealed stronger interrater consistency than was present for
the nonadjacent evaluations (psychiatric to discharge). These
explanations, individually and in combination, would lead to
expectations of changes in diagnoses over time, and would indicate that
multiple diagnostic occasions may actually inform and benefit both the
professional and the patient because greater accuracy may be obtained in
diagnoses.
Consistent with the literature (e.g., Jorm, 1987), the sample of
female adolescents was diagnosed much more often with depressive
disorders than was the sample of male adolescents. Other sex differences
were not apparent, but type of disorder was related to diagnostic
consistency. The findings of this study revealed strong consistency in
interrater agreement from the psychiatric to the discharge evaluation
for depression; however, interrater agreement for the nondepression
diagnoses was highly inconsistent. This finding may be due to the fact
that depression was combined as a uniform diagnosis, and nondepression
was a mixture of all other diagnoses (excluding depression).
Nevertheless, diagnostic agreement should not vary as much as it did
here even when diagnoses are mixed rather than singular.
The data from this study also indicate that more attention or care
may be taken with primary diagnoses than with secondary diagnoses. At
least, there is much greater consistency in primary diagnoses. This
would be expected, in that on most occasions admittance to a psychiatric
hospital will be based on a primary problem. However, the secondary
diagnoses are also important because they can inform the clinician about
the primary problem and methods for treatment. The lack of interrater
agreement for secondary diagnoses is a finding that needs additional
attention.
Considering the fact that a substantial number of children and
adolescents are affected by emotional and behavioral disorders, it is
essential that they receive the most appropriate and beneficial
treatment possible. Although researchers have shown the DSM-IIIR to have
more specific criteria for diagnostic categories than its predecessor,
documenting the reliability of clinicians' use of the DSM-IIIR has
been more difficult and researchers generally state that clinicians
often misuse this diagnostic tool. A serious problem can arise if a
child/adolescent is misdiagnosed. Not only can it result in
stigmatization and labeling for life, but the child/adolescent may also
receive inappropriate treatment. This matter becomes even more serious
when medications are involved. Both clinicians and treatment facilities
are encouraged to evaluate the consistency of their diagnostic decisions
and to take measures to improve them.
REFERENCES
American Psychiatric Association. (1987). Diagnostic and Statistical
Manual of Mental Disorders (3rd ed. revised). Washington, DC: American
Psychiatric Association.
Canning, E., Hanser, S., Shade, K., & Boyce, W. (1992). Mental
disorders in chronically ill children: Parent-child discrepancy and
physician identification. Pediatrics, 90, 692-698.
First, M., Opler, L., Hamilton, R., Linder, J., Linfield, L., Silver,
J., Toshav, N., Kahn, D., Williams, J., & Spitzer, R. (1993).
Evaluation in an inpatient setting of DTREE, a computer-assisted
diagnostic assessment procedure. Comprehensive Psychiatry, 24, 171-175.
Graham, P., & Rutter, M. (1985). Adolescent disorders. In M.
Rutter, & L. Hersov (Eds.), Child and adolescent psychiatry: Modern
approaches. 2nd ed. pp. 351-367. Oxford: Blackwell Scientific.
Jorm, A. (1987). Sex and age differences in depression: A
quantitative and synthesis of published research. Australian and New
Zealand Journal of Psychiatry, 21, 45-53.
Libb, J., Murray, J., Thurstin, J., & Alarcon, R. (1992).
Concordance of the MCMI-II, the MMPI, and Axis I discharge diagnosis in
psychiatric inpatients. Journal of Personality Assessment, 58(3),
580-590.
Skre, I., Onstad, S., Torgersen, S., & Kringlen, E. (1991). High
interrater reliability for the Structured Clinical Interview for
DSM-III-R Axis I (SCID-I). Acta Psychiatrica Scandinavica, 84, 167-173.
Smeeton, N., Wilkinson, G., Skuse, D., & Fry, J., (1992). A
longitudinal study of general practitioner consultations for psychiatric
disorders in adolescence. Psychological Medicine, 22, 709-715.
Spitzer, R., Forman, J., & Nee, J. (1979). DSM-III field trials.
I. Initial interrarer diagnostic reliability. American Journal of
Psychiatry, 136, 815-817.
Webb, L., Gold, R., Johnstone, E., & Diclemente, C. (1981).
Accuracy of DSM-III diagnoses following a training program. American
Journal of Psychiatry, 138, 376-378.
A version of this manuscript was presented at the Mid-South
Educational Research Association, November 10, 1994.
Nancy Hickin, Master's in Rehabilitation Counseling, Public
Relations Director, VISTA Volunteer, Northeast Arkansas Council on
Family Violence.
David A. Saarnio, Ph.D., Associate Professor, Department of Counselor
Education and Psychology, Arkansas State University.
Reprint requests to John R. Slate, Ph.D., Professor Department of
Educational Leadership, Valdosta State University, Valdosta, GA 31698.