An analysis of difference score measures of latent business constructs.
Vernon, Thomas Alexander ; Inman, R. Anthony ; Brown, Gene 等
ABSTRACT
This study empirically investigates the reliability and construct
validity of difference score measures of latent business constructs in
comparison to that of non-difference score measures of latent business
constructs. Results provided evidence that question the conventional
wisdom that the use of difference scores should be avoided whenever
possible. Also, the thorough empirical examination of reliability and
construct validity provides a framework for practitioners who wish to
properly evaluate different measurement techniques.
INTRODUCTION
Difference scores are created when one measure is subtracted from
another to create a measure of distinct construct (Peter, Churchill & Brown, 1993). Several researchers (Peter, Churchill & Brown,
1993; Johns, 1981; Cronbach & Furby, 1970; Nunally, 1959; and
Mosier, 1951) have cited potential problems with difference scores, such
as problems with reliability, discriminant validity, spurious
correlation and variance restriction, which can cause them to perform
poorly as measures of latent constructs. There is clearly a need for
empirical study of the possible problems encountered when using
difference scores to measure latent business constructs. This paper
assesses the reliability and validity of difference score measures of
latent business constructs in comparison to the reliability and validity
of non-difference score alternatives purported to measure the same
latent business constructs.
Difference Scores
A significant portion of the literature on difference score
measures is in the area of testing and measurement (Cronbach & Furby
1970; Lord, 1958; and Mosier, 1951). In the business area, difference
scores have been used mostly in behavior-oriented fields and are
generally concerned with the measurement of latent constructs, although
there has been application in economics and finance (Ogden, 1990).
The most common methods of expressing difference scores are simple
absolute differences, differences between profiles and signed
(algebraic) differences (Johns, 1981). Simple absolute differences are
considered "simple" in that their components consist of
single-item scores or summary scores derived from a scale of items
(Johns, 1981). The term absolute implies that the direction of the
difference is not important, only the magnitude.
Profiles are graphic summaries of multiple-item measures which
retain the identity of each item until all are combined into a summary
difference measure (Mosier, 1951). Use of profile differences requires
the assumption that the direction of the difference is not important.
However, unlike simple absolute differences, profiles have components
that consist of more than a single variable (Johns, 1981).
Johns (1981) related that differences between profiles are commonly
expressed as:
1 Sum of the absolute differences between parallel profile points
to obtain an index of dissimilarity (Bernardin & Alvares, 1975;
Green & Organ, 1973).
2 Sum of the squares of the absolute differences between parallel
profile points to obtain the index of profile dissimilarity or D2
(Cronbach & Gleser, 1953).
3 Square root of the D2 index (Frank & Hackman, 1975; Senger,
1971). Algebraic, or signed, differences are formed when the direction
of the difference is maintained, allowing researchers to consider both
magnitude and direction of differences.
While researchers often employ various weighting strategies in the
calculation of difference scores, these are primarily study-specific and
will not be addressed in this paper. However, the method used to combine
component parts into difference score measures must be considered
carefully prior to any statistical analysis employed using difference
scores as inputs. Peter, Churchill and Brown (1993) describe four areas
in which problems may occur with the use of difference scores. These
include:
Reliability--Problems with reliability of difference scores are
primarily focused in:
1 The inherently low reliability of difference scores. Difference
score reliability is usually lower than the reliabilities of their
component variables (Peter, Churchill, & Brown, 1993; Prakash &
Lounsbury, 1983; Johns, 1981; and Mosier, 1951).
2 The failure of researchers to report reliabilities of difference
scores; the failure to correctly calculate the reliability of difference
scores. In their analysis of difference score applications used in
consumer research, Peter, Churchill and Brown (1993) found that only
four of 13 studies attempted to assess reliability scores. All four (of
those assessing reliability) assessed reliability incorrectly.
3 The effect of intercomponent correlation. The reliability of a
difference score will equal the average of the reliabilities of its
components only when the intercomponent correlation is zero.
Discriminant Validity--Discriminant validity is impacted from two
sources:
1 Reliability effects. For various reasons, difference scores often
have low reliability. It is possible that the correlations between a
difference score measure and other measures may give the impression that
discriminant validity standards are met simply because of low
reliability.
2 Combination effects. Measures formed as linear combinations of
scale scores (as with difference scores) may have difference scores that
cannot be distinguished from its components, thereby failing to
demonstrate discriminant validity.
Spurious Correlation--An additional problem is the tendency of
difference scores to be correlated with other measures. This is
primarily due to the tendency of difference scores to be correlated with
their component parts (Peter, Churchill & Brown, 1993). In addition,
spurious correlation is hard to separate from legitimate correlation.
Variance Restriction--Restriction of the variance of a difference
score measure may occur when one of the components used to calculate the
difference score is consistently higher than the other (Peter, Churchill
& Brown, 1993).
Alternatives to Difference Scores
Given the potential problems associated with the use of difference
scores to measure latent constructs, several alternatives have been
suggested by researchers (Peter, Churchill & Brown, 1993; Cronbach
& Furby, 1970). Suggestions include:
1 Designing single statement measures that require the respondent to subjectively compare the two constructs whose measures were used to
create the difference score measure of the latent construct (subjective
difference method);
2 Reframing research questions to avoid any comparison of the
constructs whose measures were used to create the difference score
measure of the latent construct (single statement method), and
3 Using a multiple regression approach to determine the appropriate
weight to assign to the measures of the constructs used to create the
difference score measure of the latent construct, rather than
"forcing" the measures into a predetermined form as is the
case with the construction of difference score measures of latent
constructs.
RESEARCH HYPOTHESES
To thoroughly assess the reliability and validity of difference
scores, it was necessary to compare difference score and non-difference
score measures of the same latent business constructs in a single study.
Two latent business constructs, net perceived return and disconfirmation
of expectations, were selected from the business literature because they
could be used to develop a single research instrument focused around a
common theme. Specifically, the net perceived return and disconfirmation
of expectations associated with the hypothetical purchase of a small
4-door economy car were measured by a difference score method and two
non-difference score alternative methods.
Research hypotheses are offered to facilitate the empirical
analysis of difference score measures of latent business constructs. The
following hypothesis is offered to provide a framework for the detailed
empirical investigation of the reliability of difference score measures
and their alternatives.
H1: There is no difference in the reliability of difference score
and non-difference score measures of latent business constructs.
In addition to the need for a thorough examination of the
reliability of a measure, researchers should investigate a
measure's nomological, convergent and discriminant validity of
difference score measures and their alternatives (Peter, 1981). Hence:
H2: There is no difference in the nomological validity of
difference score and non-difference score measures of latent business
constructs
H3: There is no difference in the convergent validity of difference
score and non-difference score measures of latent business constructs
H4: There is no difference in the discriminant validity of
difference score and non-difference score measures of latent business
constructs.
The evaluation of a measure's nomological validity requires an
empirical analysis of the measure of the construct of interest and
measures of other constructs theoretically linked to the construct of
interest (Peter, 1981). Therefore, the nomological validity of the
difference score measures and their alternatives were investigated by
examining their relationships with measures of other theoretically
related constructs.
The research instrument (Appendix A) included items measuring the
respondent's overall impression and expected resale value of the
4-door economy used in the study. Overall impression and expected resale
should be significantly correlated with the net perceived return and
disconfirmation associated with the hypothetical purchase of the small
4-door economy car. Investigation of the relationship of the measures of
the net perceived return and disconfirmation of expectations with
variables measuring the overall impression and expected resale value of
the 4-door economy car requires the introduction of four secondary
hypotheses (Hypotheses 2a-2d). Utilization of hypotheses 2a through 2d
ensured that the nomological validity of difference score and
non-difference score measures were compared in a consistent manner; that
is, Hypotheses 2a through 2d are proposed to provide the framework
necessary to investigate Hypothesis 2 in a consistent manner.
H2a: Difference score measures of net perceived return and
non-difference score measures of net perceived return are not different
with respect to their correlation with the overall impression of 4-door
economy car x.
H2b: Difference score measures of disconfirmation of expectations
and non-difference score measures of disconfirmation of expectations are
not different with respect to their correlation with the overall
impression of 4-door economy car x.
H2c: Difference score measures of net perceived return and
non-difference score measures of net perceived return are not different
with respect to their correlation with the expected resale vale of
4-door economy car x.
H2d: Difference score measures of disconfirmation of expectations
and non-difference score measures of disconfirmation of expectations are
not different with respect to their correlation with the expected resale
value of 4-door economy car x.
RESEARCH METHOD
To investigate the research hypotheses it was first necessary to
determine an appropriate research methodology. Multitrait-multimethod
analysis was selected as the primary method of validity assessment due
to its support in the psychometric literature. The decision to use
multitrait-multimethod analysis necessitated the research design
employed in this study.
Multitrait-multimethod analysis requires that two or more traits,
or constructs, be measured by two or more methods with a single research
instrument. The two latent business constructs, net perceived return and
disconfirmation of expectations meet the conditions required for
multitrait-multimethod validity analysis.
Each latent trait, or construct, was measured by three methods: one
difference score method and two non-difference score alternative
methods. The first non-difference score alternative, the
"subjective difference method," required respondents to
subjectively compare the two constructs whose measures were used to
create the difference score measure of the latent construct. The second
non-difference score alternative, the "single statement
method," employed research questions that avoided any comparison of
the two constructs used to create the difference score measure of the
latent construct. A detailed discussion of the latent business
constructs examined in the study, along with the scales used to
operationalize the constructs, follows.
Net Perceived Return
The net perceived return construct was introduced by Peter and
Tarpey (1975) as an alternative theoretical formulation of how consumers
evaluate the risks and returns associated with purchase decisions. They
(Peter & Tarpey, 1975) stated that, in the context of risk-return
typology, there would appear to be three distinct strategies in terms of
how consumers make decisions:
Select the brand that minimizes expected loss (perceived risk),
Select the brand that maximizes expected gain (perceived return),
and Select the brand that maximizes net expected gain (net
perceived return).
Peter and Tarpey (1975) investigated each alternative and concluded
that the net perceived return alternative explained more variance in
automobile brand preference than the other two.
Difference Score Method. Using Lewin's (1943) vector
hypothesis of consumer behavior as a theoretical basis, net perceived
return can be defined as the difference between overall perceived return
and overall perceived risk as shown (Peter & Tarpey, 1975: 30):
NP[Re.sub.j] = f(OP[Re.sub.j] - OP[R.sub.j]) = f [summation]
[(P[G.sub.i] x I[G.sub.ij]) - (P[L.sub.ij] x I[L.sub.ij])]
where:
NP[Re.sub.j] = net perceived return for brand j
OP[Re.sub.j] = overall perceived return for brand j
OP[R.sub.j] = overall perceived risk for brand j
P[G.sub.i] = probability of gain i from purchase of brand j
[Ig.sub.ij] = importance of gain i from purchase of brand j
P[L.sub.ij] = probability of loss i from purchase of brand j
I[L.sub.ij] = importance of loss i from purchase of brand j
n = utility facets
Six utility facets were used in the Peter and Tarpey (1975) study.
These same six are incorporated into this study and include: financial
risk-return, performance risk-return, psychological risk-return,
physical risk-return, social risk-return, and time risk-return. The net
perceived return scale developed by Peter and Tarpey (1975) appears in
Appendix A.
Subjective Difference Method. The subjective difference method of
recasting a difference score into a single statement was used in the
context of net perceived return to ensure that the measurement methods
were consistent for the disconfirmation of expectations and net
perceived return constructs. Statements were developed that required the
respondents to subjectively compare the potential loss and gain
associated with the hypothetical purchase of a small 4-door economy
along each of the six utility facets. These statements are referred to
as "subjective loss-gain comparisons" (SL[G.sub.ij]). Since
the difference score measurement model of the net perceived return
construct (NP[R.sub.ej]) employs utility facet importance weights, it
was necessary to create a utility facet weighting scheme for the
subjective net perceived return (SNP[R.sub.ej]) model. The importance of
a gain ([Ig.sub.ij]) and the importance of a loss ([IL.sub.ij]) were
averaged to provide a weighting factor ([I.sub.ij]) for each utility
facet. The subjective difference model used to measure the net perceived
return (SNP[Re.sub.j]) is:
SNP[Re.sub.j] = f [summation] ([I.sub.ij] x SL[G.sub.ij])
where:
SNP[Re.sub.j] = net perceived return for brand j (subjective
measure)
SL[Gi.sub.j] = subjective loss-gain comparison I from purchase of
brand j
Iij = (I[G.sub.ij] + I[L.sub.ij])/2, importance of utility facet I
[Ig.sub.ij] = importance of gain I from purchase of brand j
I[L.sub.ij] = importance of loss I from purchase of brand j
n = utility facets
The scale used to measure the subjective loss-gain comparisons
(SL[G.sub.ij]) necessary to construct the subjective difference measure
of the net perceived return construct is shown in Appendix A.
Single Statement Method. Consistent with the research of Peter and
Tarpey (1975), overall perceived risk (OP[R.sub.j]) was used as a
non-difference score alternative formulation of how consumers evaluate
risks and returns associated with the hypothetical purchase of a small
4-door economy car. Of interest is the fact that Peter and Tarpey (1975)
found that overall perceived risk (OP[R.sub.j]) explained more variation
in brand preference than net perceived return for one of the automobiles evaluated in their study.
Disconfirmation of Expectations
The disconfirmation of expectations construct has been widely used
in the study of consumer satisfaction (Prakash & Lounsbury, 1983)
and occupies a central position as a crucial intervening variable (Churchill & Surprenant, 1982). Since disconfirmation arises from
discrepancies between prior expectations and actual performance, it is
presumably the magnitude of the disconfirmation effect that generates
satisfaction and dissatisfaction (Churchill & Surprenant, 1982).
According to Oliver and Swan (1989), satisfaction is a result of these
steps: "Prior to an exchange, consumers hold attribute norms or
form attribute performance expectations. As the product is used or
service rendered, the consumer compares performance perceptions to these
prior comparison standards. Performance above the standard has been
termed positive disconfirmation, while performance below is referred to
as negative disconfirmation. The degree of incremental (dis)satisfaction
is a direct function of positive (negative) disconfirmation."
Historically, researchers have used both difference score and
non-difference score alternative measures to operationalize
disconfirmation of expectations (Tse & Wilton, 1988; Prakesh &
Lounsbury, 1983; and Oliver, 1980). These include the difference score
method, subjective difference method and the single statement method and
each is discussed with regard to disconfirmation of expectations.
Difference Score Method. The difference score measurement model
used to measure the disconfirmation of expectations associated with the
hypothetical purchase of a small 4-door economy car is:
DIS[C.sub.j = EX[P.sub.j] - PE[R.sub.j] = [summation] [E.sub.ij] -
[summation] [P.sub.ij]
where:
DIS[C.sub.j] = disconfirmation of expectations for brand j
EX[P.sub.j] = overall expectation of brand j
PE[R.sub.j] = overall perception of brand j
[Ei.sub.j] = expectation of facet i of brand j
[Pi.sub.j] = perception of facet i and brand j
n = utility facets
The expectations and perceptions components needed to create a
difference score measure consistent with Peter and Tarpey's (1975)
net perceived return were developed and are shown in Appendix A. This
scale is consistent with those employed by others who have used
difference scores to measure disconfirmation of expectations (Tse &
Wilton, 1988; La Tour & Peat, 1979).
Subjective Difference Method. The subjective difference method for
measuring disconfirmation of expectations requires that the respondent
record a summary judgment on a "better than expected--worse than
expected" scale. The subjective difference model is:
SDIS[C.sub.j] = [summation] [Sd.sub.ij]
where:
SDIS[C.sub.j] = disconfirmation of expectations for brand j
(subjective measure)
[Sd.sub.ij] = subjective disconfirmation of expectations of utility
facet i for brand j
Oliver's (1980) three item, subjective disconfirmation
approach measuring customers' subjective disconfirmation with an
automobile dealer's service department was used as basis for the
development of a scale consistent with the six utility facet dimensions
used in Peter and Tarpey's (1975) net perceived return model. This
scale, measuring disconfirmation in the context of the hypothetical
purchase of a small 4-door economy car, is shown in Appendix A.
Single Statement Method. Overall perception (PE[R.sub.j]) was used
as an additional measure of the disconfirmation of expectations
construct. While overall perception is not a measure of disconfirmation,
it is necessary to treat it as such in order to have the same three
methods measuring all latent traits included in the research instrument.
Otherwise, it is not possible to perform multitrait-multimethod validity
analyses.
Sample
The sampling frame consisted of undergraduate students enrolled in
business classes at a small Midwestern college. Three hundred ten
questionnaires (Appendix A) were completed.
RESULTS
The evaluation of the research hypotheses is presented in two
parts. First we present the results and analysis of the hypothesis
concerning the reliability and difference scores. In the second part, we
present the results and analyses of evaluating the validity hypotheses.
Reliability Assessment
Coefficient Alpha Reliabilities. The reliabilities of all
non-difference score measures and the components necessary to construct
the difference score measures of the latent business constructs were
first assessed using coefficient alpha (Cronbach, 1951). The alphas
calculated for the components necessary to construct the difference
score measures of the latent business constructs along with their
variances and intercomponent correlations were then used to determine
the reliability of the difference score measures of the latent business
constructs measured in the study. The resulting alphas ranged from .72
to .83. Tables presenting all pertinent information are available upon
request from the authors. Available tables are listed in Appendix B.
Difference Score Reliability Assessment. Table 1 contains the
reliabilities, variances and intercomponent correlations used to
calculate the reliability of the difference score measure of the
disconfirmation of expectations construct along with the difference
score reliability calculated for this measure. Table 2 presents the same
information for the net perceived return construct.
A summary of the reliabilities of the three methods used to measure
the disconfirmation of expectations construct are shown in Table 3. A
summary of the same reliability information for the net perceived return
construct is presented in Table 4.
Reliability Hypothesis Tests. Hypothesis 1 was evaluated using the
methodology proposed by Feldt, Woodruff and Salih (1987) to test the
hypothesis of equality of coefficients alpha (Ho = A1 = A2) when the
alpha estimates are based on the sample. Note that the difference score
reliabilities are not coefficient alpha reliabilities, but were treated
as such in order to employ the method proposed by Feldt, Woodruff and
Salih (1987).
The difference score reliabilities were compared in a pairwise
fashion with the reliabilities of each of the two non-difference score
alternative measures for each latent business construct. The following
test statistic was used (Feldt, Woodruff & Salih, 1987: 99):
t = ([[alpha].sub.1] - [[alpha].sub.1]) [(N - 2).sup.1/2] / [[4(1 -
[[alpha].sub.1])(1 - [[alpha].sub.2])].sup.1/2] (Degrees of freedom - N
-2)
where N = sample size, [[alpha].sub.1] and [[alpha].sub.2] are
sample coefficients alpha, and p is the correlation between the two
summative measures developed from the sample data.
Table 5 contains a summary of the hypothesis tests of reliability
difference for the disconfirmation of expectations construct. Table 6
presents the same information for the net perceived return construct.
Examination of the four comparisons of the difference score and
non-difference score reliabilities shown in Tables 5 and 6 indicates a
significant difference in reliabilities in three of the four cases
investigated. Only the comparison of the reliabilities of the difference
score and single statement measures of the net perceived return
construct yielded no difference (p-value = 0.11). Because the other
three cases investigated show a significant difference (p-values <
0.05) in difference score and non-difference score measures, it is
maintained that there is sufficient evidence to reject Hypothesis 1.
Therefore, we conclude that there is a difference in the reliabilities
of the difference score and non-difference score measures of the latent
business constructs measured in this study.
Validity Assessment
Nomological Validity Assessment. Lower bound reliability estimates
and descriptive statistics for the items measuring overall impression
and expected resale vale of the 4-door economy car are presented in
Table 7.
The investigation of Hypotheses 2a through 2d was complicated by
the fact that the correlations to be tested are not from independent
samples. This necessitated the use of a jackknife procedure to provide
unbiased estimates of the correlations (Balloun & Oumlil, 1986). A
FORTRAN program written by Balloun and Oumlil (1986) provided n estimate
(n = sample size) of unbiased correlations which were used to calculate
sample means and standard deviations for each of the correlations of
interest.
Hypotheses 2a through 2d were evaluated with single factor analysis
of variance (ANOVA) to determine if there was any difference among the
difference score and non-difference score methods with respect to
nomological validity.
Examination of Table 8 indicates that there is no significant
difference (p-value = 0.34) among the difference score, subjective
difference and single statement measures of the net perceived return
construct with respect to their correlation with the measure of overall
impression of the 4-door economy car. Hypothesis 2a is not rejected at a
reasonable level of significance.
Table 9 reveals no significant difference (p-value = 0.08) among
the difference score, subjective difference and single statement
measures of the disconfirmation of expectations construct with respect
to their correlation with the measure of overall impression of the
4-door economy car at the 0.05 level of significance. Therefore,
Hypothesis 2b is not rejected.
Table 10 does not indicate a significant difference (p-value =
0.47) among the difference score, subjective difference and single
statement measures of the net perceived return construct with respect to
their correlation with the measure of expected resale value of the
4-door economy car. Hypothesis 2c is not rejected.
Table 11 indicates that there is no difference (p-value = 0.42)
among the difference, subjective difference and single statement
measures of the disconfirmation of expectations construct with respect
to their correlation with the measure of expected resale value of the
4-door economy car. Therefore, it is concluded that Hypothesis 2d should
not be rejected.
Since Hypotheses 2a through 2d were not rejected at the 0.05 level
of significance, Hypothesis 2 was not rejected. It is concluded that
there is no difference in the nomological validity of the difference
score and non-difference score measures of the latent business
constructs measured in this study.
Convergent and Discriminant Validity Assessment. The convergent and
discriminant validity of the difference score and non-difference score
measures were evaluated using multitrait-multimethod matrix analysis
(Campbell & Fiske, 1959). Specifically, the analysis of variance
methodology suggested by Kavanaugh, MacKinney and Wolins (1971) was
employed to provide structure to the multitrait-mulitmethod analysis of
convergent and discriminant validity and to evaluate Hypotheses 3 and 4.
The evaluation of Hypotheses 3 and 4 required that the convergent
and discriminant validity of the difference score and non-difference
score measures of the latent business constructs be compared with the
convergent and discriminant validity of the non-difference score
measures of the latent business constructs. The multitrait-multimethod
matrix containing the difference score, subjective difference and single
statement measures of the net perceived return and disconfirmation of
expectations constructs provided an overall view of the traits
(constructs) and methods used in the study but it was not a form
amenable to comparison of the convergent and discriminant validity of
the difference and non-difference score measures necessary to evaluate
Hypotheses 3 and 4. These comparisons necessitated the construction of
three additional multitrait-multimethod matrices; the first matrix
containing difference score and subjective difference measures of the
net perceived return and disconfirmation of expectations constructs, The
second matrix, containing difference score and single statement measures
of the net perceived return and disconfirmation of expectations
constructs, and the third containing the multitrait-multimethod matrix
of the subjective difference and single statement measures of the net
perceived return and disconfirmation of expectations constructs. Tables
containing these matrices are available upon request from the authors.
Consistent with the procedure employed by Kavanagh, MacKinney and
Wolins (1971), the following three-way classification model was
hypothesized to describe the data:
[Y.sub.ijk] = [mu] + [[alpha].sub.i] + [[beta].sub.j] +
[[gamma].sub.k] + [([alpha][beta]).sub.ij] + [([alpha][gamma]).sub.ik] +
[([beta][gamma]).sub.jk] + [[epsilon].sub.ijk] (6)
where:
[Y.sub.ijk] = ratings of respondents for the traits by methods
[[alpha].sub.i] = effect of respondent i = 1,2, ... 310
[[beta].sub.j] = effect of trait j = 1,2
[[gamma].sub.k] = effect of method k = 1,2,3
[[epsilon].sub.ijk] = NID (0, [[sigma].sub.[epsilon]])
Using the methodology provided, analysis of variance tables with
variance estimates and variance indexes were produced for each of the
three multitrait-multimethod matrices mentioned above and appear in
Tables 12, 13 and 14.
According to Kavanagh, MacKinney, and Wolins (1971), respondent
variance indicates the overall amount of agreement, or convergence,
among the measurement methods. Examination of the F statistics
associated with the respondent variance in the ANOVA tables indicates
significant convergence in each of the three cases at the 0.01 level of
significance.
However, evaluation of Hypothesis 3 requires comparison of
difference score convergent validity with that of the non-difference
score methods. Unfortunately, this comparison cannot be made with a test
statistic. As suggested by Kavanaugh, MacKinney and Wolins (1971),
variance indexes were used for this comparison. Tables 12 and 13 contain
the variance indexes for the difference score method's convergent
validity with the subjective difference and single statement measures,
respectively. These indexes are 0.37 and 0.53. Table 30 contains a
variance index of 0.43 for the convergent validity of the two
non-difference score methods. The average of the difference score
measures, 0.45, is very close to the variance index of the
non-difference score measures, 0.43. Logically, this would lead to the
conclusion that there is no difference between the convergent validity
of the difference score measures and the convergent validity of the
non-difference score measures. Hence, Hypothesis 3 is not rejected.
Kavanagh, MacKinney and Wolins (1971) maintain that discriminant
validity is demonstrated by significant respondent by trait (construct)
variance. The F statistics associated with the respondent by trait
variance shown in the three ANOVA tables indicate significant
discriminant validity in only the two cases involving difference scores.
The multitrait-multimethod analysis of variance evaluation of the two
non-difference score methods does not show significant (F = 0.89)
discriminant validity. Therefore, it appears that the difference score
discriminates itself from the non-difference score methods better than
the two non-difference score methods discriminate between themselves.
This is taken as evidence supporting the rejection of Hypothesis 4, and
it is concluded that there is a difference in the discriminant validity
of the difference score and non-difference score measures of the latent
business constructs measured in this study.
DISCUSSION
This research was focused on reliability and validity issues
concerning the use of difference scores to measure latent business
constructs. Four separate comparisons of difference score and
non-difference score reliabilities were made. Only one of the four
comparisons, that of the difference score and the single statement
measures of the net perceived return construct (p-value = 0.11),
indicates that there is not a statistically significant difference in
the reliability of difference score and non-difference score measures.
This was considered to be sufficient evidence to conclude that
difference score and non-difference score measures were different with
respect to reliability. This finding lends support to the idea that
difference score measures have lower reliabilities than their
non-difference score alternatives (Peter, Churchill & Brown, 1993;
Johns 1981). However, it should be noted that only the reliability
calculated for the difference score measure of the disconfirmation of
expectations construct would be considered low by most researchers. All
other measures used in the study had reliabilities above Nunnally's
(1978) suggested standard of 0.70. Moreover, the reliability of 0.80
calculated for the difference score measure of the net perceived return
construct is certainly high enough for practical research applications.
The nomological validity of the difference score and non-difference
score measures of the latent business constructs was evaluated by
testing for homogeneity of correlation with two theoretically related
constructs: overall evaluation of the 4-door economy car and expected
resale vale of the 4-door economy car. All hypothesis tests revealed no
difference among the measurement methods with respect to correlation
with the theoretically related constructs supporting the conclusion that
the difference score and non-difference score measure used in this study
are not different in nomological validity.
Multitrait-multimethod ANOVA analyses indicate that all measures
exhibitied significant convergent validity at the 0.01 level of
significance. Comparison of variance indexes did not provide evidence
necessary to reject the hypothesis that difference score and
non-difference score measures are different with respect to convergent
validity.
The results of this study indicate that the difference score and
non-difference score measures investigated are different with respect to
discriminate validity. In fact, multitrait-multimethod ANOVA analysis
indicates that the difference score measures exhibited better
discriminant validity than the non-difference score measures. These
findings conflict with literature maintaining that difference scores
measures have lower discriminant validity than do non-difference scores
measures (Peter, Churchill and Brown, 1993; Johns, 1981). The overall
implication is that difference score measures should be considered as
viable measurement alternatives, and must be given careful consideration
when their use is warranted on theoretical grounds.
LIMITATIONS OF THE STUDY
This study is limited to two major factors (1) the generalizability
of the sample to the population of difference score measures; and (2)
the error present in the analyses. The net perceived return and
disconfirmation of expectations constructs measured are but two of many
latent business constructs that have been measured using difference
score measures. The results from the evaluation of measures of two
latent business constructs cannot be used to make inferences about the
entire population of difference score measures of latent business
constructs. The proper measurement technique should be determined for
each unique situation. Difference score methods should not be discounted
as viable alternatives when warranted for the comparison of two
constructs, or traits without thorough empirical investigation.
The variance estimates developed in the multitrait-multimethod
ANOVA analyses indicate that the amount of error variance is large in
all three of the multitrait-multimethod matrix analyses. This means that
the responses are very much dependent upon unknown sources of variation,
and that any interpretation of results must be made with caution.
SUGGESTIONS FOR FUTURE RESEARCH
Given that the results of this study are not consistent with the
conventional wisdom concerning the use of difference scores to measure
latent constructs, it would seem prudent to replicate the study. First,
an identical study could be undertaken to provide further validation of
the results. Additional studies could also be performed to determine
situations in which difference scores should or should not be used.
Additionally, an effort should be made to develop research projects
that allow for the use of one of the more complex multitrait-multimethod
analyses proposed by Bagozzi and Yi (1993, 1991) for the analysis of
convergent and discriminant validity of difference score measures and
non-difference score alternatives. It should be noted, however, that the
proposals by Bagozzi and Yi (1993, 1991) require that more than three
traits be measured by more than three methods, proving to be a very
difficult study to design and execute.
APPENDIX A
Scales used in this study
NET PERCEIVED RETURN
Improbable Probable
1. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a financial loss for me because of such things as its
poor warranty, high maintenance costs, and/or high monthly payments.
2. As far as I'm concerned, if this financial loss happened to me it
would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
3. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a social loss for me because my friends and relatives
would think less highly of me.
4. As far as I'm concerned if this social loss happened to me, it would
be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
5. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a performance loss for me because it would run extremely
poorly.
6. As far as I'm concerned, if this performance loss happened to me, it
would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
7. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a psychological loss for me because it would not fit well
with my self-image or self-concept (i.e., the way I think about myself).
8. As far as I'm concerned if this psychological loss happened to me,
it would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
9. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a physical loss for me because it would not be very
safe or would become unsafe.
10. As far as I'm concerned, if this physical loss happened to me, it
would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
11. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a loss of convenience for me because I would have to
waste a lot of time and effort getting it adjusted and repaired.
12. As far as I'm concerned, if this loss of convenience happened to
me, it would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
13. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a financial loss for me because of such things as its
fine warranty, low maintenance costs, and/or reasonable monthly
payments.
14. As far as I'm concerned, if this financial gain happened to me it
would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
15. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a social loss for me because my friends and relatives
would think more highly of me.
16. As far as I'm concerned if this social gain happened to me, it
would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
17. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a performance gain for me because it would run extremely
well.
18. As far as I'm concerned, if this performance gain happened to me,
it would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
19. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a psychological gain for me because it would fit in well
with my self-image or self-concept (i.e., the way I think about
myself).
20. As far as I'm concerned if this psychological gain happened to me,
it would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
21. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a physical gain for me because it would be very safe and
would remain safe.
22. As far as I'm concerned, if this physical gain happened to me, it
would be
1 2 3 4 5 6 7
Unimportant Improbable
Important Probable
23. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a gain in convenience for me because I would not have to
waste much time and effort getting it adjusted and repaired.
24. As far as I'm concerned, if this gain in convenience happened to
me, it would be
1 2 3 4 5 6 7
Unimportant Improbable
NET PERCEIVED RETURN--SUBJECTIVE COMPARISON
1. The purchase of car x is 1 2 3 4 5 6 7 to result in financial gain
for me than it is a financial loss for me due to such things as its
good warranty, low maintenance costs, and/or low monthly payments.
2. The purchase of car x is 1 2 3 4 5 6 7 to result in a social gain
for me than it is a social loss for me because my friends and relatives
will think more highly of me.
3. The purchase of car x is 1 2 3 4 5 6 7 to result in a performance
gain for me than it is a performance loss for me because the vehicle
would run extremely well.
4. The purchase of car x is 1 2 3 4 5 6 7 to result in a psychological
gain for me than it is a psychological loss for me because the vehicle
would fit in well with my self-image or self- concept (i.e., the way I
think about yourself).
5. The purchase of car x is 1 2 3 4 5 6 7 to result in a physical gain
for me than it is a physical loss for me because it would be very safe
and would remain safe.
6. The purchase of car x is 1 2 3 4 5 6 7 to result in a gain in
convenience for me than it is a loss of convenience for me because I
would not have to waste much time and effort getting it adjusted and
repaired.
Measured 1 to 7 scale where: 1 = Less Likely, 7 = More Likely
SUBJECTIVE DISCONFIRMATION OF EXPECTATIONS
1. Car x would be a 1 2 3 4 5 6 7 purchase than the typical 4-door
economy car because of such things as its good warranty, low
maintenance costs, and/or low monthly payments.
2. My friends and relatives would think 1 2 3 4 5 6 7 highly of me if
purchased car x rather than a typical 4-door economy car.
3. Car x's performance (i.e., the way it runs) would be 1 2 3 4 5 6 7
than the typical 4-door economy car.
4. Car x would fit my self-image or self-concept (i.e., the way I
think about myself) 1 2 3 4 5 6 7 than the typical 4-door economy car.
5. Car x would be 1 2 3 4 5 6 7 safe than the typical than the typical
4-door economy car.
6. Car x would inconvenience me 1 2 3 4 5 6 7 than the typical 4-door
economy car because of the time and effort necessary to get it adjusted
and repaired. (negatively scored)
Item 1 scored on a scale of 1 to 7, where: 1 = Much Poorer, 7 = Much
Better Items 3 and 4 scored on a scale of 1 to 7, where: 1 = Much
Worse, 7 = Much Better Items 2, 5 and 6 scored on a scale of 1 to 7,
where: 1 = Much Less, 7 = Much More
DISCONFIRMATION OF EXPECTATIONS
Expectations
1. A typical small 4-door economy car would be poor choice because of
such things as their poor warranties, high maintenance costs, and/or
high monthly payments.
2. I think that the purchase of a typical small 4- door economy car
would cause my friends and relatives to think less highly of me.
3. I think that the purchase of a typical small 4- door economy car
would cause a performance loss for me because it would run extremely
poorly.
4. I think that the purchase of a typical small 4- door economy car
would not fit well with my self-image or self-concept (i.e., the way I
think about myself).
5. I think that a typical small 4-door sedan would not be very safe or
would become unsafe.
6. I think that the purchase of a typical small 4- economy car would
inconvenience me because I would have to waste a lot of time and effort
getting it adjusted and repaired.
Measured on scale of 1 to 7, where: 1 = Strongly, Disagree 7 = Strongly
Agree
Perceptions
1. The purchase of car x would be a poor choice because of such things
as their poor warranties, high maintenance costs, and/or high monthly
payments.
2. Purchase of a car x would cause my friends and relatives to think
less highly of me.
3. Purchase of a car x would cause a performance loss for me because it
would run extremely poorly.
4. Purchase of a car x would not fit well with my self-image or
self-concept (i.e., the way I think about myself).
5. Car x would not be very safe or would become unsafe.
6. Purchase of a car x would inconvenience me because I would have to
waste a lot of time and effort getting it adjusted and repaired.
Measured on scale of 1 to 7, where: 1 = Strongly, Disagree 7 = Strongly
Agree
ADDITIONAL ITEMS
1. If I were to purchase a small 4-door economy, car x would be an
excellent choice
2. I expect car x to have a very high resale value.
Measured on scale of 1 to 7, where: 1 = Strongly, Disagree 7 = Strongly
Agree
APPENDIX B
TABLES AVAILABLE FROM THE AUTHORS
Regarding the Disconfirmation of Expectations, the tables
presenting descriptive statistics, inter-item correlations and
coefficient alphas for the expectation ([E.sub.i]) and perception
([P.sub.j]) items summed to develop the expectation (EXP) and perception
(PER) components of the difference score measure of the disconfirmation
of expectation (DISC) construct and the table presenting the descriptive
statistics, inter-item correlations and coefficient alpha for the items
(S[D.subh.i]) that were summed to form the subjective difference measure
of the disconfirmation of expectations construct (SDISC).
Regarding Net Perceived Return, the table presenting descriptive
statistics, inter-item correlations and alphas for the RE[T.sub.i]
measures that were summed to form the overall perceived return (OPRe)
component used in the construction of the difference score measure of
the net perceived return (NPRe) construct and the tables presenting the
descriptive statistics, inter-item correlations and alphas for the
probability of gain (P[G.sub.i]) and importance of gain (I[G.sub.i])
measures that were used to calculate the [RET.sub.i] items. Note that
the probability of gain ([PG.sub.i]) and importance of gain (I[G.sub.i])
items are not used as summative scales in the study. Coefficients alpha
merely provide consistent information concerning all measures used in
the study. The table presenting the descriptive statistics, inter-item
correlations and alphas for the [RSK.sub.i] measures that were summed to
create the overall perceived risk measure (OPR) component necessary to
construct the difference score measure of the net perceived return
(NPRe) construct and the tables presenting the descriptive statistics,
inter-item correlations and alphas for the probability of loss
([Pl.sub.i]) and importance of loss ([IL.sub.i]) measures that were used
to calculate the [RSK.sub.i] items. Again, note that the probability of
loss ([Pl.sub.i]) and importance of loss ([Il.sub.i]) items are not used
as summative scales in the study. Coefficients alpha merely provide
consistent information concerning these measures. The table presenting
the descriptive statistics, inter-item correlations and alpha for the
[ISLG.sub.ij] measures that were summed to form the subjective
difference measure of the net perceived return constructs
(SNP[Re.sub.j]) and the tables presenting the descriptive statistics,
inter-item correlations and alpha for the [I.sub.ij] and SL[G.sub.ij]
items used to construct the ISL[G.sub.ij] measures.
REFERENCES
Bagozzi, R.P. (1990). Structural equation models in marketing
research. In W.D. Neal (ed.), Proceedings of the First Annual Advanced
Research Techniques Forum. Chicago: American Marketing Association.
Bagozzi, R.P. & Y. Yi (1991). Multitrait-multimethod matrices
in consumer research. Journal of Consumer Research, 17(4), 26-439.
Balloun, J.L. & A.B. Oumlil (1986). Jackknife: A general
purpose program for multivariate jackknife analyses. Behavior Research
Methods, Instruments and Computers, 18(1), 47-49.
Bernadin, H.J. & K.M. Alvares (1975). he effects of
organizational level on perception of role conflict resolution strategy.
Organizational Behavior and Human Performance, 14(1), 1-9.
Campbell, D.T. & D.W. Fiske (1959). Convergent and discriminant
validity by the multitrait-multimethod matrix. Psychological Bulletin,
56, 81-105.
Churchill, G.A. & C. Surprenant (1982). An investigation into
the determinants of customer satisfaction. Journal of Marketing
Research, 19(4), 491-504.
Cronbach, L.J. (1951). Coefficient alpha and the internal structure
of tests. Psychometrika, 16, 297-334.
Cronbach, L.J. and G.C. Gleser (1953). Assessing similarity between
profiles. Psychological Bulletin, 50, 456-473.
Cronbach, L.J. & L. Furby (1970). How should we measure
change--or should we? Psychological Bulletin, 74(1), 68-80.
Feldt, L.S., D.J. Woodruff, D.J. & F.A. Salih (1991).
Statistical inference for coefficient alpha. Applied Psychological
Measurement, 11(1), 93-103.
Frank, L.L. & J.R. Hackman (1975). Effects of
interviewer-interviewee similarity on objectivity in college admissions.
Journal of Applied Psychology, 60(3), 356-360.
Green, C.N. & D.W. Organ (1973). An evaluation of causal models
linking the received role job jatisfaction. Administrative Science
Quarterly, 18(1), 95-103.
Johns, Gary (1981). Difference score measures of organizational
behavior variables: A critique. Organizational Behavior and Human
Decision Processes, 27(3), 443-463.
Kavanagh, M.J., A.C. MacKinney & L. Wolins (1971). Issues in
managerial performance: Multitrait-multimethod analyses of ratings.
Psychological Bulletin, 75(1), 34-49.
LaTour, S A. & N.C. Peat (1979). Conceptual and methodological
issues in satisfaction research. Advances in Consumer Research, 6(1),
431-437.
Lord, F M. (1958). The utilization of unreliable difference scores.
Journal of Educational Psychology, 49(3), 150-152.
Mosier, C I. (1951). Batteries and profiles. in: E.F. Lindquist (ed.), Educational Measurement, Washington, D.C.: American Council on
Education.
Nunnally, JC. (1978). Tests and Measurement. New York: McGraw-Hill.
Nunnally, J C. (1978). Psychometric Theory, New York: McGraw-Hill.
Ogden, J.P. (1990). Turn-of-month evaluations of liquid profits and
stock returns: A common explanation of the monthly and January effects.
Journal of Finance, 45(4), 1259-1272.
Oliver, RL. & J.E. Swan (1989). Equity and disconfirmation
perceptions as influences on merchant and product satisfaction. Journal
of Consumer Research, 16(3), 372-383.
Peter, J.P., G.A. Churchill, Jr. & T.J. Brown (1993). Caution
in the use of difference scores in consumer research. Journal of
Consumer Research, 19(4), 655-662.
Peter, J.P. (1981). Construct validity: A review of basic issues
and marketing practices. Journal of Marketing Research, 18(2), 133-145.
Peter, J. P. & L.X. Tarpey, Sr. (1975). A comparative analysis
of three consumer decision strategies. Journal of Consumer Research,
2(1), 29-37.
Prakash, V. & J.W. Lounsbury (1983). A reliability problem in
the measurement of disconfirmation of expectations. Advances in Consumer
Research, 10(1), 244-249.
Senger, J. (1971). Managers' perceptions of subordinates'
competence as a function of personal value orientations. Academy of
Management Journal, 14(4), 415-423.
Tse, D.K. & P.C. Wilton (1988). Models of consumer satisfaction
formation: An extension. Journal of Marketing Research, 25(2), 204-212.
Thomas Alexander Vernon, Missouri Southern State University
R. Anthony Inman, Louisiana Tech University
Gene Brown, University of Missouri-Kansas City
Table 1
Input Necessary to Calculate Reliability of Difference Score Measure
of Disconfirmation of Expectations Construct
Component Reliability Variance
Expectations 0.79 45.04
Perceptions 0.81 42.29
Intercomponent Correlation = 0.58
Difference Score Reliability = 0.52
Table 2
Inputs Necessary to Calculate Reliability of Difference Score
Measure of Net Perceived Return Construct
Component Reliability Variance
Overall Perceived Return 0.80 2474.82
Overall Perceived Risk 0.77 1969.33
Intercomponent Correlation = -0.07
Difference Score Reliability = 0.80
Table 3
Summary of Reliabilities for the Three Methods Used to Measure the
Disconfirmation of Expectations Construct
Method Reliability
Difference Score Method 0.52
Subjective Difference Method 0.72
Single Statement Method 0.81
Table 4
Summary of Reliabilities for the Three Methods Used to Measure
the Net Perceived Return Construct
Method Reliability
Difference Score Method 0.80
Subjective Difference Method 0.83
Single Statement Method 0.77
Table 5
Hypothesis Test of Reliability Difference For Disconfirmation of
Expectations Construct
Difference Score (#1) vs. Subjective Difference
[H.sub.0]: [A.sub.1] = [A.sub.2]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.2]
Inputs for t calculation: [[alpha].sub.1] = 0.52
[[alpha].sub.2] = 0.72
[rho] = -0.16
N = 310
Results: t = -4.85
DOF = 308
p-value = 0.000
Difference Score vs. Single Statement (#3)
[H.sub.0]: [A.sub.1] = [A.sub.3]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.3]
Inputs for t calculation: [[alpha].sub.1] = 0.52
[[alpha].sub.2] = 0.81
[rho] = 0.43
N = 310
Results: t = -9.36
DOF = 308
p-value = 0.000
Note: Numbers have been rounded to 2 places for illustrative purposes.
Greater than 2 place accuracy was used in all calculations.
Table 6
Hypothesis Test of Reliability Difference For Net Perceived
Return Construct
Difference Score (#1) vs. Subjective Difference
[H.sub.0]: [A.sub.1] = [A.sub.2]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.2]
Inputs for t calculation: [[alpha].sub.1] = 0.80
[[alpha].sub.2] = 0.83
[rho] = -0.58
N = 310
Results t = -2.03
DOF = 308
p-value = 0.04
Difference Score vs. Single Statement (#3)
[H.sub.0]: [A.sub.1] = [A.sub.3]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.3]
Inputs for t calculation: [[alpha].sub.1] = 0.80
[[alpha].sub.2] = 0.77
[rho] = -0.69
N = 310
Results: t = 1.62
DOF = 308
p-value = 0.11
Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all calculations.
Table 7
Nomological Validity Investigation, Descriptive Statistics and Lower
Bound of Reliability Estimates
Overall Expected
Impression Resale Value
Mean 4.21 3.68
Standard Deviation 1.51 1.52
Skewness -0.21 0.21
Kurtosis -0.27 -1.24
Lower Bound of Reliability Estimate 0.27 0.14
Note: Lower bound of reliability estimates are the coefficient of
multiple determinations (R2) resulting from the regression of overall
impression and expected resale value variables on all
measures used in the study.
Table 8: Nomological Validity Investigation--Hypothesis 5
[H.sub.0]: [p.sub.sin] =
[p.sub.dif] = [p.sub.sub]
[H.sub.1] = not all [p.sub.i]
are the same
ANOVA TABLE
Sum of
Source Squares DF Mean-Square F p-value
Method 3.08 2 1.54 1.07 0.34
Error 1338.55 927 1.44
Results: Fail to reject [H.sub.0]
[p.sub.sin] [p.sub.dif] [p.sub.sub]
Mean 0.28 0.32 0.18
Standard Deviation 1.21 1.23 1.16
Sample Size 310 310 310
Note: Numbers have been rounded to 2 places for illustrative purposes.
Greater than 2 place accuracy was used in all calculations.
sin = single statement method, dif = difference score method,
sub = subjective difference method
Table 9: Nomological Validity Investigation--Hypothesis 6
[H.sub.0]: [p.sub.sin] =
[p.sub.dif] = [p.sub.sub]
[H.sub.1] = not all [p.sub.i]
are the same
ANOVA TABLE
Source Sum of Squares DF Mean-Square F p-value
Method 8.79 2 4.39 2.57 0.08
Error 1584.14 927 1.71
Results: Fail to reject [H.sub.0]
[p.sub.sin] [p.sub.dif] [p.sub.sub]
Mean 0.39 0.25 0.49
Standard Deviation 1.22 1.06 1.59
Sample Size 310 310 310
Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all
calculations.
sin = single statement method, dif = difference score method,
sub = subjective difference method
Table 10: Nomological Validity Investigation--Hypothesis 7
[H.sub.0]: [p.sub.sin] =
[p.sub.dif] = [p.sub.sub]
[H.sub.1] = not all [p.sub.i]
are the same
ANOVA TABLE
Source Sum of Squares DF Mean-Square F p-value
Method 208 2 1.04 0.75 0.47
Error 1292.1 927
Results: Fail to reject [H.sub.0]
[p.sub.sin] [p.sub.dif] [p.sub.sub]
Mean 0.2 0.3 0.21
Standard Deviation 1.21 1.15 1.23
Sample Size 310 310 310
Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all
calculations.
sin = single statement method, dif = difference score method,
sub = subjective difference method
Table 11: Nomological Validity Investigation--Hypothesis 8
[H.sub.0]: [p.sub.sin] =
[p.sub.dif] = [p.sub.sub]
[H.sub.1] = not all [p.sub.i]
are the same
ANOVA TABLE
Source Sum of Squares DF Mean-Square F p-value
Method 3.08 2 1.54 1.07 0.34
Error 1338.55 927 1.44
Results: Fail to reject [H.sub.0]
[p.sub.sin] [p.sub.dif] [p.sub.sub]
Mean 2.42 0.20 0.32
Standard Deviation 1.16 1.08 1.30
Sample Size 310 310 310
Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all
calculations.
sin = single statement method, dif = difference score method,
sub = subjective difference method
Table 12
Multitrait-Multimethod Matrix Analysis of Variance Table for
Difference Score and Subjective Difference Measures
Source DF SS MS F Variance Index
R (respondents) 309 586 1.90 3.32 0.33 .37
R X T (traits) 309 259 0.84 1.47 0.13 .19
R X M(methods) 309 219 0.71 1.24 0.07 .11
E (error) 309 176 0.57 0.57
N (number of respondents) = 310
n (number of traits or constructs) = 2
m (number of methods) = 2
Note: The analysis of variance table was constructed, variance
estimates were made, and the indexes were constructed with
methodology consistent with and outlined in Kavanagh, MacKinney
and Wolins (1971).
Table 13
Multitrait-Multimethod Matrix Analysis of Variance Table for
Difference Score and Single Statement Measures
Source DF SS MS F Variance Index
R (respondents) 309 693 2.24 5.53 0.46 .53
R X T (traits) 309 276 0.89 2.20 0.24 .38
R X M(methods) 309 145 0.47 1.16 0.03 .07
E (error) 309 125 0.41 0.41
N (number of respondents) = 310
n (number of traits or constructs) = 2
m (number of methods) = 2
Note: The analysis of variance table was constructed, variance
estimates were made, and the indexes were constructed with
methodology consistent with and outlined in Kavanagh, MacKinney
and Wolins (1971).
Table 14
Multitrait-Multimethod Matrix Analysis of Variance Table for
Subjective Difference and Single Statement Measures
Source DF SS MS F Variance Index
R (respondents) 309 569 1.82 4.08 0.35 .53
R X T (traits) 309 124 0.4 0.89 0.00 .38
R X M(methods) 309 407 1.32 2.91 0.43 .07
E (error) 309 140 0.45 0.45
N (number of respondents) = 310
n (number of traits or constructs) = 2
m (number of methods) = 2
Note: The analysis of variance table was constructed, variance
estimates were made, and the indexes were constructed with
methodology consistent with and outlined in Kavanagh, MacKinney
and Wolins (1971).