文章基本信息

标题：An analysis of difference score measures of latent business constructs.
作者：Vernon, Thomas Alexander ; Inman, R. Anthony ; Brown, Gene 等
期刊名称：Academy of Information and Management Sciences Journal
印刷版ISSN：1524-7252
出版年度：2004
期号：July
语种：English
出版社：The DreamCatchers Group, LLC
摘要：This study empirically investigates the reliability and construct validity of difference score measures of latent business constructs in comparison to that of non-difference score measures of latent business constructs. Results provided evidence that question the conventional wisdom that the use of difference scores should be avoided whenever possible. Also, the thorough empirical examination of reliability and construct validity provides a framework for practitioners who wish to properly evaluate different measurement techniques.

An analysis of difference score measures of latent business constructs.

Vernon, Thomas Alexander ; Inman, R. Anthony ; Brown, Gene 等

ABSTRACT

This study empirically investigates the reliability and construct validity of difference score measures of latent business constructs in comparison to that of non-difference score measures of latent business constructs. Results provided evidence that question the conventional wisdom that the use of difference scores should be avoided whenever possible. Also, the thorough empirical examination of reliability and construct validity provides a framework for practitioners who wish to properly evaluate different measurement techniques.

INTRODUCTION

Difference scores are created when one measure is subtracted from another to create a measure of distinct construct (Peter, Churchill & Brown, 1993). Several researchers (Peter, Churchill & Brown, 1993; Johns, 1981; Cronbach & Furby, 1970; Nunally, 1959; and Mosier, 1951) have cited potential problems with difference scores, such as problems with reliability, discriminant validity, spurious correlation and variance restriction, which can cause them to perform poorly as measures of latent constructs. There is clearly a need for empirical study of the possible problems encountered when using difference scores to measure latent business constructs. This paper assesses the reliability and validity of difference score measures of latent business constructs in comparison to the reliability and validity of non-difference score alternatives purported to measure the same latent business constructs.

Difference Scores

A significant portion of the literature on difference score measures is in the area of testing and measurement (Cronbach & Furby 1970; Lord, 1958; and Mosier, 1951). In the business area, difference scores have been used mostly in behavior-oriented fields and are generally concerned with the measurement of latent constructs, although there has been application in economics and finance (Ogden, 1990).

The most common methods of expressing difference scores are simple absolute differences, differences between profiles and signed (algebraic) differences (Johns, 1981). Simple absolute differences are considered "simple" in that their components consist of single-item scores or summary scores derived from a scale of items (Johns, 1981). The term absolute implies that the direction of the difference is not important, only the magnitude.

Profiles are graphic summaries of multiple-item measures which retain the identity of each item until all are combined into a summary difference measure (Mosier, 1951). Use of profile differences requires the assumption that the direction of the difference is not important. However, unlike simple absolute differences, profiles have components that consist of more than a single variable (Johns, 1981).

Johns (1981) related that differences between profiles are commonly expressed as:

1 Sum of the absolute differences between parallel profile points to obtain an index of dissimilarity (Bernardin & Alvares, 1975; Green & Organ, 1973).

2 Sum of the squares of the absolute differences between parallel profile points to obtain the index of profile dissimilarity or D2 (Cronbach & Gleser, 1953).

3 Square root of the D2 index (Frank & Hackman, 1975; Senger, 1971). Algebraic, or signed, differences are formed when the direction of the difference is maintained, allowing researchers to consider both magnitude and direction of differences.

While researchers often employ various weighting strategies in the calculation of difference scores, these are primarily study-specific and will not be addressed in this paper. However, the method used to combine component parts into difference score measures must be considered carefully prior to any statistical analysis employed using difference scores as inputs. Peter, Churchill and Brown (1993) describe four areas in which problems may occur with the use of difference scores. These include:

Reliability--Problems with reliability of difference scores are primarily focused in:

1 The inherently low reliability of difference scores. Difference score reliability is usually lower than the reliabilities of their component variables (Peter, Churchill, & Brown, 1993; Prakash & Lounsbury, 1983; Johns, 1981; and Mosier, 1951).

2 The failure of researchers to report reliabilities of difference scores; the failure to correctly calculate the reliability of difference scores. In their analysis of difference score applications used in consumer research, Peter, Churchill and Brown (1993) found that only four of 13 studies attempted to assess reliability scores. All four (of those assessing reliability) assessed reliability incorrectly.

3 The effect of intercomponent correlation. The reliability of a difference score will equal the average of the reliabilities of its components only when the intercomponent correlation is zero.

Discriminant Validity--Discriminant validity is impacted from two sources:

1 Reliability effects. For various reasons, difference scores often have low reliability. It is possible that the correlations between a difference score measure and other measures may give the impression that discriminant validity standards are met simply because of low reliability.

2 Combination effects. Measures formed as linear combinations of scale scores (as with difference scores) may have difference scores that cannot be distinguished from its components, thereby failing to demonstrate discriminant validity.

Spurious Correlation--An additional problem is the tendency of difference scores to be correlated with other measures. This is primarily due to the tendency of difference scores to be correlated with their component parts (Peter, Churchill & Brown, 1993). In addition, spurious correlation is hard to separate from legitimate correlation.

Variance Restriction--Restriction of the variance of a difference score measure may occur when one of the components used to calculate the difference score is consistently higher than the other (Peter, Churchill & Brown, 1993).

Alternatives to Difference Scores

Given the potential problems associated with the use of difference scores to measure latent constructs, several alternatives have been suggested by researchers (Peter, Churchill & Brown, 1993; Cronbach & Furby, 1970). Suggestions include:

1 Designing single statement measures that require the respondent to subjectively compare the two constructs whose measures were used to create the difference score measure of the latent construct (subjective difference method);

2 Reframing research questions to avoid any comparison of the constructs whose measures were used to create the difference score measure of the latent construct (single statement method), and

3 Using a multiple regression approach to determine the appropriate weight to assign to the measures of the constructs used to create the difference score measure of the latent construct, rather than "forcing" the measures into a predetermined form as is the case with the construction of difference score measures of latent constructs.

RESEARCH HYPOTHESES

To thoroughly assess the reliability and validity of difference scores, it was necessary to compare difference score and non-difference score measures of the same latent business constructs in a single study. Two latent business constructs, net perceived return and disconfirmation of expectations, were selected from the business literature because they could be used to develop a single research instrument focused around a common theme. Specifically, the net perceived return and disconfirmation of expectations associated with the hypothetical purchase of a small 4-door economy car were measured by a difference score method and two non-difference score alternative methods.

Research hypotheses are offered to facilitate the empirical analysis of difference score measures of latent business constructs. The following hypothesis is offered to provide a framework for the detailed empirical investigation of the reliability of difference score measures and their alternatives.

H1: There is no difference in the reliability of difference score and non-difference score measures of latent business constructs.

In addition to the need for a thorough examination of the reliability of a measure, researchers should investigate a measure's nomological, convergent and discriminant validity of difference score measures and their alternatives (Peter, 1981). Hence:

H2: There is no difference in the nomological validity of difference score and non-difference score measures of latent business constructs

H3: There is no difference in the convergent validity of difference score and non-difference score measures of latent business constructs

H4: There is no difference in the discriminant validity of difference score and non-difference score measures of latent business constructs.

The evaluation of a measure's nomological validity requires an empirical analysis of the measure of the construct of interest and measures of other constructs theoretically linked to the construct of interest (Peter, 1981). Therefore, the nomological validity of the difference score measures and their alternatives were investigated by examining their relationships with measures of other theoretically related constructs.

The research instrument (Appendix A) included items measuring the respondent's overall impression and expected resale value of the 4-door economy used in the study. Overall impression and expected resale should be significantly correlated with the net perceived return and disconfirmation associated with the hypothetical purchase of the small 4-door economy car. Investigation of the relationship of the measures of the net perceived return and disconfirmation of expectations with variables measuring the overall impression and expected resale value of the 4-door economy car requires the introduction of four secondary hypotheses (Hypotheses 2a-2d). Utilization of hypotheses 2a through 2d ensured that the nomological validity of difference score and non-difference score measures were compared in a consistent manner; that is, Hypotheses 2a through 2d are proposed to provide the framework necessary to investigate Hypothesis 2 in a consistent manner.

H2a: Difference score measures of net perceived return and non-difference score measures of net perceived return are not different with respect to their correlation with the overall impression of 4-door economy car x.

H2b: Difference score measures of disconfirmation of expectations and non-difference score measures of disconfirmation of expectations are not different with respect to their correlation with the overall impression of 4-door economy car x.

H2c: Difference score measures of net perceived return and non-difference score measures of net perceived return are not different with respect to their correlation with the expected resale vale of 4-door economy car x.

H2d: Difference score measures of disconfirmation of expectations and non-difference score measures of disconfirmation of expectations are not different with respect to their correlation with the expected resale value of 4-door economy car x.

RESEARCH METHOD

To investigate the research hypotheses it was first necessary to determine an appropriate research methodology. Multitrait-multimethod analysis was selected as the primary method of validity assessment due to its support in the psychometric literature. The decision to use multitrait-multimethod analysis necessitated the research design employed in this study.

Multitrait-multimethod analysis requires that two or more traits, or constructs, be measured by two or more methods with a single research instrument. The two latent business constructs, net perceived return and disconfirmation of expectations meet the conditions required for multitrait-multimethod validity analysis.

Each latent trait, or construct, was measured by three methods: one difference score method and two non-difference score alternative methods. The first non-difference score alternative, the "subjective difference method," required respondents to subjectively compare the two constructs whose measures were used to create the difference score measure of the latent construct. The second non-difference score alternative, the "single statement method," employed research questions that avoided any comparison of the two constructs used to create the difference score measure of the latent construct. A detailed discussion of the latent business constructs examined in the study, along with the scales used to operationalize the constructs, follows.

Net Perceived Return

The net perceived return construct was introduced by Peter and Tarpey (1975) as an alternative theoretical formulation of how consumers evaluate the risks and returns associated with purchase decisions. They (Peter & Tarpey, 1975) stated that, in the context of risk-return typology, there would appear to be three distinct strategies in terms of how consumers make decisions:

 Select the brand that minimizes expected loss (perceived risk),
 Select the brand that maximizes expected gain (perceived return),
 and Select the brand that maximizes net expected gain (net
 perceived return).

Peter and Tarpey (1975) investigated each alternative and concluded that the net perceived return alternative explained more variance in automobile brand preference than the other two.

Difference Score Method. Using Lewin's (1943) vector hypothesis of consumer behavior as a theoretical basis, net perceived return can be defined as the difference between overall perceived return and overall perceived risk as shown (Peter & Tarpey, 1975: 30):

NP[Re.sub.j] = f(OP[Re.sub.j] - OP[R.sub.j]) = f [summation] [(P[G.sub.i] x I[G.sub.ij]) - (P[L.sub.ij] x I[L.sub.ij])]

where:

NP[Re.sub.j] = net perceived return for brand j

OP[Re.sub.j] = overall perceived return for brand j

OP[R.sub.j] = overall perceived risk for brand j

P[G.sub.i] = probability of gain i from purchase of brand j

[Ig.sub.ij] = importance of gain i from purchase of brand j

P[L.sub.ij] = probability of loss i from purchase of brand j

I[L.sub.ij] = importance of loss i from purchase of brand j

n = utility facets

Six utility facets were used in the Peter and Tarpey (1975) study. These same six are incorporated into this study and include: financial risk-return, performance risk-return, psychological risk-return, physical risk-return, social risk-return, and time risk-return. The net perceived return scale developed by Peter and Tarpey (1975) appears in Appendix A.

Subjective Difference Method. The subjective difference method of recasting a difference score into a single statement was used in the context of net perceived return to ensure that the measurement methods were consistent for the disconfirmation of expectations and net perceived return constructs. Statements were developed that required the respondents to subjectively compare the potential loss and gain associated with the hypothetical purchase of a small 4-door economy along each of the six utility facets. These statements are referred to as "subjective loss-gain comparisons" (SL[G.sub.ij]). Since the difference score measurement model of the net perceived return construct (NP[R.sub.ej]) employs utility facet importance weights, it was necessary to create a utility facet weighting scheme for the subjective net perceived return (SNP[R.sub.ej]) model. The importance of a gain ([Ig.sub.ij]) and the importance of a loss ([IL.sub.ij]) were averaged to provide a weighting factor ([I.sub.ij]) for each utility facet. The subjective difference model used to measure the net perceived return (SNP[Re.sub.j]) is:

SNP[Re.sub.j] = f [summation] ([I.sub.ij] x SL[G.sub.ij])

where:

SNP[Re.sub.j] = net perceived return for brand j (subjective measure)

SL[Gi.sub.j] = subjective loss-gain comparison I from purchase of brand j

Iij = (I[G.sub.ij] + I[L.sub.ij])/2, importance of utility facet I

[Ig.sub.ij] = importance of gain I from purchase of brand j

I[L.sub.ij] = importance of loss I from purchase of brand j

n = utility facets

The scale used to measure the subjective loss-gain comparisons (SL[G.sub.ij]) necessary to construct the subjective difference measure of the net perceived return construct is shown in Appendix A.

Single Statement Method. Consistent with the research of Peter and Tarpey (1975), overall perceived risk (OP[R.sub.j]) was used as a non-difference score alternative formulation of how consumers evaluate risks and returns associated with the hypothetical purchase of a small 4-door economy car. Of interest is the fact that Peter and Tarpey (1975) found that overall perceived risk (OP[R.sub.j]) explained more variation in brand preference than net perceived return for one of the automobiles evaluated in their study.

Disconfirmation of Expectations

The disconfirmation of expectations construct has been widely used in the study of consumer satisfaction (Prakash & Lounsbury, 1983) and occupies a central position as a crucial intervening variable (Churchill & Surprenant, 1982). Since disconfirmation arises from discrepancies between prior expectations and actual performance, it is presumably the magnitude of the disconfirmation effect that generates satisfaction and dissatisfaction (Churchill & Surprenant, 1982). According to Oliver and Swan (1989), satisfaction is a result of these steps: "Prior to an exchange, consumers hold attribute norms or form attribute performance expectations. As the product is used or service rendered, the consumer compares performance perceptions to these prior comparison standards. Performance above the standard has been termed positive disconfirmation, while performance below is referred to as negative disconfirmation. The degree of incremental (dis)satisfaction is a direct function of positive (negative) disconfirmation."

Historically, researchers have used both difference score and non-difference score alternative measures to operationalize disconfirmation of expectations (Tse & Wilton, 1988; Prakesh & Lounsbury, 1983; and Oliver, 1980). These include the difference score method, subjective difference method and the single statement method and each is discussed with regard to disconfirmation of expectations.

Difference Score Method. The difference score measurement model used to measure the disconfirmation of expectations associated with the hypothetical purchase of a small 4-door economy car is:

DIS[C.sub.j = EX[P.sub.j] - PE[R.sub.j] = [summation] [E.sub.ij] - [summation] [P.sub.ij]

where:

DIS[C.sub.j] = disconfirmation of expectations for brand j

EX[P.sub.j] = overall expectation of brand j

PE[R.sub.j] = overall perception of brand j

[Ei.sub.j] = expectation of facet i of brand j

[Pi.sub.j] = perception of facet i and brand j

n = utility facets

The expectations and perceptions components needed to create a difference score measure consistent with Peter and Tarpey's (1975) net perceived return were developed and are shown in Appendix A. This scale is consistent with those employed by others who have used difference scores to measure disconfirmation of expectations (Tse & Wilton, 1988; La Tour & Peat, 1979).

Subjective Difference Method. The subjective difference method for measuring disconfirmation of expectations requires that the respondent record a summary judgment on a "better than expected--worse than expected" scale. The subjective difference model is:

SDIS[C.sub.j] = [summation] [Sd.sub.ij]

where:

SDIS[C.sub.j] = disconfirmation of expectations for brand j (subjective measure)

[Sd.sub.ij] = subjective disconfirmation of expectations of utility facet i for brand j

Oliver's (1980) three item, subjective disconfirmation approach measuring customers' subjective disconfirmation with an automobile dealer's service department was used as basis for the development of a scale consistent with the six utility facet dimensions used in Peter and Tarpey's (1975) net perceived return model. This scale, measuring disconfirmation in the context of the hypothetical purchase of a small 4-door economy car, is shown in Appendix A.

Single Statement Method. Overall perception (PE[R.sub.j]) was used as an additional measure of the disconfirmation of expectations construct. While overall perception is not a measure of disconfirmation, it is necessary to treat it as such in order to have the same three methods measuring all latent traits included in the research instrument. Otherwise, it is not possible to perform multitrait-multimethod validity analyses.

Sample

The sampling frame consisted of undergraduate students enrolled in business classes at a small Midwestern college. Three hundred ten questionnaires (Appendix A) were completed.

RESULTS

The evaluation of the research hypotheses is presented in two parts. First we present the results and analysis of the hypothesis concerning the reliability and difference scores. In the second part, we present the results and analyses of evaluating the validity hypotheses.

Reliability Assessment

Coefficient Alpha Reliabilities. The reliabilities of all non-difference score measures and the components necessary to construct the difference score measures of the latent business constructs were first assessed using coefficient alpha (Cronbach, 1951). The alphas calculated for the components necessary to construct the difference score measures of the latent business constructs along with their variances and intercomponent correlations were then used to determine the reliability of the difference score measures of the latent business constructs measured in the study. The resulting alphas ranged from .72 to .83. Tables presenting all pertinent information are available upon request from the authors. Available tables are listed in Appendix B.

Difference Score Reliability Assessment. Table 1 contains the reliabilities, variances and intercomponent correlations used to calculate the reliability of the difference score measure of the disconfirmation of expectations construct along with the difference score reliability calculated for this measure. Table 2 presents the same information for the net perceived return construct.

A summary of the reliabilities of the three methods used to measure the disconfirmation of expectations construct are shown in Table 3. A summary of the same reliability information for the net perceived return construct is presented in Table 4.

Reliability Hypothesis Tests. Hypothesis 1 was evaluated using the methodology proposed by Feldt, Woodruff and Salih (1987) to test the hypothesis of equality of coefficients alpha (Ho = A1 = A2) when the alpha estimates are based on the sample. Note that the difference score reliabilities are not coefficient alpha reliabilities, but were treated as such in order to employ the method proposed by Feldt, Woodruff and Salih (1987).

The difference score reliabilities were compared in a pairwise fashion with the reliabilities of each of the two non-difference score alternative measures for each latent business construct. The following test statistic was used (Feldt, Woodruff & Salih, 1987: 99):

t = ([[alpha].sub.1] - [[alpha].sub.1]) [(N - 2).sup.1/2] / [[4(1 - [[alpha].sub.1])(1 - [[alpha].sub.2])].sup.1/2] (Degrees of freedom - N -2)

where N = sample size, [[alpha].sub.1] and [[alpha].sub.2] are sample coefficients alpha, and p is the correlation between the two summative measures developed from the sample data.

Table 5 contains a summary of the hypothesis tests of reliability difference for the disconfirmation of expectations construct. Table 6 presents the same information for the net perceived return construct.

Examination of the four comparisons of the difference score and non-difference score reliabilities shown in Tables 5 and 6 indicates a significant difference in reliabilities in three of the four cases investigated. Only the comparison of the reliabilities of the difference score and single statement measures of the net perceived return construct yielded no difference (p-value = 0.11). Because the other three cases investigated show a significant difference (p-values < 0.05) in difference score and non-difference score measures, it is maintained that there is sufficient evidence to reject Hypothesis 1. Therefore, we conclude that there is a difference in the reliabilities of the difference score and non-difference score measures of the latent business constructs measured in this study.

Validity Assessment

Nomological Validity Assessment. Lower bound reliability estimates and descriptive statistics for the items measuring overall impression and expected resale vale of the 4-door economy car are presented in Table 7.

The investigation of Hypotheses 2a through 2d was complicated by the fact that the correlations to be tested are not from independent samples. This necessitated the use of a jackknife procedure to provide unbiased estimates of the correlations (Balloun & Oumlil, 1986). A FORTRAN program written by Balloun and Oumlil (1986) provided n estimate (n = sample size) of unbiased correlations which were used to calculate sample means and standard deviations for each of the correlations of interest.

Hypotheses 2a through 2d were evaluated with single factor analysis of variance (ANOVA) to determine if there was any difference among the difference score and non-difference score methods with respect to nomological validity.

Examination of Table 8 indicates that there is no significant difference (p-value = 0.34) among the difference score, subjective difference and single statement measures of the net perceived return construct with respect to their correlation with the measure of overall impression of the 4-door economy car. Hypothesis 2a is not rejected at a reasonable level of significance.

Table 9 reveals no significant difference (p-value = 0.08) among the difference score, subjective difference and single statement measures of the disconfirmation of expectations construct with respect to their correlation with the measure of overall impression of the 4-door economy car at the 0.05 level of significance. Therefore, Hypothesis 2b is not rejected.

Table 10 does not indicate a significant difference (p-value = 0.47) among the difference score, subjective difference and single statement measures of the net perceived return construct with respect to their correlation with the measure of expected resale value of the 4-door economy car. Hypothesis 2c is not rejected.

Table 11 indicates that there is no difference (p-value = 0.42) among the difference, subjective difference and single statement measures of the disconfirmation of expectations construct with respect to their correlation with the measure of expected resale value of the 4-door economy car. Therefore, it is concluded that Hypothesis 2d should not be rejected.

Since Hypotheses 2a through 2d were not rejected at the 0.05 level of significance, Hypothesis 2 was not rejected. It is concluded that there is no difference in the nomological validity of the difference score and non-difference score measures of the latent business constructs measured in this study.

Convergent and Discriminant Validity Assessment. The convergent and discriminant validity of the difference score and non-difference score measures were evaluated using multitrait-multimethod matrix analysis (Campbell & Fiske, 1959). Specifically, the analysis of variance methodology suggested by Kavanaugh, MacKinney and Wolins (1971) was employed to provide structure to the multitrait-mulitmethod analysis of convergent and discriminant validity and to evaluate Hypotheses 3 and 4.

The evaluation of Hypotheses 3 and 4 required that the convergent and discriminant validity of the difference score and non-difference score measures of the latent business constructs be compared with the convergent and discriminant validity of the non-difference score measures of the latent business constructs. The multitrait-multimethod matrix containing the difference score, subjective difference and single statement measures of the net perceived return and disconfirmation of expectations constructs provided an overall view of the traits (constructs) and methods used in the study but it was not a form amenable to comparison of the convergent and discriminant validity of the difference and non-difference score measures necessary to evaluate Hypotheses 3 and 4. These comparisons necessitated the construction of three additional multitrait-multimethod matrices; the first matrix containing difference score and subjective difference measures of the net perceived return and disconfirmation of expectations constructs, The second matrix, containing difference score and single statement measures of the net perceived return and disconfirmation of expectations constructs, and the third containing the multitrait-multimethod matrix of the subjective difference and single statement measures of the net perceived return and disconfirmation of expectations constructs. Tables containing these matrices are available upon request from the authors.

Consistent with the procedure employed by Kavanagh, MacKinney and Wolins (1971), the following three-way classification model was hypothesized to describe the data:

[Y.sub.ijk] = [mu] + [[alpha].sub.i] + [[beta].sub.j] + [[gamma].sub.k] + [([alpha][beta]).sub.ij] + [([alpha][gamma]).sub.ik] + [([beta][gamma]).sub.jk] + [[epsilon].sub.ijk] (6)

where:

[Y.sub.ijk] = ratings of respondents for the traits by methods

[[alpha].sub.i] = effect of respondent i = 1,2, ... 310

[[beta].sub.j] = effect of trait j = 1,2

[[gamma].sub.k] = effect of method k = 1,2,3

[[epsilon].sub.ijk] = NID (0, [[sigma].sub.[epsilon]])

Using the methodology provided, analysis of variance tables with variance estimates and variance indexes were produced for each of the three multitrait-multimethod matrices mentioned above and appear in Tables 12, 13 and 14.

According to Kavanagh, MacKinney, and Wolins (1971), respondent variance indicates the overall amount of agreement, or convergence, among the measurement methods. Examination of the F statistics associated with the respondent variance in the ANOVA tables indicates significant convergence in each of the three cases at the 0.01 level of significance.

However, evaluation of Hypothesis 3 requires comparison of difference score convergent validity with that of the non-difference score methods. Unfortunately, this comparison cannot be made with a test statistic. As suggested by Kavanaugh, MacKinney and Wolins (1971), variance indexes were used for this comparison. Tables 12 and 13 contain the variance indexes for the difference score method's convergent validity with the subjective difference and single statement measures, respectively. These indexes are 0.37 and 0.53. Table 30 contains a variance index of 0.43 for the convergent validity of the two non-difference score methods. The average of the difference score measures, 0.45, is very close to the variance index of the non-difference score measures, 0.43. Logically, this would lead to the conclusion that there is no difference between the convergent validity of the difference score measures and the convergent validity of the non-difference score measures. Hence, Hypothesis 3 is not rejected.

Kavanagh, MacKinney and Wolins (1971) maintain that discriminant validity is demonstrated by significant respondent by trait (construct) variance. The F statistics associated with the respondent by trait variance shown in the three ANOVA tables indicate significant discriminant validity in only the two cases involving difference scores. The multitrait-multimethod analysis of variance evaluation of the two non-difference score methods does not show significant (F = 0.89) discriminant validity. Therefore, it appears that the difference score discriminates itself from the non-difference score methods better than the two non-difference score methods discriminate between themselves. This is taken as evidence supporting the rejection of Hypothesis 4, and it is concluded that there is a difference in the discriminant validity of the difference score and non-difference score measures of the latent business constructs measured in this study.

DISCUSSION

This research was focused on reliability and validity issues concerning the use of difference scores to measure latent business constructs. Four separate comparisons of difference score and non-difference score reliabilities were made. Only one of the four comparisons, that of the difference score and the single statement measures of the net perceived return construct (p-value = 0.11), indicates that there is not a statistically significant difference in the reliability of difference score and non-difference score measures. This was considered to be sufficient evidence to conclude that difference score and non-difference score measures were different with respect to reliability. This finding lends support to the idea that difference score measures have lower reliabilities than their non-difference score alternatives (Peter, Churchill & Brown, 1993; Johns 1981). However, it should be noted that only the reliability calculated for the difference score measure of the disconfirmation of expectations construct would be considered low by most researchers. All other measures used in the study had reliabilities above Nunnally's (1978) suggested standard of 0.70. Moreover, the reliability of 0.80 calculated for the difference score measure of the net perceived return construct is certainly high enough for practical research applications.

The nomological validity of the difference score and non-difference score measures of the latent business constructs was evaluated by testing for homogeneity of correlation with two theoretically related constructs: overall evaluation of the 4-door economy car and expected resale vale of the 4-door economy car. All hypothesis tests revealed no difference among the measurement methods with respect to correlation with the theoretically related constructs supporting the conclusion that the difference score and non-difference score measure used in this study are not different in nomological validity.

Multitrait-multimethod ANOVA analyses indicate that all measures exhibitied significant convergent validity at the 0.01 level of significance. Comparison of variance indexes did not provide evidence necessary to reject the hypothesis that difference score and non-difference score measures are different with respect to convergent validity.

The results of this study indicate that the difference score and non-difference score measures investigated are different with respect to discriminate validity. In fact, multitrait-multimethod ANOVA analysis indicates that the difference score measures exhibited better discriminant validity than the non-difference score measures. These findings conflict with literature maintaining that difference scores measures have lower discriminant validity than do non-difference scores measures (Peter, Churchill and Brown, 1993; Johns, 1981). The overall implication is that difference score measures should be considered as viable measurement alternatives, and must be given careful consideration when their use is warranted on theoretical grounds.

LIMITATIONS OF THE STUDY

This study is limited to two major factors (1) the generalizability of the sample to the population of difference score measures; and (2) the error present in the analyses. The net perceived return and disconfirmation of expectations constructs measured are but two of many latent business constructs that have been measured using difference score measures. The results from the evaluation of measures of two latent business constructs cannot be used to make inferences about the entire population of difference score measures of latent business constructs. The proper measurement technique should be determined for each unique situation. Difference score methods should not be discounted as viable alternatives when warranted for the comparison of two constructs, or traits without thorough empirical investigation.

The variance estimates developed in the multitrait-multimethod ANOVA analyses indicate that the amount of error variance is large in all three of the multitrait-multimethod matrix analyses. This means that the responses are very much dependent upon unknown sources of variation, and that any interpretation of results must be made with caution.

SUGGESTIONS FOR FUTURE RESEARCH

Given that the results of this study are not consistent with the conventional wisdom concerning the use of difference scores to measure latent constructs, it would seem prudent to replicate the study. First, an identical study could be undertaken to provide further validation of the results. Additional studies could also be performed to determine situations in which difference scores should or should not be used.

Additionally, an effort should be made to develop research projects that allow for the use of one of the more complex multitrait-multimethod analyses proposed by Bagozzi and Yi (1993, 1991) for the analysis of convergent and discriminant validity of difference score measures and non-difference score alternatives. It should be noted, however, that the proposals by Bagozzi and Yi (1993, 1991) require that more than three traits be measured by more than three methods, proving to be a very difficult study to design and execute.

APPENDIX A

Scales used in this study

NET PERCEIVED RETURN

 Improbable Probable

1. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a financial loss for me because of such things as its
poor warranty, high maintenance costs, and/or high monthly payments.

2. As far as I'm concerned, if this financial loss happened to me it
would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

3. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a social loss for me because my friends and relatives
would think less highly of me.

4. As far as I'm concerned if this social loss happened to me, it would
be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

5. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a performance loss for me because it would run extremely
poorly.

6. As far as I'm concerned, if this performance loss happened to me, it
would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

7. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a psychological loss for me because it would not fit well
with my self-image or self-concept (i.e., the way I think about myself).

8. As far as I'm concerned if this psychological loss happened to me,
it would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

9. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a physical loss for me because it would not be very
safe or would become unsafe.

10. As far as I'm concerned, if this physical loss happened to me, it
would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

11. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a loss of convenience for me because I would have to
waste a lot of time and effort getting it adjusted and repaired.

12. As far as I'm concerned, if this loss of convenience happened to
me, it would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

13. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a financial loss for me because of such things as its
fine warranty, low maintenance costs, and/or reasonable monthly
payments.

14. As far as I'm concerned, if this financial gain happened to me it
would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

15. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a social loss for me because my friends and relatives
would think more highly of me.

16. As far as I'm concerned if this social gain happened to me, it
would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

17. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a performance gain for me because it would run extremely
well.

18. As far as I'm concerned, if this performance gain happened to me,
it would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

19. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a psychological gain for me because it would fit in well
with my self-image or self-concept (i.e., the way I think about
myself).

20. As far as I'm concerned if this psychological gain happened to me,
it would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

21. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a physical gain for me because it would be very safe and
would remain safe.

22. As far as I'm concerned, if this physical gain happened to me, it
would be
 1 2 3 4 5 6 7

 Unimportant Improbable
 Important Probable

23. I think that it is 1 2 3 4 5 6 7 that the purchase of a (brand)
would lead to a gain in convenience for me because I would not have to
waste much time and effort getting it adjusted and repaired.

24. As far as I'm concerned, if this gain in convenience happened to
me, it would be
 1 2 3 4 5 6 7

 Unimportant Improbable

NET PERCEIVED RETURN--SUBJECTIVE COMPARISON

1. The purchase of car x is 1 2 3 4 5 6 7 to result in financial gain
for me than it is a financial loss for me due to such things as its
good warranty, low maintenance costs, and/or low monthly payments.

2. The purchase of car x is 1 2 3 4 5 6 7 to result in a social gain
for me than it is a social loss for me because my friends and relatives
will think more highly of me.

3. The purchase of car x is 1 2 3 4 5 6 7 to result in a performance
gain for me than it is a performance loss for me because the vehicle
would run extremely well.

4. The purchase of car x is 1 2 3 4 5 6 7 to result in a psychological
gain for me than it is a psychological loss for me because the vehicle
would fit in well with my self-image or self- concept (i.e., the way I
think about yourself).

5. The purchase of car x is 1 2 3 4 5 6 7 to result in a physical gain
for me than it is a physical loss for me because it would be very safe
and would remain safe.

6. The purchase of car x is 1 2 3 4 5 6 7 to result in a gain in
convenience for me than it is a loss of convenience for me because I
would not have to waste much time and effort getting it adjusted and
repaired.

Measured 1 to 7 scale where: 1 = Less Likely, 7 = More Likely

SUBJECTIVE DISCONFIRMATION OF EXPECTATIONS

1. Car x would be a 1 2 3 4 5 6 7 purchase than the typical 4-door
economy car because of such things as its good warranty, low
maintenance costs, and/or low monthly payments.

2. My friends and relatives would think 1 2 3 4 5 6 7 highly of me if
purchased car x rather than a typical 4-door economy car.

3. Car x's performance (i.e., the way it runs) would be 1 2 3 4 5 6 7
than the typical 4-door economy car.

4. Car x would fit my self-image or self-concept (i.e., the way I
think about myself) 1 2 3 4 5 6 7 than the typical 4-door economy car.

5. Car x would be 1 2 3 4 5 6 7 safe than the typical than the typical
4-door economy car.

6. Car x would inconvenience me 1 2 3 4 5 6 7 than the typical 4-door
economy car because of the time and effort necessary to get it adjusted
and repaired. (negatively scored)

Item 1 scored on a scale of 1 to 7, where: 1 = Much Poorer, 7 = Much
Better Items 3 and 4 scored on a scale of 1 to 7, where: 1 = Much
Worse, 7 = Much Better Items 2, 5 and 6 scored on a scale of 1 to 7,
where: 1 = Much Less, 7 = Much More

DISCONFIRMATION OF EXPECTATIONS

Expectations

1. A typical small 4-door economy car would be poor choice because of
such things as their poor warranties, high maintenance costs, and/or
high monthly payments.

2. I think that the purchase of a typical small 4- door economy car
would cause my friends and relatives to think less highly of me.

3. I think that the purchase of a typical small 4- door economy car
would cause a performance loss for me because it would run extremely
poorly.

4. I think that the purchase of a typical small 4- door economy car
would not fit well with my self-image or self-concept (i.e., the way I
think about myself).

5. I think that a typical small 4-door sedan would not be very safe or
would become unsafe.

6. I think that the purchase of a typical small 4- economy car would
inconvenience me because I would have to waste a lot of time and effort
getting it adjusted and repaired.

Measured on scale of 1 to 7, where: 1 = Strongly, Disagree 7 = Strongly
Agree

Perceptions

1. The purchase of car x would be a poor choice because of such things
as their poor warranties, high maintenance costs, and/or high monthly
payments.

2. Purchase of a car x would cause my friends and relatives to think
less highly of me.

3. Purchase of a car x would cause a performance loss for me because it
would run extremely poorly.

4. Purchase of a car x would not fit well with my self-image or
self-concept (i.e., the way I think about myself).

5. Car x would not be very safe or would become unsafe.

6. Purchase of a car x would inconvenience me because I would have to
waste a lot of time and effort getting it adjusted and repaired.

Measured on scale of 1 to 7, where: 1 = Strongly, Disagree 7 = Strongly
Agree

ADDITIONAL ITEMS

1. If I were to purchase a small 4-door economy, car x would be an
excellent choice

2. I expect car x to have a very high resale value.

Measured on scale of 1 to 7, where: 1 = Strongly, Disagree 7 = Strongly
Agree

APPENDIX B

TABLES AVAILABLE FROM THE AUTHORS

Regarding the Disconfirmation of Expectations, the tables presenting descriptive statistics, inter-item correlations and coefficient alphas for the expectation ([E.sub.i]) and perception ([P.sub.j]) items summed to develop the expectation (EXP) and perception (PER) components of the difference score measure of the disconfirmation of expectation (DISC) construct and the table presenting the descriptive statistics, inter-item correlations and coefficient alpha for the items (S[D.subh.i]) that were summed to form the subjective difference measure of the disconfirmation of expectations construct (SDISC).

Regarding Net Perceived Return, the table presenting descriptive statistics, inter-item correlations and alphas for the RE[T.sub.i] measures that were summed to form the overall perceived return (OPRe) component used in the construction of the difference score measure of the net perceived return (NPRe) construct and the tables presenting the descriptive statistics, inter-item correlations and alphas for the probability of gain (P[G.sub.i]) and importance of gain (I[G.sub.i]) measures that were used to calculate the [RET.sub.i] items. Note that the probability of gain ([PG.sub.i]) and importance of gain (I[G.sub.i]) items are not used as summative scales in the study. Coefficients alpha merely provide consistent information concerning all measures used in the study. The table presenting the descriptive statistics, inter-item correlations and alphas for the [RSK.sub.i] measures that were summed to create the overall perceived risk measure (OPR) component necessary to construct the difference score measure of the net perceived return (NPRe) construct and the tables presenting the descriptive statistics, inter-item correlations and alphas for the probability of loss ([Pl.sub.i]) and importance of loss ([IL.sub.i]) measures that were used to calculate the [RSK.sub.i] items. Again, note that the probability of loss ([Pl.sub.i]) and importance of loss ([Il.sub.i]) items are not used as summative scales in the study. Coefficients alpha merely provide consistent information concerning these measures. The table presenting the descriptive statistics, inter-item correlations and alpha for the [ISLG.sub.ij] measures that were summed to form the subjective difference measure of the net perceived return constructs (SNP[Re.sub.j]) and the tables presenting the descriptive statistics, inter-item correlations and alpha for the [I.sub.ij] and SL[G.sub.ij] items used to construct the ISL[G.sub.ij] measures.

REFERENCES

Bagozzi, R.P. (1990). Structural equation models in marketing research. In W.D. Neal (ed.), Proceedings of the First Annual Advanced Research Techniques Forum. Chicago: American Marketing Association.

Bagozzi, R.P. & Y. Yi (1991). Multitrait-multimethod matrices in consumer research. Journal of Consumer Research, 17(4), 26-439.

Balloun, J.L. & A.B. Oumlil (1986). Jackknife: A general purpose program for multivariate jackknife analyses. Behavior Research Methods, Instruments and Computers, 18(1), 47-49.

Bernadin, H.J. & K.M. Alvares (1975). he effects of organizational level on perception of role conflict resolution strategy. Organizational Behavior and Human Performance, 14(1), 1-9.

Campbell, D.T. & D.W. Fiske (1959). Convergent and discriminant validity by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105.

Churchill, G.A. & C. Surprenant (1982). An investigation into the determinants of customer satisfaction. Journal of Marketing Research, 19(4), 491-504.

Cronbach, L.J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297-334.

Cronbach, L.J. and G.C. Gleser (1953). Assessing similarity between profiles. Psychological Bulletin, 50, 456-473.

Cronbach, L.J. & L. Furby (1970). How should we measure change--or should we? Psychological Bulletin, 74(1), 68-80.

Feldt, L.S., D.J. Woodruff, D.J. & F.A. Salih (1991). Statistical inference for coefficient alpha. Applied Psychological Measurement, 11(1), 93-103.

Frank, L.L. & J.R. Hackman (1975). Effects of interviewer-interviewee similarity on objectivity in college admissions. Journal of Applied Psychology, 60(3), 356-360.

Green, C.N. & D.W. Organ (1973). An evaluation of causal models linking the received role job jatisfaction. Administrative Science Quarterly, 18(1), 95-103.

Johns, Gary (1981). Difference score measures of organizational behavior variables: A critique. Organizational Behavior and Human Decision Processes, 27(3), 443-463.

Kavanagh, M.J., A.C. MacKinney & L. Wolins (1971). Issues in managerial performance: Multitrait-multimethod analyses of ratings. Psychological Bulletin, 75(1), 34-49.

LaTour, S A. & N.C. Peat (1979). Conceptual and methodological issues in satisfaction research. Advances in Consumer Research, 6(1), 431-437.

Lord, F M. (1958). The utilization of unreliable difference scores. Journal of Educational Psychology, 49(3), 150-152.

Mosier, C I. (1951). Batteries and profiles. in: E.F. Lindquist (ed.), Educational Measurement, Washington, D.C.: American Council on Education.

Nunnally, JC. (1978). Tests and Measurement. New York: McGraw-Hill.

Nunnally, J C. (1978). Psychometric Theory, New York: McGraw-Hill.

Ogden, J.P. (1990). Turn-of-month evaluations of liquid profits and stock returns: A common explanation of the monthly and January effects. Journal of Finance, 45(4), 1259-1272.

Oliver, RL. & J.E. Swan (1989). Equity and disconfirmation perceptions as influences on merchant and product satisfaction. Journal of Consumer Research, 16(3), 372-383.

Peter, J.P., G.A. Churchill, Jr. & T.J. Brown (1993). Caution in the use of difference scores in consumer research. Journal of Consumer Research, 19(4), 655-662.

Peter, J.P. (1981). Construct validity: A review of basic issues and marketing practices. Journal of Marketing Research, 18(2), 133-145.

Peter, J. P. & L.X. Tarpey, Sr. (1975). A comparative analysis of three consumer decision strategies. Journal of Consumer Research, 2(1), 29-37.

Prakash, V. & J.W. Lounsbury (1983). A reliability problem in the measurement of disconfirmation of expectations. Advances in Consumer Research, 10(1), 244-249.

Senger, J. (1971). Managers' perceptions of subordinates' competence as a function of personal value orientations. Academy of Management Journal, 14(4), 415-423.

Tse, D.K. & P.C. Wilton (1988). Models of consumer satisfaction formation: An extension. Journal of Marketing Research, 25(2), 204-212.

Thomas Alexander Vernon, Missouri Southern State University

R. Anthony Inman, Louisiana Tech University

Gene Brown, University of Missouri-Kansas City

Table 1
Input Necessary to Calculate Reliability of Difference Score Measure
of Disconfirmation of Expectations Construct

Component Reliability Variance

Expectations 0.79 45.04
Perceptions 0.81 42.29
Intercomponent Correlation = 0.58
Difference Score Reliability = 0.52

Table 2
Inputs Necessary to Calculate Reliability of Difference Score
Measure of Net Perceived Return Construct

Component Reliability Variance

Overall Perceived Return 0.80 2474.82
Overall Perceived Risk 0.77 1969.33
Intercomponent Correlation = -0.07
Difference Score Reliability = 0.80

Table 3
Summary of Reliabilities for the Three Methods Used to Measure the
Disconfirmation of Expectations Construct

Method Reliability

Difference Score Method 0.52
Subjective Difference Method 0.72
Single Statement Method 0.81

Table 4
Summary of Reliabilities for the Three Methods Used to Measure
the Net Perceived Return Construct


Method Reliability

Difference Score Method 0.80
Subjective Difference Method 0.83
Single Statement Method 0.77

Table 5
Hypothesis Test of Reliability Difference For Disconfirmation of
Expectations Construct

Difference Score (#1) vs. Subjective Difference

[H.sub.0]: [A.sub.1] = [A.sub.2]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.2]

Inputs for t calculation: [[alpha].sub.1] = 0.52
 [[alpha].sub.2] = 0.72
 [rho] = -0.16
 N = 310
Results: t = -4.85
 DOF = 308
 p-value = 0.000

Difference Score vs. Single Statement (#3)

[H.sub.0]: [A.sub.1] = [A.sub.3]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.3]

Inputs for t calculation: [[alpha].sub.1] = 0.52
 [[alpha].sub.2] = 0.81
 [rho] = 0.43
 N = 310
Results: t = -9.36
 DOF = 308
 p-value = 0.000

Note: Numbers have been rounded to 2 places for illustrative purposes.
Greater than 2 place accuracy was used in all calculations.

Table 6
Hypothesis Test of Reliability Difference For Net Perceived
Return Construct

Difference Score (#1) vs. Subjective Difference

[H.sub.0]: [A.sub.1] = [A.sub.2]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.2]

Inputs for t calculation: [[alpha].sub.1] = 0.80
 [[alpha].sub.2] = 0.83
 [rho] = -0.58
 N = 310
Results t = -2.03
 DOF = 308
 p-value = 0.04

Difference Score vs. Single Statement (#3)

[H.sub.0]: [A.sub.1] = [A.sub.3]
[H.sub.1]: [A.sub.1] [not equal to] [A.sub.3]

Inputs for t calculation: [[alpha].sub.1] = 0.80
 [[alpha].sub.2] = 0.77
 [rho] = -0.69
 N = 310
Results: t = 1.62
 DOF = 308
 p-value = 0.11

Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all calculations.

Table 7
Nomological Validity Investigation, Descriptive Statistics and Lower
Bound of Reliability Estimates
 Overall Expected
 Impression Resale Value

Mean 4.21 3.68
Standard Deviation 1.51 1.52
Skewness -0.21 0.21
Kurtosis -0.27 -1.24
Lower Bound of Reliability Estimate 0.27 0.14

Note: Lower bound of reliability estimates are the coefficient of
multiple determinations (R2) resulting from the regression of overall
impression and expected resale value variables on all
measures used in the study.

Table 8: Nomological Validity Investigation--Hypothesis 5

 [H.sub.0]: [p.sub.sin] =
 [p.sub.dif] = [p.sub.sub]
 [H.sub.1] = not all [p.sub.i]
 are the same
 ANOVA TABLE

 Sum of
Source Squares DF Mean-Square F p-value

Method 3.08 2 1.54 1.07 0.34
Error 1338.55 927 1.44

Results: Fail to reject [H.sub.0]

 [p.sub.sin] [p.sub.dif] [p.sub.sub]

Mean 0.28 0.32 0.18
Standard Deviation 1.21 1.23 1.16
Sample Size 310 310 310

Note: Numbers have been rounded to 2 places for illustrative purposes.
Greater than 2 place accuracy was used in all calculations.

sin = single statement method, dif = difference score method,
sub = subjective difference method

Table 9: Nomological Validity Investigation--Hypothesis 6

 [H.sub.0]: [p.sub.sin] =
 [p.sub.dif] = [p.sub.sub]
 [H.sub.1] = not all [p.sub.i]
 are the same
 ANOVA TABLE

Source Sum of Squares DF Mean-Square F p-value
Method 8.79 2 4.39 2.57 0.08
Error 1584.14 927 1.71

Results: Fail to reject [H.sub.0]

 [p.sub.sin] [p.sub.dif] [p.sub.sub]

Mean 0.39 0.25 0.49
Standard Deviation 1.22 1.06 1.59
Sample Size 310 310 310

Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all
calculations.

sin = single statement method, dif = difference score method,
sub = subjective difference method

Table 10: Nomological Validity Investigation--Hypothesis 7

 [H.sub.0]: [p.sub.sin] =
 [p.sub.dif] = [p.sub.sub]
 [H.sub.1] = not all [p.sub.i]
 are the same
 ANOVA TABLE

Source Sum of Squares DF Mean-Square F p-value

Method 208 2 1.04 0.75 0.47
Error 1292.1 927

Results: Fail to reject [H.sub.0]

 [p.sub.sin] [p.sub.dif] [p.sub.sub]

Mean 0.2 0.3 0.21
Standard Deviation 1.21 1.15 1.23
Sample Size 310 310 310

Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all
calculations.

sin = single statement method, dif = difference score method,
sub = subjective difference method

Table 11: Nomological Validity Investigation--Hypothesis 8

 [H.sub.0]: [p.sub.sin] =
 [p.sub.dif] = [p.sub.sub]
 [H.sub.1] = not all [p.sub.i]
 are the same
 ANOVA TABLE

Source Sum of Squares DF Mean-Square F p-value

Method 3.08 2 1.54 1.07 0.34
Error 1338.55 927 1.44

Results: Fail to reject [H.sub.0]

 [p.sub.sin] [p.sub.dif] [p.sub.sub]

Mean 2.42 0.20 0.32
Standard Deviation 1.16 1.08 1.30
Sample Size 310 310 310

Note: Numbers have been rounded to 2 places for illustrative
purposes. Greater than 2 place accuracy was used in all
calculations.

sin = single statement method, dif = difference score method,
sub = subjective difference method

Table 12
Multitrait-Multimethod Matrix Analysis of Variance Table for
Difference Score and Subjective Difference Measures

Source DF SS MS F Variance Index

R (respondents) 309 586 1.90 3.32 0.33 .37
R X T (traits) 309 259 0.84 1.47 0.13 .19
R X M(methods) 309 219 0.71 1.24 0.07 .11
E (error) 309 176 0.57 0.57

N (number of respondents) = 310
n (number of traits or constructs) = 2
m (number of methods) = 2

Note: The analysis of variance table was constructed, variance
estimates were made, and the indexes were constructed with
methodology consistent with and outlined in Kavanagh, MacKinney
and Wolins (1971).

Table 13
Multitrait-Multimethod Matrix Analysis of Variance Table for
Difference Score and Single Statement Measures

Source DF SS MS F Variance Index

R (respondents) 309 693 2.24 5.53 0.46 .53
R X T (traits) 309 276 0.89 2.20 0.24 .38
R X M(methods) 309 145 0.47 1.16 0.03 .07
E (error) 309 125 0.41 0.41

N (number of respondents) = 310
n (number of traits or constructs) = 2
m (number of methods) = 2

Note: The analysis of variance table was constructed, variance
estimates were made, and the indexes were constructed with
methodology consistent with and outlined in Kavanagh, MacKinney
and Wolins (1971).

Table 14
Multitrait-Multimethod Matrix Analysis of Variance Table for
Subjective Difference and Single Statement Measures

Source DF SS MS F Variance Index

R (respondents) 309 569 1.82 4.08 0.35 .53
R X T (traits) 309 124 0.4 0.89 0.00 .38
R X M(methods) 309 407 1.32 2.91 0.43 .07
E (error) 309 140 0.45 0.45

N (number of respondents) = 310
n (number of traits or constructs) = 2
m (number of methods) = 2

Note: The analysis of variance table was constructed, variance
estimates were made, and the indexes were constructed with
methodology consistent with and outlined in Kavanagh, MacKinney
and Wolins (1971).