首页    期刊浏览 2024年11月30日 星期六
登录注册

文章基本信息

  • 标题:A comparison of test-retest reliabilities using the self-talk use questionnaire.
  • 作者:Hardy, James ; Hall, Craig R.
  • 期刊名称:Journal of Sport Behavior
  • 印刷版ISSN:0162-7341
  • 出版年度:2005
  • 期号:September
  • 语种:English
  • 出版社:University of South Alabama
  • 摘要:A possible reason for the current state of affairs in the self-talk literature may be due to a lack of descriptive data upon which to further examine self-talk's relationships. Hardy and colleagues (Hardy, Gammage, & Hall, 2001; Hardy, Hall, & Hardy, in press, 2004) attempted to remedy this problem. Consequently, an inductive qualitative approach was utilized in their initial study (Hardy et al., 2001). It was found that both the content (i.e., what is said) and the functions of self-talk (i.e., why athletes employ self-talk) were multidimensional. Quantitative findings obtained, via the Self-Talk Use Questionnaire (STUQ), from subsequent studies offered support and extended these qualitative results. That is, athletes reported the frequent use of self-talk as categorized qualitatively as well as sex, sport, and competitive level differences examined. The STUQ was developed as a preliminary attempt to quantify athletes' use of self-talk in addition to supplement Hardy et al.'s previous qualitative findings. The STUQ was based on similar descriptive instrument used to examine mental imagery, the Imagery Use Questionnaire (IUQ; Hall, Rodgers, & Barr, 1990) as well as Hardy et al.'s (2001) qualitative findings. The IUQ is a valid and reliable general measure of athletes' frequency of the use of mental imagery. It places emphasis on the imagery-related habits of athletes (i.e., when athletes use imagery) as well as the content of their imagery. Hardy et al.'s qualitative findings helped guide the generation of items that were relevant to the mental skill of self-talk. Suggestions from an experienced sport psychology consultant and a national level soccer coach facilitated the wording of "athlete friendly" items. The STUQ assesses the frequency of the use of self-talk. It places an emphasis on when athletes employ self-talk, the content of athletes' self-talk as well as athletes' use of the specific functions of self-talk (i.e., the purpose of self-talk) and how athletes employ self-talk (e.g., use of self-talk in combination with imagery).
  • 关键词:Athletes;Self talk;Sports psychology

A comparison of test-retest reliabilities using the self-talk use questionnaire.


Hardy, James ; Hall, Craig R.


Self-talk can be thought of as a construct concerned with athletes' multidimensional sport related self-verbalizations, which seem to serve instructional and motivational functions (Hardy, Hall, & Hardy, in press). Although elite athletes and coaches both support the use of appropriate (positive) self-talk (Gould, Hodge, Peterson, & Giannini, 1989), our knowledge about this mental skill is quite limited. This limitation is somewhat surprising given that the use of cognitive restructuring interventions have been shown to be a more powerful treatment (d = .79) than the use of relaxation (d = .73), mental rehearsal (d = .57), and goal setting (d = .54) interventions for the enhancement of sporting performance (Meyers, Whelan, & Murphy, 1996).

A possible reason for the current state of affairs in the self-talk literature may be due to a lack of descriptive data upon which to further examine self-talk's relationships. Hardy and colleagues (Hardy, Gammage, & Hall, 2001; Hardy, Hall, & Hardy, in press, 2004) attempted to remedy this problem. Consequently, an inductive qualitative approach was utilized in their initial study (Hardy et al., 2001). It was found that both the content (i.e., what is said) and the functions of self-talk (i.e., why athletes employ self-talk) were multidimensional. Quantitative findings obtained, via the Self-Talk Use Questionnaire (STUQ), from subsequent studies offered support and extended these qualitative results. That is, athletes reported the frequent use of self-talk as categorized qualitatively as well as sex, sport, and competitive level differences examined. The STUQ was developed as a preliminary attempt to quantify athletes' use of self-talk in addition to supplement Hardy et al.'s previous qualitative findings. The STUQ was based on similar descriptive instrument used to examine mental imagery, the Imagery Use Questionnaire (IUQ; Hall, Rodgers, & Barr, 1990) as well as Hardy et al.'s (2001) qualitative findings. The IUQ is a valid and reliable general measure of athletes' frequency of the use of mental imagery. It places emphasis on the imagery-related habits of athletes (i.e., when athletes use imagery) as well as the content of their imagery. Hardy et al.'s qualitative findings helped guide the generation of items that were relevant to the mental skill of self-talk. Suggestions from an experienced sport psychology consultant and a national level soccer coach facilitated the wording of "athlete friendly" items. The STUQ assesses the frequency of the use of self-talk. It places an emphasis on when athletes employ self-talk, the content of athletes' self-talk as well as athletes' use of the specific functions of self-talk (i.e., the purpose of self-talk) and how athletes employ self-talk (e.g., use of self-talk in combination with imagery).

With regard to the content of athletes' self-talk, although differences across sex and skill level were absent, Hardy et al. (in press) found that team and individual sport athletes employ self-talk of differing content. Hardy et al.'s (2004) findings related to how athletes use self-talk were somewhat different to the findings for the content of self-talk. Athletes' reported an increasing use of self-talk as their competitive season progressed. Furthermore, although male and female athletes were not found to differ on how they employed self-talk, significant effects for skill level and sport type were present.

With regard to athletes' use of the functions of self-talk, Hardy et al. (in press) did not uncover significant differences between male and female, and skilled and less skilled athletes. They did demonstrate however, that individual sport athletes make greater use of self-talk in general and more precisely greater use of nearly all specific functions of self-talk, as compared to their team sport counterparts. In addition, a significant main effect for setting was revealed, in that athletes reported significantly greater use of self-talk in conjunction with competition-related than practice-related situations. A significant main effect for temporal phase was also found. That is, differences were found between the use of self-talk before, during and after practice and competition. Regardless of setting, self-talk was employed most frequently during as opposed to prior to or post practice and competition.

A noted limitation to Hardy et al.'s (in press) findings was the need to interpret them with some caution as psychometric information on the STUQ is lacking. Given that "without solid measurement, it is difficult to challenge, disconfirm, and/or extend psychological theory in sport and exercise psychology" (Duda, 1998, p.xxiii) there is an obvious need to examine the psychometric properties of the STUQ. Some frequently employed methods to assess properties of questionnaires are not, however, applicable to the STUQ. For example, because the STUQ's items are not grouped into sub-scales representing a range of self-talk related factors, examination of the instrument's factor structure, through exploratory or confirmatory techniques, as well as an examination of the STUQ's sub-scale internal consistency are not conceptually appropriate. Examination of the STUQ's internal consistency via an item-total test approach was, however, possible. As a result, this was one purpose of the present study.

The study's primary purpose, however, focused on a second relevant psychometric property, test-retest reliability. According to Thomas and Nelson (2001), test-retest reliability or stability "is one of the most severe tests of consistency" (p. 188). The test-retest method involves administering a test on two separate occasions in an identical manner. The stability of the response variable is of critical importance to the test-retest assessment. Not only does the relative stability of the response variable influence the length of the test-retest interval (Portney & Watkins, 1993), it can also determine whether the examination of an instrument's test-retest stability should even be attempted (Schutz, 1998). For example, if an underlying response variable is not stable, it would make little sense to assess the test-retest stability of an instrument designed to measure such a dynamic construct. With regard to the present study's underlying variable, self-talk, there is no evidence to date that indicates that athletes' general use of self-talk naturally changes exclusively over time. Although preliminary research has found the use of self-talk to alter across (a) practice and competitive settings, (b) preparatory phases (before, during, and after practice/competition) (Hardy et al., in press) and (c) training cycles of the season (off-, early regular, and late regular) (Hardy et al., 2004), none of these independent variables are exclusively based on time itself, each deals with very distinct situations for the athlete.

Traditionally, within the sport psychology literature the test-retest reliability or stability of psychometric questionnaires has been assessed through the use of the Pearson (interclass) correlation with a small sample of athletes (e.g., Anderson & Cychosz, 1994; Hall & Barr, 1992; Pelletier, Fortier, Vallerand, Tuscon, Briere, & Blais, 1995). Unfortunately there are limitations to the utilization of this approach. As Thomas and Nelson pointed out, use of the Pearson correlation is only suitable when assessing the relationship between two different variables. This situation is clearly not the case with a test-retest design--the same variable is measured on two separate occasions. Consequently, Thomas and Nelson suggest that the intraclass correlation should be employed when concerned with the scoring of the same variable across time (e.g., Brewer et al., 2000). However, it should be noted that correlations are an indication of relationship and do not offer information regarding agreement (Bland & Altman, 1986; Nevill, 1996) and so are unable to detect systematic bias in responses from one time to another. Furthermore, both the Pearson correlation and the intraclass correlation are often used as summary statistics obtained by pooling relevant items. Wilson and Batterham (1999) and Nevill, Lane, Kilgour, Bowes, and Whyte (2001) indicated that the use of such statistics may not provide a clear picture of the stability of an instrument's items. This is because individual items with poor stability that might be present cannot be clearly identified due to the averaging out process inherent in the use of summary statistics.

As a result, Nevill et al. (2001) supported Wilson and Batterham's (1999) proposal to use a within individual item-by-item approach to test-retest designs. It should be noted that Bland and Altman (1986) first forwarded Wilson and Batterham's general approach. They recommended the use of the proportion of agreement, computed for each item. The proportion of agreement is "based on the proportion of participants that record the same response on two separate occasions" (Nevill et al., 2001, p.273). However, it was Nevill et al.'s contention that Wilson and Batterham's overly complex "bootstrapped" item-by-item approach also lacked the ability to detect systematic bias and distinguish between "near misses" and "wide disagreements". To this end, Nevill and colleagues recommended the use of a modified proportion of agreement procedure. This method entailed the calculation of test-retest differences and then the reporting of the percentage of individuals who's differences were found to be within a reference value of no practical importance. With regard to the variable under investigation in their study, social physique anxiety, a relatively stable trait measured on a 5-point scale, a reference value of [+ or -] 1 was adopted and it was forwarded that most participants (i.e., 90%) should record differences within this value (Nevill et al.).

Although the constructs under investigation in the present study (self-talk; 9-point scale) and in Nevill et al. 's (2001) study (social physique anxiety; 5-point scale) are different in nature and are scored using different response scales, a similar within individual item-by-item approach was employed in the present study. To reflect these differences, Nevill et al.'s reference value of [+ or -] 1 was altered to meet the demands of the present study (when applicable). This alteration was carried out because identical proportion of agreement values (e.g., 95%) using the same limits of agreement (e.g., +1) obtained from questionnaires utilizing 9 and 5-point response scales respectively, are not equivalent--the 9-point scale related proportion of agreement has greater relative stability. Thus, it was expected that most (90%) participants would report responses within [+ or -] 2 of each other. As such, it was hypothesized that the STUQ items would be relatively stable over time.

In sum, the general aim of the present study was to illustrate the use of the proportion of agreement method to examining test-retest reliability using the STUQ as an example. An examination of the instrument's general internal consistency and test-retest reliability/stability was undertaken. It was expected that the STUQ would display good reliability.

Method

Participants

Participants were recruited from volleyball (n = 74) and basketball (n = 27) activity classes; the sample was comprised of 101 Kinesiology undergraduate volunteers (44 males, 57 females) with a mean age of 20.92 years (SD = 1.19). The number of participants in the present study corresponds with Nevill et al.'s (2001) recommendations concerning minimum sample size for the examination of questionnaires' reliability via the use of non-parametric approaches.

Measures

A modified version of the STUQ (Hardy et al., in press; 2004) was administered. The instrument was modified to be relevant to the sample employed. This resulted in items assessing the use of self-talk in competition to be dropped, as the athletes did not play volleyball or basketball competitively. The modified version of the STUQ was comprised of 36 items contained within 4 sections, with an emphasis on the frequency of athletes' use of self-talk. Following a self-statement oriented definition of self-talk, athletes completed the 4 questions in Section I dealing with when athletes generally use self-talk. Section 2 contained 9 questions related to the content of self-talk (i.e., what athletes say to themselves). Section 3 was comprised of 12 items that assessed the specific functions of self-talk (i.e., the reasons why athletes talk to themselves). Finally, Section 4 contained 11 questions about how athletes use self-talk (e.g., consistency of self-talk and belief in self-talk).

Participants responded to the majority of the items using a 9-point scale (1 = never, 9 = all the time). Two items required the use of a 5-point scale (1 = not at all consistent/strongly disbelieve, 5 = completely consistent/strongly believe). For the purposes of the employed analyses, only those items that were responded to in the above manner were included in data analyses. In other words, only items responded to via Likert type scales were utilized (n = 24). This illustrates a limitation noted by Hardy et al. (2004) with regard to the unusual ratio-based response format of the STUQ's content related questions.

Procedure

Permission to approach participants was first gained from their respective activity class instructors. All participants were over the age of 18 years old, thus parental consent was not required. The nature of the study was explained to each participant. Each volunteer was informed of the nature of his or her involvement in the study via a letter of information. Informed consent was implied by completion of the STUQ. The STUQ was administered in weeks 4 and 5 of the six-week volleyball/basketball activity courses. On each occasion, the STUQ took approximately 10 minutes to fill out and was completed at the beginning of the activity class. The time frame of one week employed in the present study is substantially shorter than Kline's (1993) recommended three-month gap between survey administrations. Kline (1993) proposed the extended gap in order to minimize the influence of the recall of individuals' responses. A much shorter gap was employed in the present study in order to reduce the influences that time of season (i.e., use of self-talk early versus late in the course) and improved ability (i.e., learning effect) might have on individuals' responses.

Data Analysis

First, the STUQ's internal consistency was assessed using an item-total test approach. To this end, Cronbach's alpha was calculated. As the STUQ does not have subscales, the consistency of responses across the entire 24 items was assessed. The items were normally distributed (i.e., standard deviations greater than one and skewness less than two) for both the test and retest data. Second, in order to assess test-retest stability the recommendations of Nevill et al. (2001) were followed--a non-parametric approach proposed by Bland and Altman (1999) was conducted. Thus, the proportion of agreement and proportion of test-retest differences found within a [+ or -] 2 reference value (or +1 when appropriate) were calculated. The nonparametric median sign test was employed to test for the presence of systematic bias. A Bonferroni corrected significance level of p < .002 (p = .05 / 24) was employed.

Results

Mean and standard deviation frequency values are shown in Table 1. Overall, the descriptive statistics presented in Table 1 are slightly lower but comparable to those reported in the literature (Hardy et al., in press, 2004).

Internal consistency

Participants' responses given on the first data collection point were analyzed for the purpose of examining the survey's internal consistency. The result from the test of internal consistency indicated that the STUQ items have good internal consistency, Cronbach's alpha = .94.

Test-retest reliability

In order to illustrate the value of Bland and Altman's (1986) proportion of agreement approach, test-retest reliability was assessed three ways. First, the traditional and inappropriate approach to test-retest reliability via interclass correlation was used. Second, the less traditional but more appropriate approach via intra-class correlation was conducted. Finally, the most appropriate approach to test-retest reliability was undertaken, a modified version of Bland and Altman's proportion of agreement approach. Table 2 contains correlation and agreement values for each of the 24 STUQ items examined.

It can be seen that the average interclass correlations for the STUQ items was .66 (ranging from .55 to .80; see Table 2). Reliance on interclass correlations alone would suggest that there were moderate to fairly strong positive relationships between the responses initially collected and the retest responses for the STUQ items examined. This finding might suggest that the test-retest reliability for the STUQ items examined was marginal. Calculation of intraclass correlations (ICC) using a 2-way random variable absolute agreement approach generated an average ICC of .66 (ranging from .54 to .80; see Table 2). According to Vincent (1999), such ICC coefficients would suggest that the STUQ items examined possess marginal test-retest stability. However, more specifically only 6 items from the 24 examined demonstrated ICC values greater than .70. Again, it must be re-iterated--correlations do not give a measure of stability or agreement, only association (Ludbrook, 1997).

A better understanding of the STUQ's stability over time can be gleaned from the proportion of agreement percentages. As indicated in Table 2, the proportion of agreement ranged from 21% to 70% across the 24 items. The proportion of agreement within the respective specified reference values ranged from 81% to 96%, however. Moreover, while only 11 items had proportions of agreement greater than 90%, all 24 of the items except 2 (i.e., effort control and goal function items) had agreement values greater than 85%. With regard to the presence of systematic bias, results from median sign tests indicated a significant negative bias for 2 items. Participants reported significantly higher responses on the retest of the planned self-talk item (45 participants reported differences below the median and just 16 above the median) and the item assessing the use of self-talk before practice (48 participants reported differences below the median and just 20 above the median). Together, results from the proportion of agreement and median sign test procedures are suggestive that the majority of the 24 STUQ items examined are reasonably stable.

Discussion

The aim of the present study was to generate reliability information concerning the STUQ. A fairly new method for assessing the stability of survey items was employed. Overall, supportive evidence for the a priori hypotheses was found. Specifically, the STUQ items examined appear to be internally consistent and relatively stable over time.

With regard to the different test-retest techniques presented, it can be seen that varying pictures would emerge depending on which technique was relied upon, ranging from marginal through to adequate stability for 22 of the 24 items examined. The low (Pearson and ICC) correlation coefficients may be due in part to the restrictive sample of the athletic population employed in the study. All volleyball and basketball players took part in their respective sport at the recreational level. It is possible that this led to narrow variance that subsequently impacted on the coefficients (Wilson & Batterham, 1999). It is proposed that Nevill et al.'s (2001) proportion of agreement protocol employed in the present study is the best approach to assessing a survey's test-retest reliability or stability. As such, researchers interested in examining a survey's test-retest reliability would do well to avoid the use of correlations that can not offer information about agreement or stability (Bland & Altman, 1986; Nevill, 1996), and instead utilize Nevill and colleagues' method. It should be noted that a second approach to the proportion of agreement technique is available to researchers. It is possible to create the limits of agreement based on 95% confidence intervals (Bland & Altman, 1999). The use of confidence intervals was not appropriate for the present study. Confidence intervals would likely create boundaries of agreement that include decimal places. The STUQ's response scale involves self-ratings of whole numbers only.

When the present study's proportion of agreement values are compared to previous research that has employed this technique, the STUQ items examined fare favorably against items from the Social Physique Anxiety Scale (SPAS; Hart, Leary, & Rejeski, 1989). One possible explanation for this comes from the work of Nevill et al. (2001) and Wilson and Batterham (1999) that suggests that some of the SPAS's items could be reworded to improve their test-retest stability. Alternatively, the present study's use of a different reference value (+ 2) may have contributed to the appearance of the STUQ items' superior stability. It should be noted however, the nature of self-talk is different to the trait of social physique anxiety and that participants responded to the STUQ items (in most cases) via a 9 point scale, not a 5 point scale like the SPAS utilizes.

The above point reflects a limitation to the proportion of agreement approach employed in the present study; there is an element of subjectivity regarding the use of limits of agreement. Specifically, if limits of agreement are utilized, what should these boundaries of agreement be? Wilson and Batterham (1999) have presented an argument that limits to agreement should not be employed if the variable under investigation is discrete in nature. That is, if the Likert-type responses represent distinct categories. If item responses are conceptualized as continuous in nature, Bland and Altman (1999) forward the somewhat subjective approach of employing reference values of no practical value whereby differences within the limits of agreement are not clinically important. The range of possible responses is not currently considered (e.g., 1 to 5 vs. 1 to 9) in the proportion of agreement protocol. It should be noted that the use of a [+ or -] 1 limit of agreement on a 5 point Liken-type scale is not equivalent to the use of the same reference values on a 9-point Likert-type scale. Such differences in the meaning of reference values guided our use of a [+ or -] 2 limit of agreement for items responded to via a 9 point Likert-type scale, although it is acknowledged that this reference value may be liberal. (Interestingly, the 5-point spread obtained from utilizing [+ or -] 2 agreement limits from a 9-point response scale is actually relatively more conservative than a 3-point range of agreement from a 5-point response scale.) If a [+ or -] 1 reference value was employed in the present investigation a much different story emerges (see Table 2 for proportions of agreement +1 for each of the items). As shown in Table 2, reliance on a [+ or -] 1 reference value would indicate that, with the exception of the two STUQ items scored on a 5-point scale, none of the 24 STUQ items examined met the proportion criterion of 90%. Thus, although Bland and Altman (1999) comment that "the decision about what is acceptable agreement is a clinical one" (p. 139), it would seem that there is a need for future research to extend the proportion of agreement method to incorporate the potential variance of responses.

Although the present results may help alleviate some concerns regarding Hardy and coworkers' (in press, 2004) STUQ related findings, they are not without their problems. Due to poor item stability, caution interpreting effort control and goal function related findings is needed. Furthermore, due to response format differences for some of the STUQ items and the sample employed, not all items on the STUQ were examined. As a result, reliability information on the content and competition related STUQ items is still absent. It should be noted, however, that Hardy et al. (2004; Study 2) reproduced their content related findings in a replication study. Overall, recent use of the STUQ has generated preliminary findings that should be used to base initial discussion and facilitate more in depth examinations of self-talk in the sports and exercise domains.

References

Anderson, D. F., & Cychosz, C. M. (1994). Development of an exercise identity scale. Perceptual and Motor Skills, 78, 747-751.

Bland, J. M. & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, i, 307-310.

Bland, J. M. & Altman, D. G. (1999). Measuring agreement in methods comparison studies. Statistical Methods in Medical Research, 8, 135-160.

Brewer, B. W., Van Raalte, J. L., Petitpas, A. J., Sklar, J. H., Pohlman, M. H., Krushell, R. J., Ditmer, T. D., Daly, J. M., & Weinstock, J. (2000). Preliminary psychometric evaluation of a measure of adherence to clinic-based sport injury rehabilitation. Physical Therapy in Sport Journal, 1, 68-74.

Duda, J. L. (1998). Advances in sport and exercise psychology measurement. Morgantown, WV: Fitness Information Technology.

Gould, D., Hodge, K., Peterson, K., & Giannini, J. (1989). An exploratory examination of strategies used by elite coaches to enhance self-efficacy in athletes. Journal of Sport & Exercise Psychology, 11, 128-140.

Hall, C.R., & Barr, K. A. (1992). The use of imagery by rowers. International Journal of Sport Psychology, 23, 243-261.

Hall, C. R., Rodgers, W. M., & Barr, K. A. (1990). The use of imagery by athletes in selected sports. International Journal of Sport Psychology, 4, 1-10.

Hardy, J., Gammage, K. L., & Hall, C. R. (2001). A descriptive study of athlete self-talk. The Sport Psychologist, 15, 306-318.

Hardy, J., Hall, C. R., & Hardy, L. (in press). Quantifying athletes' use of self-talk. Journal of Sport Science.

Hardy, J., Hall, C. R., & Hardy, L. (2004). A note on how athletes use self-talk. Journal of Applied Sport Psychology, 16, 251-257.

Hart, E. H., Leary, M. R., & Rejeski, W. J., (1989). The measurement of social physique anxiety. Journal of Sport & Exercise Psychology, 11, 94-104.

Ludbrook, L. (1997). Comparing methods of measurement. Clinical and Experimental Pharmacology and Physiology, 24, 193-203.

Kline, P. (1993). Handbook of psychological testing. London: Routledge.

Meyers, A. W., Whelan, J. P., & Murphy, S. M. (1996). Cognitive behavioral strategies in athletic performance enhancement. In M. Hersen, R. M. Eisler, & M. Miller (Eds.), Progress in behavior modification." Vol. 30 (pp. 137-164). Pacific Grove, CA: Brooks/ Cole.

Nevill, A. M. (1996). Validity and measurement agreement in sports performance. Journal of Sport Sciences, 14, 199.

Nevill, A. M., Lane, A. M., Kilgour, L. J., Bowes, N., & Whyte, G. P. (2001). Stability of psycho metric questionnaires. Journal of Sport Sciences, 19, 273-278.

Pelletier, L. G., Fortier, M. S., Vallerand, R. J., Tuscon, K. M., Briere, N. M., Blais, M. R. (1995). Toward a new measure of intrinsic motivation, extrinsic motivation, and amotivation in sports: the Sport Motivation Scale (SMS). Journal of Sport & Exercise Psychology, 17, 35-53.

Portney, L. G., & Watkins, M. P. (1993). Foundations of clinical research." applications to practice. Stamford, CT: Appleton & Lange.

Schutz, R. W. (1998). Assessing the stability of psychological traits and measures. In J. L. Duda (Ed.), Advances in sport and exercise psychology measurement (pp. 393-408). Morgantown, WV: Fitness Information Technology.

Thomas, J. R., & Nelson, J. K. (2001). Research methods in physical activity (4th ed.). Human Kinetics; Champaign, IL.

Vincent, W. J. (1999). Statistics in Kinesiology (2nd ed.). Human Kinetics; Champaign, IL.

Wilson, K., & Batterham, A. (1999). Stability of questionnaire items in sport and exercise psychology: Bootstrap limits of agreement. Journal of Sport Sciences, 17, 725-734.

Address Correspondence To: James Hardy, School of Sport, Health, and Exercise Sciences University of Wales, Bangor George Building, Bangor, Gwynedd LL57 2PX UK Email: j.t.hardy@bangor.ac.uk. Fax: 01248 371053

James Hardy and Craig R. Hall

University of Western Ontario
Table 1

Descriptive statistics for the STUQ items examined

 Time 1 Time 2

 Mean Standard Mean Standard
 Deviation Deviation

When athletes use self-talk
Before a practice 2.91 1.81 3.34 1.81
During a practice 5.64 1.83 5.36 1.81
After a practice 3.24 1.92 2.99 1.58
Away from a practice 3.03 2.01 2.60 1.61

Use of the functions of self-talk

Skill function 5.53 2.10 5.47 1.86
Strategy function 4.91 1.96 4.95 1.93
Psyching function 5.67 2.28 5.55 2.22
Relaxation function 4.54 2.39 4.69 2.23
Nerve control function 4.71 2.29 4.77 2.14
Focusing function 5.68 1.93 5.47 1.85
Self-confidence function 5.30 2.28 4.98 2.06
Mental preparation function 5.14 2.14 5.17 2.00
Coping function 5.26 2.34 4.99 2.15
Motivation function 5.12 2.19 5.00 2.11
Effort control function 4.83 2.20 4.52 2.13
Goal function 4.59 2.34 4.39 2.03

How athletes use self-talk

Before attempting skills 5.54 2.23 5.47 1.90
During execution of skills 4.10 2.22 4.45 1.99
Self-talk with imagery 5.58 1.99 5.11 2.08
Self-talk with physical practice 5.42 1.90 5.32 1.88
Self-talk alone 4.47 2.03 4.44 1.88
Planned self-talk 2.89 1.93 3.34 1.92
Consistent self-talk * 3.00 0.91 3.13 0.80
Belief in self-talk * 3.66 0.79 3.68 0.77

Note. Items were scored via a 9 point scale except those where
indicated. * denotes that the item was scored on a 5 point scale.

Table 2

Test-retest statistics for the STUQ items

STUQ item Inter-class Intra-class PA (%)
 correlation correlation
 coefficient coefficient

When athletes use
self-talk

Before a practice 0.65 0.66 33 (33%)
During a practice 0.67 0.66 47 (47%)
After a practice 0.56 0.54 34 (34%)
Away from a practice 0.66 0.62 46 (46%)

Use of the functions
of self-talk

Skill function 0.65 0.66 31 (31%)
Strategy function 0.69 0.69 29 (29%)
Psyching function 0.80 0.80 39 (39%)
Relaxation function 0.74 0.75 34 (34%)
Nerve control function 0.72 0.72 31 (31%)
Focusing function 0.64 0.64 30 (30%)
Self-confidence function 0.69 0.69 30 (30%)
Mental preparation
 function 0.76 0.76 37 (37%)
Coping function 0.74 0.74 31 (31%)
Motivation function 0.69 0.69 34 (34%)
Effort control function 0.63 0.62 21 (21%)
Goal function 0.62 0.60 25 (25%)

How athletes use self-talk

Before attempting skills 0.73 0.73 27 (27%)
During execution
 of skills 0.64 0.65 32 (32%)
Self-talk with imagery 0.58 0.60 27 (27%)
Self-talk with
 physical practice 0.58 0.59 20 (20%)
Self-talk alone 0.62 0.62 24 (24%)
Planned self-talk 0.70 0.67 40 (40%)
Consistent self-talk * 0.55 0.55 55 (54%)
Belief in self-talk * 0.64 0.64 71 (69%)

STUQ item PA PA
 [+ or -] (%) [+ or -] 2 (%)

When athletes use
self-talk

Before a practice 71 (70%) 90 (89%)
During a practice 76 (75%) 91 (90%)
After a practice 66 (65%) 88 (87%)
Away from a practice 78 (77%) 90 (89%)

Use of the functions
of self-talk

Skill function 64 (64%) 89 (88%)
Strategy function 75 (74%) 91 (90%)
Psyching function 75 (74%) 91 (90%)
Relaxation function 67 (66%) 87 (86%)
Nerve control function 66 (65%) 86 (85%)
Focusing function 62 (61%) 93 (92%)
Self-confidence function 67 (66%) 87 (86%)
Mental preparation
 function 73 (72%) 92 (91%)
Coping function 69 (68%) 88 (87%)
Motivation function 69 (68%) 89 (88%)
Effort control function 60 (59%) 83 (82%)
Goal function 60 (59%) 82 (81%)

How athletes use self-talk

Before attempting skills 68 (67%) 91 (90%)
During execution
 of skills 62 (61%) 88 (87%)
Self-talk with imagery 62 (61%) 88 (87%)
Self-talk with
 physical practice 68 (67%) 87 (86%)
Self-talk alone 65 (64%) 90 (89%)
Planned self-talk 71 (70%) 87 (86%)
Consistent self-talk * 93 (92%) 93 (92%)
Belief in self-talk * 97 (96%) 97 (96%)

Note. PA = proportion of agreement. * denotes that this item was
scored on a 5 point Likert type scale, accordingly a reference
value of [+ or -] 1 was consistently employed.
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有