摘要:Using G-theory as a theoretical framework, this study was intended to examine the variability and reliability of classroom instructors’ analytic assessments of EFL writing by undergraduate students at a Turkish university. Ninety-four EFL papers by Turkish-speaking students in a large-scale classroom-based English proficiency exam were scored analytically by three EFL raters. The results showed great rater variation. Ratings based on two assessment categories (e.g. communicative level and linguistic accuracy level) were also obtained. The variance component for scoring categories (c) did explain total score variance (7.25% of the total variance), suggesting that there was difference in the writing scores that could be attributed to the scoring category itself. Further, the dependability coefficient was .53 for the current scenario and even when the numbers of raters were increased to 10 the dependability of coefficient was .79. This difference had tremendous impact on the reliability of analytic scoring of EFL papers. The findings of this study provide evidence that the classroom teachers should be appropriately trained to score EFL compositions. Important implications are discussed.