期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2011
卷号:2011
出版社:ACL Anthology
摘要:The availability of learner corpora, especially
those which have been manually error-tagged
or shallow-parsed, is still limited. This means
that researchers do not have a common development
and test set for natural language processing
of learner English such as for grammatical
error detection. Given this background,
we created a novel learner corpus
that was manually error-tagged and shallowparsed.
This corpus is available for research
and educational purposes on the web. In
this paper, we describe it in detail together
with its data-collection method and annotation
schemes. Another contribution of this
paper is that we take the rst step toward
evaluating the performance of existing POStagging/
chunking techniques on learner corpora
using the created corpus. These contributions
will facilitate further research in related
areas such as grammatical error detection and
automated essay scoring.