期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2009
卷号:2009
出版社:ACL Anthology
摘要:This paper describes POS tagging experiments
with semi-supervised training as
an extension to the (supervised) averaged
perceptron algorithm, first introduced for
this task by (Collins, 2002). Experiments
with an iterative training on standard-sized
supervised (manually annotated) dataset
(106 tokens) combined with a relatively
modest (in the order of 108 tokens) unsupervised
(plain) data in a bagging-like
fashion showed significant improvement
of the POS classification task on typologically
different languages, yielding better
than state-of-the-art results for English
and Czech (4.12 % and 4.86 % relative error
reduction, respectively; absolute accuracies
being 97.44 % and 95.89 %).