期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2009
卷号:2009
出版社:ACL Anthology
摘要:Sentence fluency is an important component
of overall text readability but few
studies in natural language processing
have sought to understand the factors that
define it. We report the results of an initial
study into the predictive power of surface
syntactic statistics for the task; we use
fluency assessments done for the purpose
of evaluating machine translation. We
find that these features are weakly but significantly
correlated with fluency. Machine
and human translations can be distinguished
with accuracy over 80%. The
performance of pairwise comparison of
fluency is also very high—over 90% for a
multi-layer perceptron classifier. We also
test the hypothesis that the learned models
capture general fluency properties applicable
to human-written text. The results do
not support this hypothesis: prediction accuracy
on the new data is only 57%. This
finding suggests that developing a dedicated,
task-independent corpus of fluency
judgments will be beneficial for further investigations
of the problem.