文章基本信息

标题：Automatic Textual Complexity Analysis of French Textbooks
本地全文：下载
作者：Mihai Dascalu ; Larise Lucia Stavarache ; Stefan Trausan-Matu 等
期刊名称：Romanian Conference on Human-Computer Interaction
印刷版ISSN：2344-1690
出版年度：2014
期号：RoCHI
页码：55-60
语种：English
出版社：Matrix ROM
摘要：Research efforts in terms of automatic textual complexity assessment focus mainly on English, as currently there are only a few adaptations for other languages. Starting from validated model for English that addresses discourse analysis and textual complexity assessment, we introduce in this paper a textual complexity model for French trained on 200 documents extracted from French textbooks, pre-classified into five textual complexity classes. Analysis factors cover multiple dimensions and are grouped into the following main categories: surface, syntax, morphology, semantics and discourse analysis. The previous metrics are afterwards combined through Support Vector Machines in order to improve prediction accuracy. At a global level, besides purely quantitative surface factors, parts of speech and different cohesion metrics have proved to be reliable predictors of the textual difficulty of school materials. Overall, our study creates a consistent background for building reliable models that automatically evaluate the complexity of French texts.