期刊名称:Conference on European Chapter of the Association for Computational Linguistics (EACL)
出版年度:2017
卷号:2017
页码:681-687
语种:English
出版社:ACL Anthology
摘要:The majority of approaches to author profiling and author identification focus mainly on lexical features, i.e., on the content of a text. We argue that syntactic and discourse features play a significantly more prominent role than they were given in the past. We show that they achieve state-of-the-art performance in author and gender identification on a literary corpus while keeping the feature set small: the used feature set is composed of only 188 features and still outperforms the winner of the PAN 2014 shared task on author verification in the literary genre.