首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Vive la Différence! Text Mining Gender Difference in French Literature
  • 作者:Shlomo Argamon ; Linguistic Cognition Lab ; Dept. of Computer Science
  • 期刊名称:DHQ
  • 印刷版ISSN:1938-4122
  • 出版年度:2009
  • 卷号:3
  • 期号:02
  • 出版社:Alliance of Digital Humanities
  • 摘要:In this study, a corpus of 300 male-authored and 300 female-authored French literary and historical texts is classified for author gender using the Support Vector Machine (SVM) implementation SVMLight, achieving up to 90% classification accuracy. The sets of words that were most useful in distinguishing male and female writing are extracted from the support vectors. The results reinforce previous findings from statistical analyses of the same corpus, and exhibit remarkable cross-linguistic parallels with the results garnered from SVM models trained in gender classification on selections from the British National Corpus. It is found that female authors use personal pronouns and negative polarity items at a much higher rate than their male counterparts, and male authors demonstrate a strong preference for determiners and numerical quantifiers. Among the words that characterize male or female writing consistently over the time period spanned by the corpus, a number of cohesive semantic groups are identified. Male authors, for example, use religious terminology rooted in the church, while female authors use secular language to discuss spirituality. Such differences would take an enormous human effort to discover by a close reading of such a large corpus, but once identified through text mining, they frame intriguing questions which scholars may address using traditional critical analysis methods.
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有