首页    期刊浏览 2024年10月06日 星期日
登录注册

文章基本信息

  • 标题:Retrieval Models for Genre Classification
  • 本地全文:下载
  • 作者:Stein, Benno ; Eissen, Sven Meyer zu
  • 期刊名称:Scandinavian Journal of Information Systems
  • 出版年度:2008
  • 卷号:20
  • 期号:1
  • 页码:3
  • 出版社:Association for Information Systems
  • 摘要:Genre provides a characterization of a document with respect to its form or functional trait. Genre is orthogonal to topic, rendering genre information a powerful filter technology for information seekers in digital libraries. However, an efficient means for genre classification is an open and controversially discussed issue. This paper gives an overview and presents new results related to automatic genre classification of text documents. We present a comprehensive survey which contrasts the genre retrieval models that have been developed for Web and non-Web corpora. With the concept of genre-specific core vocabularies the paper provides an original contribution related to computational aspects and classification performance of genre retrieval models: we show how such vocabularies are acquired automatically and introduce new concentration measures that quantify the vocabulary distribution in a sensible way. Based on these findings we construct lightweight genre retrieval models and evaluate their discriminative power and computational efficiency. The presented concepts go beyond the existing utilization of vocabulary-centered, genre-revealing features and open new possibilities for the construction of genre classifiers that operate in real-time.
国家哲学社会科学文献中心版权所有