首页    期刊浏览 2024年11月08日 星期五
登录注册

文章基本信息

  • 标题:Addressing the Problem of Coherence in Automatic Text Summarization: A Latent Semantic Analysis Approach
  • 本地全文:下载
  • 作者:Abdulfattah Omar
  • 期刊名称:International Journal of English Linguistics
  • 印刷版ISSN:1923-869X
  • 电子版ISSN:1923-8703
  • 出版年度:2017
  • 卷号:7
  • 期号:4
  • 页码:33
  • DOI:10.5539/ijel.v7n4p33
  • 出版社:Canadian Center of Science and Education
  • 摘要:

    This article is concerned with addressing the problem of coherence in the automatic summarization of prose fiction texts. Despite the increasing advances within the summarization theory, applications and industry, many problems are still unresolved in relations to the applications of the summarization theory to literature. This can be in part attributed to the peculiar nature of literary texts where standard or typical summarization processes are not amenable for literature. This study, therefore, tends to bridge the gap between literature and summarization theory by proposing a summarization system that is based on more semantic-based approaches for extracting more meaningful and coherent summaries. Given that lack of coherence within summaries has its negative implications on understanding original texts; it follows that more effective methods should be developed in relation to the extraction of coherent summaries. In order to do this, a hybrid of methods including statistical (TF-IDF) and semantic (Latent Semantic Analysis LSA) methods were used to derive the most distinctive features and extract summaries from 10 English novellas. For evaluation purposes, both intrinsic and extrinsic methods are used for determining the quality of the extracted summaries. Results indicate that the integration of LSA into features extraction methods achieves better summarization performance outcomes in terms of coherence properties within the extracted summaries.

国家哲学社会科学文献中心版权所有