首页    期刊浏览 2024年11月24日 星期日
登录注册

文章基本信息

  • 标题:AN ENHANCED EXTRACTIVE TEXT SUMMARIZATION METHOD FOR MULTIPLE DOCUMENTS
  • 本地全文:下载
  • 作者:ADIBA MAHJABIN NITU ; MD. PALASH UDDIN ; PRIYANKA BASAK TUMPA
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2019
  • 卷号:97
  • 期号:23
  • 页码:3475-3485
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Nowadays, text summarization has become an important issue to extract the required information within short time. Several techniques on extractive text summarization have been developed for summarizing English text(s). However, there is a few works done for the summarization of Bengali text(s). In this paper, an improved extractive Bengali text summarization technique has been proposed with enhancing the word scoring process, position value heuristics and summary generation procedure of our previously presented summarizer. In the word scoring procedure, each word is preprocessed using noise removal, tokenization, stop word removal and stemming operation. Then, a heuristics is applied to calculate the word score through checking it in all the input document(s). Moreover, a modified heuristic is proposed for the sentence scoring in which it has given the priority highest to the middle sentence and then the upper and lower sentences from the middle sentence will be less prioritized. Finally, top k-sentences are extracted from each of the clusters of sentences made by K-means clustering algorithm and then the extracted sentences are sorted as their actual appearances in the original document(s). Thus, the final summary is synchronized with the original document(s). In comparison to the existing method, the experimental result shows that the proposed improved technique produces better summarization to satisfy the end-users.
  • 关键词:Text Summarization; Extractive Summarization; Bengali Text Summarization; Heuristics; Synchronized Summary
国家哲学社会科学文献中心版权所有