首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:Arabic Text Summarization Using Latent SemanticAnalysis
  • 本地全文:下载
  • 作者:Fadl Mutaher Ba-Alwi ; Ghaleb H. Gaphari ; Fares Nasser Al-Duqaimi
  • 期刊名称:Current Journal of Applied Science and Technology
  • 印刷版ISSN:2457-1024
  • 出版年度:2015
  • 卷号:10
  • 期号:2
  • 页码:1-14
  • 语种:English
  • 出版社:Sciencedomain International
  • 摘要:The main objective of this paper is to address Arabic text summarization using latent semantic analysis technique. LSA is a vectorial semantic form of analyzing relationships between a set of sentences. It is concerned with the word description as well as the sentence description for each concept or topic. LSA creates the word by sentence semantic matrix of a document or documents. Each word in the matrix row is represented by word variations such as root, stem and original word. The root is empirically specified as the most effective word representative, where F-score of 63% is obtained at the same time an average ROUGE of 48.5% is obtained too. LSA is implemented along with root representative and different weighting techniques then the optimal combination is specified and used as a proposed summarizer for Arabic Text Summarization. Then the summarizer is implemented again, where the input documents are pre-processed by POS tagger. The summarizer performance and effectiveness are measured manually and automatically based on the summarization accuracy. Experimental results show that the summarizer obtains higher level of accuracy as compared to human summarizer. When the compression rate is 25% F-scores of 68% is obtained and an average ROUGE score of 59% is obtained as well, in terms of Arabic text summarization.
  • 关键词:Text summarization;text mining;text extractive summary;text processing and NLP
国家哲学社会科学文献中心版权所有