首页    期刊浏览 2025年06月21日 星期六
登录注册

文章基本信息

  • 标题:Determining Extractive Summary for a Single Document Based on Collaborative Filtering Frequency Prediction and Mean Shift Clustering
  • 本地全文:下载
  • 作者:Ahmed M. El-Refaiy ; Ahmed R. Abas ; Ibrahim M. El-Henawy
  • 期刊名称:IAENG International Journal of Computer Science
  • 印刷版ISSN:1819-656X
  • 电子版ISSN:1819-9224
  • 出版年度:2019
  • 卷号:46
  • 期号:3
  • 页码:494-505
  • 出版社:IAENG - International Association of Engineers
  • 摘要:This paper presents a new unsupervised algorithm for determining extractive summary for a single document using term frequency prediction, which is obtained from memory-based collaborative filtering (CF) approach, and Mean Shift Clustering algorithm. The new algorithm uses Term-Sentence Collaborative Filtering (TSCF) for predicting term frequency. These term frequencies are used in sentence ranking according to the presence percentage of each word/term in each sentence. TSCF computes term frequencies for either terms present or missing (sparse) in a sentence via collaborative filtering prediction algorithm. The new algorithm uses Mean Shift Clustering algorithm as a final framework to group sentences according to their ranks to get more coherent summaries. Experiments show the effect of using different weighting functions including: Term Frequency (TF), Term Frequency Inverse Document Frequency (TFIDF) and binary TF. In addition, they show the effect of using different distance metrics that support sparse matrices representations including: Cosine, Euclidean and Manhattan. Experiments also, show the effect of using L1 and L2 normalization. ROUGE is used as a fully automatic metric in text summarization on DUC2002 datasets. Results show ROUGE-1, ROUGE-2, ROUGE-L and ROUGE-SU4 average recall, precision and f-measure scores, which show the effectiveness of the new algorithm. Results show that the proposed TSCF algorithm has promising results and outperforms related baseline techniques in many ROUGE scores.
  • 关键词:Extractive Text Summarization; Collaborative Filtering Prediction; Term frequency; Information retrieval; Mean Shift Clustering
国家哲学社会科学文献中心版权所有