首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:Frequent itemset-based feature selection and Rider Moth Search Algorithm for document clustering
  • 本地全文:下载
  • 作者:Madhulika Yarlagadda ; K. Gangadhara Rao ; A. Srikrishna
  • 期刊名称:Journal of King Saud University @?C Computer and Information Sciences
  • 印刷版ISSN:1319-1578
  • 出版年度:2022
  • 卷号:34
  • 期号:4
  • 页码:1098-1109
  • 语种:English
  • 出版社:Elsevier
  • 摘要:Document clustering has recently been paid great attention in retrieval, navigation, and summarization of huge volumes of documents. With a better document clustering approach, computers can organize a document corpus automatically to a meaningful cluster for enabling efficient navigation, and browsing of the corpus. Document navigation and browsing is a valuable complement to the deficiencies of information retrieval technologies. This paper introduces Modsup-based frequent itemset and Rider Optimization-based Moth Search Algorithm (Rn-MSA) for clustering the documents. At first, the input documents are given to the pre-processing step, and then, the extraction is carried out based on TF-IDF and Wordnet features. Once the extraction is done, the feature selection is carried out based on frequent itemset for the establishment of feature knowledge. At last, the document clustering is done using the proposed Rn-MSA, which is designed by combining Rider Optimization Algorithm (ROA), and the Moth Search Algorithm (MSA). The performance of the document clustering based on proposed Modsup + Rn-MSA is evaluated in terms of precision, recall, F-Measure, and accuracy. The developed document clustering method achieves the maximal precision of 95.90%, maximal recall of 96.41%, maximal F-Measure of 96.41%, and the maximal accuracy of 95.12% that indicates its superiority.
国家哲学社会科学文献中心版权所有