首页    期刊浏览 2024年11月27日 星期三
登录注册

文章基本信息

  • 标题:A Proposed Multi-Domain Approach for Automatic Classification of Text Documents
  • 作者:Abdelrahman M. Arab ; Ahmed M. Gadallah ; Akram Salah
  • 期刊名称:International Journal on Soft Computing
  • 电子版ISSN:2229-7103
  • 出版年度:2017
  • 卷号:8
  • 期号:1
  • 页码:1
  • DOI:10.5121/ijsc.2017.8101
  • 出版社:Academy & Industry Research Collaboration Center (AIRCC)
  • 摘要:Classification is an important technique used in information retrieval. Supervised classification suffersfrom certain limitations concerning the collection and labeling of the training dataset. When facing Multi-Domain classification, multiple training datasets and classifiers are needed which is relatively difficult. Inthis paper an unsupervised classification system is proposed that can manage the Multi-Domainclassification problem as well. It is a multi-domain system where each domain represented by an ontology.A document is mapped on each ontology based on the weights of the mutual tokens between them with thehelp of fuzzy sets, resulting in a mapping degree of the document with each domain. An experiment carriedout showing satisfying classification results with an improvement in the evaluation results of the proposedsystem compared to Apache Lucene.
  • 关键词:Information Retrieval; Ontology; Machine Learning; Document Classification; Fuzzy Sets
Loading...
联系我们|关于我们|网站声明
国家哲学社会科学文献中心版权所有