首页    期刊浏览 2024年09月20日 星期五
登录注册

文章基本信息

  • 标题:A Survey on Various Approaches in Document Clustering
  • 本地全文:下载
  • 作者:K.Sathiyakumari ; G.Manimekalai ; V.Preamsudha
  • 期刊名称:International Journal of Computer Technology and Applications
  • 电子版ISSN:2229-6093
  • 出版年度:2011
  • 卷号:2
  • 期号:5
  • 页码:1534-1539
  • 出版社:Technopark Publications
  • 摘要:Document clustering is the process of segmenting a particular collection of texts into subgroups including content based similar ones. The purpose of document clustering is to meet human interests in information searching and understanding. Nowadays all paper documents are in electronic form, because of quick access and smaller storage. So, it is a major issue to retrieve relevant documents from the larger database. Text mining is not a standalone task that human analysts typically engage in. The goal is to transform text composed of everyday language in a structured, database format. In this way, heterogeneous documents are summarized and presented in a uniform manner. Among others, the challenging problems of document clustering are big volume, high dimensionality and complex semantics
  • 关键词:text mining; document clustering; information extraction
国家哲学社会科学文献中心版权所有