首页    期刊浏览 2025年02月21日 星期五
登录注册

文章基本信息

  • 标题:Enhancing Text Clustering Using Concept-based Mining Model
  • 本地全文:下载
  • 作者:Lincy Liptha R. ; Raja K. ; G.Tholkappia Arasu
  • 期刊名称:International Journal of Electronics and Computer Science Engineering
  • 电子版ISSN:2277-1956
  • 出版年度:2012
  • 卷号:1
  • 期号:2
  • 页码:550-556
  • 出版社:Buldanshahr : IJECSE
  • 摘要:Text Mining techniques are mostly based on statistical analysis of a word or phrase. The statistical analysis of a term frequency captures the importance of the term without a document only. But two terms can have the same frequency in the same document. But the meaning that one term contributes might be more appropriate than the meaning contributed by the other term. Hence, the terms that capture the semantics of the text should be given more importance. Here, a new concept-based mining is introduced. It analyses the terms based on the sentence, document and corpus level. The model consists of sentence-based concept analysis which calculates the conceptual term frequency (ctf), document-based concept analysis which finds the term frequency (tf), corpus-based concept analysis which determines the document frequency (df) and concept-based similarity measure. The process of calculating ctf, tf, df, measures in a corpus is attained by the proposed algorithm which is called Concept-Based Analysis Algorithm. By doing so we cluster the web documents in an efficient way and the quality of the clusters achieved by this model significantly surpasses the traditional single-term-base approaches.
国家哲学社会科学文献中心版权所有