首页    期刊浏览 2024年11月15日 星期五
登录注册

文章基本信息

  • 标题:A Survey on Concept Based Mining Model using Various Clustering Techniques
  • 本地全文:下载
  • 作者:J.Durga ; D.Sunitha ; S.P.Narasimha
  • 期刊名称:International Journal of Advanced Research In Computer Science and Software Engineering
  • 印刷版ISSN:2277-6451
  • 电子版ISSN:2277-128X
  • 出版年度:2012
  • 卷号:2
  • 期号:4
  • 出版社:S.S. Mishra
  • 摘要:Most of the common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. A new concept-based mining model that analyzes terms on the sentence, document, and corpus levels is introduced. The concept-based mining model can effectively discriminate between nonimportant terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed mining model consists of sentence-based concept analysis, document-based concept analysis, corpus-based concept-analysis, and concept-based similarity measure. The term which contributes to the sentence semantics is analyzed on the sentence, document, and corpus levels rather than the traditional analysis of the document only. The proposed model can efficiently find significant matching concepts between documents, according to the semantics of their sentences. The similarity between documents is calculated based on a new concept-ba sed similarity measure. The proposed similarity measure takes full ad vantage of using the concept analysis measures on the sentence, document, and corpus levels in calculating the similarity between documents. The experiments demonstrate extensive comparison between the concept-based analysis and the traditional analysis. Experimental results demonstrate the substantial enhancement of the clustering quality using the sentence-based, document-based, corpus-based, and combined approach concept analysis.
  • 关键词:Concept-based mining model; sentence-based; document-based; corpus-based; concept analysis; conceptual ;term frequency; and concept-based similarity.
国家哲学社会科学文献中心版权所有