首页    期刊浏览 2024年09月19日 星期四
登录注册

文章基本信息

  • 标题:Text Mining: Pattern Extraction and Classification (Data management and Distribution)
  • 本地全文:下载
  • 作者:B Sankara Babu ; Dr. K. Rajasekhara Rao
  • 期刊名称:International Journal of Innovative Research in Science, Engineering and Technology
  • 印刷版ISSN:2347-6710
  • 电子版ISSN:2319-8753
  • 出版年度:2016
  • 卷号:5
  • 期号:6
  • 页码:10064
  • DOI:10.15680/IJIRSET.2015.0506105
  • 出版社:S&S Publications
  • 摘要:Data mining refers to the process of retrieving knowledge by discovering novel and relative patternsfrom large datasets. Clustering and Classification are two distinct phases in data mining that work to provide anestablished, proven structure from a voluminous collection of facts. In this paper, our focus is to analyze clusters ofdocuments obtained via unsupervised clustering techniques and compare the performance of classification algorithmson the documents. Cluster is a group of objects that belongs to the same class. In other words, similar objects aregrouped in one cluster and dissimilar objects are grouped in another cluster using the k-means algorithm. Classificationis a task of assigning instances to predefined classes. We have a Training set containing data that have been previouslycategorized, and based on this Training set the algorithms finds the category that the new data points belongs to it usingthe secure hashing algorithm. K-means algorithm is used for classification and SHA-256 algorithm is used for protectthe data securely in digital hash code.
  • 关键词:Data mining; Clustering and Classification; K-means algorithm; SHA-256 algorithm.
国家哲学社会科学文献中心版权所有