首页    期刊浏览 2025年07月15日 星期二
登录注册

文章基本信息

  • 标题:Spam Outlier Detection in High Dimensional Data: Ensemble Subspace Clustering Approach
  • 本地全文:下载
  • 作者:Suresh S. Kapare ; Bharat A. Tidke
  • 期刊名称:International Journal of Computer Science and Information Technologies
  • 电子版ISSN:0975-9646
  • 出版年度:2015
  • 卷号:6
  • 期号:3
  • 页码:2326-2329
  • 出版社:TechScience Publications
  • 摘要:High Dimensional data is need of world as social networking sites, biomedical data, sports, etc. Many data sets are represented with hundreds or thousands of dimensions. Dimensions are increasing, so due to “Curse of Dimensionality”, traditional outlier detection methods not working efficiently. Increasing dimensions of data objects, makes difficult to find out points, which are not fitting in group (cluster), called Outlier. The outlier detection method has important applications in the field of fraud detection, network robustness analysis, error elimination in scientific data, sports data analysis and intrusion detection. Most such applications are high dimensional domains in which the data can contain hundreds of dimensions. Spam can be linked based or content based. Ensemble subspace clustering is paradigm in which spam outlier detection is done for high dimensional data sets is proposed in this paper. The proposed method divides original high dimensional data set in subspace clusters using subspace clustering algorithm. By using improved k-means algorithms outlier cluster is found, which is further merged with other clusters depending upon consensus function. Outlier cluster, which is not going to merge with any other subspace cluster, is called as final outlier.
  • 关键词:Outlier; high dimensional data; subspace;ensemble; clustering
国家哲学社会科学文献中心版权所有