首页    期刊浏览 2024年11月26日 星期二
登录注册

文章基本信息

  • 标题:Evaluation of Multivariate Outlier Detection Methods with Benchmark Medical Datasets
  • 本地全文:下载
  • 作者:Zahra Nazari ; Dongshik Kang
  • 期刊名称:International Journal of Computer Science and Network Security
  • 印刷版ISSN:1738-7906
  • 出版年度:2018
  • 卷号:18
  • 期号:4
  • 页码:36-43
  • 出版社:International Journal of Computer Science and Network Security
  • 摘要:Outliers are unusual data points which are inconsistent with other observations in a dataset. Outlier detection method has been researched in diverse application domains and recently it has been realized that there is a direct mapping between outliers in data and real world anomalies. The importance of outlier detection is due to the fact that outliers in data sometimes interpret to significant information in a wide variety of application domains (Chandola et al. 2007). Several types of outlier detection methods are developed and a number of surveys and reviews are performed to distinguish their advantages and disadvantages. Outlier detection methods are highly domain oriented therefore an evaluation is needed to find an appropriate one for the intended domain. In this study we evaluate widely used multivariate outlier detection methods namely distance based, statistical based and clustering based for medical datasets. Five benchmark medical datasets of Heart disease, Breast Cancer Pima Indian Diabetes, Liver Disorders and Thyroid Gland are used for experiments. To identify the effectiveness of mentioned outlier detection methods, the above datasets are classified and their total variances are calculated before and after outlier detection. Eight well-known individual and ensemble classifiers are used for data classification. Finally a comparative review is performed to distinguish the advantages and disadvantages of each method and their respective effects on accuracy of classifiers.
  • 关键词:Outlier Detection; Data Mining; Machine Learning; Data Clustering; Pattern Recognition
国家哲学社会科学文献中心版权所有