首页    期刊浏览 2024年11月10日 星期日
登录注册

文章基本信息

  • 标题:MISSING VALUE IMPUTATION USING FUZZY POSSIBILISTIC C MEANS OPTIMIZED WITH SUPPORT VECTOR REGRESSION AND GENETIC ALGORITHM
  • 本地全文:下载
  • 作者:P.SARAVANAN ; P.SAILAKSHMI
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2015
  • 卷号:72
  • 期号:1
  • 出版社:Journal of Theoretical and Applied
  • 摘要:Quality data mining results can be obtained only with high quality input data. So missing data in data sets should be estimated to increase data quality. Here comes the importance of efficient methods for imputation of missing values. If the values are Missing At Random (MAR), it can be estimated using some complex manner from available data. For such an estimation of values, a combination of fuzzy c means and possibilistic c means algorithms are used in the proposed system. Thus combining the advantages of fuzzy c means algorithm, such as data can belongs to more than one cluster which gives best result for overlapped data etc and that of possibilistic c means such as handling noisy data effectively. Proposed system considers both membership function and typicality of the data. Fuzzy-Possibilistic c means method is optimized using Genetic Algorithm with Support Vector Regression (SVRGA). The main purpose of SVRGA is to minimize the error. Support Vector Regression model must be trained with complete records. Genetic Algorithm is used to select new parameters from existing population. If the error is found to be minimum then it is assumed that parameters are optimized and the dataset does not contain incomplete records. If the error is not minimum again estimate the missing values using fuzzy possibilistic c means clustering with new parameters. The system is tested with two different real time datasets, Iris and marine db with various standard missing ratios. The performance of proposed method is calculated using Random Mean Square Error (RMSE) and compared with competitor. The graphs show the system proposed in this work is performing well.
  • 关键词:Missing value Imputation; Fuzzy Possibilistic C Means; Support Vector Regression; Genetic Algorithm; Multiple Imputations.
国家哲学社会科学文献中心版权所有