期刊名称:International Journal of Computer Trends and Technology
电子版ISSN:2231-2803
出版年度:2015
卷号:22
期号:2
页码:53-63
DOI:10.14445/22312803/IJCTT-V22P111
出版社:Seventh Sense Research Group
摘要:Clustering plays a major role in data mining for building models from an input data set; predicting future data trends for further decision making; simulating and analysing model; and diagnosing of healthcare diseases. Currently, in diagnosis of healthcare diseases such as diabetes, the initial knowledge of the clustered data is required in the use of Artificial intelligence (AI) technique as data preprocessing and classification technique. However, the inability to have such a prior knowledge had led to less classification accuracy. In this work, a cascade of KMeans clustering algorithm and Artificial Neural Network (ANN) was proposed for clustering of diabetes dataset. The proposed model was implemented in two stages. In the first stage, a KMeans clustering was used to preprocess the dataset after the initial filtering operation. In the second stage, the ANN was used to classify the result obtained from the preprocessed dataset. The proposed cascaded model was applied on Pima Indian diabetes dataset (PIDD) obtained from one of the public repository. Experimental results shows that accuracy of 99.2% was obtained from the KMeansANN model. Further analysis also revealed that the cascade of KmeansANN model outperformed the cascade of ANNKmeans model, thus establishing that the two cascaded models are not commutative.
关键词:Data mining; diabetes disease; Pima Indian DiabetesDataset; ANN; K-means clustering; Pre-Processed Data; ClassificationPut your keywords here; keywords are separatedby comma.