期刊名称:Indian Journal of Computer Science and Engineering
印刷版ISSN:2231-3850
电子版ISSN:0976-5166
出版年度:2021
卷号:12
期号:5
页码:1224-1237
DOI:10.21817/indjcse/2021/v12i5/211205047
语种:English
出版社:Engg Journals Publications
摘要:Nowadays, the continued expansion of modern technology provides a way of emerging IoT and Artificial Intelligence together. Data mining techniques have been applied to a large number of datasets generated through various sources in order to bring out useful information and for making the prediction.The performance of any data mining techniquemay vary depending upon the type of dataset being used and the application area under consideration. Hence, finding the data mining technique which is utmostappropriate for a precise application domain and its associated datasets would be very advantageous. Major reason behind this study is that in most of the developing countries the health data is normal dataset but with the use of increasing sensor-based technologies it becomes necessary to make analysis on sensors dataset also.To fulfill such type of requirement, analyzing the performance ofwell-known data mining techniques, including Decision Tree, K-Nearest Neighbors, Na�ve Bayes, Support Vector Machine, Random Forest Tree, and Logistic Regressionfor diseases prediction has been conducted. The analysis utilized four different datasets including Heart Disease, Breast Cancer, MIT_BIH Arrhythmia, and Activity Recognition datasets collected through smart sensors or clinical examination of patients.These data sets were utilized in order to evaluate and analyze the considered techniques to select the most suitable one with high prediction accuracy. Analysis results show that Random Forest Tree provides the best outcome for all the considered datasets. Hence, performance analysis among all data mining techniques is carried out extensively where accuracy, precision, recall, and f1_score is taken into action. Spyder IDE is utilized for evaluation purposes.
关键词:Internet of Things (IoT);Data Mining Techniques (DMT);Disease diagnosis;Artificial Intelligence;Na�ve Bayes (NB);Support Vector Machine (SVM);Random Forest Tree (RFT);Decision Tree (DT);K-Nearest Neighbors (KNN)