期刊名称:Journal of Theoretical and Applied Information Technology
印刷版ISSN:1992-8645
电子版ISSN:1817-3195
出版年度:2017
卷号:95
期号:6
出版社:Journal of Theoretical and Applied
摘要:Nowadays, Tuberculosis is considered as one of the largest cause of death from infectious diseases worldwide. There is increasing evidence that the genetic diversity of Mycobacterium tuberculosis bacteria may have important clinical consequences. So, the combination of clinical genetic data, social and demographic is crucial to understand the epidemiology of these infectious diseases. To help the doctors to predict and diagnosis the tuberculosis disease, this research proposed two models by using Data Mining. The two models could be used as decision support tool in clinics to help in diagnosing Tuberculosis. To develop the models, a tuberculosis dataset was collected from medical field from Tropical area teaching hospital in Khartoum state Sudan, the size of this data set is 265 patient records. The first model was built by using classification algorithms. After conducting intensive experiments, three classification algorithms had been selected Na�ve Bayes, CN2 and Classification Tree. The model was implemented using Orange application. The model had been validated and according to the result, the classification Tree was selected as best algorithm with accuracy 0.9358, Sensitivity0.9632 and Specificity 0.8933. The second model was built by using clustering algorithm called K-means. Also, this model was implemented by using Orange application. The result of the clustering model had been discussed and evaluated. The two models could be used to cross-check the diagnosis and predict the Tuberculosis diseases
关键词:Tuberculosis; Data Mining; Classification; Clustering.