文章基本信息

标题：A Proficient System for Automatic Detection of Risk Level in Disease Detection using Association Rule Based DRF Algorithm
本地全文：下载
作者：B. Gomathy ; S. M. Ramesh ; A. Shanmugam 等
期刊名称：International Journal of Advances in Soft Computing and Its Applications
印刷版ISSN：2074-8523
出版年度：2013
卷号：5
期号：3
出版社：International Center for Scientific Research and Studies
摘要：A challenging research problem for researchers is predicting heart problem, breast cancer, tumor, and the most daunting diseases. Current research in this area is struggling to provide accurate and better solution for the prediction of such deadly diseases. In this paper, Discriminative Rule Framing (DRF) algorithm is proposed to analyze and predict the survivability of disease in a patient. Association rule of data mining is used to reveal the biological hidden patterns and derive association rules from a huge medical data set. Initial rules generated through association rule mining along with subset attributes of the data set are given as input to the DRF risk analysis system to predict the risk level of a given data set. The significance of the DRF is evaluated using confidence, support and lift metrics. Experimental result shows that, prediction level of the DRF is more accurate than other existing algorithms
关键词：Association Rule Mining; Feature Matching; Risk Analysis; ;Convictional Measures; CART; Machine Learning. ; var currentpos;timer; function initialize() { timer=setInterval("scrollwindow()";10);} function sc(){clearInterval(timer); }function scrollwindow() { currentpos=document.body.scrollTop; window.scroll(0;++currentpos); if (currentpos != document.body.scrollTop) sc();} document.onmousedown=scdocument.ondblclick=initialize;Gomathy;et al.;2;1. Introduction ;To discover useful information from huge data set is a tedious job. To assist ;discovering useful information profound technique named data mining is used. It ;retrieves ideas from plenty of disciplines namely statistics; database system; etc. ;Digitized format of storing information gives a hand for medical department to store ;and maintain patient's information in a database. An electronic method of storing ;information is economically feasible. This characteristic of information storage; ;simulate modern medicine to generate the enormous amount of health care data. The ;information contained in medical data set is interesting and useful for diagnosis of ;diseases and patient care. ;Models can be designed using data mining for finding patterns in data. There ;exists a need for a classifier in order to predict serious human disease. Nowadays; ;physician uses the classifier model to diagnose the diseases. Therefore; to analyze ;huge data sets; association rules of data mining; is used to refine interesting ;associations; casual constructions; correlations; frequent patterns; etc indicating the ;relationship between procedures performed on patients and generated report for ;diagnose. Most threatening diseases such as brain tumor; breast cancer; etc.; detection ;in earlier stage will increase the survival of patients. Massive data analysis research ;work is carried out in detecting such diseases using different data mining algorithms. ;Sensitivity and specificity are improved to increase the survival rate of patients and ;also decreases the workload of a radiologist. ;In this paper in order to determine the existence of disease and its risk level; ;an algorithm named DRF is introduced. Initially preprocessing process is performed ;using normalization techniques for the data set. This preprocessing work will enhance ;the association rule mining to discover medically significant rules by assigning ;weights from a huge set of medical data sets. The DRF algorithm has two stages. At ;stage 1; class-labels are framed from the preprocessed data set; based on which base ;rule for DRF algorithm is generated. As with this approach; rule formation is based ;on user need; and it can be adaptable to any kind of medical data set. The S - grid ;takes the base rule and a frame heuristic matrix. This matrix is easily accessible; and ;it plays a vital role in framing true and branch rules. For each item in the data set s-;count value is calculated depending on the values in heuristic matrix. ;In the second stage of the algorithm; true and base rules are framed. Based on ;these rules maximum and minimum values are calculated and heuristic rate is ;estimated. Along with these values; the threshold is set to filter the items in a data set ;according to the requirement of risk analysis. Fig. 1 shows the flow of the DRF ;algorithm. Risk analysis is the most widely used tool by many data mining methods ;for defining and analyzing of the undesirable events. Medical data set usually holds ;millions and millions of records. To analyze a collection of records manually ;consumes more time and also difficult to process all such types of data. Therefore; a