期刊名称:International Journal of Applied Management and Technology
出版年度:2008
卷号:6
期号:2
页码:8
出版社:Walden University
摘要:Increasing use of computers, leads to accumulation of data of an organization, demanding the need of sophisticated data handling techniques. Many data handling concepts have evolved that support data analysis, and knowledge discovery. Data warehouse and Data mining techniques are playing an important role in the area of data analysis for knowledge discovery. These techniques typically address the four basic applications such as data classification, data clustering, association between data and finding sequential patterns between the data. Various algorithms that address to classification on large data sets have proved to be efficient in classifying the variables of known or certain characteristics. However they are less effective when applied to the analysis of variable of unknown or uncertain characteristics and creating classes by combining multiple correlated variables in real world. A methodology presented in the paper that addresses two major issues of data classification using decision tree, 1) classification of variables of unknown or uncertain characteristics, 2) creating classification by combining multiple correlated variables.
关键词:User Intervention; Unknown Characteristics; Known Characteristics; Data Warehouse; Data Mining; Decision Tree; Guillotine Cut; Oblique Tree; Entropy; Gain; Uncertainty Coefficient