期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
印刷版ISSN:2158-107X
电子版ISSN:2156-5570
出版年度:2018
卷号:9
期号:12
DOI:10.14569/IJACSA.2018.091232
出版社:Science and Information Society (SAI)
摘要:The raw material of our paper is a well known and commonly used type of supervised algorithms: decision trees. Using a training data, they provide some useful rules to classify new data sets. But a data set with missing values is always the bane of a data scientist. Even though decision tree algorithms such as ID3 and C4.5 (the two algorithms with which we are working in this paper) represent some of the simplest pattern classification algorithms that can be applied in many domains, but with the drawback of missing data the task becomes harder because they may have to deal with unknown values in two major steps: at training step and at prediction step. This paper is involved in the processing step of databases using trees already constructed to classify the objects of these data sets. It comes with the idea to overcome the disturbance of missing values using the most famous and the central concept of the game theory approach which is the Nash equilibrium.
关键词:Decision tree; ID3; C4.5; missing data; game theory; Nash equilibrium