期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2016
卷号:16
期号:12
页码:117-127
出版社:International Journal of Computer Science and Network Security
摘要:Many critical applications ? such as medical diagnosis, text analysis, website phishing, and many others ? need an artificial automated tool to enhance the decision-making process. Employing association rules in the classification process is one technique in the data-mining field for making more accurate and critical decisions. This is known as the association classification (AC) technique .However, most of the AC algorithms are not scalable as they are affected by the size of the dataset. Furthermore, the issue of the algorithm's level of accuracy versus the time needed to build the model is critical some AC algorithms have a high level of accuracy but take a long time to build a model, while the others take short time to build a model but have a low level of accuracy. To address these problems, we propose in this paper, a Fast Classification Based on Association Rules (FCBA) algorithm based on new internal and external pruning methods to generate association rules using an enhanced Apriori algorithm. We compare our proposed algorithm with four well-known AC algorithms, namely the CBA, CMAR, MCAR and FACA algorithms, based on 11 UCI datasets. Most of the datasets are medical and of different sizes. This allows us to evaluate the scalability and accuracy of the algorithms. Our extensive experimental study shows that the FCBA algorithm is more scalable than the others. In addition, the FCBA algorithm outperforms the others with regard to accuracy and the time taken to build the model. FCBA is ranked first in 64% and second in 36% of datasets, with an average time of less than 0.01 seconds. Thus, it achieves the highest accuracy and the fastest average time to build the model, in comparison with the other algorithms. In the medical datasets, FCBA performs better, wins in 67% of datasets and is second place in 33%, with an average time of less than 0.01 seconds.
关键词:Data mining; Association Classification; Apriori; Medical diagnosis