摘要:Utility itemset mining, which finds the item sets based on utility factors, has established itself as an essential form of data mining. The utility is defined in terms of quantity and some interest factor. Various methods have been developed so far by the researchers to mine these itemsets but most of them are not scalable. In the present times, a scalable approach is required that can fulfill the budding needs of data mining. A Spark based novel technique has been recommended in this research paper for mining the data in a distributed way, called as Absolute High Utility Itemset Mining (AHUIM). The technique is suitable for small as well as large datasets. The performance of the technique is being measured for various parameters such as speed, scalability, and accuracy etc.
其他摘要:Utility itemset mining, which finds the item sets based on utility factors, has established itself as an essential form of data mining. The utility is defined in terms of quantity and some interest factor. Various methods have been developed so far by the researchers to mine these itemsets but most of them are not scalable. In the present times, a scalable approach is required that can fulfill the budding needs of data mining. A Spark based novel technique has been recommended in this research paper for mining the data in a distributed way, called as Absolute High Utility Itemset Mining (AHUIM). The technique is suitable for small as well as large datasets. The performance of the technique is being measured for various parameters such as speed, scalability, and accuracy etc.