期刊名称:Indian Journal of Innovations and Developments
印刷版ISSN:2277-5382
电子版ISSN:2277-5390
出版年度:2016
卷号:5
期号:4
页码:1-13
语种:English
出版社:Indian Society for Education and Environment
摘要:Objectives : The main objective of this work is to provide effective big data classification by analyzing the classification performance of the existing machine learning algorithms. Methods/Statistical analysis : In MapReduce, after pre-processing and initial partitioning, the features are selected by reducing the dimensionality using the SPARK ITFS technique and then the classification is performed at each node. The Extreme Learning Machine (ELM) is utilized in this work to provide efficient big data classification with reduced computation and network traffic cost. The work is also aimed to reduce the overall memory consumption and reduce time complexity. Findings : The various research works has been analyzed and evaluated. From the analysis, the classification is performed using machine learning techniques such as Artificial Neural Networks (ANN), Gradient Boosting, Support Vector machine (SVM), Random forests, Naïve Bayes, etc. Though the techniques have good classification performance, the classification performance can be further enhanced by using more advanced machine learning approaches. The proposed Distributed Kernel-ELM (DK-ELM) gives better performance in terms of accuracy, precision, recall and response time compared with existing machine learning algorithms such as SVM, ANN and Partial Least Square-Discriminant Analysis (PLS-DA) utilized in the partition based aggregation methods. Application/Improvements : The findings of this work prove that the DK-ELM provides better result than other approaches.
其他摘要:Objectives : The main objective of this work is to provide effective big data classification by analyzing the classification performance of the existing machine learning algorithms. Methods/Statistical analysis : In MapReduce, after pre-processing and initial partitioning, the features are selected by reducing the dimensionality using the SPARK ITFS technique and then the classification is performed at each node. The Extreme Learning Machine (ELM) is utilized in this work to provide efficient big data classification with reduced computation and network traffic cost. The work is also aimed to reduce the overall memory consumption and reduce time complexity. Findings : The various research works has been analyzed and evaluated. From the analysis, the classification is performed using machine learning techniques such as Artificial Neural Networks (ANN), Gradient Boosting, Support Vector machine (SVM), Random forests, Naïve Bayes, etc. Though the techniques have good classification performance, the classification performance can be further enhanced by using more advanced machine learning approaches. The proposed Distributed Kernel-ELM (DK-ELM) gives better performance in terms of accuracy, precision, recall and response time compared with existing machine learning algorithms such as SVM, ANN and Partial Least Square-Discriminant Analysis (PLS-DA) utilized in the partition based aggregation methods. Application/Improvements : The findings of this work prove that the DK-ELM provides better result than other approaches.
关键词:Big Data Classification; Artificial Neural Networks; Support Vector Machine; Partial least Square-discriminant Analysis; Distributed Kernel-ELM.