期刊名称:International Journal of Electronics, Communication and Soft Computing Science and Engineering
印刷版ISSN:2277-9477
出版年度:2015
卷号:4
期号:Special 3
出版社:IJECSCSE
摘要:Big Data is concerned with the large-volume, complex,growing data sets with multiple and autonomous sources. With thefast development of networking, data storage, and the datacollection capacity, Big Data is now rapidly expanding in alldomains of engineering and science, including also the physical,biological and biomedical sciences. We need new tools and newalgorithm to deal for this huge amount of data. The advances inhardware and software technologies have enabled us to collect, storeand distribute large quantities of data on a very large scale. Theprocess of discovering and extracting hidden knowledge in the formof patterns from these large data volumes is known as data mining.Data mining technology is not only a part of business intelligence,but is also used in many other application areas such as research,marketing and financial analytics.Mining Big data has opened many new challenges andopportunities. Existing data mining techniques face great difficultieswhen they are required to handle the unprecedented heterogeneity,volume, variety, speed, privacy, accuracy and trust coming alongwith big data and big data mining. Extracting knowledge in theform of patterns from these massive growing data volumes in bigdata imposes a number of computational challenges in terms ofprocessing time, memory, bandwidth and power consumption.These challenges have led to the development of parallel anddistributed data analysis approaches. Various architectures havebeen suggested by the researchers, which mainly focus on miningthe big data when it is stored in data repositories. This paper gives aconceptual model of parallel data mining architecture for big data.The architecture suggested shall process the big data or data streamusing parallel data mining scheme prior to storage in datarepositories.