期刊名称:International Journal of Innovative Research in Computer and Communication Engineering
印刷版ISSN:2320-9798
电子版ISSN:2320-9801
出版年度:2019
卷号:7
期号:5
页码:3122-3128
DOI:10.15680/IJIRCCE.2019. 0705087
出版社:S&S Publications
摘要:Data is being created from different sources over the globe and most of the data is in unstructured format
and we are in confusion that which data is to keep and which has to exclude from the repository. Because of
maintenance of all the format data we are creating big data and its creating a huge issue for data maintenance and also
to provide security to the repositories like data centers and also cloud repositories. Here we are providing the method to
identify the kind of data we can keep and maintain and what the data we need to discard from the repository. When we
consider weather prediction system we need to have past data and that will be in different formats but we don’t need
complete data from the repository and we can discard some part of data permanently which will be helpful for the
researchers to maintain the data with an efficient manner. This scenario includes mining the repository based on our
requirement and implementing some machine learning techniques to identify which data is most valuable and needed to
perform some of the research operations like designing the prediction model and identifying a novel approach in a
specific domain. This research can be implemented in any domain which is having more amounts of big data. Health
care domain is the platform which was used more to perform any mining activities and to identify a novel thing in the
real world. Big Data issue has to be resolved in medical domain which leads to the good architecture of maintaining
data and also securing data is also possible in an efficient manner.