首页    期刊浏览 2024年11月28日 星期四
登录注册

文章基本信息

  • 标题:An Efficient Approach for Storage of Big Data Streams in Distributed Stream Processing Systems
  • 本地全文:下载
  • 作者:Sultan Alshamrani ; Quadri Waseem ; Abdullah Alharbi
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2020
  • 卷号:11
  • 期号:5
  • DOI:10.14569/IJACSA.2020.0110514
  • 出版社:Science and Information Society (SAI)
  • 摘要:Besides, centralized managing, processing and querying, the storage is one of the important components of a big data management. There is always a huge requirement of storing immense volumes of heterogeneous data in different formats. In big data steam processing applications, the storage is given a priority and always plays a big role in historical data analysis. During stream processing, some of the incoming data and the intermediate results are always a good source of future samples. These samples can be used for the future evaluation to eliminate the numerous mistakes of storing and maintaining the big data streams. Hence, a big data stream application requires an efficient support for storage of historical queries. The researchers, scientist and academicians are working hard to develop a sophisticated mechanism that is needed for storage to keep the most useful data for the future references by means of stream archive storage. However, a stream processing system can’t store the whole incoming stream data for future references. A technique is needed to get rid of the expired data and free the space for more incoming data in an archive storage. Hence keeping in view, the storage space limitation, integration issues and its associated cost, we try to optimize the stream archive storage and free more space for future data. The proposed enhanced algorithm will help to delete the obsolete data (retention or expired) and free the space for the new incoming data in a distributed platform. Our paper presents an Enhanced Time Expired Algorithm (ETEA) for stream archived storage in a distributed environment for removing the obsolete data based on time expiration and providing a space for the new incoming data for historical data analysis during the skew time (Hot Spots).We also evaluated the efficiency of our algorithm using the skew factor. The experimental results show that our approach is 98% efficient and fast than other conventional techniques.
  • 关键词:Distributed stream databases; storage optimization; stream archive storage; time expiration
国家哲学社会科学文献中心版权所有