首页    期刊浏览 2024年09月04日 星期三
登录注册

文章基本信息

  • 标题:PREDICTIVE ANALYSIS OF LOCALITY-AWARE STORAGE-TIER DATA BLOCKS OVER HADOOP
  • 本地全文:下载
  • 作者:NAWAB MUHAMMAD FASEEH QURESHI ; DONG RYEOL SHIN ; ISMA FARAH SIDDIQUI
  • 期刊名称:Journal of Theoretical and Applied Information Technology
  • 印刷版ISSN:1992-8645
  • 电子版ISSN:1817-3195
  • 出版年度:2017
  • 卷号:95
  • 期号:12
  • 页码:2658
  • 出版社:Journal of Theoretical and Applied
  • 摘要:The term Big Data analytics refers to a large-scale solution for managing giant datasets in a parallel environment. Hadoop is an ecosystem that processes large datasets in distributed computing scenario. The ecosystem is further categorized into four sub-projects i.e. HDFS, MapReduce, YARN and Hadoop Commons. The Hadoop Distributed File System (HDFS) is a backbone of ecosystem, which helps storing and processing large datasets. Recently, HDFS is upgraded to heterogeneous storage-tier environment that cope with data block processing over multiple storage devices i.e. DISK, SSD and RAM. The block placement policy dispatches data blocks to the devices without calculating I/O transfer parameters and locality perspectives. Moreover, HDFS selects random Datanodes that could be located into the next rack having longer path than local rack. This increases the data block processing latency and results in a huge delay for replica management in heterogeneous storage-tier. To resolve this issue, we propose a predictive analysis that build a locality-aware storage-tier node summary and predict the most nearby available storage-tier for block job processing. The experimental evaluation depicts that the proposed approach reduces data block transfer time overhead, replica transfer time overhead and decreases node paths to an optimal accessibility over the cluster.
  • 关键词:Hadoop; HDFS; Locality-aware; network distance; storage-tier.
国家哲学社会科学文献中心版权所有