首页    期刊浏览 2024年10月04日 星期五
登录注册

文章基本信息

  • 标题:Impact of Data Compression on the Performance of Column-oriented Data Stores
  • 本地全文:下载
  • 作者:Tsvetelina Mladenova ; Yordan Kalmukov ; Milko Marinov
  • 期刊名称:International Journal of Advanced Computer Science and Applications(IJACSA)
  • 印刷版ISSN:2158-107X
  • 电子版ISSN:2156-5570
  • 出版年度:2021
  • 卷号:12
  • 期号:7
  • DOI:10.14569/IJACSA.2021.0120747
  • 语种:English
  • 出版社:Science and Information Society (SAI)
  • 摘要:Compression of data in traditional relational database management systems significantly improves the system performance by decreasing the size of the data that results in less data transfer time within the communication environment and higher efficiency in I/O operations. The column-oriented database management systems should perform even better since each attribute is stored in a separate column, so that its sequential values are stored and accessed sequentially on the disk. That further increases the compression efficiency as the entire column is compressed/decompressed at once. The aim of this research is to determine if data compression could improve the performance of HBase, running on a small-sized Hadoop cluster, consisted of one name node and nine data nodes. Test scenario includes performing Insert and Select queries on multiple records with and without data compression. Four data compression algorithms are tested since they are natively supported by HBase - SNAPPY, LZO, LZ4 and GZ. Results show that data compression in HBase highly improves system performance in terms of storage saving. It shrinks data 5 to 10 times (depending on the algorithm) without any noticeable additional CPU load. That allows smaller but significantly faster SSD disks to be used as cluster’s primary data storage. Furthermore, the substantial decrease in the network traffic is an additional benefit with major impact on big data processing.
  • 关键词:Column-oriented data stores; data compression; distributed non-relational databases; benchmarking column-oriented databases
国家哲学社会科学文献中心版权所有