期刊名称:International Journal of Computer Science and Network Security
印刷版ISSN:1738-7906
出版年度:2009
卷号:9
期号:10
页码:116-123
出版社:International Journal of Computer Science and Network Security
摘要:The mainstream lossless data compression algorithms have been extensively studied in recent years. However, rather less attention has been paid to the block algorithm of those algorithms. The aim of this study was therefore to investigate the block performance of those methods. The main idea of this paper is to break the input into different sized blocks, compress separately, and compare the results to determine the optimal block size. The select of optimal block size involves tradeoffs between the compression ratio and the processing time. We found that, for PPM, BWT and LZSS, a block size of greater than 32 KiB may be optimal. For Huffman coding and LZW, a moderate sized block (16KiB for Huffman and 32KiB for LZSS) is better. We also use the mean block standard deviation (MBSD) and locality of reference to explain the compression ratio. We found that good data locality implies a large skew in the data distribution, and the greater data distribution skew and the MBSD, the better the compression ratio. There is a positive correlation between MBSD and compression ratio.