In this paper we address the problem of trading optimally, and in a principled way, the compressed size/decompression time of LZ77 parsings by introducing what we call the Bicriteria LZ77-Parsing problem. The goal is to determine an LZ77 parsing which minimizes the space occupancy in bits of the compressed file, provided that the decompression time is bounded by T. Symmetrically, we can exchange the role of the two resources and thus ask for minimizing the decompression time provided that the compressed space is bounded by a fixed amount given in advance. We address this goal in three stages: (i) we introduce the novel Bicriteria LZ77-Parsing problem which formalizes in a principled way what data-compressors have traditionally approached by means of heuristics; (ii) we solve this problem efficiently, up to a negligible additive constant, in O(nlog2 n) time and optimal O(n) words of working space, by proving and deploying some specific structural properties of a weighted graph derived from the possible LZ77-parsings of the input file; (iii) we execute a preliminary set of experiments which show that our novel compressor is very competitive to all the highly engineered competitors (such as Snappy, lzma, bzip2), hence offering a win-win situation in theory&practice.
Bicriteria data compression
FARRUGGIA, ANDREA;FERRAGINA, PAOLO;FRANGIONI, ANTONIO;VENTURINI, ROSSANO
2014-01-01
Abstract
In this paper we address the problem of trading optimally, and in a principled way, the compressed size/decompression time of LZ77 parsings by introducing what we call the Bicriteria LZ77-Parsing problem. The goal is to determine an LZ77 parsing which minimizes the space occupancy in bits of the compressed file, provided that the decompression time is bounded by T. Symmetrically, we can exchange the role of the two resources and thus ask for minimizing the decompression time provided that the compressed space is bounded by a fixed amount given in advance. We address this goal in three stages: (i) we introduce the novel Bicriteria LZ77-Parsing problem which formalizes in a principled way what data-compressors have traditionally approached by means of heuristics; (ii) we solve this problem efficiently, up to a negligible additive constant, in O(nlog2 n) time and optimal O(n) words of working space, by proving and deploying some specific structural properties of a weighted graph derived from the possible LZ77-parsings of the input file; (iii) we execute a preliminary set of experiments which show that our novel compressor is very competitive to all the highly engineered competitors (such as Snappy, lzma, bzip2), hence offering a win-win situation in theory&practice.File | Dimensione | Formato | |
---|---|---|---|
bczip_soda14.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
444.77 kB
Formato
Adobe PDF
|
444.77 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.