We revisit the fundamental problem of compressing an integer dictionary that supports efficient rank and select operations by exploiting two kinds of regularities arising in real data: repetitiveness and approximate linearity. Our first contribution is a Lempel-Ziv parsing properly enriched to also capture approximate linearity in the data and still be compressed to the kth order entropy. Our second contribution is a variant of the block tree structure whose space complexity takes advantage of both repetitiveness and approximate linearity, and results highly competitive in time too. Our third and final contribution is an implementation and experimentation of this last data structure, which achieves new space-time trade-offs compared to known data structures that exploit only one of the two regularities
Repetition- and Linearity-Aware Rank/Select Dictionaries
Paolo Ferragina;Giovanni Manzini;Giorgio Vinciguerra
2021-01-01
Abstract
We revisit the fundamental problem of compressing an integer dictionary that supports efficient rank and select operations by exploiting two kinds of regularities arising in real data: repetitiveness and approximate linearity. Our first contribution is a Lempel-Ziv parsing properly enriched to also capture approximate linearity in the data and still be compressed to the kth order entropy. Our second contribution is a variant of the block tree structure whose space complexity takes advantage of both repetitiveness and approximate linearity, and results highly competitive in time too. Our third and final contribution is an implementation and experimentation of this last data structure, which achieves new space-time trade-offs compared to known data structures that exploit only one of the two regularitiesFile | Dimensione | Formato | |
---|---|---|---|
LIPIcs-ISAAC-2021-64.pdf
accesso aperto
Tipologia:
Versione finale editoriale
Licenza:
Creative commons
Dimensione
960.96 kB
Formato
Adobe PDF
|
960.96 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.