Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations. We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding reduction in space usage and memory latency, comparing with the state of the art. We study the following applications: compressed string dictionary and monotone minimal perfect hash for strings. In compressed string dictionary, we obtain data structures that outperform other state-of-the-art compressed dictionaries in space efficiency while obtaining predictable query times that are competitive with data structures preferred by the practitioners. On real-world datasets, our compressed tries obtain the smallest space (except for one case) and have the fastest lookup times, whereas access times are within 20% slower than the best-known solutions. In monotone minimal perfect hash for strings, our compressed tries perform several times faster than other trie-based monotone perfect hash functions while occupying nearly the same space. On real-world datasets, our tries are approximately 2 to 5 times faster than previous solutions, with a space occupancy less than 10% larger.

Fast compressed tries through path decompositions

GROSSI, ROBERTO;
2014-01-01

Abstract

Tries are popular data structures for storing a set of strings, where common prefixes are represented by common root-to-node paths. More than 50 years of usage have produced many variants and implementations to overcome some of their limitations. We explore new succinct representations of path-decomposed tries and experimentally evaluate the corresponding reduction in space usage and memory latency, comparing with the state of the art. We study the following applications: compressed string dictionary and monotone minimal perfect hash for strings. In compressed string dictionary, we obtain data structures that outperform other state-of-the-art compressed dictionaries in space efficiency while obtaining predictable query times that are competitive with data structures preferred by the practitioners. On real-world datasets, our compressed tries obtain the smallest space (except for one case) and have the fastest lookup times, whereas access times are within 20% slower than the best-known solutions. In monotone minimal perfect hash for strings, our compressed tries perform several times faster than other trie-based monotone perfect hash functions while occupying nearly the same space. On real-world datasets, our tries are approximately 2 to 5 times faster than previous solutions, with a space occupancy less than 10% larger.
2014
Grossi, Roberto; Ottaviano, Giuseppe
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/766399
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 37
  • ???jsp.display-item.citation.isi??? ND
social impact