CINECA IRIS Institutional Research Information System

Searching for similar objects in a collection is a core task of many applications in databases, pattern recognition, and information retrieval. As there exist similarity-preserving hash functions like SimHash, indexing these objects reduces to the solution of the Approximate Dictionary Queries problem. In this problem we have to index a collection of fixed-sized keys to efficiently retrieve all the keys which are at a Hamming distance at most κ from a query key. In this paper we propose new solutions for the approximate dictionary queries problem. These solutions combine the use of succinct data structures with an efficient representation of the keys to significantly reduce the space usage of the state-of-the-art solutions without introducing any time penalty. Finally, by exploiting triangle inequality, we can also significantly speed up the query time of the existing solutions.

Fast and compact hamming distance index

Gog, Simon;VENTURINI, ROSSANO

2016-01-01

Abstract

Searching for similar objects in a collection is a core task of many applications in databases, pattern recognition, and information retrieval. As there exist similarity-preserving hash functions like SimHash, indexing these objects reduces to the solution of the Approximate Dictionary Queries problem. In this problem we have to index a collection of fixed-sized keys to efficiently retrieve all the keys which are at a Hamming distance at most κ from a query key. In this paper we propose new solutions for the approximate dictionary queries problem. These solutions combine the use of succinct data structures with an efficient representation of the keys to significantly reduce the space usage of the state-of-the-art solutions without introducing any time penalty. Finally, by exploiting triangle inequality, we can also significantly speed up the query time of the existing solutions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2016
			
	Codice ISBN
	
				978-1-4503-4290-2
978-1-4503-4069-4
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/800751

Citazioni

ND

14

ND

social impact