We initiate a study on the fundamental relation between data sanitization (i.e., the process of hiding confidential information in a given dataset) and frequent pattern mining, in the context of sequential (string) data. Current methods for string sanitization hide confidential patterns introducing, however, a number of spurious patterns that may harm the utility of frequent pattern mining. The main computational problem is to minimize this harm. Our contribution here is twofold. First, we present several hardness results, for different variants of this problem, essentially showing that these variants cannot be solved or even be approximated in polynomial time. Second, we propose integer linear programming formulations for these variants and algorithms to solve them, which work in polynomial time under certain realistic assumptions on the problem parameters.
Hide and Mine in Strings: Hardness and Algorithms
Alessio Conte;Roberto Grossi;Nadia Pisanti;Giulia Punzi;
2020-01-01
Abstract
We initiate a study on the fundamental relation between data sanitization (i.e., the process of hiding confidential information in a given dataset) and frequent pattern mining, in the context of sequential (string) data. Current methods for string sanitization hide confidential patterns introducing, however, a number of spurious patterns that may harm the utility of frequent pattern mining. The main computational problem is to minimize this harm. Our contribution here is twofold. First, we present several hardness results, for different variants of this problem, essentially showing that these variants cannot be solved or even be approximated in polynomial time. Second, we propose integer linear programming formulations for these variants and algorithms to solve them, which work in polynomial time under certain realistic assumptions on the problem parameters.| File | Dimensione | Formato | |
|---|---|---|---|
|
Hide&Mine.pdf
solo utenti autorizzati
Tipologia:
Versione finale editoriale
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
155.16 kB
Formato
Adobe PDF
|
155.16 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
ICDM_Hide_and_Mine_is_Hard.pdf
accesso aperto
Tipologia:
Documento in Post-print
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
161.87 kB
Formato
Adobe PDF
|
161.87 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


