Elastic Degenerate (ED) strings and Elastic Founder (EF ) graphs are two versions of acyclic components of pangenomes. Both ED strings and EF graphs (which we collectively name variable strings) extend the well-known notion of indeterminate string. Recent work has extensively investigated algorithmic tasks over these structures, and over several other variable strings notions that they generalise. Among such tasks, the basic operation of matching a pattern into a text, which can serve as a toolkit for many pangenomic data analyses using these data structures, deserves special attention. In this paper we: (1) highlight a clear taxonomy within both ED strings and EF graphs ranging through variable strings of all types, from the linear string up to the most general one; (2) investigate the problem PvarT(X, Y ) of matching a solid or variable pattern of type X into a variable text of type Y ; (3) using as a reference the quadratic conditional lower bounds that are known for PvarT(solid,ED) and PvarT(solid,EF ), for all possible types of variable strings X and Y we either prove the quadratic conditional lower bound for PvarT(X, Y ), or provide non-trivial, often sub-quadratic, upper bounds, also exploiting the above-mentioned taxonomy.

A Unifying Taxonomy of Pattern Matching in Degenerate Strings and Founder Graphs

Rocco Ascone;Giulia Bernardini;Alessio Conte;Massimo Equi;Roberto Grossi;Nadia Pisanti
2024-01-01

Abstract

Elastic Degenerate (ED) strings and Elastic Founder (EF ) graphs are two versions of acyclic components of pangenomes. Both ED strings and EF graphs (which we collectively name variable strings) extend the well-known notion of indeterminate string. Recent work has extensively investigated algorithmic tasks over these structures, and over several other variable strings notions that they generalise. Among such tasks, the basic operation of matching a pattern into a text, which can serve as a toolkit for many pangenomic data analyses using these data structures, deserves special attention. In this paper we: (1) highlight a clear taxonomy within both ED strings and EF graphs ranging through variable strings of all types, from the linear string up to the most general one; (2) investigate the problem PvarT(X, Y ) of matching a solid or variable pattern of type X into a variable text of type Y ; (3) using as a reference the quadratic conditional lower bounds that are known for PvarT(solid,ED) and PvarT(solid,EF ), for all possible types of variable strings X and Y we either prove the quadratic conditional lower bound for PvarT(X, Y ), or provide non-trivial, often sub-quadratic, upper bounds, also exploiting the above-mentioned taxonomy.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1272612
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact