Partial Order Alignment (POA) was introduced by Lee et al. in 2002 to allow the alignment of a string to a graph-like structure representing a set of aligned strings (a Multiple Sequence Alignment, MSA). However, the POA edit transcript (the sequence of edit operations that describe the alignment) does not reflect the possible elasticity of the MSA (different gaps sizes in the aligned string), leaving room for possible misalignment and its propagation in progressive MSA. Elastic-Degenerate Strings (ED-strings) are strings that can represent the outcome of an MSA by highlighting gaps and variants as a list of strings that can differ in size and that can possibly include the empty string. In this paper, we define a method that optimally aligns a string to an ED-string, the latter compactly representing an MSA, overcoming the ambiguity in the POA edit transcript while maintaining its time and space complexity.
Optimal Sequence Alignment to ED-Strings
Pisanti, N
2022-01-01
Abstract
Partial Order Alignment (POA) was introduced by Lee et al. in 2002 to allow the alignment of a string to a graph-like structure representing a set of aligned strings (a Multiple Sequence Alignment, MSA). However, the POA edit transcript (the sequence of edit operations that describe the alignment) does not reflect the possible elasticity of the MSA (different gaps sizes in the aligned string), leaving room for possible misalignment and its propagation in progressive MSA. Elastic-Degenerate Strings (ED-strings) are strings that can represent the outcome of an MSA by highlighting gaps and variants as a list of strings that can differ in size and that can possibly include the empty string. In this paper, we define a method that optimally aligns a string to an ED-string, the latter compactly representing an MSA, overcoming the ambiguity in the POA edit transcript while maintaining its time and space complexity.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.