Genomes can be described as a collection of clusters, the gene families, whose members are called paralogs. Paralogs are genes that most probably share duplication history and show a significant similarity in their sequences, even if they perform slightly different biological function. Among the different mechanisms that have led to an increase of the genomic information during biological evolution, gene duplication is probably the most important. To better understand duplication events, the first step is to investigate the history of the gene families in order to detect which duplication events have taken place, and in which relative (partial) order. Here we present a method, called PaTre, that, given a gene family, attempts to construct the paralogy tree of the family. We will work under the hypothesis that every family member derives from a duplication process of another member. By the term paralogy tree, we mean a directed tree in which the root represents the most ancient paralog of the family and each oriented arc (a, b) represents the existence of a duplication event from the template gene a to its copy b. Notice that gene a survives the event and can serve as a template of more than one duplication event; in fact, there can be more than one arc leaving a. PaTre uses new algorithmic techniques motivated by the specific application at hand. The reliability of the inferential process has been tested by means of a simulator that implements different hypotheses on the duplication-with-modification paradigm and on three examples of different biological gene families, belonging either to lower and higher organisms.

PaTre: a method for Paralogy Trees Construction

PISANTI, NADIA;MARANGONI, ROBERTO;FERRAGINA, PAOLO;FRANGIONI, ANTONIO;LUCCIO, FABRIZIO
2003-01-01

Abstract

Genomes can be described as a collection of clusters, the gene families, whose members are called paralogs. Paralogs are genes that most probably share duplication history and show a significant similarity in their sequences, even if they perform slightly different biological function. Among the different mechanisms that have led to an increase of the genomic information during biological evolution, gene duplication is probably the most important. To better understand duplication events, the first step is to investigate the history of the gene families in order to detect which duplication events have taken place, and in which relative (partial) order. Here we present a method, called PaTre, that, given a gene family, attempts to construct the paralogy tree of the family. We will work under the hypothesis that every family member derives from a duplication process of another member. By the term paralogy tree, we mean a directed tree in which the root represents the most ancient paralog of the family and each oriented arc (a, b) represents the existence of a duplication event from the template gene a to its copy b. Notice that gene a survives the event and can serve as a template of more than one duplication event; in fact, there can be more than one arc leaving a. PaTre uses new algorithmic techniques motivated by the specific application at hand. The reliability of the inferential process has been tested by means of a simulator that implements different hypotheses on the duplication-with-modification paradigm and on three examples of different biological gene families, belonging either to lower and higher organisms.
2003
Pisanti, Nadia; Marangoni, Roberto; Ferragina, Paolo; Frangioni, Antonio; Savona, A.; Pisanelli, C.; Luccio, Fabrizio
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/185022
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact