Objective: Modeling phylogenetic interactions is an open issue in many computational biology problems. In the context of gene function prediction we introduce a class of kernels for structured data leveraging on a hierarchical probabilistic modeling of phylogeny among species. Methods and materials: We derive three kernels belonging to this setting: a sufficient statistics kernel, a Fisher kernel, and a probability product kernel. The new kernels are used in the context of support vector machine learning. The kernels adaptivity is obtained through the estimation of the parameters of a tree structured model of evolution using as observed data phylogenetic profiles encoding the presence or absence of specific genes in a set of fully sequenced genomes. Results: We report results obtained in the prediction of the functional class of the proteins of the budding yeast Saccharomyces cerevisae which favorably compare to a standard vector based kernel and to a non-adaptive tree kernel function. A further comparative analysis is performed in order to assess the impact of the different components of the proposed approach. Conclusions: We show that the key features of the proposed kernels are the adaptivity to the input domain and the ability to deal with structured data interpreted through a graphical model representation.
|Autori:||NICOTRA L; MICHELI A|
|Titolo:||Modeling Adaptive Kernels from Probabilistic Phylogenetic Trees|
|Anno del prodotto:||2009|
|Digital Object Identifier (DOI):||10.1016/j.artmed.2008.08.007|
|Appare nelle tipologie:||1.1 Articolo in rivista|