Kernel-based learning models such as support vector machines (SVMs) can seamlessly deal with graph structures thanks to suitable kernel functions that compute a particular similarity between pairs of data samples. In the last two decades, a plethora of graph kernels have been proposed, motivated by theoretical properties or designed specifically for an application domain. Computing graph kernels however presents a significant cost for both training and inference, since predictions on unseen graphs require evaluating the kernel e.g. between the new sample and all support vectors, and solutions to an SVM optimization problem are not guaranteed to be sparse. In this paper, we present a method to select a minimum set of spanning vectors for the solutions of SVMs, based on the rank-revealing QR decomposition of the kernel Gram matrix. We apply it on 18 real-world classification tasks on chemical compounds, showing its effectiveness to reduce the computational burden in performing inference on unseen graphs by minimizing the number of support vectors without penalizing accuracy. This in turn gives us a tool to better analyze the trade-off between accuracy, expressiveness and inference cost among different graph kernels.

Minimum Spanning Set Selection in Graph Kernels

Tortorella D.;Micheli A.
2023-01-01

Abstract

Kernel-based learning models such as support vector machines (SVMs) can seamlessly deal with graph structures thanks to suitable kernel functions that compute a particular similarity between pairs of data samples. In the last two decades, a plethora of graph kernels have been proposed, motivated by theoretical properties or designed specifically for an application domain. Computing graph kernels however presents a significant cost for both training and inference, since predictions on unseen graphs require evaluating the kernel e.g. between the new sample and all support vectors, and solutions to an SVM optimization problem are not guaranteed to be sparse. In this paper, we present a method to select a minimum set of spanning vectors for the solutions of SVMs, based on the rank-revealing QR decomposition of the kernel Gram matrix. We apply it on 18 real-world classification tasks on chemical compounds, showing its effectiveness to reduce the computational burden in performing inference on unseen graphs by minimizing the number of support vectors without penalizing accuracy. This in turn gives us a tool to better analyze the trade-off between accuracy, expressiveness and inference cost among different graph kernels.
2023
978-3-031-42794-7
978-3-031-42795-4
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1213678
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact