Counting the number of subgraphs, or patterns, of a certain kind is at the heart of data mining, and st-paths are one of the most basic graph patterns to express connectivity. The problem of counting the number of st-paths in a graph, both directed and undirected, has been studied since the 70s, and is one of the original \#P-complete problems introduced by Valiant [Valiant, SIAM Journal on Computing, 1979]. However, counting can be a heavy task and known algorithms already struggle on graphs with hundreds of nodes. For this reason we propose a novel approach: we assess whether the number of st-paths of an undirected graph is at least a given number z. Instead of finding paths one-by-one (i.e., listing), our algorithm is based on decomposing and collapsing computational tasks arranged in a tree-like structure to enhance the effectiveness of each step in growing the number of paths found. Extensive experimental results on real-world datasets show the algorithm scaling to graphs with millions of nodes and edges, with z in the trillions. Its performance is orders of magnitude better than state-of-the-art listing algorithms adapted to this task.
An Efficient Algorithm for Assessing the Number of st-Paths in Large Graphs
Punzi, Giulia
;Conte, Alessio
;Grossi, Roberto
;
2023-01-01
Abstract
Counting the number of subgraphs, or patterns, of a certain kind is at the heart of data mining, and st-paths are one of the most basic graph patterns to express connectivity. The problem of counting the number of st-paths in a graph, both directed and undirected, has been studied since the 70s, and is one of the original \#P-complete problems introduced by Valiant [Valiant, SIAM Journal on Computing, 1979]. However, counting can be a heavy task and known algorithms already struggle on graphs with hundreds of nodes. For this reason we propose a novel approach: we assess whether the number of st-paths of an undirected graph is at least a given number z. Instead of finding paths one-by-one (i.e., listing), our algorithm is based on decomposing and collapsing computational tasks arranged in a tree-like structure to enhance the effectiveness of each step in growing the number of paths found. Extensive experimental results on real-world datasets show the algorithm scaling to graphs with millions of nodes and edges, with z in the trillions. Its performance is orders of magnitude better than state-of-the-art listing algorithms adapted to this task.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.