The emergence of real-time decision-making applications in domains like high-frequency trading, emergency management, and service level analysis in communication networks has led to the definition of new classes of queries. Skyline queries are a notable example. Their results consist of all the tuples whose attribute vector is not dominated (in the Pareto sense) by one of any other tuple. Because of their popularity, skyline queries have been studied in terms of both sequential algorithms and parallel implementations for multiprocessors and clusters. Within the Data Stream Processing paradigm, traditional database queries on static relations have been revised in order to operate on continuous data streams. Most of the past papers propose sequential algorithms for continuous skyline queries, whereas there exist very few works targeting implementations on parallel machines. This paper contributes to fill this gap by proposing a parallel implementation for multicore architectures. We propose (i) a parallelization of the eager algorithm based on the notion of Skyline Influence Time, (ii) optimizations of the reduce phase and load-balancing strategies to achieve near-optimal speedup, and (iii) a set of experiments with both synthetic benchmarks and a real dataset in order to show our implementation effectiveness.

Continuous skyline queries on multicore architectures

DE MATTEIS, TIZIANO;MENCAGLI, GABRIELE
2016-01-01

Abstract

The emergence of real-time decision-making applications in domains like high-frequency trading, emergency management, and service level analysis in communication networks has led to the definition of new classes of queries. Skyline queries are a notable example. Their results consist of all the tuples whose attribute vector is not dominated (in the Pareto sense) by one of any other tuple. Because of their popularity, skyline queries have been studied in terms of both sequential algorithms and parallel implementations for multiprocessors and clusters. Within the Data Stream Processing paradigm, traditional database queries on static relations have been revised in order to operate on continuous data streams. Most of the past papers propose sequential algorithms for continuous skyline queries, whereas there exist very few works targeting implementations on parallel machines. This paper contributes to fill this gap by proposing a parallel implementation for multicore architectures. We propose (i) a parallelization of the eager algorithm based on the notion of Skyline Influence Time, (ii) optimizations of the reduce phase and load-balancing strategies to achieve near-optimal speedup, and (iii) a set of experiments with both synthetic benchmarks and a real dataset in order to show our implementation effectiveness.
2016
DE MATTEIS, Tiziano; Di Girolamo, Salvatore; Mencagli, Gabriele
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/793914
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 8
social impact