Divide-and-Conquer (DaC) is a sequential programming paradigm which models a large class of algorithms used in real-life applications. Although suitable to extract parallelism in a straightforward way, the parallel implementation of DaC algorithms still requires some expertise in parallel programming tools by the programmer. In this paper we aim at providing to non-expert programmers a high-level solution for fast prototyping parallel DaC programs on multicores with minimal programming effort. Following the rationale of parallel design pattern methodology, we design a C++11-compliant template interface for developing parallel DaC programs. The interface is implemented using different back-end frameworks (i.e. OpenMP, Intel TBB and FastFlow) supporting source code reuse and a certain amount of performance portability. Experiments on a 24-core Intel server show the effectiveness of our approach: with a reduced programming effort the programmer easily prototypes parallel versions with performance comparable with hand-made parallelizations.

A divide-and-conquer parallel pattern implementation for multicores

DANELUTTO, MARCO;DE MATTEIS, TIZIANO;MENCAGLI, GABRIELE;TORQUATI, MASSIMO
2016-01-01

Abstract

Divide-and-Conquer (DaC) is a sequential programming paradigm which models a large class of algorithms used in real-life applications. Although suitable to extract parallelism in a straightforward way, the parallel implementation of DaC algorithms still requires some expertise in parallel programming tools by the programmer. In this paper we aim at providing to non-expert programmers a high-level solution for fast prototyping parallel DaC programs on multicores with minimal programming effort. Following the rationale of parallel design pattern methodology, we design a C++11-compliant template interface for developing parallel DaC programs. The interface is implemented using different back-end frameworks (i.e. OpenMP, Intel TBB and FastFlow) supporting source code reuse and a certain amount of performance portability. Experiments on a 24-core Intel server show the effectiveness of our approach: with a reduced programming effort the programmer easily prototypes parallel versions with performance comparable with hand-made parallelizations.
2016
9781450346412
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/809431
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? ND
social impact