A Calculus for Parallel Computations over Multidimensiona l Dense Arrays

DI COSMO, R; Z, Li; Pelagatti, Susanna

We present a calculus to formalize and give costs to parallel computations over multidimensional dense arrays. The calculus extends a simple distribution calculus (proposed in some previous work) with computation and data collection. We consider an SPMD programming model in which process interaction can take place using point-to-point as well as collective operations, much in the style of MPI. We want to give a rigorous description of all stages of data parallel applications working over dense arrays: initial distribution (i.e., partition and replication) of arrays over a set of processors, parallel computation over distributed data, exchange of intermediate results and ﬁnal data gather. In the paper, beside deﬁning the calculus, we give it a formal semantics, prove equations between different combinations of operations, and show how to associate a cost to operation combinations. This last feature makes possible to make quantitative cost-driven choices between semantically equivalent implementation strategies.