ISC'14

June 22–26, 2014
Leipzig, Germany

Session Details

 
Name: Tutorial 04: Hybrid Parallel Programming with MPI & OpenMP
 
Time: Sunday, June 22, 2014
09:00 am - 01:00 pm
 
Room:   Lecture Room 11
CCL - Congress Center Leipzig
 
Breaks:08:00 am - 10:30 am Welcome Coffee
 
Presenter:   Georg Hager, RRZE
  Gabriele Jost, Supersmith
  Rolf Rabenseifner, HLRS
 
Abstract:   Most HPC systems are clusters of shared memory nodes. Such SMP nodes can be small multi-core CPUs up to large many-core CPUs. Parallel programming may combine the distributed memory parallelization on the node interconnect (e.g., with MPI) with the shared memory parallelization inside of each node (e.g., with OpenMP or MPI-3.0 shared memory).
This tutorial analyzes the strengths and weaknesses of several parallel programming models on clusters of SMP nodes. Multi-socket-multi-core systems in highly parallel environments are given special consideration. MPI-3.0 introduced a new shared memory programming interface, which can be combined with inter-node MPI communication. It can be used for direct neighbor accesses similar to OpenMP or for direct halo copies, and enables new hybrid programming models. These models are compared with various hybrid MPI+OpenMP approaches and pure MPI. This tutorial also includes a discussion of the OpenMP support for accelerators. Benchmark results are presented for modern platforms such as Intel Xeon Phi and Cray XC30. Numerous case studies demonstrate the performance-related aspects of hybrid programming, and application categories that can take advantage of this model are identified. Tools for hybrid programming such as thread/process placement support and performance analysis are presented in a "how-to" section.

Content Level
25% Introductory, 50% Intermediate, 25% Advanced

Attendee Requirements
None.

Audience Prerequisites
Some knowledge about parallel programming with MPI and OpenMP.

Targeted Audience
People who are in charge with the development of efficient parallel software on clusters of shared memory nodes.