Practical Introduction to GPU Programming with OpenACC

OpenACC is a directive-based programming model for highly parallel systems, which allows for automated generation of portable GPU code. In this tutorial, we will get to know the programming model with examples, learn how to use the associated tools environment, and incorporate first strategies for performance optimization into our programs.

Requirements & Setup

The course will be held on a GPU-equipped supercomputer with all necessary software and tools configured.

For the course you will need

  • A laptop capable of using SSH, best including X forwarding (Download SSH keys)
  • OPTIONAL An editor for code changes (vim will suffice, which can also be used directly on the supercomputer)

We will be using the JURON supercomputer located at Forschungszentrum Jülich. It features a new CPU-GPU architecture combination of IBM POWER8+ CPUs and NVIDIA Tesla P100 GPUs. Since the supercomputer features non-public hardware, you are required to sign an Usage Agreement form to participate in this course. The agreement will look like the following and is handed out during the course: File:User Agreement JURON.pdf

Depending on the speed of the local network connection, it could make sense to install the following tools for GPU programming

  • CUDA Toolkit – Free from NVIDIA; can also be installed when no GPU is in your system. Comes with the Visual Profiler, which we will use to analyze our application
  • PGI Compiler – Community Edition is free; this is the OpenACC compiler we will be using; includes analysis tools (as an alternative/addition to Visual Profiler)

Exercises