Tips and References

Startup, Tips and References[edit | edit source]

Startup[edit | edit source]

Respository[edit | edit source]

All of the documentation for this tutorial and the example code is stored in GitHub.

Just clone the repository in the normal way, e.g.,:

git clone -b gridka17
The gridka17 branch preserves the material as presented at GridKA17; if you want to use the latest version (the tutorial does get improved and updated) clone the master branch instead, but then you should probably also work from the latest copy of the instructions in github.

Setup and Running[edit | edit source]

The current version of this tutorial has been developed on OS X 10.12 (with Threaded Building Blocks and Boost installed using homebrew and also tested with Fedora 26 (install TBB and Boost runtime and development RPMs).

Most modern Linux distributions should work fine, as long as you have an up to date C++ compiler (C++11 support is absolutely needed) with TBB and Boost installed.

A Dockerfile that sets up Fedora 26 correctly is in the repository (docker/cpp-concurrency-fedora); you can also use the docker image graemeastewart/cpp-concurrency-fedora from Docker Hub. If you use the docker image remember to use a bind mount to a persistent filesystem for your work!

docker run -it -v $HOME:/workspace graemeastewart/cpp-concurrency-fedora

Both the GridKA VMs and the docker container have a variety of editors installed: nano, jed, emacs, geany, vim. The advantage of using the docker container is that you can use an editor or IDE on your host. To use an X editors, like geany, use the -X option to ssh (VM) or pass in the correct DISPLAY setting using -e (docker). e.g., for docker on the Mac:

docker run -it -e DISPLAY=$(ipconfig getifaddr en0):0 \
-v $HOME:/workspace graemeastewart/cpp-concurrency-fedora

(you may need to run xhost +$(ipconfig getifaddr en0) on the Mac host; make sure that XQuartz allows network clients to connect in Preferences->Security).

Tips[edit | edit source]

C++11 in General[edit | edit source]

To compile concurrent C++11 programs you'll need some compiler and linker flags:

  • -std=c++11 - Use the C++11 standard (-std=c++14 also works, if your compiler supports that)
  • -lpthread - Enable posix thread support, which is the underlying thread library used by libstdc++ on linux platforms

You might well find the Makefile here useful. It will compile any .cc file into a like named executable with the correct compiler flags.

TBB[edit | edit source]

Depending on how TBB has been installed in your environment you make need to add correct include paths and link options to your compiler command. In the school VMs and the Docker container no special options are needed beyond -ltbb.

Intel documentation for TBB is very convenient.

CMake[edit | edit source]

If you know CMake, there is a CMakeLists.txt file provided that will compile all of the tutorial examples and solutions. As there is a pre-existing Makefile in the source directory you must do an out of source build (which is best practice anyway).

You may need to download the FindTBB module for CMake to work properly, which can be done by using the git submodule in the repository:

git submodule init
git submodule update

Using CMake is very easy:

  1. Run cmake ../path/to/the/source in the place you want to build the examples
  2. Then just use make as normal

If you add your own programs you'll need to change the CMakeLists.txt file.

Tips[edit | edit source]

Top and Threads[edit | edit source]

If you run top -H will see all running threads -- this is useful for a quick check that you have a multi-threaded program running! The H key will cycle threads on/off once top is running.

So top -H -u $USER would probably be the most useful command to use.

Timing[edit | edit source]

To measure any performance boost from threading you need to take accurate timing measurements. The easiest way to do this is with the chrono library from C++11.

#include <chrono>
#include <iostream>

int main() {
    // Prologue stuff

    auto start = std::chrono::high_resolution_clock::now();
    // Do work here
    auto end = std::chrono::high_resolution_clock::now();

    auto duration = end - start;

    std::cout << "That took: " << std::chrono::duration<float, std::milli> (duration).count() << " ms" << endl;

   // Epilogue stuff

Note that it's important to time the interesting bit of the program only (which is why time ./my_prog isn't so useful).

Caveat Emptor If you run on virtual machines (e.g., GridKA School machines or lxplus at CERN) there is the possibility of interference and jitter. So, always take a few timing measurements, just to check that things are stable.

References[edit | edit source]

Generally Useful C++ Resources[edit | edit source]

Concurrency and Parallel Programming[edit | edit source]

  • Baptiste Wicht has a nice tutorial on C++11 concurrency basics:
  • Probably the best book on C++ concurrency is C++ Concurrency in Action by Anthony Williams, published by Manning. This book covers lock free programming in some detail.
  • Nicolai M. Josuttis' The C++ Standard Library: A Tutorial and Reference, Second Edition has a good chapter on C++11 concurrency.
  • Jeff Pershing's Introduction to Lock Free Programming.
  • A great general introduction to the computer science of concurrency, which discusses a lot of the classic synchronisation problems in detail is The Little Book of Semaphones by Allen B. Downie. It's available here: (Essential reading to know what the Sleeping Barber problem is!)
  • A highly practical example of concurrent data access is the Read-Copy-Update pattern,
  • A modern book on parallel programming patterns and exploiting concurrency is Structured Parallel Programming by Michael McCool, James Reinders and Arch Robison (Elsevier, 2012).

TBB[edit | edit source]

  • The Intel website has much good reference and tutorial information.
  • Intel's TBB YouTube Channel.
  • James Reinders, one of the authors of TBB, wrote the O'Reilly book Intel Threaded Building Blocks. It is rather old now (2007) and although it has good discussions of concurrency in general, its TBB specific information is a bit out of date.