Compiling For a GPU

Using a GPU can accelerate a code, but requires special programming and compiling. Several options are available for GPU-enabled programs.


OpenACC is a standard

Available NVIDIA CUDA Compilers

ModuleVersion Module Load Command
cuda11.0.228 module load gcc/9.2.0 cuda/11.0.228
cuda10.1.168 module load cuda/10.1.168
cuda10.2.89 module load cuda/10.2.89
cuda11.0.228 module load cuda/11.0.228
cuda9.2.148.1 module load cuda/
ModuleVersion Module Load Command
nvhpc20.9 module load nvhpc/20.9

GPU architecture -arch

According to the CUDA documentation, “in the CUDA naming scheme, GPUs are named sm_xy, where x denotes the GPU generation number, and y the version in that generation.” The documentation contains details about the architecture and the corresponding xy value. On Rivanna, the GPU nodes are K80, P100, V100, and RTX 2080 Ti, which are Kepler, Pascal, Volta, and Turing, respectively. In summary, please use the following values when compiling CUDA code on Rivanna.

GPU Type Architechture xy CUDA Version
K80 Kepler 37 5 - 10 (deprecated from 11)
P100 Pascal 60 8+
V100 Volta 70 9+
RTX 2080 Ti Turing 75 10+

Therefore, if you need your code to work on all GPU types, please load CUDA version 10:

module load cuda/10.2.89

and provide a list of NVCC flags, e.g.

-gencode arch=compute_37,code=sm_37 \
-gencode arch=compute_60,code=sm_60 \
-gencode arch=compute_70,code=sm_70 \
-gencode arch=compute_75,code=sm_75