CUDA

A good resource for CUDA and Modern CMake is this talk by CMake developer Robert Maynard at GTC 2017.

There are two ways to enable CUDA support. If CUDA is not optional:

You’ll probably want listed here also. And, if CUDA is optional, you’ll want to put this in somewhere conditionally:

enable_language(CUDA)

To check to see if CUDA is available, use CheckLanuage:

include(CheckLanguage)
check_language(CUDA)

You can see if CUDA is present by checking CMAKE_CUDA_COMPILER (was missing until CMake 3.11).

You can check variables like CMAKE_CUDA_COMPILER_ID (for nvcc, this is "NVIDIA", Clang was added in CMake 3.18). You can check the version with CMAKE_CUDA_COMPILER_VERSION.

If you are looking for CUDA’s standard level, in CMake 3.17 a new collection of compiler features were added, like . These have the same benefits that you are already used to from the cxx versions.

This is the easy part; as long as you use .cu for CUDA files, you can just add libraries exactly like you normally would.

You can also use separable compilation:

set_target_properties(mylib PROPERTIES
                            CUDA_SEPARABLE_COMPILATION ON)

You can also directly make a PTX file with the CUDA_PTX_COMPILATION property.

When you build CUDA code, you generally should be targeting an architecture. If you don’t, you compile ‘ptx’, which provide the basic instructions but is compiled at runtime, making it potentially much slower to load.

All cards have an architecture level, like “7.2”. You have two choices; the first is the code level; this will report to the code being compiled a version, like “5.0”, and it will take advantage of all the features up to 5.0 but not past (assuming well written code / standard libraries). Then there’s a target architecture, which must be equal or greater to the code architecture. This needs to have the same major number as your target card, and be equal to or less than the target card. So 7.0 would be a common choice for our 7.2 card. Finally, you can also generate PTX; this will work on all future cards, but will compile just in time.

Using targets should work similarly to CXX, but there’s a problem. If you include a target that includes compiler options (flags), most of the time, the options will not be protected by the correct includes (and the chances of them having the correct CUDA wrapper is even smaller). Here’s what a correct compiler options line should look like:

However, if you using almost any find_package, and using the Modern CMake methods of targets and inheritance, everything will break. I’ve learned that the hard way.

For now, here’s a pretty reasonable solution, as long as you know the un-aliased target name. It’s a function that will fix a C++ only target by wrapping the flags if using a CUDA compiler:

: Place for built-in Thrust, etc
CMAKE_CUDA_COMPILER: NVCC with location

You can use to find a variety of useful targets and variables even without enabling the CUDA language.

If you want to support an older version of CMake, I recommend at least including the FindCUDA from CMake version 3.9 in your cmake folder (see the CLIUtils github organization for a git repository). You’ll want two features that were added: CUDA_LINK_LIBRARIES_KEYWORD and cuda_select_nvcc_arch_flags, along with the newer architectures and CUDA versions.

To use the old CUDA support, you use find_package:

find_package(CUDA 7.0 REQUIRED)

You’ll also might want to allow a user to check for the arch flags of their current hardware: