Monday, May 27, 2019

Learn CMake's Scripting Language in 15 Minutes

As explained in my previous post, every CMake-based project must contain a script named CMakeLists.txt. This script defines targets, but it can also do a lot of other things, such as finding third-party libraries or generating C++ header files. CMake scripts have a lot of flexibility.
Every time you integrate an external library, and often when adding support for another platform, you’ll need to edit the script. I spent a long time editing CMake scripts without really understanding the language, as the documentation is quite scattered, but eventually, things clicked. The goal of this post is to get you to the same point as quickly as possible.
This post won’t cover all of CMake’s built-in commands, as there are hundreds, but it is a fairly complete guide to the syntax and programming model of the language.

Hello World

If you create a file named hello.txt with the following contents:
message("Hello world!")         # A message to print
…you can run it from the command line using cmake -P hello.txt. (The -P option runs the given script, but doesn’t generate a build pipeline.) As expected, it prints “Hello world!”.
$ cmake -P hello.txt
Hello world!

All Variables Are Strings

In CMake, every variable is a string. You can substitute a variable inside a string literal by surrounding it with ${}. This is called a variable reference. Modify hello.txt as follows:
message("Hello ${NAME}!")       # Substitute a variable into the message
Now, if we define NAME on the cmake command line using the -D option, the script will use it:
$ cmake -DNAME=Newman -P hello.txt
Hello Newman!
When a variable is undefined, it defaults to an empty string:
$ cmake -P hello.txt
Hello !
To define a variable inside a script, use the set command. The first argument is the name of the variable to assign, and the second argument is its value:
set(THING "funk")
message("We want the ${THING}!")
Quotes around arguments are optional, as long as there are no spaces or variable references in the argument. For example, I could have written set("THING" funk) in the first line above – it would have been equivalent. For most CMake commands (except if and while, described below), the choice of whether to quote such arguments is simply a matter of style. When the argument is the name of a variable, I tend not to use quotes.

You Can Simulate a Data Structure using Prefixes

CMake does not have classes, but you can simulate a data structure by defining a group of variables with names that begin with the same prefix. You can then look up variables in that group using nested ${}variable references. For example, the following script will print “John Smith lives at 123 Fake St.”:
set(JOHN_NAME "John Smith")
set(JOHN_ADDRESS "123 Fake St")
set(PERSON "JOHN")
message("${${PERSON}_NAME} lives at ${${PERSON}_ADDRESS}.")
You can even use variable references in the name of the variable to set. For example, if the value of PERSON is still “JOHN”, the following will set the variable JOHN_NAME to “John Goodman”:
set(${PERSON}_NAME "John Goodman")

Every Statement is a Command

In CMake, every statement is a command that takes a list of string arguments and has no return value. Arguments are separated by (unquoted) spaces. As we’ve already seen, the set command defines a variable at file scope.
As another example, CMake has a math command that performs arithmetic. The first argument must be EXPR, the second argument is the name of the variable to assign, and the third argument is the expression to evaluate – all strings. Note that on the third line below, CMake substitutes the string value of MY_SUM into the enclosing argument before passing the argument to math.
math(EXPR MY_SUM "1 + 1")                   # Evaluate 1 + 1; store result in MY_SUM
message("The sum is ${MY_SUM}.")
math(EXPR DOUBLE_SUM "${MY_SUM} * 2")       # Multiply by 2; store result in DOUBLE_SUM
message("Double that is ${DOUBLE_SUM}.")
There’s a CMake command for just about anything you’ll need to do. The string command lets you perform advanced string manipulation, including regular expression replacement. The file command can read or write files, or manipulate filesystem paths.

Flow Control Commands

Even flow control statements are commands. The if/endif commands execute the enclosed commands conditionally. Whitespace doesn’t matter, but it’s common to indent the enclosed commands for readablity. The following checks whether CMake’s built-in variable WIN32 is set:
if(WIN32)
    message("You're running CMake on Windows.")
endif()
CMake also has while/endwhile commands which, as you might expect, repeat the enclosed commands as long as the condition is true. Here’s a loop that prints all the Fibonacci numbers up to one million:
set(A "1")
set(B "1")
while(A LESS "1000000")
    message("${A}")                 # Print A
    math(EXPR T "${A} + ${B}")      # Add the numeric values of A and B; store result in T
    set(A "${B}")                   # Assign the value of B to A
    set(B "${T}")                   # Assign the value of T to B
endwhile()
CMake’s if and while conditions aren’t written the same way as in other languages. For example, to perform a numeric comparison, you must specify LESS as a string argument, as shown above. The documentation explains how to write a valid condition.
if and while are different from other CMake commands in that if the name of a variable is specified without quotes, the command will use the variable’s value. In the above code, I took advantage of that behavior by writing while(A LESS "1000000") instead of while("${A}" LESS "1000000") – both forms are equivalent. Other CMake commands don’t do that.

Lists are Just Semicolon-Delimited Strings

CMake has a special substitution rule for unquoted arguments. If the entire argument is a variable reference without quotes, and the variable’s value contains semicolons, CMake will split the value at the semicolons and pass multiple arguments to the enclosing command. For example, the following passes three arguments to math:
set(ARGS "EXPR;T;1 + 1")
math(${ARGS})                                   # Equivalent to calling math(EXPR T "1 + 1")
On the other hand, quoted arguments are never split into multiple arguments, even after substitution. CMake always passes a quoted string as a single argument, leaving semicolons intact:
set(ARGS "EXPR;T;1 + 1")
message("${ARGS}")                              # Prints: EXPR;T;1 + 1
If more than two arguments are passed to the set command, they are joined by semicolons, then assigned to the specified variable. This effectively creates a list from the arguments:
set(MY_LIST These are separate arguments)
message("${MY_LIST}")                           # Prints: These;are;separate;arguments
You can manipulate such lists using the list command:
set(MY_LIST These are separate arguments)
list(REMOVE_ITEM MY_LIST "separate")            # Removes "separate" from the list
message("${MY_LIST}")                           # Prints: These;are;arguments
The foreach/endforeach command accepts multiple arguments. It iterates over all arguments except the first, assigning each one to the named variable:
foreach(ARG These are separate arguments)
    message("${ARG}")                           # Prints each word on a separate line
endforeach()
You can iterate over a list by passing an unquoted variable reference to foreach. As with any other command, CMake will split the variable’s value and pass multiple arguments to the command:
foreach(ARG ${MY_LIST})                         # Splits the list; passes items as arguments
    message("${ARG}")                           # Prints each item on a separate line
endforeach()

Functions Run In Their Own Scope; Macros Don’t

In CMake, you can use a pair of function/endfunction commands to define a function. Here’s one that doubles the numeric value of its argument, then prints the result:
function(doubleIt VALUE)
    math(EXPR RESULT "${VALUE} * 2")
    message("${RESULT}")
endfunction()

doubleIt("4")                           # Prints: 8
Functions run in their own scope. None of the variables defined in a function pollute the caller’s scope. If you want to return a value, you can pass the name of a variable to your function, then call the setcommand with the special argument PARENT_SCOPE:
function(doubleIt VARNAME VALUE)
    math(EXPR RESULT "${VALUE} * 2")
    set(${VARNAME} "${RESULT}" PARENT_SCOPE)    # Set the named variable in caller's scope
endfunction()

doubleIt(RESULT "4")                    # Tell the function to set the variable named RESULT
message("${RESULT}")                    # Prints: 8
Similarly, a pair of macro/endmacro commands defines a macro. Unlike functions, macros run in the same scope as their caller. Therefore, all variables defined inside a macro are set in the caller’s scope. We can replace the previous function with the following:
macro(doubleIt VARNAME VALUE)
    math(EXPR ${VARNAME} "${VALUE} * 2")        # Set the named variable in caller's scope
endmacro()

doubleIt(RESULT "4")                    # Tell the macro to set the variable named RESULT
message("${RESULT}")                    # Prints: 8
Both functions and macros accept an arbitrary number of arguments. Unnamed arguments are exposed to the function as a list, through a special variable named ARGN. Here’s a function that doubles every argument it receives, printing each one on a separate line:
function(doubleEach)
    foreach(ARG ${ARGN})                # Iterate over each argument
        math(EXPR N "${ARG} * 2")       # Double ARG's numeric value; store result in N
        message("${N}")                 # Print N
    endforeach()
endfunction()

doubleEach(5 6 7 8)                     # Prints 10, 12, 14, 16 on separate lines

Including Other Scripts

CMake variables are defined at file scope. The include command executes another CMake script in the same scope as the calling script. It’s a lot like the #include directive in C/C++. It’s typically used to define a common set of functions or macros in the calling script. It uses the variable CMAKE_MODULE_PATH as a search path.
The find_package command looks for scripts of the form Find*.cmake and also runs them in the same scope. Such scripts are often used to help find external libraries. For example, if there is a file named FindSDL2.cmake in the search path, find_package(SDL2) is equivalent to include(FindSDL2.cmake). (Note that there are several ways to use the find_package command – this is just one of them.)
CMake’s add_subdirectory command, on the other hand, creates a new scope, then executes the script named CMakeLists.txt from the specified directory in that new scope. You typically use it to add another CMake-based subproject, such as a library or executable, to the calling project. The targets defined by the subproject are added to the build pipeline unless otherwise specified. None of the variables defined in the subproject’s script will pollute the parent’s scope unless the set command’s PARENT_SCOPE option is used.
As an example, here are some of the scripts involved when you run CMake on the Turf project:

Getting and Setting Properties

A CMake script defines targets using the add_executableadd_library or add_custom_targetcommands. Once a target is created, it has properties that you can manipulate using the get_property and set_property commands. Unlike variables, targets are visible in every scope, even if they were defined in a subdirectory. All target properties are strings.
add_executable(MyApp "main.cpp")        # Create a target named MyApp

# Get the target's SOURCES property and assign it to MYAPP_SOURCES
get_property(MYAPP_SOURCES TARGET MyApp PROPERTY SOURCES)

message("${MYAPP_SOURCES}")             # Prints: main.cpp
Other target properties include LINK_LIBRARIESINCLUDE_DIRECTORIES and COMPILE_DEFINITIONS. Those properties are modified, indirectly, by the target_link_librariestarget_include_directories and target_compile_definitions commands. At the end of the script, CMake uses those target properties to generate the build pipeline.
There are properties for other CMake entities, too. There is a set of directory properties at every file scope. There is a set of global properties that is accessible from all scripts. And there is a set of source file properties for every C/C++ source file.
Congratulations! You now know the CMake scripting language – or at least, it should be easier to understand large scripts using CMake’s command reference. Otherwise, the only thing missing from this guide, that I can think of, is generator expressions. Let me know if I forgot anything else!

How to Build a CMake-Based Project

How to Build a CMake-Based Project

CMake is a versatile tool that helps you build C/C++ projects on just about any platform you can think of. It’s used by many popular open source projects including LLVM, Qt, KDE and Blender.
All CMake-based projects contain a script named CMakeLists.txt, and this post is meant as a guide for configuring and building such projects. This post won’t show you how to write a CMake script – that’s getting ahead of things, in my opinion.
As an example, I’ve prepared a CMake-based project that uses SDL2 and OpenGL to render a spinning 3D logo. You can build it on Windows, MacOS or Linux.
The information here applies to any CMake-based project, so feel free to skip ahead to any section. However, I recommend reading the first two sections first.
If you don’t have CMake yet, there are installers and binary distributions on the CMake website. In Unix-like environments, including Linux, it’s usually available through the system package manager. You can also install it through MacPortsHomebrewCygwin or MSYS2.

The Source and Binary Folders

CMake generates build pipelines. A build pipeline might be a Visual Studio .sln file, an Xcode .xcodeproj or a Unix-style Makefile. It can also take several other forms.
To generate a build pipeline, CMake needs to know the source and binary folders. The source folder is the one containing CMakeLists.txt. The binary folder is where CMake generates the build pipeline. You can create the binary folder anywhere you want. A common practice is to create a subdirectory build beneath CMakeLists.txt.
By keeping the binary folder separate from the source, you can delete the binary folder at any time to get back to a clean slate. You can even create several binary folders, side-by-side, that use different build systems or configuration options.
The cache is an important concept. It’s a single text file in the binary folder named CMakeCache.txt. This is where cache variables are stored. Cache variables include user-configurable options defined by the project such as CMakeDemo’s DEMO_ENABLE_MULTISAMPLE option (explained later), and precomputed information to help speed up CMake runs. (You can, and will, re-run CMake several times on the same binary folder.)
You aren’t meant to submit the generated build pipeline to source control, as it usually contains paths that are hardcoded to the local filesystem. Instead, simply re-run CMake each time you clone the project to a new folder. I usually add the rule *build*/ to my .gitignore files.

The Configure and Generate Steps

As you’ll see in the following sections, there are several ways to run CMake. No matter how you run it, it performs two steps: the configure step and the generate step.
The CMakeLists.txt script is executed during the configure step. This script is responsible for defining targets. Each target represents an executable, library, or some other output of the build pipeline.
If the configure step succeeds – meaning CMakeLists.txt completed without errors – CMake will generate a build pipeline using the targets defined by the script. The type of build pipeline generated depends on the type of generator used, as explained in the following sections.
Additional things may happen during the configure step, depending on the contents of CMakeLists.txt. For example, in our sample CMakeDemo project, the configure step also:
  • Finds the header files and libraries for SDL2 and OpenGL.
  • Generates a header file demo-config.h in the binary folder, which will be included from C++ code.
In a more sophisticated project, the configure step might also test the availability of system functions (as a traditional Unix configure script would), or define a special “install” target (to help create a distributable package). If you re-run CMake on the same binary folder, many of the slow steps are skipped during subsequent runs, thanks to the cache.

Running CMake from the Command Line

Before running CMake, make sure you have the required dependencies for your project and platform. For CMakeDemo on Windows, you can run setup-win32.py. For other platforms, check the README.
You’ll often want to tell CMake which generator to use. For a list of available generators, run cmake --help.
Create the binary folder, cd to that folder, then run cmake, specifying the path to the source folder on the command line. Specify the desired generator using the -G option. If you omit the -G option, cmake will choose one for you. (If you don’t like its choice, you can always delete the binary folder and start over.)
mkdir build
cd build
cmake -G "Visual Studio 15 2017" ..
If there are project-specific configuration options, you can specify those on the command line as well. For example, the CMakeDemo project has a configuration option DEMO_ENABLE_MULTISAMPLE that defaults to 0. You can enable this configuration option by specifying -DDEMO_ENABLE_MULTISAMPLE=1 on the cmake command line. Changing the value of DEMO_ENABLE_MULTISAMPLE will change the contents of demo-config.h, a header file that’s generated by CMakeLists.txt during the configure step. The value of this variable is also stored in the cache so that it persists during subsequent runs. Other projects have different configuration options.
cmake -G "Visual Studio 15 2017" -DDEMO_ENABLE_MULTISAMPLE=1 ..
If you change your mind about the value of DEMO_ENABLE_MULTISAMPLE, you can re-run CMake at any time. On subsequent runs, instead of passing the source folder path to the cmake command line, you can simply specify the path to the existing binary folder. CMake will find all previous settings in the cache, such as the choice of generator, and re-use them.
cmake -DDEMO_ENABLE_MULTISAMPLE=0 .
You can view project-defined cache variables by running cmake -L -N .. Here you can see CMakeDemo’s DEMO_ENABLE_MULTISAMPLE option left at its default 0 value:

Running cmake-gui

I prefer the command line, but CMake also has a GUI. The GUI offers an interactive way to set cache variables. Again, make sure to install your project’s required dependencies first.
To use it, run cmake-gui, fill in the source and binary folder paths, then click Configure.
If the binary folder doesn’t exist, CMake will prompt you to create it. It will then ask you to select a generator.
After the initial configure step, the GUI will show you a list of cache variables, similar to the list you see when you run cmake -L -N . from the command line. New cache variables are highlighted in red. (In this case, that’s all of them.) If you click Configure again, the red highlights will disappear, since the variables are no longer considered new.
The idea is that if you change a cache variable, then click Configure, new cache variables might appear as a result of your change. The red highlights are meant to help you see any new variables, customize them, then click Configure again. In practice, changing a value doesn’t introduce new cache variables very often. It depends how the project’s CMakeLists.txt script was written.
Once you’ve customized the cache variables to your liking, click Generate. This will generate the build pipeline in the binary folder. You can then use it to build your project.

Running ccmake

ccmake is the console equivalent to cmake-gui. Like the GUI, it lets you set cache variables interactively. It can be handy when running CMake on a remote machine, or if you just like using the console. If you can figure out the CMake GUI, you can figure out ccmake.

Building with Unix Makefiles

CMake generates a Unix makefile by default when run from the command line in a Unix-like environment. Of course, you can generate makefiles explicitly using the -G option. When generating a makefile, you should also define the CMAKE_BUILD_TYPE variable. Assuming the source folder is the parent:
cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug ..
You should define the CMAKE_BUILD_TYPE variable because makefiles generated by CMake are single-configuration. Unlike a Visual Studio solution, you can’t use the same makefile to build multiple configurations such as Debug and Release. A single makefile is capable of building exactly one build type. By default, the available types are Debug, MinSizeRel, RelWithDebInfo and Release. Watch out – if you forget to define CMAKE_BUILD_TYPE, you’ll probably get an unoptimized build without debug information, which is useless. To change to a different build type, you must re-run CMake and generate a new makefile.
Personally, I also find CMake’s default Release configuration useless because it doesn’t generate any debug information. If you’ve ever opened a crash dump or fixed a bug in Release, you’ll appreciate the availability of debug information, even in an optimized build. That’s why, in my other CMake projects, I usually delete the Release configuration from CMakeLists.txt and use RelWithDebInfo instead.
Once the makefile exists, you can actually build your project by running make. By default, make will build every target that was defined by CMakeLists.txt. In CMakeDemo’s case, there’s only one target. You can also build a specific target by passing its name to make:
make CMakeDemo
The makefile generated by CMake detects header file dependencies automatically, so editing a single header file won’t necessarily rebuild the entire project. You can also parallelize the build by passing -j 4 (or a higher number) to make.
CMake also exposes a Ninja generator. Ninja is similar to make, but faster. It generates a build.ninja file, which is similar to a Makefile. The Ninja generator is also single-configuration. Ninja’s -j option autodetects the number of available CPUs.

Building with Visual Studio

We’ll generate a Visual Studio .sln file from the CMake command line. If you have several versions of Visual Studio installed, you’ll want to tell cmake which version to use. Again, assuming that the source folder is the parent:
cmake -G "Visual Studio 15 2017" ..
The above command line will generate a Visual Studio .sln file for a 32-bit build. There are no multiplatform .sln files using CMake, so for a 64-bit build, you must specify the 64-bit generator:
cmake -G "Visual Studio 15 2017 Win64" ..
Open the resulting .sln file in Visual Studio, go to the Solution Explorer panel, right-click the target you want to run, then choose “Set as Startup Project”. Build and run as you normally would.
Note that CMake adds two additional targets to the solution: ALL_BUILD and ZERO_CHECK. ZERO_CHECK automatically re-runs CMake when it detects a change to CMakeLists.txt. ALL_BUILD usually builds all other targets, making it somewhat redundant in Visual Studio. If you’re used to setting up your solutions a certain way, it might seem annoying to have these extra targets in your .sln file, but you get used to it. CMake lets you organize targets and source files into folders, but I didn’t demonstrate that in the CMakeDemo sample.
Like any Visual Studio solution, you can change build type at any time from the Solution Configuration drop-down list. The CMakeDemo sample uses CMake’s default set of build types, shown below. Again, I find the default Release configuration rather useless as it doesn’t produce any debug information. In my other CMake projects, I usually delete the Release configuration from CMakeLists.txt and use RelWithDebInfo instead.

Built-In CMake Support in Visual Studio 2017

In Visual Studio 2017, Microsoft introduced another way to use CMake with Visual Studio. You can now open the source folder containing CMakeLists.txt from Visual Studio’s File → Open → Folder menu. This new method avoids creating intermediate .sln and .vcxproj files. It also exposes 32-bit and 64-bit builds in the same workspace. It’s a nice idea that, in my opinion, falls short for a few reasons:
  • If there are any source files outside the source folder containing CMakeLists.txt, they won’t appear in the Solution Explorer.
  • The familiar C/C++ Property Pages are no longer available.
  • Cache variables can only be set by editing a JSON file, which is pretty unintuitive for a Visual IDE.
I’m not really a fan. For now, I intend to keep generating .sln files by hand using CMake.

Building with Xcode

The CMake website publishes a binary distribution of CMake for MacOS as a .dmg file. The .dmg file contains an app that you can drag & drop to your Applications folder. Note that if you install CMake this way, cmake won’t be available from the command line unless you create a link to /Applications/CMake.app/Contents/bin/cmake somewhere. I prefer installing CMake from MacPorts because it sets up the command line for you, and because dependencies like SDL2 can be installed the same way.
Specify the Xcode generator from the CMake command line. Again, assuming that the source folder is the parent:
cmake -G "Xcode" ..
This will create an .xcodeproj folder. Open it in Xcode. (I tested in Xcode 8.3.1.) In the Xcode toolbar, click the “active scheme” drop-down list and select the target you want to run.
After that, click “Edit Scheme…” from the same drop-down list, then choose a build configuration under Run → Info. Again, I don’t recommend CMake’s default Release configuration, as the lack of debug information limits its usefulness.
Finally, build from the Product → Build menu (or the ⌘B shortcut), run using Product → Run (or ⌘R), or click the big play button in the toolbar.
It’s possible to make CMake generate an Xcode project that builds a MacOS bundle or framework, but I didn’t demonstrate that in the CMakeDemo project.

Building with Qt Creator

Qt Creator provides built-in support for CMake using the Makefile or Ninja generator under the hood. I tested the following steps in Qt Creator 3.5.1.
In Qt Creator, go to File → Open File or Project… and choose CMakeLists.txt from the source folder you want to build.
Qt Creator will prompt you for the location of the binary folder, calling it the “build directory”. By default, it suggests a path adjacent to the source folder. You can change this location if you want.
When prompted to run CMake, make sure to define the CMAKE_BUILD_TYPE variable since the Makefile generator is single-configuration. You can also specify project-specific variables here, such as CMakeDemo’s DEMO_ENABLE_MULTISAMPLE option.
After that, you can build and run the project from Qt Creator’s menus or using the Shift+Ctrl+B or F5 shortcuts.
If you want to re-run CMake, for example to change the build type from Debug to RelWithDebInfo, navigate to Projects → Build & Run → Build, then click “Run CMake”.
The CMakeDemo project contains a single executable target, but if your project contains multiple executable targets, you can tell Qt Creator which one to run by navigating to Projects → Build & Run → Run and changing the “Run configuration” to something else. The drop-down list is automatically populated with a list of executable targets created by the build pipeline.

Other CMake Features

  • You can perform a build from the command line, regardless of the generator used: cmake --build . --target CMakeDemo --config Debug
  • You can create build pipelines that cross-compile for other environments with the help of the CMAKE_TOOLCHAIN_FILE variable.
  • You can generate a compile_commands.json file that can be fed to Clang’s LibTooling library.
I really appreciate how CMake helps integrate all kinds of C/C++ components and build them in all kinds of environments. It’s not without its flaws, but once you’re proficient with it, the open source world is your oyster, even when integrating non-CMake projects. My next post will be a crash course in CMake’s scripting language.
If you wish to become a power user, and don’t mind forking over a few bucks, the authors’ book Mastering CMake offers a big leap forward. Their article in The Architecture of Open Source Applications is also an interesting read.