GCC is not the only open source C++ compiler that’s available out there on the interwebs — there’s also Clang, which is part of the LLVM Compiler Infrastructure. Each has its strengths and weaknesses, and both are excellent compilers for a wide variety of platforms. Whenever possible, I try to compile the Linux code I write with both compilers, for the simple reason that sometimes one will catch an error or emit a warning that the other one missed. I find that having my code build cleanly with at least two compilers and all warnings enabled is a useful best practice for improving code quality.

As it turns out, Clang on Linux has a very interesting capability: it can use GCC’s version of the standard C++ library (libstdc++), or its own version (libc++). The choice of standard library may be specified using a command-line argument (-stdlib=libstdc++ or -stdlib=libc++, respectively). Furthermore, if you have multiple versions of GCC installed on your Linux system, you can provide an optional command-line argument to Clang that specifies the GCC version whose libstdc++ you want to use (--gcc-toolchain=<path>).

This post is a followup to Building GCC on Linux, and in it I’m going to describe a set of scripts I wrote for building Clang on Linux. I call this collection clang-builder and it is available on GitHub.

Build Steps Overview

There is one caveat when using clang-builder on Linux: the scripts assume that an installation of GCC already exists that has been built with gcc-builder. The upshot of this requirement is that, on Linux, you should build at least one version of GCC before attempting to build Clang.

When writing the scripts, I wanted the Clang versions I was building to be as modern as possible. This meant that in almost all cases they would need to rely on GCC toolchains that were newer than the distro-provided compiler. For example, on my CentOS 7.2.1511 VM, the official distro version of GCC is 4.8.5 — positively ancient! On my Ubuntu 16.04 workstation, the distro GCC is 5.4, which is also pretty old. So I decided to require Clang installations built with clang-builder on Linux to depend on GCC installations built with gcc-builder.

Like gcc-builder, the process is pretty straightforward, and usually goes like this:

  1. Build a recent version of GCC with gcc-builder
  2. Install Clang-specific prerequisites
  3. Clone the clang-builder repo
  4. Customize some build variables
  5. Perform the build
  6. Stage the build
  7. Create an installation package (.tgz or .rpm) and install it
  8. Use the newly installed compiler
  9. Remove temporary build files

As with gcc-builder, the idea behind the scripts is to install each version of Clang in its own unique directory hierarchy with only its own files; no files from any other compiler are present. Clang is built using CMake, and the intended installation directory may be specified by assigning a value to the CMAKE_INSTALL_PREFIX variable on the CMake command line. The scripts create a unique directory name based on the compiler’s version number and assign that name to CMAKE_INSTALL_PREFIX.

I’ve chosen /usr/local/clang as the default root directory into which individual Clang versions are installed. Each version is installed into its own subdirectory having the version number as its name. For example: Clang 4.0.0 would be installed into /usr/local/clang/4.0.0; Clang 4.0.1 would be installed in /usr/local/clang/4.0.1; and so on.

In the following sections, I’ll walk through the process of building Clang 5.0.1, with GCC 7.2 as the default GCC toolchain. When the process is finished, the compiler will be installed into /usr/local/clang/5.0.1.

Note that I’ll also use the environment variable BUILD_TOP_DIR to represent the work directory into which the clang-builder repository is cloned. After cloning, all of the build scripts will be found in ${BUILD_TOP_DIR}/clang-builder (assuming you use the default rep directory name when cloning). Whenever I specify a relative path in the steps below, it will be relative to ${BUILD_TOP_DIR}/clang-builder; for example, ./dist means ${BUILD_TOP_DIR}/clang-builder/dist, etc.

Step 1: Build a Recent Version of GCC

If you’ve already built one or more versions of GCC using gcc-builder, then you’re all set and can move to the next step. If not, then I recommend that you pick a recent one, such as GCC 7.2 (which is the most recent at the time of this writing) and build and install it.

Step 2: Install Clang-Specific Prerequisites

In addition to the prerequites for gcc-builder, you’ll also need:

  • CMake 3.4.3 or higher
  • Python 2.7

At the time of this writing, clang-builder only supports building compressed tarballs or RPM files. If there’s enough demand, or someone volunteers to help, I’ll add support for building DEB files.

Step 3: Clone the Repo

Clone the clang-builder git repo and check out the branch for the major/minor version of Clang that you want to build. As of this writing, the branches available for checkout are clang50, clang40, clang39 and clang38.

$ cd ${BUILD_TOP_DIR}
$ git clone https://github.com/BobSteagall/clang-builder.git
$ cd clang-builder
$ git checkout clang50

In this example we’re building Clang 5.0.1, so I checked out the clang50 branch.

Step 4: Customize the Build Variables

Before building, you’ll need to customize some of the variables exported by clang-build-vars.sh. The first such variable is at the top of that file, CLANG_VERSION, and must be edited to specify the version of Clang you want to download and build. Start up your favorite editor and open clang-build-vars.sh. Near the top you’ll find the variable that specifies the Clang version number:

##- Customize this variable to specify the version of Clang that you wish
##  to download and build.
##
export CLANG_VERSION=5.0.X

For our example build of Clang 5.0.1, this variable should be set to:

export CLANG_VERSION=5.0.1

This variable must be set correctly for the other clang-builder scripts to work. As mentioned above, with this definition our build will eventually be installed into /usr/local/clang/5.0.1.

You should next look at the GCC_VERSION and GCC_INSTALL_PREFIX variables:

##- If building on Linux, customize these variables to specify the location
##  of the preferred GCC toolchain partner on this system.  It is important
##  that the variable GCC_INSTALL_PREFIX be defined to have the same value 
##  as the --prefix flag used to configure the default GCC toolchain.
##
if [ `uname` == "Linux" ]
then
    export GCC_INSTALL_PREFIX=/usr/local/gcc/7.2.0
    export GCC_VERSION=7.2.0
fi

As mentioned in the comments, the important variable here is GCC_INSTALL_PREFIX. It is used to specify the location of the default GCC toolchain to be used by the version of Clang that you’re building. In this example, we’re using GCC 7.2 built with gcc-builder as the default GCC toolchain, and specifying the default path to that compiler. Of course, if you have built GCC 7.2 and installed it in a different directory, then you would set GCC_INSTALL_PREFIX to that directory.

There is no rule that says you need to use a different version of GCC for each version of Clang you build. On one of my workstations, I use builds of Clang 4.0.1, 5.0.0, and 5.0.1 that were all built with GCC 7.2 as the default GCC toolchain. You can mix and match GCC and Clang versions at will; however, keep in mind that you’ll want to use a version of GCC that is about the same age as the version of Clang you’re building. The two distributions evolve their feature sets at about the same rate, and by following this advice, you’ll minimize the possibility of having a GCC toolchain whose libstdc++ is incompatible with the Clang you’re building.

The GCC_VERSION variable is not quite as important; it is required only if you are going to create an RPM installation package.

There are several other variables in the clang-build-vars.sh file that can be customized to fine-tune your build, and they are described below in Other Build Variables.

Step 5: Perform the Build

This step consists of executing the build-clang.sh script. This script manages the process downloading the required source packages, unpacking them, configuring them, building them, and running the compiler test suite that comes with the Clang distribution:

$ ./build-clang.sh | tee build.log

Alternatively, if you want to save time and build Clang without running the tests, you can use:

$ ./build-clang.sh -T | tee build.log

As you can see, I’m using the tee command to copy output from the build process into a log file. I find the log file to be pretty useful on those occasions where the build fails or does something wonky.

Step 6: Stage the Build

Assuming that the build succeeds, and you are satisfied with the test results, the next step is to run the stage-clang.sh script to create a dummy installation that I call the staging area.

$ ./stage-clang.sh

The staging area will be created in the ./dist subdirectory. The files installed there become the source for building whichever installation package you choose to create in Step 7. In our example here of building Clang 5.0.1, the ./dist subdirectory will contain the compiler in ./dist/usr/local/clang/5.0.1 and some scripts for setting environment variables will be found in ./dist/usr/local/bin.

Step 7: Make an Installation Package

Create a Compressed Tarball. If you want to create a compressed tarball for subsequent installations, run:

$ ./pack-clang.sh

The resulting tarball will be in the ./packages subdirectory. Note that the tarball will have the Linux distro name as a substring in the tarball file name; on my CentOS-7 VM, the tarball file name is kewb-clang501-CentOS-7-x86_64.tgz. To install the tarball:

$ cd /
$ sudo tar -zxvf ${BUILD_TOP_DIR}/clang-builder/packages/kewb-clang501-CentOS-7-x86_64.tgz.tgz

or, alternatively, from within the build directory:

$ sudo tar -zxvf ./packages/kewb-clang501-CentOS-7-x86_64.tgz -C /

If you are satisfied that everything is working correctly, then at some point you’ll probably want to set ownership of the installed files to root:

$ cd /usr/local
$ sudo chown -R root:root clang/5.0.1/
$ sudo chown root:root bin/*clang501*

Installing from a compressed tarball is my preferred technique these days. It’s quick and easy to do the installation, it’s trivial to remove an installation, and absolutely no system configuration files, directories, or databases are affected.

Create an RPM. It’s also possible to create an RPM:

$ ./make-clang-rpm.sh -v

The resulting RPM will be created in the ./packages subdirectory. On my CentOS-7 VM, the RPM file name is kewb-clang501-0.el7.x86_64.rpm. You can install it using rpm or yum on the command line. Of course, you’ll need to make sure that your GCC toolchain is also installed before you can use your new Clang compiler.

Step 8: Use Your Custom Clang Build

In order to use your new compiler installation, some path variables need to be adjusted. The simplest way to do this is source the setenv-for-clang501.sh script that was created with the compiler and installed into /usr/local/bin.

Option A: Source the script /usr/local/bin/setenv-for-clang-501.sh, which was installed as part of Step 7 above.

$ source /usr/local/bin/setenv-for-clang501.sh

This script makes a copy of some existing path environment variables and then modifies those variables so that your compiler installation can be used. If you would like stop using your new compiler and restore the old path environment variables, you can source /usr/local/bin/restore-default-paths-clang501.sh:

$ source /usr/local/bin/restore-default-paths-clang501.sh

OR

Option B: Modify your PATH environment variable so that the directory $CLANG_INSTALL_PREFIX/bin appears before the directory where your system default compiler is installed (which is usually in /usr/bin or /usr/local/bin). You will also need to modify your LD_LIBRARY_PATH environment variable so that the $CLANG_INSTALL_PREFIX/lib, $GCC_INSTALL_PREFIX/lib, and $GCC_INSTALL_PREFIX/lib64 directories appear first in that path. For this example of building Clang 5.0.1 with a default GCC toolchain of GCC 7.2, the paths would be:

$ export PATH=/usr/local/clang/5.0.1/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/clang/5.0.1/lib:/usr/local/gcc/7.2.0/lib:\
/usr/local/gcc/7.2.0/lib64:$LD_LIBRARY_PATH

Step 9: Clean Up

After you have built and installed Clang, are satisfied it’s working, and have decided to keep it, you can remove all of the working directories using the clean-clang.sh script:

$ ./clean-clang.sh

This script will delete the unpacked source and build directories for LLVM and LibC++, the staging area ./dist, and the packages output directory ./packages.

Other Build Variables

There are several other build variables defined in clang-build-vars.sh. Although they all have default values that I think are reasonable, you may want to change one or more of them. The following is a brief description the variables and their meanings.

Package Naming

The following two variables contribute to your installation’s name. The first specifies the installation package name, while the second sets the name of the Clang build triple.

##- Customize variable this to name the installation; the custom name
##  is displayed when a user invokes clang or clang++ with the -v flag
##  ("clang -v").
##
export CLANG_VENDOR="(KEWB Computing Build)"

##- Customize this variable to define the middle substring in the Clang
##  build triple.
##
export CLANG_CUSTOM_BUILD_TAG=kewb

The effects of customizing these variables can be seen by running clang -v:

[dev2 ~]$ clang -v
(KEWB Computing Build) clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
Target: x86_64-kewb-linux-gnu
Thread model: posix
InstalledDir: /usr/local/clang/5.0.1/bin
Found candidate Clang installation: /usr/local/clang/5.0.1/lib/gcc/x86_64-kewb-linux-gnu/5.0.1
Selected Clang installation: /usr/local/clang/5.0.1/lib/gcc/x86_64-kewb-linux-gnu/5.0.1
Candidate multilib: .;@m64
Selected multilib: .;@m64

The text shown in green above shows the custom package name (KEWB Computing Build) and the custom build triple (x86_64-kewb-linux-gnu).

Installation Location

The next variable is used to specify the directory in which your Clang build will be installed.

##- Customize this variable to specify where this version of Clang will
##  be installed.
##
export CLANG_INSTALL_PREFIX=/usr/local/clang/$CLANG_VERSION

Given that we’ve set CLANG_VERSION=5.0.1 in Step 4 above, from this we can see that the installation directory for our build will be /usr/local/clang/5.0.1.

Timestamp

The next variable specifies a timestamp that is applied to all files in the final installation package (.tgz or .rpm).

##- Customize this variable to specify the installation's time stamp.
##
export CLANG_TIME_STAMP=201712161000

It seems like a silly thing, but I like to have the files in my compiler installations all have the same timestamp, usually the day I complete the first full installation of that compiler.

Concurrent Build and Test

The next two variables specify the number of concurrent processes to be used by make when building and testing the installation.

##- Customize this variable if you want to change the arguments passed
##  to "make" that specify the number of threads used to build Clang.
##
export CLANG_BUILD_THREADS_ARG='-j8'

I’ve had the fewest build problems and shortest overall build times with Clang when the number of build jobs is equal to the number of cores on my system. The scripts will also run Clang unit tests after the build is complete, using the same number of cores for testing as for building.

Summary

As with GCC, building Clang is a lengthy process. On my Xeon W3690-based workstation, building and testing with six cores takes about 45-60 minutes. On my newer Core i7-based system at work, building and testing with eight cores takes about 30-40 minutes.

Thanks for stopping by.

–Bob

Leave a Reply

Your email address will not be published. Required fields are marked *