VCPKG — a C++ package manager

2020-05-21 C++, vcpkg, conan, package manager, Nexus

Recently I embarked on a project to extract external dependencies of Personal Desktop and build them in a separate job in Azure providing the results as artifacts to the main job which builds Personal Desktop. I took this as an opportunity to rethink how we use external libraries in PD and started with some research. This blogpost describes what I found.

Most scripting languages (Python, Javascript, Ruby, …) come with an integrated package manager (pip, npm, gem,…) used to deal with external dependencies of a project. And even some modern compiled languages do (Rust has crates, Go has modules, C# has NuGet). C++, however, does not (though this might change with the Packaging C++ Modules proposal). It turns out, that there are package managers for C++ which we can use today. Before going into the details, let us first step back a little bit.

Package managers

When a project grows to a certain size, it inevitably comes across a problem, which was already solved by someone else (or many someone elses). A sound engineering practice, embodied in the “do not reinvent the wheel” adage, would dictate that we use the someone else’s work. Of course, there might be good reasons to diverge from this practice. The solution might be unavailable, or it might not be a good solution (though we are often too quick to reach this conclusion). Another obstacle can be the amount of effort it takes to integrate the solution into our project. And this is what package managers aim to solve. So, what are our options, when a standard package manager is not available?

  1. Go against the best practice and just reinvent the wheel
  2. Import the external code into the project as is
  3. Write our own package manager
  4. See if there is some nonstandard package manager we could use

This post is about the last option, so, without further ado, what package managers does C++ have?

Existing C++ Package managers

A cursory Google search will come up with a surprising number of possibilities.

  1. CMake’s ExternalProject This consists of a CMake function which creates a custom target to download, update/patch, configure, build, install and test an external project. To integrate an external library, one uses the ExternalProject_Add function to describe how to build the dependency.

  2. Meson’s Wrap Somewhat similar to the previous, except that it is aimed at the meson build system. Additionally, it has a curated list of prepared recipes (i.e. build description files) called WrapDB so that integrating an external dependency is as easy as downloading a single file from the online wrap database. The database currently contains 116 packages.

  3. Hunter, built on CMake’s ExternalProject Hunter expands on CMake’s ExternalProject by providing a set of ready made recipes. Additionally, it allows for versioning and has a notion of binary cache which several projects can share to avoid unnecessarily rebuilding common dependencies. Currently, it has 527 different recipes.

  4. dds (build system, very opinionated) DDS integrates package management into the build system itself. Packages are described using a JSON file format and, if a project follows a standard layout, creating this file is very straightforward, since dds infers most of what it needs from the layout. While dds can manage the full build process, it also has some rudimentary integration with CMake (which allows one to use dds packages in a CMake project). Currently, there is no ready-to use online collection of dds packages.

  5. Conan Conan is a dependency manager for C++ in the spirit of Python’s pip. It has support for binary packages, versioning, dependencies and is build system agnostic (it has integrations for the more popular build systems) . Projects dependencies are listed in a conanfile.txt file and are downloaded/built using the conan command. Package recipes are written in Python. The Conan project is backed by the devops company JFrog.

  6. Buckaroo (built on the buck build system from Facebook) A dependency manager written in F#, tightly integrated with the buck build system. Does not have the notion of a binary package, instead it relies on a build artifact cache (as, e.g., provided by buck). Package recipes are written in Python. Currently has 358 official recipes.

  7. build2 An all in one system (build system & package manager) written in C++. Prides itself on precise change detection. Currently has 60 packages. Doesn’t support binary packages.

  8. vcpkg A package manager from Microsoft, somewhat similar to CMakeExternalProjectAdd. Provides CMake recipes for packages. Similar to unix system packages in that the versions of the packages in the repository should, in theory, be integrated and work together. Currently it has official recipes (ports in their terminology) for 1381 packages.

  9. others

The problem becomes, which to choose.

Choosing a package manager for Personal

Notice that both the package part and the Personal in the title are important. In particular, we are not interested in a build system and we are not trying to choose a package manager which would be ideal for general use (for some considerations in this direction, see the following blogpost by Corentin Jabot). Lets try to write down a list of requirements

Personal requirements

  1. multiplatform (supports at least Win, Linux and MacOS)
  2. easy to integrate into our build
  3. support the external dependencies we use (zlib, boost, cryptopp, curl, pkcs11pp, wxwidgets), preferably without needing any work from us (wishful thinking, here)

  4. support patching external dependencies (in particular, support for building dependencies)
    1. supports building external dependencies from source
    2. supports splitting the build
General requirements

  1. has a healthy community of developers & users
  2. supports a wide range of packages
  3. has a central repository of packages, but allows for custom repositories
  4. sane & secure naming scheme
  5. source based (for the reasoning, see the above mentioned blogpost)

(credit: requirements 5.–8. were taken from Corentin’s post)

It is important to note that the requirements, so far, have no weights on them; however, since our goal is to improve Personal, requirements 1.-3. should be considered hard requirements. Looking at 1., we can eliminate managers requiring changing the build system. This leaves us with ExternalProject, Hunter, Conan and vcpkg. So how do they fare? Well none of them really satisfy requirement 7 and all satisfy 0 and 8 (and 3.1, which is a subset of 8). We’ll look at requirement 4 separately, so lets exclude these and concentrate on the others. We look at the options one by one:

ExternalProject It of course satisfies 0.-2. and 8.; 4. might be debatable (the Prague Team at Nexus feels very healthy), unfortunately realistically speaking not (no one in Prague wants to go anywhere close to the build system, if not forced). The other requirements are clearly not satisfied.

Hunter Integrating Hunter seems to definitely require work, however its hard to guess how complicated the work is without trying. For 2 we’re missing wxwidgets and pkcs11pp. Unfortunately, wxwidgets are almost a hard requirement; we would need to create a wxwidgets package from scratch. For 3, Hunter, by default, builds the packages when building the project; however, package builds are cached in a separate directory which can be shared between builds and it even has a facility for packaging these into binary artifacts, which can be automatically downloaded during build of the project. For 5, it currently supports 527 packages. For 6, its central repository consists of git repositories in the hunter-packages namespace on GitHub. The only search interface is the repo search available on github, which, on the other hand, does the job.

Package recipes are simple CMake files which use predefined hunter helper functions.

Conan Integrating Conan, at least according to the documentation, seems to be straightforward. Unfortunately, the same is true of vcpkg, which turned out to require quite a lot of work. We are skeptical and give Conan a fuzzy - on this one. For 2, the story is almost the same as for Hunter, except that Conan does have an unofficial wxwidgets package, which earns it a +. For 3, Conan has a separate step to install required dependencies (much like pip install). For 5, the official package repository conan center states that it has “126,889 Binary Packages Indexed”, but that probably just means that each configuration (version, flags, …) of each package counts. The packages are built from recipes in the conan-center-index github repo which currently has 355 recipes. For 6. conan has a conan center repository of official packages, an official beta packages repo conan-community, and a community repository of OSS packages bincrafters.

Package recipes are Python programs somewhat similar to setup.py for Python packages.

Vcpkg

Integrating Vcpkg, at least according to the documentation, seems to be straightforward. Unfortunately, much of this depends on the quality of the package recipes, which is very varied. Our experience was that zlib, curl and cryptopp were trivial, boost required some small changes on our side, and we created a recipe (port int vcpkg parlance) for pkcs11pp which was not hard. However, dealing with wxwidgets was a nightmare. This earns vcpkg a clear -. This brings us to 2, here vcpkg is a theoretical winner (has a wxwidgets package), however it is probably not that different from Conan, so we give it a +. For 3, vcpkg, dependencies are built using the vcpkg command (vcpkg install) and can be exported as zip file to be consumed by the project build step. For 5, there are 1381 official packages and for 6. recipes are provided in a github repo, searching is possible through the vcpkg search command. Custom repositories work by creating a ports directory with package recipes and passing a path to it to vcpkg on the command line.

Package recipes consist of directory (somewhat inspired by the debian directory of debian packages) which contains a simple metadata file CONTROL, a portfile.cmake which drives the build, optional patch files (used to patch package sources) and other cmake helper files. The portfile uses vcpkg helper functions and is typically quite simple.

The tables below give a brief summary of this analysis.

Requirement integration (1) pkg support (2) pkg customization (3) community (4) # of pkgs (5) pkg repos & search (6)
ExternalProject
Hunter (?)
Conan (?)
Vcpkg
Requirements Status Notes
ExternalProject
1,2
3
4 might be debatable (the Prague Team at Nexus feels very healthy), unfortunately realistically speaking it does not (no one in Prague wants to go anywhere close to the build system, if not forced)
5,6
Hunter
1 (?)
2 supports zlib, boost, curl, cryptopp
3 see hunter binaries
4 see below
5,6 supports 527 packages, the only search interface is github repo search (each package has its own repo)
Conan
1 (?)
2 supports zlib, boost, curl, cryptopp; wxwidgets has an unofficial package
3 yes, packages are installed by running conan install and support building from source and binary packages
4 see below
5,6 currently has 355 official packages, and three package repos conan center, conan community, and bincrafters
Vcpkg
1 in principle easy, in practice quite hard
2 theoretically supports all, however wxwidgets needs heavy patching
3 yes, packages are built by vcpkg install and binary packages can be created by vcpkg export; packages are consumed by by providing a toolchain flag to CMake and using find_packaga
4 see below
5,6 supports 1381, packages recipes are provided in a github repo, searching is possible through the vcpkg search command

Community Health

Finally, lets look at the community around the package managers. For simplicity, we will only look at VCPKG and Conan here. Hunter, while definitely not a one-man show, still has a much smaller community than the other two.

The following is inspired by (well, more precisely, shamelessly cribbed from) Kevin Otten’s blog post on planet.kde.org and we use his ComDaAn scripts.

The following two graphs try to visulize project activity. They show committers to the projects repositories on the X-axes (sorted by date of the first commit). The Y-axes show time in weeks. For each commiter and week, a dot is put on the graph if the committer commited to the repo in the given week. The color of the dot corresponds to the number of commits, the more commits in the week, the darker. The curve on the left side of the graph (the first dots in each line) visualizes the ability of the project to attract new contributors. The steeper it is, the higher the rate of new committers. This rate isn't by itself a useful metric. It is also important, that committers stay. This can be seen on the right hand side. The more dense the righthand side is, the more successful the project is in retaining the committers.

Comparing these plots, we can see that the rates of new committers are very similar, although Conan has about half the committers (533) of VCPKG (1122). The VCPKG project seems to be slightly better at retaining committers, while Conan seems to have more active committers. Out of the 1122 VCPKG comitters, 67 have an e-mail address @microsoft.com indicating they are Microsoft employees. For Conan, it is hard to determine which committers are JFRog employess (there is only one Conan committer with a @conan.io address). Overall, they seem to be very healthy projects.

We would now like to see more of the structure of the community. The following two graphs show a vertex for each committer. Two vertices are linked together if the committers committed to the same file in the repo, indicating they worked on the same code. The color of the vertex corresponds to the centrality of the vertex in the graph; roughly spekaing central nodes are those, which have shortest paths to other nodes. The more central nodes there are, the more people, again roughly speaking, have deep knowledge of the code.

We can see that the Conan community, has proportionately about twice as many highly central nodes. For VCPKG it seems that most committers work on a limited number of recipes and there isn't as much collaboration on common code.

The final choice

We have two clear candidates --- Conan and Vcpkg, with Conan having a slight edge on most metrics, although the community (and number of packages) in Vcpkg is larger. In the end we chose Vcpkg since we did not want to introduce another language (Python) into our build system. On the other hand, Conan would have been a completely valid choice too.

We now finish with some technical details of our implementation.

Implementing VCPG in Personal

We have started by splitting our main Personal mono-repo into two repositories, one for the external libraries (personal.libs), the other for the main application (personal).

External libraries The personal.libs repo is used for building our dependencies using vcpkg. These dependencies are then imported during the build process of the main repo (personal). The personal.libs repo contains

  1. a snapshot of the official vcpkg as a submodule in ./vcpkg. Since Vcpkg handles versioning differently from the other package managers, closer to how system package managers manage versions. The idea is, that a snapshot (commit) of the repo should contain package versions which are known to work together. This means that there is much less flexibility in choosing package versions at the cost of a (theoretical) guarantee, that the versions should work together. In practice, this is not too big a problem. If we need a different package version, we will provide that in our "ports overlay" (vcpkgs term for a repository of custom recipes).

  2. a directory ./ports containing our custom recipes; these contain the libraries which we have custom patches for (boost, wxwidgets) and libraries where we need different versions than available in the vcpkg repo. See ports overlay documentation for more info.

  3. a directory ./triplets which contain our compiler toolchain settings (global compiler flags, dynamic vs. static linkage, architecture, ...) for the various platforms we support. See triplet files documentation for more info.

  4. the package-list.txt file which contains the list of packages that should be built (required by personal)

When a new dependency is needed, we first look at whether a recipe for the library is available in the official vcpkg repo (in the vcpkg/ports subdirectory of personal.libs). If it is not (or if the official recipe needs some modifications), we create a new recipe for it in our "local" ports directory. Using the new dependency in personal is then, ideally, just a matter of including a single find_package call in the personal CMakeLists.txt. Sometimes (e.g. this was the case for the wxWidgets dependency), integrating the package into personal needs more work, typically pointing include and library paths to a correct subdirectory of ${_VCPKG_INSTALLED_DIR}/${VCPKG_TARGET_TRIPLET}.

The build process

To build the external libraries on a development machine (so that they can be used when compiling the main application locally) the following vcpkg invocations are sufficient:

    # or ./vcpkg/bootstrap-vcpkg.bat on windows
    # this builds the vcpkg command and pulls in necessary dependencies (like cmake, ...)
    # it is sufficient to run this only once
    $ ./vcpkg/bootstrap-vcpkg.sh

    # remove outdated packages
    # replace vcpkg with vcpkg.exe on windows; also adjust the triplet (nx-64-linux-rel) appropriately
    $ ./vcpkg/vcpkg remove --outdated --overlay-ports=$PWD/ports --overlay-triplets=$PWD/triplets --triplet nx-64-linux-rel

    # install the packages that personal requires
    # replace vcpkg with vcpkg.exe on windows; also adjust the triplet (nx-64-linux-rel) appropriately
    $ ./vcpkg/vcpkg install --overlay-ports=$PWD/ports --overlay-triplets=$PWD/triplets --triplet nx-64-linux-rel @package-list.txt

When building personal, it is then sufficient pass -DVCPKG_TARGET_TRIPLET=nx-64-linux-rel -DCMAKE_TOOLCHAIN_FILE=$PERSONAL_LIBS_REPO/vcpkg/scripts/buildsystems/vcpkg.cmake to the cmake command when configuring the build (again, the triplet specification needs to be adjusted to match the target platform).

To build the packages on Azure, we need one more step. Since the build of personal.libs can't share its environment with the build of the personal repo (in general, the vm images are unique to a build job and are destroyed immediately after finishing) --- we need to export the built packages as build artifacts, so that they can be used by the build of personal. This is done by running the export subcommand of vcpkg:

   $ ./vcpkg/vcpkg export --output=personal.libs.Ubuntu18 --7zip --overlay-ports=$PWD/ports --overlay-triplets=$PWD/triplets --triplet nx-64-linux-rel @package-list.txt

Additionally, to save time, we implement caching, so that we do not unnnecessarily rebuild vcpkg and packages which have not changed. We put the following CacheBeta@0 task near the top of our job:

- task: CacheBeta@0
      displayName: 'Cache vcpkg & package builds'
      inputs:
          # As 'key' use the content of the response file, vcpkg's commit id and build agent name.
          # The key must be one liner, each segment separated by pipe, non-path segments enclosed by
          # double quotes.
          # key: $(Build.SourcesDirectory)/package-list.txt | "$(Agent.Name)"
          key: '"v3-32" | "$(Agent.OS)" | $(Build.SourcesDirectory)/package-list.txt | $(Build.SourcesDirecto
ry)/vcpkg/ports/**/CONTROL | $(Build.SourcesDirectory)/ports/**/CONTROL'
          restoreKeys: |
            "v3-32" | "$(Agent.OS)" | $(Build.SourcesDirectory)/package-list.txt | $(Build.SourcesDirectory)/
vcpkg/ports/**/CONTROL
            "v3-32" | "$(Agent.OS)" | $(Build.SourcesDirectory)/package-list.txt | $(Build.SourcesDirectory)/
ports/**/CONTROL
            "v3-32" | "$(Agent.OS)" | $(Build.SourcesDirectory)/package-list.txt
            "v3-32" | "$(Agent.OS)"
          path: '$(Build.SourcesDirectory)/vcpkg'
          cacheHitVar: CACHE_RESTORED

The cache keys, which pick what version to take from the cache, start with the most strict which is a concatenation of the hashes of vcpkg package CONTROL files, the hashes of the ports CONTROL files, the hash of the package-lists.txt the vm image os and a fixed key (v3-32, this is used so that we can manually force a clean build by changing it). The directory which is cached is the ./vcpkg directory in the repo, which contains the builds of the packages.

To use the packages in the personal build on Azure, we first download and extract the exported artifacts from the personal.libs build, by including the following tasks in azure-pipelines.yaml:

- task: DownloadPipelineArtifact@2
    displayName: 'Download compiled dependencies'
    inputs:
        buildType: specific
        project: '$(System.TeamProjectId)'
        pipeline: 52
        buildVersionToDownload: 'latest'
        artifact: personal.libs.ubuntu18
        downloadPath: $(System.DefaultWorkingDirectory)
- task: ExtractFiles@1
    displayName: 'Extract compiled dependencies'
    inputs:
        archiveFilePatterns: 'personal.libs.ubuntu18.7z'
        destinationFolder: $(System.DefaultWorkingDirectory)
        cleanDestinationFolder: false
- task: CopyFiles@2
    displayName: "Move compiled dependencies to vcpkg-libs"
    inputs:
        sourceFolder: $(System.DefaultWorkingDirectory)/personal.libs.ubuntu18
        contents: '**'
        targetFolder: $(Build.SourcesDirectory)/vcpkg-libs
        overWrite: true
        cleanDestinationFolder: false

Finally we modify CMakeLists.txt to use find_package to find most of the vcpkg packages (wxWidgets need some special casing, since find_package does not work for them) and provide cmake with the right toolchain and triplet options on the commandline:

 -DVCPKG_TARGET_TRIPLET=nx-64-linux-rel -DCMAKE_TOOLCHAIN_FILE=$(Build.SourcesDirectory)/vcpkg-libs/scripts/buildsystems/vcpkg.cmake

Conclusion

I've tried to show in this blogpost that, although there currently isn't any standard dependency management system for C++, there are several viable candidates which have already attracted a considerable community around them. I have looked at Conan and Vcpkg in more detail and then briefly described some technical details of our Vcpkg integration.

Given all that is happening in this area (and, in particular, the Packaging C++ Modules proposal) it will be interesting to see where the dependency management landscape for C++ evolves.