depsaw: analyze and stop overbuilding in bazel

Summary #

Today, I’m introducing depsaw: an experimental tool that can be used to reduce overbuilding and overtesting in bazel.

In its current state, it is a tool that is used to analyze the dependency graph and commit frequencies per file to produce a list of targets that could be optimized significantly.

Here is an example from the bazel codebase itself:

$ depsaw trigger-scores-map ~/workspace/bazel "//:bazel-distfile"
targets:
- name: //src/main/java/com/google/devtools/build/lib/analysis:srcs
  rebuilds: 3604
  immediate_dependents: 1
  total_dependents: 5
  score: 3604
- name: //src/test/shell:srcs
  rebuilds: 3092
  immediate_dependents: 1
  total_dependents: 4
  score: 3092
- name: //src/main/java/com/google/devtools/build/lib/bazel:srcs
  rebuilds: 2741
  immediate_dependents: 1
  total_dependents: 5
  score: 2741
- name: //src/test/java/com/google/devtools/build/lib/rules:srcs
  rebuilds: 2647
  immediate_dependents: 1
  total_dependents: 5
  score: 2647
... # a few thousand other lines

This shows that //src/main/java/com/google/devtools/build/lib/analysis:srcs caused 3604 rebuilds of //:bazel-distfile and its dependents over the history of the repo.

In the future, I’d like to also introduce tooling to help automatically reduce or split dependencies. If you want to just dive in, look at the readme and give it a shot! If you’re interested in learning the story and design considerations, read on.

Backstory: overbuilding and overtesting downstream targets #

At Cruise, we use bazel as our build system for a significant chunk of the code base. Combined with a monorepo (where all source code for an organization is in a single code repository), it has allowed for a fluid experience with building and testing software. This will not be a deep dive into bazel and its capabilities, but bazel at a high level organizes projects in the following way:

Source files are consumed into buildable units known as targets. Each target is generated from a rule.
Targets can be dependants for other targets, runnable as tests, or as a generic executable (e.g. a command-line interface).
Bazel runs in two phases: an analyze phase that is able to understand the relationship between these dependencies.

Bazel has rules and toolchains for a wide variety of languages, allowing intermingling of python on C/C++ dependencies (e.g. with pybind), or adding in a static file (e.g. yaml or some binary asset) that a target depends upon.

Like all codebases, functionality gravitates toward a few upstream dependencies, on which thousands or more downstream targets depend on. For example, a utility library for wrapping shell scripts might have hundreds of thousands of dependants.

To reduce the cost of building everything, all the time, build systems often cache their results. Bazel is no exception, providing both local and remote caching functionality. This means that you only rebuild a target when it, or its dependencies, actually changes.

As this continues, we can run into overbuilding, where downstream targets are rebuliding too often, even when the code they actually depend on don’t change. This happens due to:

Depending on only a handful of files in a target that has a large number of files in its source files.
Depending on a target which pulls in another dependency, which is not used in your particular code path or is included erreously.

So how do you solve overbuilding? That’s what this post and depsaw are all about.

Fixing overbuilding and overtesting #

If you have a codebase where things are being overbuilt, the general workflow is to solve that is along the lines of:

Find the targets that are causing the most rebuilds: it’s good to scope the problem to the high-value targets.
Identify why the target is contributing so much to the builds. Apply the appropriate solution.
Back to 1.

Let’s dive into each of those in detail.

1: Find the targets causing the most rebuilds #

The targets that are causing rebuilds comes down to a couple dimensions:

Invalidations of the target’s builds. Targets can invalidate frequently because:
- They depend on a things that also invalidate frequently.
- There are a frequent changes to their source files.
Dependants. If you have a target that hundreds or thousands of other targets depend on, that can cascade and result in invalidating downstream targets very frequently.

We can create a scoring like:

score(target) = total_recursive_dependants(target) * len(commits_to_build(target))

The score is very high if there are a lot of dependents, or the target itself is changing very frequently. This is actually very similar to the metrics that Spotify used to analyze their overbuilding for their mobile apps.

Here’s a table to help you get a sense of the score:

score	dependents	average_distinct_commits_per_file	num_input_files	commits_dependencies_built
1000	100	1	10	0
501	10	1	1	50
500	10	10	5	0
200	2	1	100	0

Targets that have a lot of dependents will score higher, since the invalidate multiple targets.
Targets that have a lot of rapidly input files will score higher.
Targets that have a fair number of dependents, and depend on a fair number of targets, will score higher.
Targets that have a fair number of dependents, as well as have a high number of files modified, will score higher.

Using depsaw, you can get these scores with the following, running it against a git repository using bazel:

TARGET=YOUR_TARGET_HERE
depsaw trigger-scores-map $(pwd) "${TARGET}" --format=csv --since 2024-11-01 --deps-file "${DEPS_FILE}" > /tmp/deps.csv

Here’s some rough pseudocode of the algorithm:

def commits_to_build(target):
   commits = commits_that_modified_files(targets.input_files)
   for dep in target.dependencies:
      commits = commits | commits_to_build(dep)
   return commits

Basically, looking at all the commits that modified the input files of your target, and taking the union of that and all the commits that would have triggered a build of a dependency.

This is also recursive - which means that as you build the list of commits for one target, you can easily build it for all the targets it depends on at the same time.

We can build this algorithm by first getting the list of modified files in git, per commit, via a command like:

git log --numstat | awk '/^[0-9-]+/{ print $NF}' | sort | uniq -c | sort -nr

And use the information around dependencies extracted from bazel:

bazel query "deps(//...)" --output streamed_jsonproto`

2: Causes and solutions #

Fundamentally, overbuilding comes from depending on targets that are too big, for one reason or another.

too many files in a single package #

In this case, there are too many files that are grouped in the same target, such that targets must now depend on that single mega-target, rebuilding all dependants.

Solution: files should be split up into more granular targets. Using a code search tool to find the references to various files and imports. Analyzing those would help you determine which refactors are easiest by hand.

too much functionality in a single file #

In this case, there is a single file that is depended on by too many targets.

Solution: separate the file into multiple other files, then split those up into separate targets. Modify dependents to use the split dependency. Using a code analysis tool, you can see which functions are used the most, then factor those into separate targets.

unnescessary dependencies #

In this case, there are dependencies that are completely unnescessary. Anecdotally I think this is overexagerrated as an issue, but it is possible without some proper pruning that a target that was previously relevant no longer is.

Solution: remove that target.

conclusions and other thoughts #

This tooling has already been helpful: I’ve found opportunities to factor dependencies of some of my targets by 30%!

The above process is a start, but, like depsaw, there’s a lot more to do to make eliminating overbuilding a more automated and simpler exercise. I will probably have a follow-up post at some point when I have some better insights. If you have some ideas, please join the conversation at the bottom or contact me!

In the meantime, here’s some other musings:

Considering sibling dependencies for more score accuracy #

Even the score above is a bit naive - it ignores the fact that there are common dependencies that would be pulled in via other means anyway. for example, a utility library may be pulled in even if a different dependent is removed:

graph LR
    A[A]
    B[B]
    C[C]
    D[D]

    A --> B
    A --> C
    B --> D
    B --> E
    C --> D

In this case, removing B from A wouldn’t actually prevent rebuilds of Target A when D changes, since it would still be pulled in through B and C. So the value in removing a dependency would be more accurate if it included a way to measure the reduction in builds by removing B, and thereby removing the dependencies that are unique to it (like E).

For now, this can be emulated in depsaw by first re-running depsaw before and after removing a dependency - this will tell you what the actual net difference would be in eliminating that specific dependent from the graph.

Bazel is amazing #

Although separate from the point of this post, bazel is an extremely powerful tool. I would argue that for a monorepo to succeed, you need some sort of system that is used to define the complex relationship between software units, and is able to give you descriptive answers about these relationships. To that end, bazel provides the query command. You can use it to answer the following questions, and more:

question	bazel query
What targets does my code depend on?	`bazel query "deps(//foo)"`
What targets depend on the code I’m working on?	`bazel query --infer_universe_scope "rdeps(//..., //foo)"`
Why does foo depend on bar?	`bazel query "somepath(//foo, //bar)"`
What are the test targets that depend on me?	`bazel query "tests(//...) intersect rdeps(//..., //foo)"`

These types of queries are critical to identifying common build problems, including:

overbuilding: rebuilding targets that are not needed.
overtesting: testing downstreams that are not actually affected by your change.
identifying how an extraneous target is pulled into your dependencies.