Khanlou | Continuous Integration

July 16, 2019

Continuous Integration

A few months ago, Aaron Patterson tweeted about some three-year-old uncommitted code:

tfw you work on a patch for 3 years and it’s like 2k lines and then you commit it and hope CI passes

Aaron’s CI build, naturally, didn’t pass.

At this point, Martin Fowler (the Martin Fowler, that Martin Fowler) chimed in:

is this where I get to be the annoying internet pedant and point out that if a patch has been developed for 3 years it’s most certainly not CI?

Why am I dredging up a twitter dunk from 3 months ago? Because this tweet, these 146 characters, have completely upended the way I think about branches, working in teams, and continuous integration in general.

Up until reading that tweet — a tweet! — almost all of the usage of the phrase “continuous integration” in my day-to-day life revolved around setting up some build server (usually but not always something like BitRise, BuddyBuild, Travis, etc) to build my code and run tests before I’m allowed to merge a branch.

But Martin’s tweet made me realize something. Build servers are just a tool used in the service of continuous integration. They don’t represent the philosophy behind it. The name carries the meaning quite well — continuously (as frequently as possible) integrate your work into some shared place, usually the mainline branch.

The implications are manifest. Integrating frequently means:

You don’t have to deal with horrible merges.
Not only do your teammates know what you’re working on at any given time, they can begin using it before you’re finished
If you have to drop your work for months (or, in Aaron’s case, years), the work you’ve done isn’t in limbo, doesn’t go out of date, and can be built on by the rest of your team.
Code review on shorter diffs is more fruitful than code review on longer diffs.
Code review earlier in the process can yield useful discussion on architecture and high-level patterns than code review later in the process, after the feature is fully built.
The fault lines between features form earlier, and any bugs that happen because of the interaction between two features emerge earlier, making them easier to solve and alleviating crunch time before a release — a problem that even the Swift compiler team has run into.

Code that isn’t integrated into your team’s mainline branch should be considered a liability at best, and dead at worst. Your job is to get the small, workable units of code merged in as early as possible.

A problem presents itself, however. You need to build a feature that takes 1,000 lines of code, but you’d like to merge it in in smaller chunks. How can you merge the code in if it’s not finished?

Broadly, the strategy is called “branch by abstraction”. You “branch” your codebase, not using git branches, but rather branches in the code itself. There’s no one way to do branch by abstraction, but many techniques that are all useful in different situations.

The easiest way to do this is to create functions and objects that aren’t actually called from anywhere. For example, making a new feature can involve writing the model objects first and making a PR from just those. Super easy to code review and get out of the way before the meat of your feature is ready.

Of course, the humble if statement is also a great way to apply this technique; use it liberally with feature flags to turn features on and off. (A feature flag doesn’t have to be complicated. A global constant boolean gets you pretty far. Feature flags don’t have to come from a remote source! However, I would recommend against compile-time #if statements, however. Code that doesn’t get compiled might as well be dead.)

Lastly, you can take advantage the language features you have access to. Get creative! I like to use Swift’s typealiases to enable old code to work while I rename a type or refactor some code. The Swift standard library uses this feature to great effect:

public typealias BidirectionalSlice<T> = Slice<T> where T : BidirectionalCollection

BidirectionalSlice used to be its own type, but with conditional conformance, a separate type wasn’t needed any more. However, in order for consumer code to continue compiling, a typealias was added to ease the transition between a world with this discrete type and a world without, all the while never breaking anyone’s code. If you’re interested, this whole file is full of such goodies.

If your team decides to take this approach, then you’re going to have a lot more code reviews in front of you. If there’s a lot of latency in your pull request pipeline, then that latency will begin to block you and your teammates’ efforts to continuously integrate their changes into the mainline branch.

It’s important to remember that code review is a blocked thread of execution, and, as a reviewer, your job is to unblock your teammates as quickly as possible. Prioritizing code review over almost all other work alleviates the slow code review pipeline, and enables steady integration of your coworkers’s code.

Ultimately, continuous integration isn’t a build server, it’s a mindset. Keep your outstanding branches small and get code into the mainline branch steadily and frequently, and your team will see appreciable yields.