A crufty codebase is an experience every developer has had. From the outset, you plan for a beautiful architecture, elegant data flow, and simple code, but after years of bug fixes and feature additions, the codebase morphs into a husk of its former self, displeasing and tiresome to work on.
As you write your programs, you make sequential straightforward changes, like adding features or fixing bugs. Each individual change is, as you intended, good, but the combination is something that neither you nor anyone else on your team can be satisfied with. This concept has relatives in systems theory, philosophy, economics, and political theory. In this post, we’ll explore some of those theories and use them to evaluate our practice.
As a sort of fictional case study, we have Mike Hadlow’s The Lava Layer Anti-Pattern. In it, he describes a situation where each of four developers continue to modernize the data layer of an app. Through no fault of management or any of the developers, the final state of the code is fragmented and confusing, various modules in disagreement.
Broadly, this concept is known as emergence. Emergence describes a system whose high-level behavior is fundamentally different than its low-level components. Emergence can be seen in lots of fields, including (but certainly not limited to) biology, computer science, and traffic. Note that the theory of emergence doesn’t ascribe any value to the high-level behavior; neutral low-level actions can cause both positive or negative high-level outcomes. The cruft in our codebases comes from a series of changes that weren’t intended to have an effect on the overall structure of the program, but nevertheless do.
There are two theories in economics that are useful for analyzing codebases. First, broken window theory. The broken window theory suggests that fixing little things like broken windows or graffiti reduces the incidence of more serious crimes, like muggings. There are broken windows in your codebase, and you can fix them them pretty easily. Increasing consistency will make future committers more reticent to make unsightly and changes. Use tools like
clang-format to clean up whitespace and make code conform to standards, reducing needless comments in your code reviews.
The second economic theory that’s notable here is the Tragedy of the Commons. The Tragedy of the Commons refers to a simple idea: in a situation where a group shares a common resource, individuals can exploit that shared resource, enjoying the gains personally, and distributing the costs communally. It’s pretty easy to see how this applies to environmental issues, like pollution or overfishing.
The prototypical example is a group of ranchers that share a plot of land for their cows. As each rancher considers adding another cow to their herd, she evaluates the potential upside and downside. Upside: one more cow, which means more milk/beef/hide; downside: overgrazing of the shared land, for which the costs of repairing will be shared among all the ranchers. This is also known as an externality.
Think of your codebase as the commons: multiple people working in the same space. When working on a story or a ticket, you can take out technical debt at any point. Technical debt is formed when code is written suboptimally to optimize some other value, usually time. Especially on a team, taking a shortcut and finishing your work more quickly will make you look good in front of your teammates or your manager. Since someone has to pay back that debt, so you’re incentivized to throw your teammates or even a future version of yourself under the bus.
And it doesn’t even require ill intentions. Emergence can happen even when you have no designs either way. Have you ever reviewed a pull request that looked great in diff form, but when browsing the code later, it became obvious that the fix was unsound? It’s certainly happened to me. It’s possible (and sometimes even easy!) to accidentally write pull requests in such a way that they look good on GitHub, but closer examination of the fix’s surroundings reveals that it solves the problem in the wrong place or in the wrong way. This is most obvious with duplication. A diff can’t tell you if you’ve rewritten code that already exists in the codebase, and that can be a really simple way of building up technical debt, having code fall out of sync, and leaving holes for bugs to hide in.
Meditations on Moloch is an awesome blog post generalizing this concept to society, capitalism, and ideology. In it, the author quotes a short passage from the Principia Discordia, a conversation between a human and a goddess, which sums it up nicely.
”I am filled with fear and tormented with terrible visions of pain. Everywhere people are hurting one another, the planet is rampant with injustices, whole societies plunder groups of their own people, mothers imprison sons, children perish while brothers war. O, woe.”
WHAT IS THE MATTER WITH THAT, IF IT IS WHAT YOU WANT TO DO?
“But nobody wants it! Everybody hates it.”
OH. WELL, THEN STOP.
But how do we stop? Emergent behavior acts like entropy: reversing it takes work and, more importantly, coordination.
There are no real secrets here. A crufty codebase demands attention, and it won’t fix itself. Code will need to be refactored over time. For a team, coordination mostly happens at the code review level. This doesn’t just mean checking for style, but checking, at a very rigorous level, for flaws, duplication, confusing code, and spots that will otherwise cause problems in the future. Code review makes sure the team is synchronized.
For teams of one, solving your problem involves holistic attention to your codebase. No one else is around to catch your mistakes, so you have to put a lot of deliberation into it. On the other hand, there’s no one else to muck it up after you fix it.
But the best solution is just to never get yourself into the problem. Mike Hadlow suggests generally waiting for more maturity in the technologies that you introduce. He also seems to recommend waiting until you can replace a component wholesale instead of piecemeal.
This is your craft, and building good software is your primary job. Write code that doesn’t need documentation; barring that, write documentation. Write tests. Don’t write comments, but use good naming and method extraction to obviate them. With attention and coordination, you can make your codebase enjoyable to work in again.