Is backward compatibility even possible?
Writing a new API? Have you considered your version compatibility guarantees? It’s harder than you might expect.
API compatibility is pervasive. In the web, the misspelling of the HTTP “Referer” header has persisted since 1995, and every browser today pretends to be Netscape Navigator just in case some webserver still cares.
With all this effort, it’s worth asking: is it even possible to have a truly backward-compatible API?
(If you’re new to APIs, you can substitute “menus”)
What is backward compatibility?
In the context of APIs, let’s say you’ve got an API v1, and you’re considering releasing a new version, vNEXT. If version vNEXT is backward-compatible, then existing users should be able to use the same code that worked with v1, without any changes.
Some kinds of changes are obviously not backward-compatible:
Deleting anything visible to the user (and renaming is a kind of deleting)
Changing the types of existing input or output fields
Changing default values
Making fields required when they used to be optional
But as Hyrum’s Law says: With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.
More concretely, what if your harmless bug fix breaks a behavior that your users were relying on?
This isn’t just a joke. At work, I had regular discussions when writing release notes to decide whether behavior changes counted as “breaking” or not. Here are some innocuous changes that might break things for your users, based on real issues I’ve seen.
Bob’s business budgeting app
Let’s imagine a concrete scenario where we can make some changes. You’re Bob (hi, I’m Bobbie!) and you’ve created a business budgeting app. You have a bunch of users and even some corporate partners, and you’re looking forward to making some improvements:
Making it easier to set an expense title
Fixing a bug where user inputs are parsed incorrectly
Adding better logging
Improving performance
All of these improvements should be backward-compatible, right? Right?
Making it easier to set an expense title
Your budgeting app has a CLI for some reason a chat interface (popular with the kids!). Currently, there’s a command to start a new expense, `create expense`, that takes no other arguments.
Your users often request that they’d like to set the title of the note directly from the command, so you add the feature. Now, users can set the title by specifying it as the last argument: create expense "My new title"
.
Oops! You just broke some existing usage. What if one of your users had a script that had create expense enter
- an accidental typo they had made because they were voice-coding? Now that same command will have different behavior. But do you even care? It’s probably not a big deal.
Fixing a bug where user inputs are parsed incorrectly
In that same budgeting app, you discovered a new issue. If a user inputs a number with a leading zero (like “071” instead of “71”), your system interprets it as an octal (base-8) number instead of decimal. This is not at all what you intended, and your customers would agree.
71 in octal = 57 in decimal (image via coolconversion.com)
Let’s just parse all numbers in decimal and ignore leading zeros entirely. Or alternatively, reject the input and tell the user to remove any leading zeros.
Oops, these are both breaking changes - the same input now produces a different result or error.
Maybe you can continue with the old behavior, but add an additional warning message or log about octal numbers. Probably not breaking, but does it really help the users? They probably prefer the app to accurately track their finances…
Adding better logging
Okay, this octal bug has made you nervous. You want to be careful, so instead of changing anything directly, you added a new log message every time the situation occurs. Users should be able to read the log message, and you’ll also collect some telemetry to know about how common this situation is across the whole userbase.
You ship this new logging version, and the complaints immediately start rolling in. Users are reporting a new error: No space left on device.
Oops! Your log message produces a visible side effect - it takes up space on the user’s filesystem. With your new version, the same user inputs now produce loads of logs, and that can cause systems to crash.
(Fun fact: as a software engineer, one of the first production outages I caused was related to adding a log message to learn more about usage - when the thing I was trying to log didn’t exist, the application crashed (a total rookie mistake). Also, the automated rollback of my code failed, because the same crash took down the deployment API. Oops!)
Improving performance
After you finish apologizing to all your users for the recent breakages, you decide to make some performance improvements. Everyone loves faster code, right? This is a safe thing to do and surely nothing could go wrong.
The new app now calculates the yearly tax summary almost instantaneously. It’s a huge improvement over the previous version, which used to take several seconds. You ship it.
…and oops, one of your biggest partners has called you to complain. You’ve broken their website, which embeds your app as part of their business management suite. It turns out their code expected your calculation to take at least 5 seconds. Now that it’s faster, users encounter lots of errors and results that don’t make any sense.
In frustration, you quit your job and return to the construction industry. At least here, no one expects a house upgrade without disruption.
Can we fix it?
I’m sorry, Bob. We can still fix anything. Just not in a backward-compatible way.
A principled engineer might argue, some of these were avoidable mistakes. If we were better at the design stage, we might remember the guideline to prefer flags to args and write better tests to prevent bugs entirely. But the rest were the user’s fault - it’s not our problem that they wrote race conditions and didn’t set up log rotation!
But for your business, is it valuable to point fingers? You shipped a new version, and customers saw that it was broken. That’s already bad enough without blaming them, too. With the Hyrum’s Law perspective on API compatibility, you can’t assume that any change is backward-compatible.
No change is without risk. What can you do to manage that risk?
You can keep around a copy of the old code. It’s technically easy to do this with a library or software , but it can get expensive quickly when you’re running a SaaS with infrastructure costs. Customer support is more complicated, and maintaining security patches for older versions takes extra effort too.
Or, you can roll out new changes carefully. Canary deployments, feature flags, observability tools, opt-in configuration, are all ways to reduce the risk of a given change. These are also complex or expensive; ultimately, it’s easier to think about your product when everyone is running the latest and greatest version.
As always, you should maintain strong communication with your users and customers. Release notes and semantic versioning highlight expected breakages so that they can plan ahead. In the long run, it’s inevitable that you will break someone’s workflow. Hopefully it will be an edge case like Bob’s changes, but regardless - help them fix it, whether that’s by changing their workflow or your product.
Breaking changes are a design choice
True backward compatibility is a very difficult target. Instead of sticking to backward compatibility at all costs, you should consider breaking changes as an intentional design option.
Have you ever stumbled when using tabs vs. spaces in a Makefile? It’s incredibly frustrating to realize your script was broken because you used the wrong kind of invisible whitespace character.
# won't work, command starts with spaces build: go build main.go -o myprogram # works, command starts with a tab "\t" test: build go mod tidy go test
That’s a backward compatibility decision. Stuart Feldman, the creator of Make, recalls:
Within a few weeks of writing Make, I already had a dozen friends who were using it.
So even though I knew that "tab in column 1" was a bad idea, I didn't want to disrupt my user base.
So instead I wrought havoc on tens of millions.
Stuart Feldman, via Michael Stillwell (beebo.org): Tabs and Makefile
Consider the tens of millions! Is it worth it to sacrifice them to backward compatibility?