Continuous delivery is a software development practice in which a team or a company strives to keep their software in a deliverable state at all times. Closely related is continuous deployment, in which the software itself is actually released to production as often as possible. Continuous delivery is a prerequisite to continuous deployment. From a business point of view, continuous delivery allows a company to quickly adapt to changing market developments, respond to user feedback, etcetera. However, what is in my opinion a much stronger argument in favor of CD is that it enforces a large number of software development practices; it’s a core principle which can be referred to when making decisions about the practice of software development, ranging from how version control is handled, test and release automation, etcetera.
In this post I’ll go into the software development practice of version control, and discuss a number of strategies in the context of continuous delivery.
Version control, automated verifications and continuous integration
One of the core principles of CD is to keep the build green. All developers on a project work on a shared codebase and need to be able to rely that it is always in a working, deployable state. I will mostly use Git terms, but it applies to all version control systems.
Trunk-based development
The simplest way to collaborate with multiple developers on a single codebase is probably Trunk-Based Development, in which all developers work on a single branch, usually the master
branch. This collaboration method heavily implies Continuous Integration, in which changes are, as the name implies, continuously integrated multiple times per day. This keeps everyone up to date on the latest developments, so that new and updated features are quickly known throughout the entire team. It intends to prevent people from working on islands, isolated from the rest of the team for more than a day.
The risk with trunk-based development however is that every push from any developer comes with a risk that they break the build. Breaking the build causes the master
to be in an unreleasable state. It interrupts the continuous deployment flow, and will impede everyone working on the codebase until it has been resolved.
There are a number of ways to mitigate this risk. A major factor in successfully applying CD is test automation. One way to prevent breaking the build is to make sure the automated tests are all green before integrating back into master. This can be enforced, for example, by setting up a pre-push or pre-commit hook. An important note here is that when integrating back with the main codebase, all changes made should be integrated locally before running the tests.
One challenge is that running all tests does not scale for several of categories of projects. If running all the tests takes longer than a minute, which is often the case when running integration tests or end-to-end testing for client-facing applications (using e.g. Selenium), it becomes more and more likely there have been changes on master
that are not integrated with the work done locally before the test is finished. This problem also becomes larger the more developers are working on the codebase. It’s possible to opt to not run all tests, and accept that the build / release may break in tests that take longer to finish, but this reduces the reliability and will eventually impede continuous delivery – not to mention developers will feel less responsible when the build breaks.
Another alternative approach is, as Facebook used to say and still puts into practice, to “move fast and break things”. That is, rely less on automated testing, and just deploy it to production. What Facebook does is automatically make new releases available to a small subset of their users, then monitor incoming error reports. If errors come in, the release is not made available to more users; instead, developers have to fix and make a new release. If there are no new errors, the amount of people that get to see the new or changed feature is gradually increased until it’s deployed to 100% of users.
This works great for not-that-critical applications, which in my opinion are most customer-facing applications. Critical errors that completely break everything for users should of course be avoided, but in practice most errors are subtle or just “not ideal”, instead of actually critical. Whether this approach will work for your application will, of course, depend on a large amount of factors. What I think is important to consider is “How bad is it that something breaks?” I believe most bugs in released software will end up being minor inconveniences, and if it’s deployed to only a small subset of users, it’ll be a minor inconvenience to only a small number of users. If there is a large problem, even then impact will be low – and when you have CD set up properly and can do rapid releases, a fix can be rolled out quickly.
In my personal experience, I’ve only worked with this approach in two projects; the one was one of my first projects at Xebia at UPC, in a small (4-5 developer) team with the project lasting just six weeks. This was for many people in that team the first time they worked with Git, so there was a bit of figuring out to do. It seemed to work well enough for that application since we had good communication and did most things in pairs, but the project was too short for CD to be put into practice.
The other project was at NS, which usually was just two people working on a project. Initially we did trunk-based development, with most features being done in just a few days, but we would regularly go for a feature branch approach, mostly to avoid disrupting myself or the other developer (and vice-versa) while developing. Feature branching allows developers to have their code tested and deployed before merging with the mainline, which can help avoid breaking the build and disrupting both other developers and the Continuous Deployment flow.
Feature branching
In a nutshell, feature branching is where a developer creates a branch off the mainline source code, does their work, and merges it back to the mainline. Before that merge is made, a number of verifications is first performed; the full suite of automated tests is run, a code review is performed by peers (either in the same team, another team, a senior developer, etc), a product owner inspects the new feature and provides feedback, etcetera. To ensure that after merging the master
build remains green, it’s best to only allow merging of branches which are up to date with the master; that is, after the merge all that has been changed from the mainline is the one new feature. Most version control management software (github, gitlab, bitbucket, etc) has management options to enforce this; look for a fast-forward only option. This ensures a linear history, and ensures that after every merge a release is possible with a relatively small change compared to the previous release.
Feature branching allows teams and developers to work in isolation, and to not be interrupted if the build is red – if that still happens for whatever reason. It also allows for a more formal approach to software review, as well as tweaking the commit contents and messages before merging. It is, in my opinion, a more scalable solution to software development than trunk-based development. It too can be combined with the Facebook approach of deploying to a limited number of users; writing and maintaining a full suite of end-to-end tests is something that takes a lot of time and effort, and its payoff may not be as high. The ‘partial deployment’ approach allows the developers to spend less time on comprehensive tests, which in turn allows for more time spent on adding or changing features, as well as more frequent deployments.
In my personal experiences, end-to-end web application tests (first using FitNesse, later Protractor, both using Selenium / WebDriver under the hood) tend to be unstable, slow, not comprehensive, hard to debug, and a challenge to keep updated. When the test tools are not stable, they do not contribute to CD. When they are not comprehensive, they leave gaps and chances at breaking things for end-users.
An extension of the feature branching model is the git-flow model, which adds an additional layer in between the integration branch (master
in the previous examples, called develop
in git flow) and the “ready for deployment” branch. This adds a layer of indirection, a buffer so to speak, where there’s more margin for error and breaking the build. I don’t think this is a good strategy; because breaking the build does not hurt as much, there is less incentive to fix the root causes behind it. There is more ritual and processes involved in going to production, several more steps are involved. These rituals could be automated to some degree, but it’s better to not have the ritual in the first place than to try and hide them – the more rituals, automated or otherwise, the more overhead there is that impedes CD.
The “simple” feature branching approach is what I’ve used the most in projects; I’ve found that both me and other developers naturally gravitate towards this approach; it takes away a lot of headaches for keeping master up to date, it pushes back conflict resolution (if necessary) to the moment of integration instead of all the time, it allows for automated tests on a CI server to run and avoid merging if there is a regression or issue, and what probably improved code quality the most was that with the help of tools like Github, Gitlab and Stash, more formal code reviews were possible.
Before we switched to using these systems, we’d go around and shop for anyone willing to review our code. This involved pulling someone out of their concentration, which was considered kind of annoying. The other problem with our code reviews was that instead of the reviewer taking their time and going through the code, it was more often the developer just showing the code, browsing through it and explaining it. Using the more formal tooling, it turned into an asynchronous process in which the reviewer could review the code in his own time and choosing his own moment. The main thing there is that you have to agree to not leave reviews open for too long, else the original developer will have moved on to the next thing already. This usually happens anyway, but it should be kept to a minimum to prevent excessive context switching for both the developer and the reviewer.
Most version control schemes are a variation of the above two or three approaches. The fork & pull request flow as popularised by code hosting platforms like GitHub is a variation of the feature branch approach, with the specific variation that the fork doesn’t need to integrate back into the upstream codebase – it can live on as its own product if the original authors of the forked project do not agree with changes made.
The Linux project, the project which Git was originally designed for and which is one of the largest codebases to use version control, uses a highly distributed approach to version control; instead of branches, pull requests, etc, it has a hierarchy of maintainers, with Linus Torvalds being the main integrator who ends up merging all the changes sent via e-mail from the top maintainers into his mainline version of the Linux kernel. Pages like Submitting Patches and First Kernel Patch try and explain the process to some degree.
Whichever approach you as a team decide on, make sure to stick with one approach and be consistent so that everyone knows what’s going on. Second, as a team, learn to understand Git; learn what the status of your local repository is based on commandline output, be aware of the difference between local and remote branches, know what you have locally and what is shared with others. Keep your history clean, and gratuitously use tools like amend and interactive rebase to clean up the history you are creating – but only as long as you haven’t shared it with others yet. It’s highly discouraged to rewrite history after pushing in trunk-based development; in fact, all major git hosting solutions offer the option of protecting a branch; I recommend enabling that for the master branch at all time, so nobody can rewrite history.
I do rewrite history in branches using (interactive) rebasing and force-pushing, mostly because while the initial work for a feature might be just a handful of commits, the fixes and rework that often comes up following code review will often add a lot of “fix this”, “fix that” commits; by using interactive rebasing and such, you can rewrite history and make sure that when the feature is finally merged to master, it will appear as being made right the first time. This is wrong of course and changes will be added later, but at least the initial development effort will be clean in your git history. Also, don’t do this if you have more than one person working on the same branch, this will cause a lot of pain. Prefer pair programming if you really have to, or agree to make a mess out of the history while developing, but have only one developer clean up history once development is done. Also keep in mind that this is advanced stuff
Keep in mind that when using any branch-based approach (e.g. anything not trunk-based), you can make edits in the code, commits, and everything up until the final merge – if you are going to go for one approach, use this option to ensure your history is as clean as can be, and that when you do hit the merge button, your product is ready to go to production.