Continuous Delivery
David Farley about Continuous Delivery Petra Kiers 24 Feb, 2014
master
branch. This collaboration method heavily implies Continuous Integration, in which changes are, as the name implies, continuously integrated multiple times per day. This keeps everyone up to date on the latest developments, so that new and updated features are quickly known throughout the entire team. It intends to prevent people from working on islands, isolated from the rest of the team for more than a day.
The risk with trunk-based development however is that every push from any developer comes with a risk that they break the build. Breaking the build causes the master
to be in an unreleasable state. It interrupts the continuous deployment flow, and will impede everyone working on the codebase until it has been resolved.
There are a number of ways to mitigate this risk. A major factor in successfully applying CD is test automation. One way to prevent breaking the build is to make sure the automated tests are all green before integrating back into master. This can be enforced, for example, by setting up a pre-push or pre-commit hook. An important note here is that when integrating back with the main codebase, all changes made should be integrated locally before running the tests.
One challenge is that running all tests does not scale for several of categories of projects. If running all the tests takes longer than a minute, which is often the case when running integration tests or end-to-end testing for client-facing applications (using e.g. Selenium), it becomes more and more likely there have been changes on master
that are not integrated with the work done locally before the test is finished. This problem also becomes larger the more developers are working on the codebase. It’s possible to opt to not run all tests, and accept that the build / release may break in tests that take longer to finish, but this reduces the reliability and will eventually impede continuous delivery - not to mention developers will feel less responsible when the build breaks.
Another alternative approach is, as Facebook used to say and still puts into practice, to “move fast and break things”. That is, rely less on automated testing, and just deploy it to production. What Facebook does is automatically make new releases available to a small subset of their users, then monitor incoming error reports. If errors come in, the release is not made available to more users; instead, developers have to fix and make a new release. If there are no new errors, the amount of people that get to see the new or changed feature is gradually increased until it’s deployed to 100% of users.
This works great for not-that-critical applications, which in my opinion are most customer-facing applications. Critical errors that completely break everything for users should of course be avoided, but in practice most errors are subtle or just “not ideal”, instead of actually critical. Whether this approach will work for your application will, of course, depend on a large amount of factors. What I think is important to consider is “How bad is it that something breaks?” I believe most bugs in released software will end up being minor inconveniences, and if it’s deployed to only a small subset of users, it’ll be a minor inconvenience to only a small number of users. If there is a large problem, even then impact will be low - and when you have CD set up properly and can do rapid releases, a fix can be rolled out quickly.
In my personal experience, I’ve only worked with this approach in two projects; the one was one of my first projects at Xebia at UPC, in a small (4-5 developer) team with the project lasting just six weeks. This was for many people in that team the first time they worked with Git, so there was a bit of figuring out to do. It seemed to work well enough for that application since we had good communication and did most things in pairs, but the project was too short for CD to be put into practice.
The other project was at NS, which usually was just two people working on a project. Initially we did trunk-based development, with most features being done in just a few days, but we would regularly go for a feature branch approach, mostly to avoid disrupting myself or the other developer (and vice-versa) while developing. Feature branching allows developers to have their code tested and deployed before merging with the mainline, which can help avoid breaking the build and disrupting both other developers and the Continuous Deployment flow.master
build remains green, it’s best to only allow merging of branches which are up to date with the master; that is, after the merge all that has been changed from the mainline is the one new feature. Most version control management software (github, gitlab, bitbucket, etc) has management options to enforce this; look for a fast-forward only option. This ensures a linear history, and ensures that after every merge a release is possible with a relatively small change compared to the previous release.
Feature branching allows teams and developers to work in isolation, and to not be interrupted if the build is red - if that still happens for whatever reason. It also allows for a more formal approach to software review, as well as tweaking the commit contents and messages before merging. It is, in my opinion, a more scalable solution to software development than trunk-based development. It too can be combined with the Facebook approach of deploying to a limited number of users; writing and maintaining a full suite of end-to-end tests is something that takes a lot of time and effort, and its payoff may not be as high. The ‘partial deployment’ approach allows the developers to spend less time on comprehensive tests, which in turn allows for more time spent on adding or changing features, as well as more frequent deployments.
In my personal experiences, end-to-end web application tests (first using FitNesse, later Protractor, both using Selenium / WebDriver under the hood) tend to be unstable, slow, not comprehensive, hard to debug, and a challenge to keep updated. When the test tools are not stable, they do not contribute to CD. When they are not comprehensive, they leave gaps and chances at breaking things for end-users.
An extension of the feature branching model is the git-flow model, which adds an additional layer in between the integration branch (master
in the previous examples, called develop
in git flow) and the “ready for deployment” branch. This adds a layer of indirection, a buffer so to speak, where there’s more margin for error and breaking the build. I don’t think this is a good strategy; because breaking the build does not hurt as much, there is less incentive to fix the root causes behind it. There is more ritual and processes involved in going to production, several more steps are involved. These rituals could be automated to some degree, but it’s better to not have the ritual in the first place than to try and hide them - the more rituals, automated or otherwise, the more overhead there is that impedes CD.
The “simple” feature branching approach is what I’ve used the most in projects; I’ve found that both me and other developers naturally gravitate towards this approach; it takes away a lot of headaches for keeping master up to date, it pushes back conflict resolution (if necessary) to the moment of integration instead of all the time, it allows for automated tests on a CI server to run and avoid merging if there is a regression or issue, and what probably improved code quality the most was that with the help of tools like Github, Gitlab and Stash, more formal code reviews were possible.
Before we switched to using these systems, we’d go around and shop for anyone willing to review our code. This involved pulling someone out of their concentration, which was considered kind of annoying. The other problem with our code reviews was that instead of the reviewer taking their time and going through the code, it was more often the developer just showing the code, browsing through it and explaining it. Using the more formal tooling, it turned into an asynchronous process in which the reviewer could review the code in his own time and choosing his own moment. The main thing there is that you have to agree to not leave reviews open for too long, else the original developer will have moved on to the next thing already. This usually happens anyway, but it should be kept to a minimum to prevent excessive context switching for both the developer and the reviewer.