How I Learned To Stop Worrying and Pushed To Master

Blog

Video

How I Learned To Stop Worrying and Pushed To Master

11/22/2021

If you've worked on a team of developers of any size, you've had to come to an agreement among your team about how your code is branched and integrated. For about fifteen years now, I've been following some form or another of a GitFlow branching model. If your not familiar with GitFlow, it is a branching model where you isolate working branches and carefully craft them into release branches that make their way back into your main branch. If this sounds complicated, hang tight - I'll bring in some diagrams.

All of that to say, in the past couple of months, my team has switched to using trunk-based development and seen some huge gains in our productivity and overall morale. The idea behind trunk-based development is that every commit goes straight to the main branch. It means that you have multiple developers committing to the same branch multiple times a day.

Why would anyone do this? Isn't it a nightmare to manage releases and bugs? I can already feel the nervous twitching from the GitFlow purists. Just give me a moment to make my case and I think if approached with an opened mind, you may even end up agreeing with me.

Why GitFlow Wasn't Working For Us

By design, GitFlow is a set of checks and balances designed to slow you down and ensure higher quality in your main branch.

The flow is essentially:

Branch off of the main branch
Iterate on your feature until it is completed
Merge back to develop
A release branch is cut to merge back to the main branch and get deployed

In practice, slowing down the flow of integration can arguably introduce more bugs and lower quality than moving fast. After all, if you are in danger, your natural instinct is to run faster. Why would we think that we would be more likely to avoid danger by slowing down?

Upon reflection, GitFlow was causing several issues with our team at the time I reevaluated it.

GitFlow Was Slowing Us Down

This is partly by design, but the merge process was slowing down our development. For a feature to be considered ready to merge into the main branch, it had to go through:

Code review ( in GitHub )
Pass all unit tests
Build on CircleCI
Get deployed to a user acceptance testing environment
Pass manual and automated QA (often multiple rounds of QA)

Our velocity was taking a huge hit. We were under-resourced in QA to handle checking and re-checking every feature to verify that there were no regressions, just to merge into our develop environment to see that new regressions were introduced by bad merges.

Our three developer team was consistently completing between 6 and 8 story points, which was a serious red flag for hitting some of our deadlines. I knew something had to change or we were going to be behind on delivery. By changing our flow, we got as high as 16 points per sprint. When bugs started pouring in from QA, we got down to 12, but so far we have never gone lower.

Merge Of Doom

One of the biggest issues with a sophisticated branching model is that branches tend to diverge from each other. No matter how much you try to keep developers working on their own corners of the application, there will inevitably be overlap, and where there is overlap, there are conflicts.

If you have three developers working on the same application and each developer completes two or three stories over the course of a sprint, you may have hundreds or even thousands of files being committed, particularly if you are working on a greenfield application. If even a handful of the same files get touched for multiple features in the same sprint, there will be conflicts. That turns out to be a best-case scenario.

Conflicts are costly when managed well. We would often pair program through conflicts to make sure we weren't accidentally overwriting something important that another developer worked on. It would often mean a couple of hours a week with two or more developers comparing notes on a conflict, trying to make sure their changes didn't get lost.

When managed poorly, conflicts can be even more costly. Features we thought were completed would be missing important requirements. Bugs that were fixed would be broken again. New bugs would be introduced.

This was amplified by the time that passed between when a branch was cut from develop to the time when it was merged back. For bigger features, this could be one or even two weeks. The more time that passed, the greater divergence there would be from the other code.

If a feature was left undone for a month or more, it was almost impossible to resurrect and merge back into the codebase because the shape of things would change so much over time.

The Solution: Delete All the Branches

One day, I was programming something and couldn't get it working. My eight year old son came upstairs to get something and I decided to use him as my rubber duck. After explaining the issue in a way I thought an eight your old could grasp, he looked at me and said "why don't you just delete all your code and start over?" What I'm proposing here is a process equivalent to deleting all of your code. What I'm proposing is deleting all of your branches but one.

This all came together for me when I was catching up on YouTube and stumbled across Dave Farley's video Continuous Integration vs Feature Branch Workflow. He explained many of the inherent problems that I was seeing in my team's ability to deliver code quickly. At first I had a negative gut reaction to it because of all of my years feature branching. Then I remembered that my career has been full of nothing but mindset changes and pivots. The question was whether continuous integration could work in the real world with multiple developers pushing code constantly. I did some investigation and a lot of large companies use the CI model, so it seemed that it should be able to scale, perhaps better than branching.

With trunk-based development, or continuous integration, developers are encouraged to push their code to the main branch frequently. Not just when a feature is finished, but every time there is something new and useful. Features can be toggled on and off with feature flags, so your code releases should theoretically be separate from your feature releases, which is a huge benefit for delivery.

Changes We Made

Moving to continuous integration meant changing our processes at many levels. I thought through all the things that were slowing us down from the point of a developer finishing a task to the code finally getting merged. For our team, we made the following adjustments.

Code Reviews After Merge

We were reviewing code before merging it using GitHub's pull requests. Since we didn't have branches anymore, we also no longer had pull requests, which meant our code reviews needed to take place at a different point. We transitioned to doing a meeting once a week where we walk through the code that was committed the previous week. It's a time to make suggestions on ways we can improve it, find areas where we're doing similar things in different ways and join our efforts together and to get a better understanding of the pieces of the system that we don't touch as much. I feel that it's actually been more productive than the asyncronous pull requests we were doing before, because it gives the author of the code an opportunity to explain what the code is doing and why certain choices were made.

QA In Staging

On a typical day, we push code to the main branch 10 to 15 times. If the code builds and the tests pass, it automatically deploys to a staging environment.

Before, we were building ephemeral UAT environments for each feature to be independently tested before graduating them to the develop branch where we would test that they integrated with the other features correctly. At that point, we would cut a release and merge it to staging which was a close replica of production and was the last point of failure before pushing to production.

We eliminated every environment but staging and production. We are using Serverless Stack (SST) to run our own developer environments and test locally. SST uses all of the AWS infrastructure we use once deployed, but it allows us emulate lambda invokation locally, which is the best of both worlds. Because of that, our local environment essentially serve as a testing ground for our features prior to getting merged back to the main branch.

We do have a manual QA step, but since we are continuously integrating our code, every issue that is found is considered a new bug in our system. It means we have more bug cards, but it also allows us to close down features more quickly because there is nothing holding it back. This has helped to allow us to do less context switching because we can finish whatever feature we're building and then tackle bugs that come in below it. There's no pulling down a divergent branch, reinstalling new dependencies and trying to get back into the state of the branch we were in several days prior when we were working on it.

Manual Release to Production

Every time we push code it builds, tests and releases to staging and then goes into hold. At that point, we can click a button and it will release to production. We can release to production as often as needed, but so far, it's usually been once or twice a week.

Are We Even Really Agile Now?

Because of the new flow, we barely live inside of formal sprints other than keeping up with a few of the ceremonies like standups and pop-up sprint planning sessions. We aren't so much working inside a sprint as we are pushing continual value to our product. Our flow is basically Kanban and it honestly works really well with the continuous integration flow.

That said, I think what we're doing is very in line with the spirit of Agile. After all, the first principle of the Agile manifesto is "Our highest priority is to satisfy the customer through early and continuous delivery of valuable software."

Is It Working?

For us, the answer is yes. I'm always hesitant to be overly prescriptive about particular solutions or processes, but in our case, we've doubled or tripled our productivity, gotten much more responsive to issues and bugs and barely every have a merge conflict that is more than a couple of files.

That said, GitFlow and other git branching strategies work well for a lot of people. If it works for you, I say keep doing what you're doing. For the rest of us, continuous integration has been a game-changer.

Tyson Cadenhead leads the dev team at Vestaboard

He has a passion for Functional Programming, GraphQL, Serverless architecture and React. When he's not writing code or working with his team, Tyson enjoys drawing, playing guitar, growing vegetables and spending time with his wife, two boys and his dog and cat.

How I Learned To Stop Worrying and Pushed To Master

11/22/2021

Why GitFlow Wasn't Working For Us

GitFlow Was Slowing Us Down

Merge Of Doom

The Solution: Delete All the Branches

Changes We Made

Code Reviews After Merge

QA In Staging

Manual Release to Production

Are We Even Really Agile Now?

Is It Working?

Never miss an update!

Tyson Cadenhead leads the dev team at Vestaboard