What I've Learned Working with Continuous Deploy

My Experience

Before we dive in I'll briefly discuss my engineering background. Most of my experience comes from working with a small, ten person engineering team. Each engineer on the team has fewer than 3 years of experience, many fresh out of college. Our team works with two week sprints and everyone is a full stack engineer.

Software Development Life Cycle

I think before discussing any trade offs from using continuous deploy, we need to understand our software development life cycle and the responsibilities of an engineer:

prioritize the feature or bug
scope the size of the feature or bug
plan and design the feature or bug fix
implement the feature or fix locally and write unit and integration tests
develop a test plan used by other engineers to manually test the changes
conduct a code review (or a few) to catch bugs, fix styling, etc
deploy the change to our staging infrastructure
run the manual test plan
make any changes from testing and repeat step 4-8, if necessary
release the fix to production
run the test plan on production to verify the behaviour

I think understanding the software life cycle process is fundamental in understanding the overhead imposed on individual engineers by continuous deploy.

How does continuous deploy play with our development process?

Continuous deploy has the advantage of allowing a team to move quickly and deploy changes when they're ready. For my team, new features and bug fixes can make it to production with little friction. Our continuous integration setup runs tests and automatically deploys changes if the tests pass. This means we don't need to wait to deploy a change. This is can be particularly advantageous for a small team trying to move fast.

So, as an engineer, all I need to do is follow the software development life cycle steps and my fix can be deployed. Simple, right?

Okay, but what's the catch?

There's actually a lot of overhead when it comes to each step in the process, and it works well, if you don't have buggy code and you understand the scope of the change. Based on my experience so far, no code is without bugs. Even with unit and integration tests with high coverage and code reviews, we still encounter bugs. Testing doesn't catch everything. Variations in environment configuration, network infrastructure, or inefficient code can all result in a failed deploy.

I've realized a well written, reproducible test plan, can go a long way in catching these problems. But it takes a non trivial amount of time to write a detailed test plan. There a lot of considerations you need to take into account like:

How will this impact the rest of the system?
What other pieces of code use this interface?
How will this be used now? How about the future?
What edge cases did I consider and how can we test them?
How will this work on production?
How will my change scale with more data?

These are just a few example questions, and they become increasingly difficult to answer as the scope of the change increases and you only have an engineer or two to solve them.

Another area for problems is context. If your reviewer doesn't have the context to properly review the code and execute the test plan, breaking bugs may make it into production. How do you ensure that problem context is shared equally between a few engineers?

Let's say a breaking change does make it to production and someone else is preparing to release their feature. Is it their responsibility to verify your changes work before releasing theirs? I'd say yes, at least to some extent. If you don't, you might deploy a change on top of a bad change.

What I've Learned

As a small team, you can see there's a lot to consider for one or two engineers to release a feature. I think in order for continuous deploy to work successfully for our small team a few conditions need to be met:

Understand the scope of your feature or bug fix
Write a detailed, reproducible test plan
Run the test plan in development, staging, and production
Make sure any other recently deployed changes don't impact your code
Revert broken changes in source control to prevent stacking bugs

I think this list is a good set of responsibilities for a good engineer. But what does it mean when one or two engineer's are juggling these varying levels of context? It means every engineer shares the burden of scoping, managing, testing, and verifying deployment of their changes. And while you could think of this as unnecessary overhead, you can also think of it as a method of holding engineers accountable.

Is continuous deploy faster for small teams?

It's a trade off. Like any other engineering decision, continuous deploy comes with the same advantages and disadvantages. It probably depends on the size of the change, feature, or bug you're fixing. It can be nice to have specialized roles at times but other times it's nice to have a jack of all trades.

What I can say is that continuous deploy, when executed correctly, helps engineers understand the full scope of their tasks and gain a more holistic understanding of their responsibilities.

Posted on March 10, 2019