The Zero Downtime Release

One of our financial providers (one of the top 3) released a new version of their online web/mobile banking a while back.  They were up and down for much of 24 hours.  Yes, they notified us a day in advance that this was going to happen, but I find it really hard to believe that a major business takes that long to cutover to a new release in today’s online world.  24 hours.  Wow.

It’s actually possible to install/cutover a new release while production is up.  Yes, even with non-trivial sites.  I’m surprised at how many folks don’t believe this.

But don’t most sites go offline for new product releases?  Unfortunately, many sites do.  Perhaps the release process is risky, and everything needs to be verified.  Or perhaps the site is still in rapid feature build mode, and can’t afford to take time to automate internal processes.  If the site has to go down for a release, then the tendency is to put more features into a release.  This makes the release even more complex and costly.  Often on weekends, with all hands present to quickly resolve issues.  Great way to spend a weekend.

Automating development processes, however, can make the release process very fast, routine and inexpensive.  Perhaps even a push button release.  What kind of competitive advantage would that give your business?  You would not have to wait for a specific release window to push features required for a new customer.  What if, instead of a release every quarter, you could do a release every week?  Or every day?  Or even several times a day?  Your product team might be free to test new features in production, instead of offline.  If the features are popular, leave them in.  If not, take them out.

It does take work to get to the point where you can install a new release while production is up.  Even more work with complex core systems.  You may need to find and cut out activities that are not really adding much value – but that you may believe are necessary because you have always done them.  It also requires automating more of your development/release value chain.  Automating things like software builds, (complete) regression testing, and the actual cutover to production.

It may take a while to get to the point where you can cutover to a new release while production is live.  There is a learning curve.  You will need to get your teams on board.   It may be best to set some short term objectives – knock down the hurdles so you can make progress towards this goal.

Hold your team accountable for uptime.  Make sure you include new release cut over time in your uptime reporting.  Your customers do.