Continuous Verification with Kayenta and Spinnaker

Verification is the last but not least step in any deployment. Even though no code is going be pushed to production without passing through QA testing, you need to verify that the updates work as expected in a production environment, with live traffic.

When managing a continuous delivery process and moving as fast as possible, it’s likely that some errors are going to slip into the production releases. The verification step tries to ensure those mistakes effect as few of your site visitors as possible, and that you can discover them and rollback the deployment as quickly as possible when something slips through.

Kayenta is an artificial intelligence-powered canary analysis tool that is integrated with Spinnaker. Spinnaker manages the canary deployment; Kayenta determines whether or not the canary should be pushed to production. The process can be fully automated, and there’s no need to switch between tools to manage the canary deployment and analyze the canary’s performance. When Kayenta determines a deployment passes, it automatically promotes the canary to full production. Conversely, if it fails, the canary is automatically destroyed and kept out of production. It can all happen with little-to-no human intervention.

Here’s how it works.

Setting up the Canary Analysis

A canary analysis is one technique for reducing the risk of letting faulty code slip into production sites. The general process is: You create two new deployments. One is the new code; the other is an exact replica of the current production site. Then you divert a small portion of the production traffic, usually around 1% away from the production site and distribute it equally between the baseline (the exact replica of the production site) and the canary (the new code or configurations). Creating a new baseline, instead of comparing the canary to the production site, ensure that startup effects and long-running functions don’t skew the comparison between to the two deployments.

Pass or Fail?

The next step is to determine whether or not the canary deployment is any good—i.e., if it behaves at least as well as the baseline deployment on key performance metrics.

Canary analysis can be done manually—and that’s how it’s done in many organizations. A team member will look at logs and graphs that compare CPU usage, memory usage, error rates and other metrics. It’s easy for human error and/or poor decision making in the analysis to lead organizations to promote faulty code. It’s also very slow, meaning that companies that want to move quickly and optimize their continuous delivery cycle will run into a verification bottleneck if the canary analysis is done manually.

For companies that prioritize moving fast and not breaking things, automatic, AI-powered canary analysis is key. When Kayenta does the canary analysis, a team member does have to configure the metrics to measure, but just once. Kayenta is able to run statistical tests on all of the metrics you want to consider and will give an aggregate score (success, manual intervention or fail). Kayenta can automatically promote canaries that pass and rollback canaries that fail—borderline cases are flagged for human intervention.

Using Kayenta as part of Spinnaker helps you safely speed up your continuous deployment pipeline and minimize the opportunities for human error. It provides a better, more accurate canary analysis than is possible manually and frees up your team members to work on other projects. In other words, it removes another obstacle to a mature CI/CD process, increases the number of deployments possible per month and reduces both manual steps and the chance of things going wrong.