Community post by Stacey Potter
At GitOps Days 2022, Jason Morgan, Technical Evangelist at Buoyant and co-chair of the CNCF Business Value Subcommittee, demonstrated how to make Flux, Flagger, and Linkerd work together. He also showed an example of using an ingress to do A/B testing based on headers or cookies. Jason talked through the theory and whiteboarded for those who prefer visualizing these scenarios. The presentations include everything you need to know, to do this on your own.
Before we get too deep in the technical weeds of the demos, let’s give you a brief overview of the tools and what they do.
What is Linkerd?
Linkerd is an ultra-light, ultra-simple, and ultra-powerful service mesh for Kubernetes. It makes running services easier and safer by giving you runtime debugging, observability, reliability, and security—all without requiring any changes to your code. Linkerd is fully open source and the only graduated service mesh project in the Cloud Native Computing Foundation (CNCF).
How does Linkerd work?
Linkerd works by installing a set of lightweight, transparent micro-proxies next to each service instance. These proxies automatically handle all traffic between services. Because they’re transparent, these proxies act as highly instrumented out-of-process network stacks, sending telemetry to, and receiving control signals from, the control plane. This design allows Linkerd to measure and manipulate traffic to and from your service without excessive latency.
What is Flagger?
Flagger is a progressive delivery tool that automates the release process for applications running on Kubernetes. It reduces the risk of introducing a new software version in production by gradually shifting traffic to the new version while measuring metrics and running conformance tests. Flagger is a CNCF project and part of the Flux family of GitOps tools.
How does Flagger work?
Flagger implements several deployment strategies (Canary releases, A/B testing, Blue/Green mirroring) using a service mesh or an ingress controller for traffic routing. For release analysis, Flagger can query Prometheus, InfluxDB, Datadog, New Relic, CloudWatch, Stackdriver, or Graphite. And for alerting, it uses Slack, MS Teams, Discord, and Rocket. Flagger can be configured with Kubernetes custom resources and is compatible with any CI/CD solutions made for Kubernetes. Since Flagger is declarative and reacts to Kubernetes events, it can be used in GitOps pipelines with tools like Flux, JenkinsX, Carvel, Argo, etc. To learn more about how it works, read the docs.
Stop Worrying and use A/B Testing
Now that you’ve got a basic understanding of Linkerd and Flagger, let’s get into Jason’s demos to learn how to keep calm and trust A/B testing. Because Flagger allows for different rollouts with either your service mesh or ingress, Jason shows two demos in his talks, one for header-based routing with Nginx and the other one for weight-based routing with Linkerd.
Header-Based Routing with Nginx:
Jason starts by showing his current setup and a whiteboard image of his Kubernetes cluster, which shows a generator and two primary pods running the podinfo application (created by Stefan Prodan for testing and workshops). He then changes the application, saves, git commits, and pushes it, then forces Flux to reconcile this change (this is for the sake of the demo, Flux does this automagically without a forced reconciliation). Flagger is watching this deployment and sees an incoming change and spins up two new pods with the latest versions of the app (which start receiving test traffic). Because this demo shows header-based routing, the standard user won’t get the traffic routing and see the new application. Jason changes his header and shows the application automatically updating with the new change (in this case, it’s a simple logo change). Jason returns his headers to a standard user and waits for the last step, which is the canary validation.
The traffic shift is illustrated in the Flagger Canary Stages image below:
While Nginx is setting where the traffic is going and handling the routing, Linkerd is providing the metrics that power Flagger’s decisions on whether to move forward with the canary rollout. You can see that the validation was successful because Flux and Flagger automatically begin terminating the pods with the old app versions. Jason also notes that when you’re integrating Nginx with Linkerd, there’s a cool feature called set service-upstream. Nginx, by default, is going to try and route to individual pods in a service. With the set service upstream value, you can tell Nginx to default routing to the Kubernetes service instead of a particular ingress. This allows Nginx to play seamlessly well with Linkerd in your environment. You won’t have to do any configuration to Nginx or Linkerd to tell them to work together, you just add the proxy, and everything behaves as normal.
Weight-based Routing with Linkerd:
In the second demo, Jason starts by modifying the canary deployment to comment out header-based routing and making it weight-based instead (with maxWeight and stepWeight settings). He then pushes the canary but still needs to change the (logo in the) app; he Git pushes the change, and when Flux reconciles that change, a percentage of traffic is routed to the new version of the app. Now we see Flagger spinning up the new deployment with the new pod versions. Once the success criteria have been met, it takes the old deployment (which still has however many pods of the old version), deploys the new version of the app, turns off the old versions, shifts traffic back, and deletes the canary version.
According to Jason what you get with both examples “is a lot of power with very little work.” It’s clear how much the Linkerd and Flagger communities have researched the needs of the market and deployment patterns in the wild. Both make it easy to implement best practices without getting lost and to integrate with all the other Cloud Native tooling you and your organization need.
Here’s the video in its entirety if you’d like to watch it from start to finish:
Next Steps
Check out the Linkerd Getting Started docs and if you have questions, visit the Linkerd Slack. If you’d like to follow along or try these demos out for yourself, check out Jason’s repository, or visit the Flagger Docs for Linkerd Canary Deployments and other tutorials. If you have questions or need help, reach out on CNCF Slack in the #flux and #flagger channels.
Did you miss the GitOps Days conference? No worries, you can watch all sessions on-demand at the GitOps Days 2022 Playlist.