Case Study

Nemlig.com

Boosting performance and security with Cilium

Challenge

Nemlig.com is Denmark’s leading online grocery shopping company. During the initial phase of building their Kubernetes platform, the platform team at Nemlig knew that the CNI embedded in their Kubernetes distribution, k3s, was not ready for their production needs. They were also interested in adopting eBPF, recognizing its potential to meet future network, observability and security related needs..

Solution

Nemlig selected Cilium as their CNI solution due to its performance, supportive community, comprehensive network policies, eBPF capabilities, and seamless integration with their Kubernetes distribution, k3s. Additionally, it replaced the default kube-proxy and iptables for networking in their cluster and Cilium also includes Envoy as a daemonset.

Impact

Implementing Cilium in their Kubernetes platform has enabled Nemlig to enhance agility, replace kube-proxy and iptables in their clusters, and reduce the operational burden of managing their network. Adopting Cilium has also provided them with new capabilities, notably developing their token-based authentication service.

Challenges:
Industry:
Location:
Cloud Type:
Published:
June 24, 2024

Projects used

By the numbers

50,000

Grocery boxes delivered daily

400,000

Customers

500,000

Order lines daily

Building a Kubernetes Platform with Cilium

Nemlig has a platform engineering team of four individuals responsible for building and operating their Kubernetes platform. Their platform uses the k3s Kubernetes distribution and operates across both VMWare and the cloud. 

The Nemlig platform engineering team knew heads-on that using kube-proxy and Flannel for their Kubernetes platform’s networking would not cut it.  Instead they wanted a solution to reduce the operational burden of managing the network and replace kube-proxy.

Flannel is a fine CNI but it’s not ready for the enterprise in any way shape or form, just for your home set-up or Raspberry Pi. There was no reason to think that it would be viable to use Flannel in production.

One of the primary gains we wanted out of our CNI was to offload complexity and make it easier to run Kubernetes clusters on the networking side because there is so much going on. The more help you can get, the more things are properly handled for you, and the more tooling provided, the better.”

Lars Bengtsson, Lead DevOps Engineer, Nemlig.com

After reviewing their options, they chose Cilium for its high performance and security, supportive community, comprehensive network policies, and eBPF capabilities.

“I’ve been following the Cilium project for 5+ years and I thought the package Cilium provided was really capable at the time. I didn’t have the chance in my previous company to use Cilium in production, but I used it for some hosting I built for myself and found it really nice to work with. When I joined Nemlig three years ago, I started building the Kubernetes platform I coined Kubernemlig.

It is the obvious choice to use Cilium as the CNI because of its performance, the capabilities of eBPF, and Cilium network policies. Also, the general traction and the community around it have a nice vibe. The project also has great velocity on getting things out of the door with new features, fixing bugs, and so on.”

Lars Bengtsson, Lead DevOps Engineer, Nemlig.com

After making their choice, they disabled Flannel, installed Cilium with Helm, and configured it to use the features Cilium provided out-of-the-box. 

“We disabled Flannel, installed Cilium with Helm, and configured it to meet our needs. First, we added kube-proxy replacement and set up native routing. We also use the maglev load balancing algorithm, which is good. Envoy Proxy  via Cilium is used for our token-based authentication as a service. We are leveraging Cilium network policies because it has a much richer feature set than the Kubernetes network policy and better fits into what we want to do around firewalling and controlling network traffic. Finally, at times we dive into the network flows with Hubble. We use so many features of Cilium!”

Lars Bengtsson, Lead DevOps Engineer, Nemlig.com

Building Token-Based Authentication as a Service with Cilium and Envoy

With Cilium as their CNI and their Kubernetes platform expanding, Nemlig faced a new requirement: enabling developers to deploy applications to be accessed from the Internet. To meet this business demand, they needed a layer of security to ensure that only users with a valid token issued by their backend servers could access the exposed applications on their Kubernetes platform. To construct this new security layer, the Nemlig platform engineering team utilized Cilium’s inbuilt Envoy to provide JWT verification for services attempting to connect to these exposed applications.

“The need for this implementation came from the fact that we were about to allow developers to deploy applications on our platform that were exposed externally outside our internal network. For this reason, we needed something to establish a base layer of security so that only users with a valid token (issued from our backend servers) would be allowed to talk to the exposed applications. 

For this new requirement, we knew Envoy could provide JWT verification, and we knew we already had Envoy through Cilium on our Kubernetes platform. It seemed obvious to implement the JWT verification via Cilium’s Envoy configuration to provide it as a service on the platform and alleviate the load of implementing this logic in the services themselves.

The benefit of this implementation is that we can establish a perimeter around our platform, where we can enforce this control and without instrumenting any of the services themselves allowing us to get JWT validation before traffic even hits the services.”

Anton Due, Senior DevOps Engineer, Nemlig.com

Cilium’s Envoy allowed them to seamlessly add JWT verification for externally exposed applications without having to add an additional component to their stack or modify the services themselves.

How Cilium and eBPF Changed the Cloud Native Ecosystem 

One of the key factors behind Nemlig’s decision to adopt Cilium was its eBPF foundation. The team values the performance enhancements, versatility, and transformative impact that eBPF, as the underlying technology of Cilium, has had on the cloud native ecosystem.

“When looking at which technology to choose, what I found interesting was the new paradigm that Cilium unlocked by building on top of eBPF. I found it very promising, and I’ve been proven correct. I think that was a very good decision to build Cilium on top of eBPF and change the cloud native ecosystem. 

There are so many security and observability solutions and projects today that use eBPF, like Tetragon. With the speed at which security incidents can happen in today’s world if you have your system enforcing rules when something bad is happening, it is a really strong capability. eBPF is now traversing into other areas, like security, and I’m excited about building more with it in the future.”

Lars Bengtsson, Lead DevOps Engineer, Nemlig.com

“Looking back over the past three years, Cilium exceeded my expectations of alleviating some of the pains of managing the network side of running Kubernetes clusters. And as has time gone on, there are many new features we benefit from. For example, it is now possible to separate the Cilium agent demon set workload and the one for Envoy giving us some benefit in regards to robustness and failure domains. Then with CiliumNetworkPolicy and CiliumClusterWideNetworkPolicy, we are able to enable the token-based authentication as a service that we built.”

Lars Bengtsson, Lead DevOps Engineer, Nemlig.com

Future Plans and Further Improvements with Cilium

Cilium has been a tremendous success for Nemlig. With Cilium, they have been able to improve their business agility and security, replace kube-proxy, and enable the development of new capabilities for their platform.

“I would say that the Kubernetes project, what we did and what we’ve built over the last three years, has enabled Nemlig to be much more agile. With Cilium and Envoy, we are also able to offer good security posture out of the box with the platform instead of having to ask the developers to implement it themselves later on.”

Anton Due, Senior DevOps Engineer, Nemlig.com

With Cilium as a key part of their platform, the Nemlig team already has some future plans for Cilium. They are planning to use Cilium network policies for Layer 7 visibility, improve their token-based authentication as a service, use Tetragon for runtime policy enforcement, and test out Cluster Mesh.

“We did a PoC using Cilium network policies for layer 7 visibility and enforcing micro-segmentation of east-west and north-south traffic. I want us to integrate that for all our workloads and find a way to automate network policy creation so that when a new service is onboarded to Kubernemlig, we generate network traffic in a test cluster and then automatically write the needed network policy/ies so the correct policies can be in place for production as ealy as possible. That way you don’t have the classic “I forgot to open that port” issue.

We also want to polish the way we apply the token-based authentication as a service we built so that it’s easier for us to use and leverage Tetragon to enforce different policies. Finally, we have been looking at Cluster Mesh for the benefits it brings migrating services from cluster to cluster in the cloud.”

Lars Bengtsson, Lead DevOps Engineer, Nemlig.com