Case Study

PostFinance

Enhancing performance, scalability and observability with Cilium

Challenge

PostFinance, a Swiss bank, serves over 5 million personal and business customers, offering top-tier solutions and smart innovations for managing finances.

The platform team at PostFinance initially developed their Kubernetes clusters using kube-proxy and iptables for cluster networking. However, as the number of clusters on their platform increased, they encountered performance, scalability, and observability challenges. These issues slowed down their platform and complicated debugging due to limited visibility. In response, they sought an open source networking solution that was widely adopted by the Kubernetes community to address these problems.

Solution

PostFinance was able to address their performance, scalability, and observability challenges by replacing kube-proxy based networking with Cilium. They chose Cilium because it is open source, built on eBPF, stable, and offers better performance and observability than other solutions.

Impact

By adopting Cilium, PostFinance successfully eliminated kube-proxy from their platform, resulting in enhanced performance. Additionally, the integration of Hubble has provided them with deeper network observability, enabling them to swiftly identify and rectify issues.

Location:
Cloud Type:
Published:
June 12, 2024

Projects used

By the numbers

4 Million

Customer transactions/day

562

Kubernetes nodes

2.5 thousand

Namespaces

PostFinance operates an on-premise, VM-based, vanilla Kubernetes platform. This setup includes 25 shared clusters and accommodates 750 users across three environments: development, testing, and production.

They initially built their Kubernetes platform utilizing kube-proxy and iptables for networking but soon encountered issues with performance, scalability, and observability. 

“Using kube-proxy, and with our clusters growing, it was becoming a challenge to simply start a pod. It took several seconds, and up to a minute, for a pod to gain connectivity or for services to map to a pod, massively impacting the scalability of our platform.”

Clément Nussbaumer, Systems Engineer, PostFinance

Recognising these challenges, PostFinance began searching for a new open source solution that was widely adopted by the Kubernetes community. After a brief evaluation, they decided to adopt Cilium because it is stable, built on eBPF, open-source, and has become the industry standard networking solution across the cloud native community. Cilium replaced kube-proxy and now provides PostFinance with better performance, deeper observability in their network, and integrates well into their wider Kubernetes platform. 

“Being built on eBPF, performance-wise Cilium is great. With Cilium, you also have access to quite some in-depth debugging capabilities if you want, or if you are curious about how it’s implemented. And having worked with kube-proxy and iptables, I prefer that we don’t use them anymore.”

Clément Nussbaumer, Systems Engineer, PostFinance

“What I like the most about Cilium at the moment is all of the potential it has. It integrates well into the whole cloud native landscape and adds features like ingress and Gateway API support to round out our Kubernetes platform.”

Luana Cusseddu, Systems Engineer, PostFinance

Unlocking Better Observability with Hubble

Migrating to Cilium to address their kube-proxy performance and scalability issues came with additional observability benefits. PostFinance now relies on Cilium Hubble for network observability, which is a significant upgrade from the traditional Linux network observability tools they initially used.

“In terms of observability in Kubernetes, it was quite tricky to know exactly where the packets were flowing and what could be blocking them. Basic Linux networking tools, like tcpdump, lack the context of cluster topology and it quickly becomes complicated with dynamic pod IPs. Observing these network flows is quite simple to do now with Hubble including analyzing the flows, filtering, etc.

We export the Hubble metrics, flows, and particularly the connection drops. With the source label and namespace label, we can quickly help when one of our customers or teams is having issues or a bad configuration in the namespace.”

Clément Nussbaumer, Systems Engineer, PostFinance

Cilium Helps Businesses Scale 

With Cilium already as the key networking layer in their platform, the PostFinance team is currently testing Cilium’s WireGuard node-to-node encryption and evaluating Gateway API as the next steps in their Cilium journey.

“We are currently migrating to node-to-node encryption with WireGuard and are just waiting to deploy that to production. We’re also looking into using Gateway API.”

Luana Cusseddu, Systems Engineer, PostFinance

Migrating to Cilium has been a major success for the PostFinance team. They have been able to improve the performance of their clusters, resolve the scalability issues they experienced with kube-proxy, and improve observability and debugging for their teams and customers.

“With Cilium, our pods start up much faster, scale faster, and more. We’ve rarely had issues with Cilium, or had it be the cause of an incident which is a good thing because if you don’t notice something then it is not in the way and working as intended.”

Clément Nussbaumer, Systems Engineer, PostFinance