Guest post by Tobi Knaup, CEO of D2iQ
The open-source ecosystem has evolved quickly from niche projects with limited corporate backing into the default way to develop software. Both small and large organisations are using open-source software to speed up innovation and product development.
A state of enterprise open source survey found 95% of enterprises are taking open-source seriously; 75% report open-source software is extremely important to IT strategy; 77% of respondents plan to ramp up open-source use in the next year.
Meanwhile, 71% of the UK government’s tech workers report they use more open source than five years ago. The U.S. Department of Defense even issued a memorandum on open source software being its preference, not proprietary software.
The market has consolidated following Kubernetes’ maturation, with many Kubernetes offerings of different architectures, features, and interfaces – some less open and flexible than others, containing different restrictions, dependencies, and licensing terms.
Deviating from an open Kubernetes standard can cause issues. As The Journal of Cloud Computing notes, “without an appropriate standardized format, ensuring interoperability, portability, compliance, trust, and security is difficult.”
What Is Pure Upstream Open-Source Kubernetes?
Upstream Kubernetes is an open-source version of Kubernetes hosted and maintained by the Cloud Native Computing Foundation where code and documentation is developed and contributions are made. It consists of core ‘plain vanilla’ Kubernetes for orchestrating containers without add-on applications – all publicly accessible for inspection, modification, and redistribution.
Free and open-source software projects have good intentions – making tech that helps the whole community. Anyone can access the code and collaborate to fix bugs, add patches, and optimise performance quickly. But project growth can lead to diverging goals and perspectives – or ‘forks’ in the code.
What Is a Fork of Kubernetes?
A fork of Kubernetes is a version of the open-source project developed separately from the main workstream. Forking happens when part of the development community or a third-party vendor makes a copy of the upstream project with modifications to start a completely independent line of development.
You might fork Kubernetes because of a difference in opinion (technical or personal), or because development of the upstream project stalled, or a desire for different functionality. That can happen in open-source or proprietary environments.
When an open source Kubernetes fork improves the original code, other forks can utilise it, combining the code with their fork to better meet the needs of developers and end users.
But for Kubernetes forks in proprietary environments, vendors or cloud companies will change the source code to meet their own needs, repackage software, and offer it to customers as a proprietary distribution. They may alter the add-ons needed to run Kubernetes in production.
This complicates management of the solution, but also risks vendor lock-in.
The Problem with Forking Kubernetes
It’s hard to deploy and manage Kubernetes at scale. Many organisations use proprietary distributions to obtain enterprise support for their container platform but this has led to significantly forked versions of Kubernetes.
Some challenges include:
Complications with Patches, Bug Fixes, Upgrades, and New Features
Every new update makes it harder to make changes work with a custom distribution. It’s a slow, and costly process. Vendors who fork Kubernetes often have an older version of the cluster API because it takes them six months or more to get improvements and bug fixes from the upstream.
Vendor Lock-in
Forks in Kubernetes create lock-in, i.e. a customer cannot easily replace or migrate the solution. It removes the flexibility to move your applications and data seamlessly between public, private, and on-premise services. It also doesn’t provide you with multiple options as your company grows. Even if the source code is open-source, vendors can wrap Kubernetes in features that prevent migration to other platforms without extra cost and excess resource allocation.
Lack of Functionality
A forked version of Kubernetes can break application functionality. Some custom distributions rely on proprietary APIs and CLIs to get full functionality, which creates lock-in. If the custom distribution only runs on the vendor’s custom Linux kernel, it also creates lock-in. Eventually, it gets harder to maintain this fork, preventing merging the latest upstream patches into the fork without major work on patch and feature compatibility. If a product is discontinued, you may be out of luck.
Less Secure
A fork in Kubernetes could potentially run less secure code. If a vulnerability is found in open-source code and fixed by the community in the upstream, a forked version of code may not benefit since it differs from the upstream.
Lack of Interoperability
Vendors can modify code for their custom distributions or the supporting applications you need to make Kubernetes run in production. While an altered version of Kubernetes will work with a particular vendor’s application stack and management tools, these proprietary modifications lock you into customized component builds, stopping you from integrating with other upstream open-source projects. If their stack comprises multiple products, it’s very hard to achieve interoperability, which can cause lots of downstream issues as you scale.
Technical Debt
It’s hard to merge back a fork that has changed drastically over from the upstream. We call this technical debt – the cost of maintaining source code caused by straying from the main branch of joint development. More changes to forked code means more money and time to rebase the fork to the upstream project.
Pure Upstream Kubernetes is the Way Forward
Pure upstream open-source Kubernetes is the focal point where decisions are made, where contributions happen, and comes with a built-in community that continuously improves the source code.
A pure, upstream solution allows sharing ideas with the larger community and getting new features and releases accepted upstream. Every project and product based on the upstream can benefit from previous work when they pick up the future release or merge recent (or all) upstream patches.
While anyone can copy, install, or distribute Kubernetes from the upstream repository, larger companies and organizations need certified products, tested and hardened for enterprise use. As such, organisations rely on vendors to turn upstream Kubernetes into downstream products that meet their business needs.