Guest post by Joram Wilander, Director of Engineering at Mattermost, Inc.

Introduction

Most products that run as Software-as-a-Service (SaaS) are built to be multi-tenant, meaning that a single instance or deployment is meant to be used by multiple organizations. There’s a good reason for this: it’s generally easier to scale and operate multi-tenant applications.  

But in this new age of containers, orchestration, infrastructure-as-code, and Kubernetes, where it’s cheaper, faster, and simpler to deploy a new instance of an application, that may no longer be the case. Here at Mattermost we thought there was enough merit in that idea that we built our Cloud architecture off of our single tenant application.

Why is Mattermost Single-Tenant in the First Place?

Before I get into the details of how that works, a bit of background on Mattermost’s architecture is warranted. When we first started developing Mattermost back in 2014 we wanted to create a collaboration chat platform that was 1) open source and 2) self-hosted. We didn’t have any plans for running Mattermost as SaaS and we really wanted to build a product that was focused on the organization using it. So, we architected Mattermost to be single-tenant. There can be teams within a single Mattermost instance but there are no barriers between the users of those teams interacting with each other. Users were made to be the top-level and the relationship with teams is 1:n.

For the first five or so years of Mattermost’s existence this served us very well. It let us focus on making a great experience for our self-hosted users and customers. But then, a couple of years ago, we wanted to expand into offering Mattermost as a hosted service. 

Taking Mattermost to the Cloud

Now the engineering team was faced with the question: how? A lot of the engineers at Mattermost have worked on large SaaS systems in the past and well-knew the benefits that being multi-tenant brings to them. We also knew that re-architecting Mattermost to be multi-tenant would be no small feat because being single-tenant was at the core of how it was built. We were also still a fairly small start-up company at the time and didn’t have the resources to fork our product and build something separate for Cloud. The question we asked was, do we have to re-architect at the application layer or was there another way to solve this? Perhaps it could be solved at the infrastructure layer, say using something like Kubernetes? To cut the story short, after numerous discussions, POCs and a bunch of research, we were confident we could do it.

So, we did it. And it looks like this:

Mattermost cloud architecture

The short version of what we did was leverage the orchestration power of Kubernetes and add another layer of orchestration on top of that, which was much more Mattermost-aware, to be able to deploy individual instances of Mattermost quickly and on-demand when a customer signs up for a workspace. This means that each customer who has their own workspace gets their own deployment of single-tenant Mattermost and a set of pods that is only for them.

Additionally, all of this wouldn’t be possible without a ton of other great, open source CNCF projects that we built our Cloud with:

Benefits

In addition to not having to re-architect our entire application to be multi-tenant, we also get a number of other benefits from building our Cloud architecture around a single-tenant application:

None of these are small potatoes either, especially since Mattermost was founded with data privacy and security as core principles.

Challenges

Running any SaaS product is hard. It’s even harder when you’re doing any sort of trailblazing off the beaten path. Here’s a non-exhaustive list of some challenges we faced. All of these are deep enough to be their own blog posts, and some of them already are.

Missteps

As often happens when you’re building anything complex, there are always lessons to be learned. Some of the missteps we had were:

High-level Architecture

Getting into some more detail, at a high level there are three primary components of the system:

The Customer Web Server is a fairly standard web server. It provides the front-end portal that our customers use to sign up, handles billing and customer account information, and is what tells the Provisioning Server to create or delete new deployments of Mattermosts, what we call installations on the backend. This is what serves pages like this.

The Provisioning Server is the brain, or command-and-control, of our entire Cloud architecture. It is responsible for creating and managing Kubernetes clusters, deploying, configuring, and managing Mattermost installations onto an appropriate cluster, and scheduling and rolling through updates to both clusters and installations. This is the primary part of that additional Mattermost-aware orchestration layer I mentioned. It has a REST API and is built using a micro-monolith architecture that lets us ship as a single binary that can run one or more supervisors to handle cluster, installation, and other responsibilities separately. If you’d like to learn more about the specifics, check out the original technical specification.

Finally, there is our Kubernetes Operator that handles all the low-level Kubernetes management of the Mattermost installations. It creates all the Kubernetes resources in the cluster needed for Mattermost to run, knows how to perform rolling updates, and other maintenance tasks. If you’re not familiar with Kubernetes operators you can think of it as a system administrator that is an expert on the application being codified into an application that manages the deployment to Kubernetes for you. While we heavily rely on our operator for our Cloud, it is also usable by customers who want to self-host Mattermost on their own Kubernetes clusters.

There’s a lot more to it than just that but that’s a good high-level summary.

Infrastructure

There are two main pieces of infrastructure that makeup Mattermost Cloud. There’s the command-and-control (CnC) Kubernetes cluster and the application Kubernetes clusters.

There is only one CnC cluster per deployment of the entire Cloud infrastructure. Both the Customer Web Server and Provisioning Server run within it, as well as all the other utilities and services we need to run a Cloud service such as logging and metrics collection.

Application Kubernetes clusters are where the Mattermost installations that our customers use are deployed. These clusters are created as needed by the Provisioning Server and there can be as many of them as needed.

We currently run everything on AWS and leverage S3 and RDS for file storage and databases, respectively. In the future we’d love to go multi-cloud, potentially even giving customers the choice for which cloud provider they’re deployed on.

Want to Learn More?

Check out these other blog posts from our Cloud team:

Join the ~Developers: Cloud channel on our community server if you want to discuss anything you just read with us.

Mattermost is actively hiring SREs and Cloud Engineers, apply here!