Published: Jan 3, 2020 by John Placais
TL;DR: Don't make the move to docker and kubernetes at the start of a project, design your architecture to prepare for it and only start using kubernetes when it becomes necessary.
Docker and Kubernetes are famous for their role in massive corporations such as Google, ADP, and SAP. In fact, Google built the predecessor to Kubernetes (called Borg) in order to manage their container environment. Containers and container orchestration systems can be a wonderful way to manage a large scale software project where dozens of servers are required, but can actually be a detriment to smaller projects that haven’t reached scale just yet.
For the uninitiated, it will help to understand how docker and Kubernetes are used. The way docker works in an enterprise environment starts with getting a base image from a repository, usually Docker Hub. This image has at the least, an operating system. Then we can add to that base image whatever it is that you will need to run, like a website or a database by writing a dockerfile. The dockerfile describes the following:
- A target OS
- Files to copy into the container
- Scripts to be run as part of building the image
- Which ports to expose from the container to the outside world
- The command to run upon starting your container
With that dockerfile you can create a key image. Once we have created all the key images your environment needs, we need to store them in a central repository for all your hosts to pull from. This repository will also keep track of versioning. We required an internal container repository to securely save these new images, as they contained sensitive data and code.
Any time we need to update the images, we create, run, and maintain scripts as a part of our build.
Next, you will want to create your docker hosts. Initially, you can manually set this up but to really take advantage of containers, you will want to leverage Kubernetes and script out your desired state. Manual setup can quickly become error-prone as more images are created that need to communicate with each other.
Kubernetes has a hierarchy that helps manage the complexity. At the lowest level is the container. Above that is a pod, which can contain one or more containers together in a group. A pod is deployed on an individual host, so if you want things co-located, keep them in the same pod.
Kubernetes nodes manage and run your pods. It’s at this level where you define the type of hardware required (how many CPU cores and how much Memory) for the pods you want to run in the node.
And finally, nodes are grouped together into a Kubernetes cluster, which defines your entire environment across all nodes.
Deployments are the mainstay tool for creating and updating new versions of your pods in the cluster. A deployment can describe the following:
- Tags such as an app name, or a tier
- A selector, which describes the target definition
- A template of what to deploy (containers)
- And Optionally probe definitions
Deployments can be rolled out in a variety of different ways, including a slow roll, and blue/green deployments. This really helps to manage a rollout while keeping 100% uptime if needed. And you can leverage deployments to roll back a bad release as well.
Probes allow Kubernetes to monitor the health of each of your containers. They come in two flavors - Liveness (is it healthy) and Readiness (is it ready after a startup). There are prebuilt probes that you can use to perform pretty much anything you will need, such as an HTTP request and response, or executing a command on the machine and expecting a response.
Kubernetes Services are what help you manage the networking for and between deployed pods as they startup and shutdown, as well as load balancing traffic between pods. Any external network traffic comes into a pod through a service. And unlike pods and containers, services are persistent and need to stay up.
Why and when to use Docker and Kubernetes
So now we have a high-level view of the basic concepts of Kubernetes. This should hopefully give you a glimpse into the complexity of the container world, and we haven’t even talked about persisting data, cybersecurity, or debugging.
So why use Kubernetes and Docker? First, Kubernetes at its heart is a DevOps tool for large scale applications. Kubernetes deploys and runs your application regardless of what technologies you are using in your containers. Kubernetes makes a clean separation between your app and how to deploy your app.
Second, Kubernetes gives you the power to be elastic and scalable on a large scale. You can grow and shrink the number of instances of your pods easily, as well as add new servers to your environment if necessary.
Third, resiliency - Kubernetes will use probes to monitor your pods and auto-replace any pods that seem to be malfunctioning. If one of your containers stops responding or functioning, it will handle shutdown and cleanup, and ensure the number of pods you want to be up and running, stay that way.
As cool and useful as Kubernetes is, I would not recommend it to every application developer. The problem is that Kubernetes is the shiny new toy - everyone wants to use it for everything. Developers want to containerize their apps and see if other people deploy their creations. They are probably tired of Puppet and Chef. They want to be a part of the revolution. Developers have a fear of missing out, and a strong desire to have Kubernetes on their resume.
So my answer is (as it is with many of these kinds of situations) it depends. If you are creating an application for a handful of people that will comfortably run on a single server today, then switching to Kubernetes would be premature. Remember one of the tenets of software architecture - “Added complexity reduces productivity”. There is a learning curve with Kubernetes, and there is a burden to work with it and manage it. If you already have a complex environment (a few dozen servers) and are struggling to manage it all with traditional tools, then you already have the complexity that mandates the implementation of Kubernetes, and it can help you manage and scale from here on as you grow.
Also, depending on the solution you are trying to build, you can alternatively build scalable solutions with a cloud provided serverless approach. For example, leveraging Azure functions, Azure Event Hub, and Azure Data Lakes can give you scalability with less complexity ( as before, choosing a serverless architecture only makes sense depending on what your product is trying to do! ).
For our current project, we know that we will eventually need the scalability and management of a Kubernetes cluster, but we are delaying the jump to it for as long as possible so we can focus on software development. We have designed the product with a micro service architecture that will make it easy to migrate to Kubernetes when we have to.