Kubernetes

Want to automate pretty much every aspect of a deployed application lifespan? Having issue with reliability using docker-compose? Doesn’t want to care/handle how to expose this application to the outside? Want to have a reproducible infrastructure of containers deployed in a fast and resilient manner? Then Kubernetes is for you. In a word, it’s a container orchestrator aimed at data-center.

First, you need to deploy a master node, containing the configuration and is the main API server, then a few nodes connecting to it; of course, you can have multiple master, for high-availability. There is much information on kubernetes.io on how to get started.

Now that it is up and running, you have to choose between two mode of development: either a file based one encoding the wanted state of the system, or a command line based one where you evolve the running system until you’re happy with it then dump/clean the configuration to get to the file-based approach. I found that devops prefer the latter, here I’ll go with the former.

Let’s take an example: I’m currently working on deploying Drynx, a decentralized and secure system for machine learning on distributed datasets, to show its power via our demonstrator. To have a better understanding of what Drynx is computing, we want to actually show theses datasets in our web interface, so we want to provide an endpoint to retrieve them, ie a nginx providing some files via HTTP. In kubernetes’ terms, it means having a deployment for some nginx pod, providing a service.

  • pod: an instance of an application, running inside a container, which can be crashed/killed for any reason, an updated pod, killed by the system, some maintenance needed on the host, …
    • so if you want state to be stored or shared between containers, you need to use a volume
      • in our case, we actually have the datasets shared between nginx and the Drynx nodes
    • one quite useful feature is that you can use initContainers, which are run before the main containers are run, it’s usually where you generate the configuration
      • the way to transfer state between the init containers and the normal ones is to use an emptyDir, which lives as long as the pod does (so it survives container restart)
  • deployment: handle pods, how many you need at any point, how to transition to a new application version; it will (re)start/kill its managed pods to get to the state you want
    • it’s the main type of file to write, as it contains a template for the pods to create; I’ve yet to find a common use case for writing a pod file directly
  • service: entrypoint for the HTTP calls; as the pods can be created at any IP and can be quite short lived, we want to have a stable address to connect to, also, routable from the outside of the datacenter usually

So with that, you have an application which can crash and be upgraded without caring anymore about how it has to be exposed to the outside or how to handle running connection. You have a clear view of what is running, how it is available and how well it is running, without caring about little detail like physical location or hardware.