Deprecation of Dockershim
A couple of days ago I found a news in Hacker News with the title Kubernetes is deprecating Docker runtime support that made me think a bit about the impact of this news, both in terms to Kubernetes and for Docker, since Docker popularized the container technology, could this be the end of Docker?
Let’s talk a bit about Docker and the container eco-system evolution to understand how we got here.
Understanding Docker
Docker is not just a container runtime, it’s a entire platform to build, publish and run containers that as a lot of features built on-top like networking, volume management, etc.
Looking at Docker from 10.000 feet, you can find two components, docker and dockerd. The docker client (is a CLI tool) allows you to issue commands to dockerd to pull, build or run a image among other things, and the dockerd is the Docker daemon process that receives client requests and performs the requested actions.
Evolution of container runtimes: OCI and CRI
Since it’s first release in 2014, docker and the entire container ecosystem have been evolving, and a set of standards emerged from it Open Container Initiative (OCI) and Container Runtime Interface (CRI).
With OCI docker broke down the monolith and separated their code for low level interaction with the Linux kernel that allowed them to run containers to a library called libcontainer and a tool called runc that became the reference implementation of the OCI runtime specification.
So in it’s core the OCI runtime specification specifies the format of a Docker image, called bundle and how the resources from the Linux kernel (cgroups, namespaces) are allocated and how to run a image. There is no network management, not image pulling specified at this level, just how to run containers.
So in December 2015 Docker Engine 1.11 got released, this was the first version of Docker supporting runc and containerd.
containerd is a high level container runtime that is available as a daemon and can manage the complete container lifecycle of its host system: image transfer and storage, container execution and supervision, low-level storage and network attachments, etc.
Underneath containered calls runc to create the containers.
The release of Kubernetes 1.5 introduced the Container Runtime Interface (CRI), a plugin interface which enables kubelet to use a wide variety of container runtimes (containered, cri-o, etc), without the need to recompile. CRI consists of a protocol buffers and gRPC API, and libraries, with additional specifications and tools under active development.
Why the deprecation of Dockershim?
Long story short, Docker engine daemon (dockerd) doesn’t support CRI meaning that in order for Kubernetes to support Docker to run images they need to use a shim and maintain it at their own cost.
The Dockershim flow
When the kubelet wants to create a container the following steps are required:
- Kubelet calls dockershim through the CRI interface (gRPC) to request the creation of the container. The dockershim is embed directly in the kubelet and acts like a CRI server, and receives the creating request.
- The dockershim gets the kubelet request and translates it to something that the Docker daemon (dockerd) can understand.
- As we saw in Docker 1.11, the Docker externalized the container creation to containerd, meaning that dockerd then requests containerd to create the container.
- One containerd gets the request, it created a process called containerd-shim to let containerd-shim operate the container. This happens because the container process needs a parent process to attach to in order to collect stats, state, etc; and if the containerd process is the parent, if containerd hands or gets upgraded all child container processes on the host will have to exit; containerd and shim are not parent-child processes.
- The containerd-shim just calls runc a command line tool that handles all the low level container operations and implements the OCI interface standard.
- After runc starts the container, it will exit directly and the containerd-shim will become the container parent process, responsible for collecting the status of the container process, reporting it to containerd and taking over the of the child process of the container after the process with pid 1 in the container exists, cleaning up and ensuring that there are no zombie processes.
The containerd via cri-plugin
When we look at the kubelet flow when using containerd 1.1+ we can see that the list of dependencies and complexity gets reduced:
- The kubelet calls containerd directly via the CRI interface (gRPC) to request the creation of the container.
- Before containerd 1.1 there was cri-containerd daemon process that implemented the CRI interface received the requests from the kubelet and forward them containerd. Now containerd implements directly the CRI plugin interface receiving the requests directly, it created a process called containerd-shim to let containerd-shim operate the container. This happens because the container process needs a parent process to attach to in order to collect stats, state, etc; and if the containerd process is the parent, if containerd hands or gets upgraded all child container processes on the host will have to exit; containerd and shim are not parent-child processes.
- The containerd-shim just calls runc a command line tool that handles all the low level container operations and implements the OCI interface standard.
- After runc starts the container, it will exit directly and the containerd-shim will become the container parent process, responsible for collecting the status of the container process, reporting it to containerd and taking over the of the child process of the container after the process with pid 1 in the container exists, cleaning up and ensuring that there are no zombie processes.
As you can see maintaining the dockershim brings a lot of pain to the Kubenetes team, they need to manage another component just to keep supporting Docker (tracking deprecations, api changes, etc) and at the end of the day Docker is not special and it should expose a CRI interface just like other container runtimes (containerd, cri-o).
So in order to continue using Docker for Kubernetes deployments in the future, the Docker client or daemon will need to become CRI compliant meanwhile everyone should start planning to transition to a CRI compatible runtime like containerd for example.
Some managed Kubernetes services like AKS or GKE already are supporting CRI runtimes others like EKS are still working supporting it.
Don’t forget to check if the container tools that you are using also support CRI runtimes and don’t just expect to find Docker.
In a last note this doesn’t mean that you for local development should stop using Docker, right now the docker platform in my understanding still gives to developers the best user experience and a huge step to tools that simply make our life easier (eg: docker-compose).
Here are some articles that try to explain a bit more in detail the reasons under the what will this entail for Kubernetes and it’s users:
Comments