Docker swarm and kubernetes are the most widespread container orchestrators for running micro services spanned on different nodes of a cluster. They provide high availability, scalability, security and easy management to complex software architectures.
The goal is reached from docker swarm using the stack concept that is a set of docker services related together; in kubernetes by different types of objects like Deployment, Replication Controller, Stateful Set and Services, that extend the meaning of containers, and that lead the controllers and the scheduler to run containers in the node of the cluster providing them all the network and storage resources.
Kubernetes is more powerful than docker swarm because it’s porgrammable and it permits to integrate easily with different network and storage plugin enabling to any vendor to use the same interface. For networking the reference interface is container network interface, for storage the guide is Container Storage Interface (CSI) that was developed as a standard for exposing arbitrary block and file storage storage systems to containerized workloads on containers.
Although kubernetes is more powerful than swarm, this does not justify the lack of interest in docker swarm which still remains easier to configure offering a series of interesting services like the stack concept not present in kubernets,
This article continue the job already described in my first article http://www.securityandit.com/system/docker-swarm-vs-kubernetes-part-1/ where I spoke about services and stack swarm against kubernetes objects.
The focus is now about the volumes in Swarm and kubernetrs.
Volumes in docker swarm and Kubernetes
For docker swarm the volumes are simple directories outside of the container’s filesystem that hold persistent data and mounted inside the containers
There are three type of volumes:
- A bind mount makes a file or directory on the host available to the container it is mounted within.
- A named volume Data in named volumes can be shared between a container and the host machine, as well as between multiple containers. This is the default mode and as Docker suggests they are the preferred mechanism for persisting data generated by and used by Docker containers.
- A tmpfs mounts a tmpfs inside a container for volatile data.
Following a simple stack in docker swarm with one service that mounts from the host machine a directory where the nginx container stores the html data. The volume is bind type.
[root@swarm-01 ]# vi compose_dev.yml
[root@swarm-01 ]# docker stack deploy –prune –compose-file /glustervol1/docker-compose/compose_dev.yml stack-dev –with-registry-auth
[root@swarm-01 docker-compose]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
ojjj4tqlhxry stack-dev_nginx replicated 1/1 nginx:latest *:80->80/tcp,*:443->443/tcp
It’s possible to inspect the container for showing the volume mounted inside the service:
[root@swarm-01 ]# docker inspect a6647078b071
As showed the volume is bind type because a directory on the host was available on the service that is composed by only one replica.
Docker swarm presents a lot of limitations respect to kubernets.
- Docker swarm doesn’t support any type of automatically volume mounting and creation. In the example above, the source directory must be present and mounted in the hosts where the kubernetes pods are running. If we use some external volume like nfs or iscsi, the external volume must be manually detected and mounted in any node of the cluster.
- In docker swarm it’s not possible to replicate services creating automatically for any replica a new volume. Let’s suppose to have a service like mongo db with n replicas configured in replica set. In docker swarm it’s not possible to have different volumes for any replica, but it’s necessary to create different services like mongodb-01, mongodb-02 configured in high availability.
In Kubernetes the concept of directory to be mounted inside of container is outdated because it introduces three new concepts: Persistent Volume, Persistent Volume Claim and Storage class.
- A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster, associated to a physical resource like a disk or an external storage reachable by nfs, iscsi, gluster, etc.
- A PersistenVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. It’s used for mounting a persistent volume into a Pod. The pods can claim them without knowing its details.
- The Storage class provides a way for administrators to describe the “classes” of storage they offer. it permit to create automatically a Persistent Volume by cloud provisioner api.
The benefits respect to docker swarm are:
- The volumes are not anymore simple directories of the host but resources created or used automatically. Example: in docker swarm a iscsi target must be manually discovered and mounted on any node of the cluster and then mounted into containers like any other directory. In Kubernetes, it’s possible to use a iscsi volume directly defining all its details in the persistent volume configuration like done in this example.
- It’s possible to abstract the volume resource claiming it when it needed. Example: A pod needs a physical volume of n GB; it claims it and automatically kubernetes creates a new volume using the right cloud plugin driver. The new volume could be a disk in Google Cloud, in AWS, a nfs or gluster remote directory.
- Kubernetes offers a new object called StatefulSets for managing pods that consume persistent storage. This permit that with docker swarm is not possible: having n replicas of a database services that mount automatically different persistent volumes. This is a good solution for running databases in high availability inside the kubernetes cluster.
Let me show some example for understanding better these concepts.
In the first example I will show how the storage class object is used. I remember that by storage class it’s possible to define a storage abstraction type that allocates a volume with certain characteristics when a claim is created.
A storage abstraction belonging to”fast”class can be created in this way:
[root@kali storage_class]# vi storage-pv-fast.yaml
[root@kali storage_class]#kubectl create -f storage-pv-fast.yaml
[root@kali storage_class]# kubectl get storageclass
When volume provisioning is invoked, the parameter type:
pd-ssd and the secret any referenced secret(s) are passed to the CSI plugin kubernetes.io/gce-pd via a
CreateVolume call. In response, the external volume plugin provisions a new volume and then automatically create a
PersistentVolume object to represent the new volume. Kubernetes then binds the new
PersistentVolume object to the
PersistentVolumeClaim, making it ready to use.
The fast storage class above creates dynamically a new volume (PV) of the size configured in the physical volume claim: a new physical disk will be automatically created. When the PVC is defined, the storage class above will bound to it the PV above created.
[root@kali storage_class]# vi storage-pv-fast-claim.yaml
[root@kali storage_class]# kubectl create -f storage-pv-fast-claim.yaml
persistentvolumeclaim “pvc-fast” created
[root@kali storage_class]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-fast Bound pvc-936fcf00-fc53-11e7-b1bc-42010a840041 10Gi RWO standard 17s
[root@kali storage_class]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-936fcf00-fc53-11e7-b1bc-42010a840041 10Gi RWO Delete Bound default/pvc-fast standard 1m
In other words, the physical volume claim is bound to physical volume associated to a new disk. It’s like a new volume was created manually:
[root@kali storage_class]# gcloud compute disks list|grep 0041
gke-persistent-disk-tu-pvc-936fcf00-fc53-11e7-b1bc-42010a840041 us-central1 10 pd-standard READY
The volume can be mounted inside the directory referencing the PVC resource and not the PV.
Next a Pod called debian that mounts on the data directory the volume above. The Pod simply writes something in a text file inside the directory mounted.
[root@kali storage_class]#vi debian.yaml
– name: task-pv-storage
– name: debian
– mountPath: “/data”
while true; do
date >> /data/test.txt;
echo “Hello from the second container” >> /data/test.txt;
[root@kali storage_class]#kubelet create -f debian.yaml
[root@kali storage_class]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
debian-01 1/1 Running 0 7m 10.40.0.7 gke-persistent-disk-tuto-default-pool-eedbf06b-k311
Inside the container, it’s possible to see that the volume are bind type like in docker swarm.
[root@kali storage_class]# docker inspect 69dc814dd6dc”Mounts”: [
Another advanced feature introduced by kubernets is the possibility to connect directly to external file server like nfs or glusterfs.
It’s not necessary to mount in any node of the cluster the nfs volume as it would be done with docker swarm but it’s enough to declare a persistent volume of nfs type like that:
[root@kali kubernets]# vi nfs-volume.yaml
# FIXME: use the right IP
[root@kali kubernets]# kubectl create -f nfs-volume.yaml
As we said, the pods consume pvc. So it means we need to claim the volume in order to be used and it’s done by a persisten volume claim:
[root@kali kubernets]# vi nfs-volume-claim.yaml
[root@kali kubernets]# kubectl create -f nfs-volume-claim.yaml
[root@kali kubernets]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-volume-claim Bound nfs-volume 10Gi RWX nfs 3s4
The PVC bound to volume above is ready to be consumed by a pod. I will use a deployment object already explained in my previous article.
[root@kali kubernets]# vi httpd.yaml
– image: httpd:alpine
– containerPort: 80
– mountPath: “/usr/local/apache2/htdocs/”
– name: nfs-volume-storage
[root@kali kubernets]# kubectl create -f httpd.yaml
deployment “httpd” created
[root@kali kubernets]#kubectl describe deployment httpd
/usr/local/apache2/htdocs from nfs-volume-storage (rw)
In this last example, it’s easy to understand the difference between storage class, that creates automatically a persistent volume calling the plugin provisioner when a pod claims it, and the persistent volume that is created manually and bound when a pods claims it.
When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. You can find all the details about that at https://kubernetes.io/docs/concepts/storage/persistent-volumes/.
An important volume, that I would like to speak are the local persisten volume. The Kubernetes scheduler understands which node a Local Persistent Volume belongs to. With HostPath volumes, a pod referencing a HostPath volume may be moved by the scheduler to a different node resulting in data loss. But with Local Persistent Volumes, the Kubernetes scheduler ensures that a pod using a Local Persistent Volume is always scheduled to the same node.
With this type of object, it’s possible to run database in high availability inside a kubernetes cluster without using remote storage, like ceph for example, using also the stateful set. You find all the details in this link: https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/.
As you can see the volume management in kubernetes is more functional and flexible respect to docker swarm that doesn’t offer anything of special. Instead in Kubernetes there is a important step forward in the abstraction of storage space from business processes.
We can also say that the way how are managed the volumes in kubernetes is more functional and configurable respect to docker swarm.
kubernes offers a lot of plugin for managing different type of volumes, on premises and on cloud, and it provides us the possibility to integrate with new storage by well documented go api available here: https://github.com/kubernetes/client-go