Docker swarm and kubernetes are the most widespread container orchestrators for running micro services spanned on different nodes of a cluster. They provide high availability, scalability, security and easy management to complex software architectures.
The goal is reached from docker swarm using the stack concept that is a set of docker services related together; in kubernetes by different types of objects like Deployment, Replication Controller, Stateful Set and Services that extend the meaning of containers and are created by restful api sent to Api Gateway that is the core of the cluster.
Kubernetes is more powerful than docker swarm because it integrates natively with different cloud providers, supports a lot of plugins versus other solutions and offers a lot of meta object that permit to manage the software deployment in a more robust and efficient way.
This does not justify the lack of interest in docker swarm which still remains easier to configure offering a series of interesting services like the stack concept not present in kubernets,
This article continue the job already described in my first article http://www.securityandit.com/system/docker-swarm-vs-kubernetes-part-1/ where I spoke about services and stack swarm against kubernetes objects.
The focus is now about the volumes in Swarm and kubernetrs.
Volumes in docker swarm and Kubernetrs
For docker swarm the volumes are simple directories outside of the container’s filesystem that hold persistent data and mounted inside the containers
There are three type of volumes:
- A bind mount makes a file or directory on the host available to the container it is mounted within.
- A named volume Data in named volumes can be shared between a container and the host machine, as well as between multiple containers. This is the default mode and as Docker suggests they are the preferred mechanism for persisting data generated by and used by Docker containers.
- A tmpfs mounts a tmpfs inside a container for volatile data.
Following a simple stack in docker swarm with one service that mounts from the host machine a directory where the nginx container stores the html data. The volume is bind type.
[root@swarm-01 ]# vi compose_dev.yml
[root@swarm-01 ]# docker stack deploy –prune –compose-file /glustervol1/docker-compose/compose_dev.yml stack-dev –with-registry-auth
[root@swarm-01 docker-compose]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
ojjj4tqlhxry stack-dev_nginx replicated 1/1 nginx:latest *:80->80/tcp,*:443->443/tcp
It’s possible to inspect the container for showing the volume mounted inside the service:
[root@swarm-01 ]# docker inspect a6647078b071
As showed the volume is bind type because a directory on the host was available on the service that is composed by only one replica.
Docker swarm presents a lot of limitations respect to kubernets.
- Docker swarm doesn’t support any type of automatically volume mounting and creation. In the example above, the source directory must be present and mounted in the hosts where the kubernetes pods are running. If we use some external volume like nfs or iscsi, the external volume must be manually detected and mounted in any node of the cluster.
- In docker swarm it’s not possible to replicate services creating automatically for any replica a new volume. Let’s suppose to have a service like mongo db with n replicas configured in replica set. In docker swarm it’s not possible to have different volumes for any replica, but it’s necessary to create different services like mongodb-01, mongodb-02, etc.
In Kubernetes the concept of directory to be mounted inside of container is outdated because it introduces three new concepts: Persistent Volume, Persistent Volume Claim and Storage class.
- A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator. It is a resource in the cluster just like a node is a cluster, associated to a physical resource like a disk or an external storage reachable by nfs, iscsi, gluster, etc.
- A PersistenVolumeClaim (PVC) is a request for storage by a user. It is similar to a pod. Pods consume node resources and PVCs consume PV resources. It’s used for mounting a persistent volume into a Pod. The pods can claim them without knowing its details.
- The Storage class provides a way for administrators to describe the “classes” of storage they offer. it permit to create automatically a Persistent Volume by cloud provisioner api.
The benefits respect to docker swarm are:
- The volumes are not anymore simple directories of the host but resources created or used automatically. Example: in docker swarm a iscsi target must be manually discovered and mounted on any node of the cluster and then mounted into containers like any other directory. In Kubernetes, it’s possible to use a iscsi volume directly defining all its details in the persistent volume configuration like done in this example.
- It’s possible to abstract the volume resource claiming it when it needed. Example: A pod needs a physical volume of n GB; it claims it and automatically kubernetes creates a new volume using the right cloud plugin driver. The new volume could be a disk in Google Cloud, in AWS, a nfs or gluster remote directory.
- Kubernetes offers a new object called StatefulSets for managing pods that consume persistent storage. This permit that with docker swarm is not possible: having n replicas of a database services that mount automatically different persistent volumes.
It’s important to understand that the benefits provided by Kubernetes are possible thanks to docker volume plugins than enable the integration with external storage systems such as Amazon EBS, Google Cloud Persistent Disk, Gluster file system, etc. See https://docs.docker.com/engine/extend/legacy_plugins/#volume-plugins.
Let me show some example for understanding better these concepts.
In the first example I will show how the storage class object is used. I remember that by storage class it’s possible to define a storage abstraction type that allocates a volume with certain characteristics when a claim is created.
A storage abstraction belonging to”fast”class can be created in this way:
[root@kali storage_class]# vi storage-pv-fast.yaml
[root@kali storage_class]#kubectl create -f storage-pv-fast.yaml
[root@kali storage_class]# kubectl get storageclass
The fast storage class is ready to create dynamically a new volume (PV) of the size configured in the physical volume claim. Ready but not immediately created.,
When the PVC is defined, the storage class above will create a new volume using the google cloud provisioner driver: a new physical disk will be automatically created.
[root@kali storage_class]# vi storage-pv-fast-claim.yaml
[root@kali storage_class]# kubectl create -f storage-pv-fast-claim.yaml
persistentvolumeclaim “pvc-fast” created
[root@kali storage_class]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc-fast Bound pvc-936fcf00-fc53-11e7-b1bc-42010a840041 10Gi RWO standard 17s
[root@kali storage_class]# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-936fcf00-fc53-11e7-b1bc-42010a840041 10Gi RWO Delete Bound default/pvc-fast standard 1m
In other words, the physical volume claim is bound to physical volume associated to a new disk. It’s like a new volume was created manually:
[root@kali storage_class]# gcloud compute disks list|grep 0041
gke-persistent-disk-tu-pvc-936fcf00-fc53-11e7-b1bc-42010a840041 us-central1 10 pd-standard READY
The volume can be mounted inside the directory referencing the PVC resource and not the PV.
Next a Pod called debian that mounts on the data directory the volume above. The Pod simply writes something in a text file inside the directory mounted.
[root@kali storage_class]#vi debian.yaml
– name: task-pv-storage
– name: debian
– mountPath: “/data”
while true; do
date >> /data/test.txt;
echo “Hello from the second container” >> /data/test.txt;
[root@kali storage_class]#kubelet create -f debian.yaml
[root@kali storage_class]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
debian-01 1/1 Running 0 7m 10.40.0.7 gke-persistent-disk-tuto-default-pool-eedbf06b-k311
Inside the container, it’s possible to see that the volume are bind type like in docker swarm.
[root@kali storage_class]# docker inspect 69dc814dd6dc”Mounts”: [
Another advanced feature introduced by kubernets is the possibility to connect directly to external file server like nfs or glusterfs.
It’s not necessary to mount in any node of the cluster the nfs volume as it would be done with docker swarm but it’s enough to declare a persistent volume of nfs type like that:
[root@kali kubernets]# vi nfs-volume.yaml
# FIXME: use the right IP
[root@kali kubernets]# kubectl create -f nfs-volume.yaml
As we said, the pods consume pvc. So it means we need to claim the volume in order to be used and it’s done by a persisten volume claim:
[root@kali kubernets]# vi nfs-volume-claim.yaml
[root@kali kubernets]# kubectl create -f nfs-volume-claim.yaml
[root@kali kubernets]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
nfs-volume-claim Bound nfs-volume 10Gi RWX nfs 3s4
The PVC bound to volume above is ready to be consumed by a pod. I will use a deployment object already explained in my previous article.
[root@kali kubernets]# vi httpd.yaml
– image: httpd:alpine
– containerPort: 80
– mountPath: “/usr/local/apache2/htdocs/”
– name: nfs-volume-storage
[root@kali kubernets]# kubectl create -f httpd.yaml
deployment “httpd” created
[root@kali kubernets]#kubectl describe deployment httpd
/usr/local/apache2/htdocs from nfs-volume-storage (rw)
As you can see the volume management in kubernetes is more functional and flexible respect to docker swarm that doesn’t offer anything of special.
We can say that with Kubernetes a step forward is taken in abstraction of storage space from business processes.
In the next article I will speak about the different approaches with which swarm docker and kubernetes manage the network.