Docker swarm Vs Kubernetes part 1

  System
 

Docker swarm and kubernetes are the most widespread container orchestrators for running micro services spanned on different nodes of a cluster. They provide high availability, scalability, security and easy management to complex software architectures.

The goal is reached from docker swarm using the service concept that is a request to run containers; in kubernetes by different types of objects like Deployment, Replication Controller, Stateful Set and Services that extending the meaning of Pod are created by restful api sent to Api Gateway that is the core of the cluster.

Kubernetes is more powerful than docker swarm because it integrates natively with different cloud providers, it supports a lot of plugins versus other solutions and offers a lot of meta object that permit to manage the software deployment in a more robust and efficient way.

This does not justify the lack of interest in docker swarm which still remains easier to configure offering a series of interesting services like the stack concept not present in kubernetes,

In this first article I will try to compare the two approaches by different examples using as laboratory a docker swarm cluster configured on premise formed by two Centos 7 server and Google Cloud Container for the kubernetes cluster.

I will not explain how to configure a swarm or kubernetes cluster. For swarm you can find all the information at https://docs.docker.com/get-started/part4/#create-a-cluster, for kubernetes read this link https://kubernetes.io/docs/getting-started-guides/scratch/.

Let’s start to discuss about the most important differences: swarm services and kubernetes pods and Load Balancing in Swarm and Kubernetes. Network and Volume are addressed in next articles (Docker Swarm Vs Kubernetes part 2).

Swam services and Kubernetes pods

Swarm mode now native in docker is very simple to setup and configure. Its most important functionality is to deploy application services in the docker engines belonging to the same cluster.

Any application service is a request or task to orchestrate containers directly from swarm manager to swarm workers. Before going inside services definition, I will try to explain the differences between swarm and kubernetes world in its basic unit: containers and pods. This is important because there is a lot of confusion about that.

A Docker container is a normal process running in a system operating well isolated by different linux namespaces: PID, network, ipc and mount.

The namespace concept was implement in the kernel Linux since 2002 and now it’s used by docker for isolating the containers from external world. Isolate means that the container can see only its file system, network and only the processes running inside it. The interprocess channel communication (ipc) is reserved only to the processes of container.

For example, it’s possible to verify in a docker container the presence of a network namespace in this simple way:

[root@swarm-01 ~]# docker run –name namespace_04 -itd –network=green busybox
fb43171af488661adf1a9946cd40f976febb82463282d72b075ad3f334415e62
[root@swarm-01 ~]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fb43171af488 busybox “sh” 2 seconds ago Up 1 seconds namespace_04
[root@swarm-01 ~]# docker exec -it fb43171af488 ip addr show
93: eth0@if94: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:41:0a:03 brd ff:ff:ff:ff:ff:ff
inet 10.85.10.3/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe41:a03/64 scope link
valid_lft forever preferred_lft forever
[root@swarm-01 ~]# ln -s /var/run/docker/netns /var/run/netns
[root@swarm-01 ~]# ip netns ls
80c64628846e (id: 5)
1-5eded2abc0 (id: 4)
2c7e95af1b56 (id: 0)
[root@swarm-01 ~]# docker exec -it fb43171af488 ip addr show
98: eth0@if99: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:41:0a:05 brd ff:ff:ff:ff:ff:ff
inet 10.65.10.5/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe41:a05/64 scope link
valid_lft forever preferred_lft forever
[root@swarm-01 ~]#ip netns exec 80c64628846e ip addr show
98: eth0@if99: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
link/ether 02:42:0a:41:0a:05 brd ff:ff:ff:ff:ff:ff
inet 10.85.10.3/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::42:aff:fe41:a05/64 scope link
valid_lft forever preferred_lft forever

As you can see above, the interface of container belongs to a network namespace (80c64628846e). For each container that Docker creates, it allocates a virtual Ethernet device (in the example above eth0@if99) which is attached to the bridge docker0.

Other namespace concepts are used in a similar way in order to permit to the containers to access only to its resources. You can find more information about the namespace concept in this wonderful article http://crosbymichael.com/creating-containers-part-1.html.

After explaining the container concept, I introduce better the service concept that is the core in swarm.

A service, as docker says, is the definition of the tasks to execute on the worker nodes. A task is a request to run a container and it assigned by the swarm manager to worker nodes according to the number of replica to scale. Following how to create a nginx service with two replicas in a cluster formed by two workers:

[root@swarm-02 docker-compose]# docker service create \
> –replicas 2 \
> –name nginx \
> –update-delay 10s \
> nginx:1.12.2

The manager assigns to two workers the task to run of a nginx container. Infact:

[root@swarm-02 docker-compose]# docker service ps nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
uutsqgd1ez9f nginx.1 nginx:1.12.2 swarm-03 Running Preparing 7 minutes ago
prvaojfoaqcx nginx.2 nginx:1.12.2 swarm-02 Running Running 7 minutes ago

We can update the container image for nginx. The swarm manager applies the update to nodes according to policy described in the update-delay. It means the every container or service is update with a delay of 10 seconds.

root@swarm-02 docker-compose]# docker service update –image nginx:1.13.8 nginx
nginx
Since –detach=false was not specified, tasks will be updated in the background.
In a future release, –detach=false will become the default.

It’s possible to verify the new versions:

[root@swarm-02 docker-compose]# docker service ps nginx
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
teoquh562yn6 nginx.1 nginx:1.13.8 swarm-03 Running Running 42 seconds ago
uutsqgd1ez9f \_ nginx.1 nginx:1.12.2 swarm-03 Shutdown Shutdown 48 seconds ago
krbuqi67eeex nginx.2 nginx:1.13.8 swarm-02 Running Running 59 seconds ago
prvaojfoaqcx \_ nginx.2 nginx:1.12.2 swarm-02 Shutdown Shutdown about a minute ago

As showed above the swarm cluster orchestrates services, group of containers, in the swarm nodes of the cluster offering some useful functionality: scaling, auto deployment, service discovery and load balancing that will be explained in the next paragraph.

Kubernes run docker containers without changing the low level docker implementation, simply adding new objects that enrich more benefits. The objects are represented as json objects and created by restful api communicating with the Gateway Api of the cluster and already include the service functionality provided by swarm services.

The kubernetes controllerers that are listening of objects like Deployment, Replication Controller, Stateful Set orchestrate and schedule the running of containers creating the basic and most important object that is the Pod,  runnable by the kubelet process running in any node of the cluster.

The Pod is a group of containers with network and storage shared. The network is shared because the veth interface of any containers of a pod is created in the same network namespace of a pause container that is the owner of the namespace shared by all the containers of the pod. It’s also possible inside the containers of a pod to share volumes mounting it in different directories.

A Kubernetes pod is the smallest deployable unit orchestrated by Kubernetes as single unit. The containers inside a pod can communicate via localhost and share data. The first question that comes to mind is: Why and when should we co-locate containers inside a Pod rather than span them in different containers like docker philosophy?

The answer depends of the nature and the relation of containers. If the containers must share big data or if the containers must always run in the same node, it could have sense to aggregate them in a single Pod, otherwise it could better to run them inside different pods.

Docker supports the docker-compose wrapper for running containers by yaml files; Kubernetes with kubectl utility has the same approach. The scope of kubelet is to convert the yaml file in a json object and send it by a rest requests to api gateway.

Following how to create a Pod containing two docker containers:

[root@kali persisten-vol-01]# vi pod.yaml
kind: Pod
apiVersion: v1-,h
metadata:
name: pod-test
spec:
volumes:
– name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
containers:
– name: nginx
image: nginx
ports:
– containerPort: 80
volumeMounts:
– mountPath: “/usr/share/nginx/html”
name: task-pv-storage
– name: debian
image: debian
volumeMounts:
– mountPath: “/pod-data”
name: task-pv-storage
command: [“/bin/sh”]
args:
– “-c”
– >
while true; do
date >> /pod-data/index.html;
echo Hello from the second container >> /pod-data/index.html;
sleep 1;
done
[root@kali persisten-vol-01]# kubectl create -f pod.yaml
pod “pod-test” created

In the yaml file there is a definition of Pod with two containers: nginx and debian. The two containers are managed as single unit called pod-test and it will share network and storage.

Every Pod mounts the same volume (I will explain this kind of volume later) in a different directory. The state of Pod can be verified in this way:

[root@kali nfs]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-test 1/1 Running 0 1h

As showed above, the pod is a single unit like a single docker container. With this command is possible to check the two containers inside the pod and log to them:

[root@kali persisten-vol-01]# kubectl get pods –all-namespaces -o=jsonpath='{range .items[*]}{“\n”}{.metadata.name}{“:\t”}{range .spec.containers[*]}{.image}{“, “}{end}{end}’ |grep nginx-test
nginx-test: nginx, debian,
[root@kali persisten-vol-01]# kubectl exec -it nginx-test -c nginx — /bin/bash
[root@nginx /]#exit
[root@kali persisten-vol-01]# kubectl exec -it nginx-test -c debian — /bin/bash
[root@debian /]#

When a pod dies is not recovered and it’s will take another ip address after all like docker. For resolving these problems, while docker swarm provides new parameters to put in the compose file like restart_policy, Kubernetes uses a new object called ReplicationController..

A replication Controller ensures that a specified number of pod replicas are running at any one time in stable way. If a pod dies for some strange reason, it’s automatically restarted. You can find more information at kubernetes site.

Another object called Deployment is now suggested to use instead of ReplicationController. The Deployment object is the right and modern way to deploy a set of containers controlling the version history. I speak about containers, but it’s more right to say Pods because these objects run Pods which next run containers inside them.

Following a example about how to use it for the deployment of a set of nginx containers running on a kubernetes cluster composed by three nodes:

[root@kali Deployment]# vi nginx-deployment.yaml
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
– name: nginx
image: nginx:1.7.9
ports:
– containerPort: 80
[root@kali Deployment]# kubectl create -f nginx-deployment.yaml
deployment “nginx-deployment” created
[root@kali Deployment]# kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 3 3 3 3 18s
[root@kali Deployment]# kubectl get deployments
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 3 3 3 3 22s

As I said, the Deployment object like other objects run Pods. Infact it’s always possible to check the pod running inside the deployment:

[root@kali Deployment]# kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deployment-431080787-0vd0x 1/1 Running 0 26s
nginx-deployment-431080787-c7nxg 1/1 Running 0 26s
nginx-deployment-431080787-s8jb3 1/1 Running 0 26s
[root@kali Deployment]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
nginx-deployment-431080787-0vd0x 1/1 Running 0 42s 10.8.1.7 gke-cluster-1-default-pool-47fff2b6-4ns1
nginx-deployment-431080787-c7nxg 1/1 Running 0 42s 10.8.0.7 gke-cluster-1-default-pool-47fff2b6-jmxb
nginx-deployment-431080787-s8jb3 1/1 Running 0 42s 10.8.1.6 gke-cluster-1-default-pool-47fff2b6-4ns1
[root@kali Deployment]# kubectl get rs
NAME DESIRED CURRENT READY AGE
nginx-deployment-431080787 3 3 3 1m

The power of Deployment object: it’s possible to edit the yaml file changing the nginx version and redeploying again:

[root@kali Deployment]# kubectl edit deployment/nginx-deployment
[root@kali Deployment]# kubectl rollout status deployment/nginx-deployment
Waiting for rollout to finish: 1 out of 3 new replicas have been updated…
Waiting for rollout to finish: 2 out of 3 new replicas have been updated…
Waiting for rollout to finish: 2 out of 3 new replicas have been updated…
Waiting for rollout to finish: 2 out of 3 new replicas have been updated…
Waiting for rollout to finish: 1 old replicas are pending termination…
Waiting for rollout to finish: 1 old replicas are pending termination…
Waiting for rollout to finish: 1 old replicas are pending termination…
deployment “nginx-deployment” successfully rolled out
[root@kali Deployment]# kubectl get pods –show-labels
NAME READY STATUS RESTARTS AGE LABELS
nginx-deployment-2078889897-0c0gt 1/1 Running 0 1m app=nginx,pod-template-hash=2078889897
nginx-deployment-2078889897-6ff1k 1/1 Running 0 30s app=nginx,pod-template-hash=2078889897
nginx-deployment-2078889897-xgprb 1/1 Running 0 20s app=nginx,pod-template-hash=2078889897
[root@kali Deployment]# kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx-deployment 3 3 3 3 5m
[root@kali Deployment]#kubectl get deployment
[root@kali Deployment]# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.11.240.1 <none> 443/TCP 17m
[root@kali Deployment]# kubectl describe deployments
Name: nginx-deployment
Namespace: default
CreationTimestamp: Thu, 04 Jan 2018 18:40:17 +0100
Labels: app=nginx
Annotations: deployment.kubernetes.io/revision=2
Selector: app=nginx
Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.9.1
Port: 80/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
—- —— ——
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: nginx-deployment-2078889897 (3/3 replicas created)
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal ScalingReplicaSet 6m deployment-controller Scaled up replica set nginx-deployment-431080787 to 3
Normal ScalingReplicaSet 2m deployment-controller Scaled up replica set nginx-deployment-2078889897 to 1
Normal ScalingReplicaSet 1m deployment-controller Scaled down replica set nginx-deployment-431080787 to 2
Normal ScalingReplicaSet 1m deployment-controller Scaled up replica set nginx-deployment-2078889897 to 2
Normal ScalingReplicaSet 1m deployment-controller Scaled down replica set nginx-deployment-431080787 to 1
Normal ScalingReplicaSet 1m deployment-controller Scaled up replica set nginx-deployment-2078889897 to 3
Normal ScalingReplicaSet 1m deployment-controller Scaled down replica set nginx-deployment-431080787 to 0

Another useful object is StatefulSets that are intended for managing stateful applications and distributed systems. A good article about that is that: http://blog.kubernetes.io/2017/01/running-mongodb-on-kubernetes-with-statefulsets.html.

For tasks to schedule as crontab script, kubernetes offers the CronJob object that manages time based jobs to run once at specified point in time or repeatedly at a specified time. For parallel job it’s possible to use Job Object.

This is the strength of kubernetes: provide a lot of meta pod that help to manage better distributed applications. Every object uses other objects for orchestrating Pods that containing containers. The objects, respect to services swarm, doesn’t load balance the traffic because they only orchestrate the run of containers. For load balancing Kubernetes uses the service object that from this point of view it’s similar to services swarm but its scope is only to balance.

Load Balancing in Swarm and in Kubernetes

The load balancing in swarm and kubernetes is implemented by the services. Swarm docker offers a simple load balancing that forwards all the traffic direct to exposed ports to all replicated containers. Kubernetes, as in this style, present different type of balancing that will be explained in this paragraph.

Let’s start speaking about the load balancing in swarm docker and I take the opportunity to speak about the stack concept present in swarm and not in Kubernetes.

The concept stack is present only in swarm and represents a set of services logically related. Finally a feature not present in kubernetes, but this limit is achieved by using helm software that permit to package a set of kubernetes resources by variables resolved a run time.

You should use a stack for many reasons: for example an application formed by different containers could be splitted in different stacks for logically providing more environments like development, testing and preproduction. Unfortunatly it’s not possibile to define dependecies between stack, instead in kubernetes It’s possibile to do simply with the helm chart.

The definition of containers inside the definition of a stack are automatically converted in services and running in any node of cluster. This is different respect to Kubernetes pod because the stack is not single unit like the pod, but a set of services that share the network.

Every service or container is accessible on a layer two network (implemented by vxlan protocol) present in any node of the cluster. Swarm manager nodes assign each service in the swarm a unique DNS name and load balances running containers.

Following how to create a swarm stack formed by two containers reachable on a internal network 10.0.0.5/24 (another stack would have another subnet).

[root@swarm-02 ~]# docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
flc4fkfk7ds1kl1zqey7xigln0 swarm-02 Ready Active Reachable
lqi9fnu3oi8ak5tbh84s8terq * swarm-03 Ready Active Leader
[root@kali-02 docker-compose]#vi compose-dev.yaml
version: ‘3’
services:
nginx:
image: /nginx
depends_on:
– “haproxy”
volumes:
– /gfs/dev/nginx/conf.d:/etc/nginx/conf.d
replicas: 1
ports:
– “480:80”
– “4443:443”
apache:
image: apache
replicas: 1
volumes:
– /gfs/dev/apache/conf.d:/etc/apache2/conf.d
[root@kali-02 docker-compose]# docker stack deploy –compose-file /gfs/docker-compose/compose_dev.yml stack-dev –with-registry-auth
Creating network stack-dev_default
Creating service stack-dev_nginx
Creating service stack-dev_apache
Creating service stack-dev_bind
[root@kali-02 docker-compose]# docker stack ps stack-dev
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
zteim31v034p stack-dev_apache apache swarm-02.test Running Running 7 seconds ago
fp0nb4hde7ok stack-dev_nginx nginx swarm-03.test Running Running 7 seconds ago

As showed below, the single unit of stack are converted in services and its names are automatically resolved by a internal dns:

[root@kali-02 docker-compose]# docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
3fneyrpk9zx6 stack-dev_apache replicated 1/1 swarm-02
icus5w8i9xda stack-dev_nginx replicated 1/1 swarm-03 *:480->80/tcp,*:4443->443/tcp
[root@kali-02 docker-compose]# netstat -an|grep 453
tcp6 0 0 :::453 :::* LISTEN
udp6 0 0 :::453 :::*
unix 2 [ ACC ] STREAM LISTENING 42453 /var/run/docker/metrics.sock
[root@kali-02 docker-compose]# netstat -an |grep 4443
tcp6 0 0 :::4443 :::* LISTEN
[root@kali-02 docker-compose]# netstat -an |grep 480
tcp6 0 0 :::480 :::* LISTEN [root@kali-02 docker-compose]# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
6abbc0ffc65e apache “/bin/sh -c ‘rm -r…” About a minute ago Up About a minute stack-dev_apache.1.zteim31v034phwq1fzcaq0g1l
[root@kali-02 docker-compose]# docker exec -it 6abbc0ffc65e /bin/sh
/ # ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
18: eth0@if19: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue state UP
link/ether 02:42:0a:00:00:05 brd ff:ff:ff:ff:ff:ff
inet 10.0.0.5/24 scope global eth0
valid_lft forever preferred_lft forever
inet 10.0.0.4/32 scope global eth0
valid_lft forever preferred_lft forever
20: eth1@if21: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
link/ether 02:42:ac:12:00:03 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.3/16 scope global eth1
valid_lft forever preferred_lft forever
/ # ping stack-dev_apache
PING stack-dev_apache (10.0.0.4): 56 data bytes
64 bytes from 10.0.0.4: seq=0 ttl=64 time=0.068 ms
64 bytes from 10.0.0.4: seq=1 ttl=64 time=0.080 ms
— stack-dev_apache ping statistics —
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.068/0.074/0.080 ms
/ # exit

Every node of the cluster runs a proxy load balancer that balances the traffic direct to exposed ports to all the replicas of that container. The nginx service is running on kali-02 vm but it reachable also from other nodes thanks to load balancer proxy running in any node of the cluster. This can be verified in this way:

[root@swarm-02 compose]# curl -v –silent http://swarm-02:480 2>&1 |grep HTTP
> GET / HTTP/1.1
< HTTP/1.1 302 Found
[root@swarm-02 compose]# curl -v –silent http://swarm-03:480 2>&1 |grep HTTP
> GET / HTTP/1.1
< HTTP/1.1 302 Found

Any node of the cluster can be used as load balancer, but no external load balancer is provided. I suggest to create a reverse proxy behind the cluster in order to manage well the traffic versus the all stacks. For example, you can create a virtual server like https://dev.mydomain.com and another https://pre.mydomain.com that proxy versus the right exposed ports as defined in the yaml file.

Kubernetes unfortunally doesn’t have the stack concept but it’s possible to reach the same result using opportunally the service object even if the solution implemented is not isolated like the swarm stack.

The service object in kubernetes is an abstraction which defines a logical set of Pods, while in swarm it defines a logical set of containers. It’s a REST object, similar to a Pod and to another object. Like all of the REST objects, a Service definition can be posted to the apiserver to create a new instance.

It must be explicitly created like in this example:

[root@kali compose]# vi my-service.yaml
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: MyApp
ports:
– protocol: TCP
port: 80
targetPort: 9376

The configuration above create a new Service that balances the traffic on port 80 versus all the pods with label MyApp on the targetPort.

For this scope, every node in a Kubernetes cluster runs a kube-proxy responsible for implementing a form of virtual IP for Services. There are diffrents type of proxy

  1. Userspace . In iptable mode it installs iptables rules which capture traffic to the Service’s clusterIP (which is virtual) and Port and redirects that traffic to one of the Service’s backend sets.
  2. Proxy mode. In this mode a random port is open in any node of the cluster where is listening the proxy that receives the traffic and forwards it versus back-end pods.

Respect to docker there are different type of services:

  1. ClusterIP: ClusterIp can be used only for internal cluster scope and they are reachable only from inside of the cluster.
  2. Load balancer. It exposes the service externally using the cloud provider’s load balancer defining the provider interface provided by kubernetes.
  3. NodePort: It’s similar to way how the service are exposed in docker swarm. A random port is created in any node of the cluster and the traffic versus this port is natted to back-end pods.
  4. External: In this case the management of the virtual ip address is responsible of the Administator.
  5. Ingress: This Ingress can provide load balancing, SSL termination and name-based virtual hosting.

I will explain how the first two types of services work implementing an example provided by Google at https://cloud.google.com/kubernetes-engine/docs/tutorials/persistent-disk.

I will work in a cluster formed by three nodes directly on google cloud:

[root@kali ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-cluster-test-default-pool-a18beb44-m6dc Ready <none> 9m v1.7.11-gke.1
gke-cluster-test-default-pool-a18beb44-n2wm Ready <none> 9m v1.7.11-gke.1
gke-cluster-test-default-pool-a18beb44-st4b Ready <none> 9m v1.7.11-gke.1

Every pod running in a node will have the ip address belong to a well defined subnet.It’s possible to show in the google network the following static routes:

[root@kali ~]# gcloud compute routes list
gke-cluster-test-80b95a23-9e653b33-f478-11e7-b713-42010a800018 default 10.36.2.0/24 us-central1-a/instances/gke-cluster-test-default-pool-a18beb44-n2wm 1000
gke-cluster-test-80b95a23-9eaab7c6-f478-11e7-b713-42010a800018 default 10.36.0.0/24 us-central1-a/instances/gke-cluster-test-default-pool-a18beb44-st4b 1000
gke-cluster-test-80b95a23-9ebcaeb4-f478-11e7-b713-42010a800018 default 10.36.1.0/24 us-central1-a/instances/gke-cluster-test-default-pool-a18beb44-m6dc 1000

These routes will permit to any pod to speak with another running in another node and the GCP network will not know anything about that. A pod running in the first node will have an ip address belonging to 10.36.2.0/24 subnet; a pod running the seconde node will have an ip address of 10.36.0.0/24 subnet; a pod running in the third node will have an ip address of 10.36.2.0/24.

If the cluster is on premise, different strategies must be implemented. A possible solution is to use vxlan protocol like swarm. You can find more details about that at https://kubernetes.io/docs/getting-started-guides/scratch/

After this necessary introduction to kubernetes cluster, let’s start to implement and explain the example provided by Google. I start creating the Pod by a Deployment object already described above (I don’t put the commands for creating the google disk and the secrets used, you can find it directly in the google example).

[root@kali ~]#vi mysql.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mysql
labels:
app: mysql
spec:
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
– image: mysql:5.6
name: mysql
env:
– name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql
key: password
ports:
– containerPort: 3306
name: mysql
volumeMounts:
– name: mysql-persistent-storage
mountPath: /var/lib/mysql
volumes:
– name: mysql-persistent-storage
gcePersistentDisk:
pdName: mysql-disk
fsType: ext4
[root@kali ~]# kubectl create -f mysql.yaml
deployment “mysql” created
[root@kali ~]# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE
mysql-3368603707-hcrxq 1/1 Running 0 18m 10.36.2.5 gke-cluster-test-default-pool-a18beb44-n2wm
[root@kali ~]# kubectl exec -it mysql-3368603707-hcrxq — ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue state UP group default
link/ether 0a:58:0a:24:02:05 brd ff:ff:ff:ff:ff:ff
inet 10.36.2.5/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::fc0e:dcff:fe89:66da/64 scope link
valid_lft forever preferred_lft forever
Now it’s possible create the service:

The mysql service is now created. The type will be of ClusterIp because it must be reachable only from internal cluster. The virtual ip address create is not routable: this is the default behaviour.

[root@kali~]# vi mysql-service.yaml
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
spec:
type: ClusterIP
ports:
– port: 3306
selector:
app: mysql
[root@kali~]# kubectl create -f mysql-service.yaml
[root@kali ~]# kubectl get pod
NAME READY STATUS RESTARTS AGE
mysql-3368603707-hcrxq 1/1 Running 0 30s
root@kali ~]# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.39.240.1 443/TCP 5m
mysql ClusterIP 10.39.251.240 3306/TCP 29s
[root@kali ~]# kubectl describe svc mysql
Name: mysql
Namespace: default
Labels: run=mysql
Annotations: <none>
Selector: run=mysql
Type: ClusterIP
IP: 10.39.251.240
Port: <unset> 3306/TCP
TargetPort: 3306/TCP
Endpoints: 10.36.2.5:3306
Session Affinity: None
Events: <none>

As showed above, a virtual cluster IP has been created and it will be available on any node of the cluster. This virtual ip will be managed by a iptables rule. Infact in any node of the cluster it’s possibile to show the following destination nat rules for natting the destination cluster virtual ip address with the ip address of mysql container.

gke-cluster-test-default-pool-a18beb44-m6dc ~ # iptables -t nat –list
Chain KUBE-SEP-F4VH6QHAVWHHLQG2 (1 references)
target prot opt source destination
KUBE-MARK-MASQ all — 10.36.2.5 anywhere /* default/mysql: */
DNAT tcp — anywhere anywhere /* default/mysql: */ tcp to:10.36.2.5:3306
chain KUBE-SVC-M7XME3WTB36R42AM (1 references)
target prot opt source destination
KUBE-SEP-F4VH6QHAVWHHLQG2 all — anywhere anywhere /* default/mysql: */
KUBE-SVC-M7XME3WTB36R42AM tcp — anywhere 10.39.251.240 /* default/mysql: cluster IP */ tcp dpt:mysql

The clusterIP is rechable from any pod, also from mysql pod where it’s possible to telnet it. I remember that the DNS server watches the Kubernetes API for new services and creates a set of DNS records for each. For this case a new service called mysql is created (same approach in docker swarm).

[root@kali ~]# kubectl exec -it mysql-3368603707-hcrxq — /bin/bash
root@mysql-3368603707-hcrxq:/# telnet mysql 3306
Trying 10.39.251.240…
Connected to mysql.default.svc.cluster.local.
Escape character is ‘^]’.
J
5.6.38$+M#eSO€sg,CM:wm<!/fmysql_native_password

Now we can create the wordpress Pod and its relative service that must be reachable from external and it will be a Loadbalancer type. This load balancer will be created on google cloud using the right plugin. A good article about how the load balancer works is described at https://medium.com/google-cloud/kubernetes-from-load-balancer-to-pod-3f2399637b0c.

Let’s start to create the Deployment wordpress that runs the relative Pod.

[root@kali ~]# vi wordpress.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: wordpress
labels:
app: wordpress
spec:
replicas: 1
selector:
matchLabels:
app: wordpress
template:
metadata:
labels:
app: wordpress
spec:
containers:
– image: wordpress
name: wordpress
env:
– name: WORDPRESS_DB_HOST
value: mysql:3306
– name: WORDPRESS_DB_PASSWORD
valueFrom:
secretKeyRef:
name: mysql
key: password
ports:
– containerPort: 80
name: wordpress
volumeMounts:
– name: wordpress-persistent-storage
mountPath: /var/www/html
volumes:
– name: wordpress-persistent-storage
gcePersistentDisk:
pdName: wordpress-disk
fsType: ext4
[root@kali ~]# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE
mysql-3368603707-84vv5 1/1 Running 0 1m 10.36.2.5 gke-cluster-test-default-pool-a18beb44-n2wm
wordpress-3479901767-tkqf7 1/1 Running 0 35s 10.36.2.6 gke-cluster-test-default-pool-a18beb44-n2wm

You can verify that the mysql service is reachable from wordpress pod.

[root@kali ~]# kubectl exec -it wordpress-3479901767-tkqf7 — /bin/bash
root@wordpress-3479901767-tkqf7:/var/www/html#
root@wordpress-3479901767-tkqf7:/var/www/html#
root@wordpress-3479901767-tkqf7:/var/www/html# apt-get update
Get:1 http://security.debian.org stretch/updates InRelease [63.0 kB]
Ign:2 http://cdn-fastly.deb.debian.org/debian stretch InRelease
Get:3 http://cdn-fastly.deb.debian.org/debian stretch-updates InRelease [91.0 kB]
Get:4 http://cdn-fastly.deb.debian.org/debian stretch Release [118 kB]
Get:5 http://security.debian.org stretch/updates/main amd64 Packages [333 kB]
Get:6 http://cdn-fastly.deb.debian.org/debian stretch Release.gpg [2434 B]
Get:7 http://cdn-fastly.deb.debian.org/debian stretch-updates/main amd64 Packages [6499 B]
Get:8 http://cdn-fastly.deb.debian.org/debian stretch/main amd64 Packages [9531 kB]
Fetched 10.1 MB in 1s (5362 kB/s)
Reading package lists… Done
root@wordpress-3479901767-tkqf7:/var/www/html# apt-get install telnet
Reading package lists… Done
Building dependency tree
Reading state information… Done
The following additional packages will be installed:
netbase
The following NEW packages will be installed:
netbase telnet
0 upgraded, 2 newly installed, 0 to remove and 2 not upgraded.
Need to get 91.1 kB of archives.
After this operation, 206 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://cdn-fastly.deb.debian.org/debian stretch/main amd64 netbase all 5.4 [19.1 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian stretch/main amd64 telnet amd64 0.17-41 [72.0 kB]
Fetched 91.1 kB in 0s (483 kB/s)
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package netbase.
(Reading database … 13090 files and directories currently installed.)
Preparing to unpack …/archives/netbase_5.4_all.deb …
Unpacking netbase (5.4) …
Selecting previously unselected package telnet.
Preparing to unpack …/telnet_0.17-41_amd64.deb …
Unpacking telnet (0.17-41) …
Setting up netbase (5.4) …
Setting up telnet (0.17-41) …
update-alternatives: using /usr/bin/telnet.netkit to provide /usr/bin/telnet (telnet) in auto mode
update-alternatives: warning: skip creation of /usr/share/man/man1/telnet.1.gz because associated file /usr/share/man/man1/telnet.netkit.1.gz (of link group telnet) doesn’t exist
root@wordpress-3479901767-tkqf7:/var/www/html# telnet mysql 3306
Trying 10.39.251.240…
Connected to mysql.default.svc.cluster.local.
Escape character is ‘^]’.

Let’s create the load balancer service. I remember that Kubernetes has the concept of a Cloud Provider, which is a module which provides an interface for managing TCP Load Balancers. Google implements this interface for creating automatically the load balancer on google cloud.

[root@kali ~]# kubectl create -f wordpress-service.yaml
[root@kali ~]# kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.39.240.1 443/TCP 18m
mysql ClusterIP 10.39.251.240 3306/TCP 10m
wordpress LoadBalancer 10.39.251.240 35.225.202.61 80:30350/TCP 2m

A new load balancer is created. The traffic on  35.225.202.61:80 is proxied to any nodes at the 30350 port where kube-proxy is listening for balancing the traffic. Infact on any node you can find kube-proxy listening:

gke-cluster-test-80b95a23-9e653b33-f478-11e7-b713-42010a800018 ~ # netstat -anp|g
rep 30350
tcp6 0 0 :::30350 :::* LISTEN 1836/kube-proxy

For details about the load balancer and the pool balanced you can use the following commands:

[root@kali ~]# gcloud compute forwarding-rules list
NAME REGION IP_ADDRESS IP_PROTOCOL TARGET
a5a33aabbf60c11e78a8a42010a80002 us-central1 35.225.202.61 TCP us-central1/targetPools/a5a33aabbf60c11e78a8a42010a80002
[root@kali ~]# gcloud compute forwarding-rules describe a5a33aabbf60c11e78a8a42010a80002 –region us-central1
IPAddress: 35.225.202.61
IPProtocol: TCP
creationTimestamp: ‘2018-01-10T05:44:42.230-08:00’
description: ‘{“kubernetes.io/service-name”:”default/wordpress”}’
id: ‘777115605059479077’
kind: compute#forwardingRule
loadBalancingScheme: EXTERNAL
name: a5a33aabbf60c11e78a8a42010a80002
portRange: 80-80
region: https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/regions/us-central1
selfLink: https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/regions/us-central1/forwardingRules/a5a33aabbf60c11e78a8a42010a80002
target: https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/regions/us-central1/targetPools/a5a33aabbf60c11e78a8a42010a80002
[root@kali1 ~]# gcloud compute target-pools describe –region us-central1 a5a33aabbf60c11e78a8a42010a80002
creationTimestamp: ‘2018-01-10T05:44:38.451-08:00’
description: ‘{“kubernetes.io/service-name”:”default/wordpress”}’
healthChecks:
– https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/global/httpHealthChecks/k8s-43dc8aa4ea2f6134-node
id: ‘1619866515853256233’
instances:
– https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/zones/us-central1-a/instances/gke-cluster-test-default-pool-a18beb44-m6dc
– https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/zones/us-central1-a/instances/gke-cluster-test-default-pool-a18beb44-n2wm
– https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/zones/us-central1-a/instances/gke-cluster-test-default-pool-a18beb44-st4b
kind: compute#targetPool
name: a5a33aabbf60c11e78a8a42010a80002
region: https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/regions/us-central1
selfLink: https://www.googleapis.com/compute/v1/projects/ow-conectus-demo/regions/us-central1/targetPools/a5a33aabbf60c11e78a8a42010a80002
sessionAffinity: NONE

This type of service is very useful if the objective is to export the service to internet. But you should attention at the prices. In my opinion the prices for the Load balancers are very expansive.

I suggest for saving money to use a classic external ip address natted to internal ip where you can install and configure a nginx with haproxy as reverse proxy and load balancer. In this scenario you have the possibility to have the configurational flexibility of these products certainly superior to google load balancer.

Conclusion

I believed in swarm, but I could not go in Production for a lot of issue not resolved by docker, but after knowing well kubernetes I can say that compare Kubernetes with Swarm has not sense: kubernetes is a Ferrari, swarm in a bad little car. That’s all.

This another article about the same topic: Docker Swarm Vs Kubernetes part 2.

Don’t hesitate to contact me for any questions.

LEAVE A COMMENT