Kubernetes the easy way part 01

  Network, System
 

In this series of articles I will try to describe how to install and configure a kubernetes cluster in high availability and for production use, using an easy approach opposite to hard way, proposed by https://github.com/kelseyhightower/kubernetes-the-hard-way, that, however, is a good way to understand and learn more, inside the scenes, the infrastructure of a kubernetes cluster.

I have chosen, as first easy solution, a ubuntu deployment strategy that use microk8s for deploying the kubernetes clusters in high availability, that, starting from last year, is ready for Production as described here https://microk8s.io/high-availability. In the next article, continuing in the wake of this spirit, I will deep another Ubuntu juju based solution.

With the awareness of the importance of control of an infrastructure in production, I will try to describe how the microk8s architecture works describing limits and benefits.

The reference cluster will be formed by 3 virtual machines, an haproxy behind the cluster, in order to expose the ingress services to standard http and https ports, and a nfs server to provide a network file system to mount and share inside the cluster, all running in Ubuntu 20.04.2 LTS.

Let me start with the microk8s installation.

Microk8s HA Installation

Before starting with the installation, it’s necessary to have the hostnames of the every virtual machine resolved via dns, if it’s not possible, as I made, I suggest to add in each /etc/hosts file the following association:

microkus01# vi /etc/hosts
10.10.10.1 microk8s01
10.10.10.2 microk8s02
10.10.10.3 microk8s03

The apiserver, in order to permit to launch commands inside the pods, by exec commands, must connect to kubelet, port 1025, and it needs to resolve the hostname of the pod where is running. Without this, the kubectl exec command doesn’t work if the pod is running in another node different where the command itself is executed.

Moreover, in order to use useful commands for the administration of the cluster, it’s also necessary to set these alias:

microkus01#alias helm='microk8s helm3' alias kubectl='microk8s kubectl'

I remember that helm, even if not used in this article, has become a standard de facto for kubernetes application management.

Let’s go on with the installation of the microk8s – by snap package manager. that is a way to package application, like rpm, installed under the directory /var/snap – in each node of the cluster. The latest stable release will be installed:

root@microk8s01:~# snap install microk8s --classic --channel=1.21/stable
2021-05-21T07:54:19Z INFO Waiting for automatic snapd restart...
microk8s (1.21/stable) v1.21.1 from Canonical✓ installed
root@microk8s01:~# snap info microk8s
name:      microk8s
summary:   Lightweight Kubernetes for workstations and appliances
publisher: Canonical✓
store-url: https://snapcraft.io/microk8s
contact:   https://github.com/ubuntu/microk8s
license:   unset
description: |
  MicroK8s is the smallest, simplest, pure production Kubernetes for clusters, laptops, IoT and
  Edge, on Intel and ARM. One command installs a single-node K8s cluster with carefully selected
  add-ons on Linux, Windows and macOS.  MicroK8s requires no configuration, supports automatic
  updates and GPU acceleration. Use it for offline development, prototyping, testing, to build your
  CI/CD pipeline or your IoT apps.
commands:
  - microk8s.add-node
  - microk8s.cilium
  - microk8s.config
  - microk8s.ctr
  - microk8s.dashboard-proxy
  - microk8s.dbctl
  - microk8s.disable
  - microk8s.enable
  - microk8s.helm
  - microk8s.helm3
  - microk8s.inspect
  - microk8s.istioctl
  - microk8s.join
  - microk8s.juju
  - microk8s.kubectl
  - microk8s.leave
  - microk8s.linkerd
  - microk8s
  - microk8s.refresh-certs
  - microk8s.remove-node
  - microk8s.reset
  - microk8s.start
  - microk8s.status
  - microk8s.stop
services:
  microk8s.daemon-apiserver:            simple, enabled, inactive
  microk8s.daemon-apiserver-kicker:     simple, enabled, active
  microk8s.daemon-cluster-agent:        simple, enabled, active
  microk8s.daemon-containerd:           simple, enabled, active
  microk8s.daemon-control-plane-kicker: simple, enabled, inactive
  microk8s.daemon-controller-manager:   simple, enabled, inactive
  microk8s.daemon-etcd:                 simple, enabled, inactive
  microk8s.daemon-flanneld:             simple, enabled, inactive
  microk8s.daemon-kubelet:              simple, enabled, inactive
  microk8s.daemon-kubelite:             simple, enabled, active
  microk8s.daemon-proxy:                simple, enabled, inactive
  microk8s.daemon-scheduler:            simple, enabled, inactive

In this scenario, each node of the cluster is an independent microk8s cluster. The same command below can be executed, with the same output, in any node.

root@microk8s01:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s01 NotReady 20s v1.21.1-3+1f02fea99e2268
root@microk8s01:~# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-f7868dd95-zstn4 1/1 Running 0 30s
calico-node-z4kqw 1/1 Running 0 30s

As showed above, the calico plugin is used as container network interface: it is used for assigning the ip addresses to containers and configure the routing for inter pods communication. There are two way to configure the calico plugin: ip2ip and vxlan tunnel. The microk8s, as showed below, uses the vxlan tunnel (for more information about calico you can read my article https://www.securityandit.com/network/kubernetes-network-cluster-architecture/):

microk8s01:~# kubectl get IPPool
NAME AGE
default-ipv4-ippool 11m
root@microk8s01:~# kubectl describe IPool default-ipv4-ippool
error: the server doesn't have a resource type "IPool"
root@microk8s01:~# kubectl describe IPool default-ipv4-ippool -n kube-system
error: the server doesn't have a resource type "IPool"
root@microk8s01:~# kubectl describe IPPool default-ipv4-ippool
Name: default-ipv4-ippool
Namespace:
Labels:
Annotations: projectcalico.org/metadata: {"uid":"12623a0d-f763-4ec8-b45a-51588dda5e4f","creationTimestamp":"2021-05-21T08:38:12Z"}
API Version: crd.projectcalico.org/v1
Kind: IPPool
Metadata:
Creation Timestamp: 2021-05-21T08:38:12Z
Generation: 1
Managed Fields:
API Version: crd.projectcalico.org/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:projectcalico.org/metadata:
f:spec:
.:
f:blockSize:
f:cidr:
f:ipipMode:
f:natOutgoing:
f:nodeSelector:
f:vxlanMode:
Manager: Go-http-client
Operation: Update
Time: 2021-05-21T08:38:12Z
Resource Version: 370
Self Link: /apis/crd.projectcalico.org/v1/ippools/default-ipv4-ippool
UID: 172a3dc1-4ca5-45f7-83b5-1d395eb974e6
Spec:
Block Size: 26
Cidr: 10.1.0.0/16
Ipip Mode: Never
Nat Outgoing: true
Node Selector: all()
Vxlan Mode: Always
Events:

Calico is my best preferred network plugin because it’s stable and, opposite to other cni like flannel, provides the implementation of the network policy important for cluster segregation.

Microk8s has the possibility to enable different modules that are of two types: simple binary installed, like helm3, that must be installed in any nodes, and kubernetes resources that can be installed in one only node of the cluster. Before creating one, I suggest to install the dns module in the first node, microk8s01, that will be used as starting point of the cluster. Theoretically this module could be installed after forming the cluster.

root@microk8s01:~# microk8s enable dns
Enabling DNS
Applying manifest
serviceaccount/coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
clusterrole.rbac.authorization.k8s.io/coredns created
clusterrolebinding.rbac.authorization.k8s.io/coredns created
Restarting kubelet
DNS is enabled
root@microk8s01:~# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7f9c69c78c-wjxzq 1/1 Running 0 8m9s
calico-kube-controllers-f7868dd95-zstn4 1/1 Running 0 28m
calico-node-z4kqw 1/1 Running 0 28m

The coredns mounts internally a configmap where is specified the DNS forward: it could be useful to change these with internal dns servers:

root@microk8s01:~# kubectl describe configmap coredns -n kube-system|grep forw
forward . 8.8.8.8 8.8.4.4

For creating the the cluster. in the first node, it’s enough to get the authentication token:

root@microk8s01:~# microk8s add-node
From the node you wish to join to this cluster, run the following:
microk8s join 10.10.10.1:25000/68f8dce22bc0168b29c773f2b8c5484b/b8fe1583b804If the node you are adding is not reachable through the default interface you can use one of the following: microk8s join 10.10.10.1:25000/68f8dce22bc0168b29c773f2b8c5484b/b8fe1583b804

In the second node, microk8s02, it’s easy to join in this way:

root@microk8s02:~# snap install microk8s --classic --channel=1.21/stable
microk8s (1.21/stable) v1.21.1 from Canonical✓ installed
root@microk8s02:~# microk8s join 10.10.10.1:25000/68f8dce22bc0168b29c773f2b8c5484b/b8fe1583b804
Contacting cluster at 10.10.10.1
Waiting for this node to finish joining the cluster. ..

The approach, used for adding the node, by bootstrap token, is well explained in kubernetes documentation https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/ After repeating the same process for the third node, we have a cluster formed by three nodes, that is, as explain later, the minimum number of nodes required.

root@microk8s01:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
microk8s01 Ready 55m v1.21.1-3+1f02fea99e2268
microk8s02 Ready 11m v1.21.1-3+1f02fea99e2268
microk8s03 Ready 5m52s v1.21.1-3+1f02fea99e2268
root@microk8s01:~# kubectl get pods
No resources found in default namespace.
root@microk8s01:~# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-7f9c69c78c-wjxzq 1/1 Running 0 35m
calico-kube-controllers-f7868dd95-zstn4 1/1 Running 0 55m
calico-node-tz7hk 1/1 Running 0 12m
calico-node-gb5dw 1/1 Running 0 12m
calico-node-2kgj5 1/1 Running 0 6m

It’s time to install the rbac authorization (https://kubernetes.io/docs/reference/access-authn-authz/rbac/) that is the kubernetes way for assigning roles to users and groups that have authenticated to the cluster. In microk8s the default user is admin that is authenticated by token (https://kubernetes.io/docs/reference/access-authn-authz/authentication/).

root@microk8s01:~# microk8s enable rbac
Enabling RBAC
Reconfiguring apiserver
Adding argument --authorization-mode to nodes.
Configuring node 10.10.10.1
Configuring node 10.10.10.2
Restarting nodes.
Configuring node 10.10.10.1
Configuring node 10.10.10.2
RBAC is enabled

The command configure the apiserver of any node of the clusters adding rbac parameter in its startup command.

The next step is to have a way for exposing the kubernetes service outside of the cluster. There are, as I have deepened in this my article https://www.securityandit.com/network/kubernetes-service-nodeport-ingress-and-loadbalancer/, two solutions: one ingress and the other NodePort based. The solution ingress based is more flexible, scalable and secure. It must be present in any production kubernetes cluster.

Ingress Installation in microk8s

The ingress controller is enabled with ‘microk8s enable ingress’, but unfortunately the nginx is listening on localhost – http and https ports – because, despite the presence of hostPort that forces to listen in the host network namespace, there is the nginx publish-status-address parameter set to 127.0.0.1.

In order to expose the controller, I created a NodePort service for the ingress daemon set and by a haproxy, running in an external machine, proxying the http and https ports to NodePort randomly created by kubernetes.

root@microk8s01:~# microk8s enable ingress
Enabling Ingress
ingressclass.networking.k8s.io/public created
namespace/ingress created
serviceaccount/nginx-ingress-microk8s-serviceaccount created
clusterrole.rbac.authorization.k8s.io/nginx-ingress-microk8s-clusterrole created
role.rbac.authorization.k8s.io/nginx-ingress-microk8s-role created
clusterrolebinding.rbac.authorization.k8s.io/nginx-ingress-microk8s created
rolebinding.rbac.authorization.k8s.io/nginx-ingress-microk8s created
configmap/nginx-load-balancer-microk8s-conf created
configmap/nginx-ingress-tcp-microk8s-conf created
configmap/nginx-ingress-udp-microk8s-conf created
daemonset.apps/nginx-ingress-microk8s-controller created
Ingress is enabled
root@microk8s01:~# kubectl get pods -n ingress
NAME READY STATUS RESTARTS AGE
nginx-ingress-microk8s-controller-jhtxq 1/1 Running 0 66s
nginx-ingress-microk8s-controller-dh7rh 1/1 Running 0 66s
nginx-ingress-microk8s-controller-qxxhl 1/1 Running 0 66s
root@microk8s01:~# ss -lp |grep http
root@microk8s01:~# ss -lp |grep https
root@microk8s01:~# vi nodeport-service-ingress.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-ingress-microk8s
namespace: ingress
spec:
type: NodePort
selector:
name: nginx-ingress-microk8s
ports:
# By default and for convenience, the targetPort is set to the same value as the port field.
- port: 80
name: http
- port: 443
name: https
root@microk8s01:~# kubectl apply -f nodeport-service-ingress.yaml
service/nginx-ingress-microk8s unchanged
root@microk8s01:~# kubectl get service -n ingress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nginx-ingress-microk8s NodePort 10.152.183.118 80:30982/TCP,443:32264/TCP 42s

After that, let’s go on to configure an haproxy external load balancer running outside of the cluster, that is the entry point for https traffic. The In this way it’s possible to reach our kubernetes service by a url like https://myservice.mydomain.

root@haproxy:#apt install haproxy
root@haproxy:#vi /etc/haproxy/haproxy.cfg
frontend www.mysite.com
bind 10.0.0.4:80
bind 10.0.0.4:443 ssl crt /etc/ssl/certs/mysite.pem
http-request redirect scheme https unless { ssl_fc }
default_backend web_servers
backend web_servers
balance roundrobin
server server1 10.10.10.1:32264 check maxconn 20 ssl verify none
server server2 10.10.10.2:32264 check maxconn 20 ssl verify none
server server2 10.10.10.3:32264 check maxconn 20 ssl verify none

I installed the haproxy for http redirect in https and for balancing all the https traffic to https node port of the ingress. Following all the other configurations done in haproxy using a self signed certificate ignoring the verification by adding verify none to the server.

root@haproxy:/etc/haproxy# openssl req -newkey rsa:2048 -nodes -keyout key.pem -x509 -days 365 -out certificate.pem
root@haproxy:/etc/haproxy# cat certificate.pem >>/etc/ssl/certs/mysite.pem
root@haproxy:/etc/haproxy# cat key.pem >> /etc/ssl/certs/mysite.pem
root@haproxy:/etc/haproxy# systemctl start haproxy
root@haproxy:/etc/haproxy# systemctl status haproxy
root@microk8s01:/etc/haproxy# curl -v https://www.mysite.com
< date: Fri, 21 May 2021 10:16:27 GMT
< content-type: text/html
< content-length: 146
< strict-transport-security: max-age=15724800; includeSubDomains
<
404 Not Found

The kubernetes dashboard is ready to be installed and reachable by a normal https url.

Kubernetes Dashboard Installation in microk8s

Kubernetes Dashboard is a general purpose, web-based UI for Kubernetes clusters that can be installed following the instructions in the official github address https://github.com/kubernetes/dashboard, but microk8s provides the installation by module that, behind the scenes uses the same yaml files.

root@microk8s01:/etc/# microk8s enable dashboard
Enabling Kubernetes Dashboard
Enabling Metrics-Server
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
Warning: apiregistration.k8s.io/v1beta1 APIService is deprecated in v1.19+, unavailable in v1.22+; use apiregistration.k8s.io/v1 APIService
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/microk8s-admin created
Adding argument --authentication-token-webhook to nodes.
Configuring node 10.10.10.1
Configuring node 10.10.10.2
Configuring node 10.10.10.3
Restarting nodes.
Configuring node 10.10.10.1
Configuring node 10.10.10.2
Configuring node 10.10.10.3
Metrics-Server is enabled
Applying manifest
serviceaccount/kubernetes-dashboard created
service/kubernetes-dashboard created
secret/kubernetes-dashboard-certs created
secret/kubernetes-dashboard-csrf created
secret/kubernetes-dashboard-key-holder created
configmap/kubernetes-dashboard-settings created
role.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created
rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created
deployment.apps/kubernetes-dashboard created
service/dashboard-metrics-scraper created
deployment.apps/dashboard-metrics-scraper created

As you can see, there is a strange thing: why it’s necessary to add authentication-token-webhook in every node of the cluster? The answer is because the metric server, that collects metrics like CPU or memory consumption for containers or nodes, must authenticate to kubelet and the authentication-token-webhook parameter added to it has the scope to use the service account token for that.

For exposing the kubernetes dashboard, the best way is to use an ingress created in this way:

root@root@microk8s01:#vi ingress-dashboard.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: k8s-dashboard
namespace: kube-system
annotations:
kubernetes.io/ingress.class: public
nginx.ingress.kubernetes.io/ssl-passthrough: "true"
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
spec:
rules:
host: k8s-micro-dashboard.mydomain
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kubernetes-dashboard
port:
number: 443
root@root@microk8s01:# kubectl apply -f ingress-dashboard.yaml
root@root@microk8s01:# kubectl get ingress -A |grep dash
kube-system k8s-dashboard k8s-micro-dashboard.mydomain 127.0.0.1 80, 443 2d
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: kubernetes-dashboard
namespace: kube-system
EOF
root@root@microk8s01:# cat <<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
EOF
root@root@microk8s01#kubectl -n kube-system get secret $(kubectl -n kube-system get sa/kubernetes-dashboard -o jsonpath="{.secrets[0].name}") -o go-template="{{.data.token | base64decode}}"

It’s possible to reach the kubernetes dashboard via https://k8s-dashboard.mydomain, where the virtual host is resolved in the ip address of haproxy that proxies to kubernetes NodePort. The authentication to dashboard is implemented via token, that by rbac, has associated the admin cluster role. In a Production environment, it would be better to have a login that, by username and password, gets roles based to principle of least privilege necessary to perform its function.

For implementing a scenario like that, as kubernetes suggested in https://kubernetes.io/docs/reference/access-authn-authz/authentication/, it’s preferable to use openid connect as authentication strategy. I suggest to use dex, as oidc provider service, and gangway to easily enable authentication flows via OIDC for a kubernetes cluster. Unfortunately, as explained below, in microk8s it’s not possible to add oidc parameters to apiserver and it prevents, until now, to use it.

I noted that microk8s provides as module portainer that is a gui created for docker swarm management and reconverted for kubernetes. Despite having the possibility to manage different clusters and support the ldap authentication, I prefer to use kubernetes dashboard because portainer seems to be a dress born for swarms and badly cut for kubernetes.

For having a kubernetes cluster ready to use in Production, we should have a shared storage that can be mounted inside pods for sharing data, config or to centralize the logs.

Storage configuration

In microk8s it’s possible to enable the storage module – that creates a host path storage class for creating dynamic persistent volumes that mount a file or directory from the host node’s filesystem into your Pod. This is a volume that cannot be shared between pods and is not very useful in a Production environment.

In order to to this, it’s enough to enable the module in this way:

root@root@microk8s01:#microk8s enable storage
root@root@microk8s01:#kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE microk8s-hostpath (default) microk8s.io/hostpath Delete Immediate false 14d

For production environment, it’s better to have a network file system shared that can be mounted as read-write by many nodes. We have two choices:

  1. Create a nfs persistent volume that will be bound when the persistent volume claim requests it by storage class.
  2. Install a nfs external provisioner that creates the nfs persistent volume dynamically when the persistent volume claim requests it by storage class..

In this case I will install a nfs external provisioner supposing to have a external nfs file server reachable at 10.10.10.5 address:

root@root@microk8s01:#shelm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
root@root@microk8s01:#microk8s helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \ --set nfs.server=10.10.10.5\ --set nfs.path=/data-nfs/microk8s
root@root@microk8s01:#microk8s kubectl get sc NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE microk8s-hostpath (default) microk8s.io/hostpath Delete Immediate false 14d nfs-client cluster.local/nfs-subdir-external-provisioner Delete Immediate true 14d

The nfs server must have a local directory accessible to all microk8s nodes where the /etc/exports is configured in this way:

root@root@microk8s01:#vi/etc/exports
/data/microk8s 10.10.10.1(rw,sync,no_root_squash) /data/microk8s 10.10.10.2(rw,sync,no_root_squash) /data/microk8s 10.10.10.3(rw,sync,no_root_squash)
root@root@microk8s01:#exportfs -a

The nfs volumes can be created in this simple way:

root@microk8s01:# cat < apiVersion: v1
> kind: PersistentVolumeClaim
> metadata:
>   name: nfs-volume
> spec:
>   storageClassName: nfs-client
>   accessModes:
>     - ReadWriteMany
>   resources:
>     requests:
>       storage: 5Gi
> EOF
persistentvolumeclaim/nfs-volume created
root@microk8s01:# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS        REASON   AGE
pvc-ec5ffba7-20a8-5fab-b974-721a35b9c1f8   5Gi        RWX            Delete           Bound    default/nfs-volume    nfs-client       

The persistent volume nfs-volume is ready to be mounted inside any pod declaring in a deployment file in this way:

root@microk8s01:#cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
replicas: 1 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx-nfs
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- mountPath: "/var/log/nginx/"
name: testvolume
restartPolicy: Always
volumes:
- name: testvolume
persistentVolumeClaim:
claimName: nfs-volume
EOF
root@microk8s01:#kubectl get pods NAME READY STATUS RESTARTS AGE nginx-6bd6c9685f-fqtgq 1/1 Running 0 6s root@microk8s01:#
kubectl exec -it nginx-6bd6c9685f-fqtgq -- /bin/bash root@nginx-6bd6c9685f-fqtgq:/#
df -k|grep data 29324176 10976972 16834552 40% /etc/hosts 10.10.10.5:/data/microk8s/default-nfs-volume-pvc-ec5ffba7-20a8-5fab-b974-721a35b9c1f8 343811072 217825280 108497920 67% /var/log/nginx

We have a cluster with all important services installed and configured, necessary for production management, like ingress, dashboard and storage. We should also add system monitoring service like a prometheus/grafana stack. This can be added by the ‘microk8s enable prometheus’, but in my case this didn’t work correctly, so I decided to install, but it’s out scope of this article, the prometheus operator directly from https://grafana.com/docs/grafana-cloud/quickstart/prometheus_operator/

The last piece missing, before installing it in Production environment, is its knowledge for having more control of any its part,

Inside the microk8s

The kubernetes cluster is easy to configure, but, in order to manage it in an production environment, is very important to understand how the infrastructure is deployed and configured.

For this scope, as a starting point to deepen the knowledge of the infrastructure, we can take a look at the output of ‘snap info microk8s’ command, where all the services up&running in the cluster are showed

root@microk8s01:/etc/haproxy# snap info microk8s |grep daemon
microk8s.daemon-apiserver: simple, enabled, inactive
microk8s.daemon-apiserver-kicker: simple, enabled, active
microk8s.daemon-cluster-agent: simple, enabled, active
microk8s.daemon-containerd: simple, enabled, active
microk8s.daemon-control-plane-kicker: simple, enabled, inactive
microk8s.daemon-controller-manager: simple, enabled, inactive
microk8s.daemon-etcd: simple, enabled, inactive
microk8s.daemon-flanneld: simple, enabled, inactive
microk8s.daemon-kubelet: simple, enabled, inactive
microk8s.daemon-kubelite: simple, enabled, active
microk8s.daemon-proxy: simple, enabled, inactive
microk8s.daemon-scheduler: simple, enabled, inactive

I was surprised to see controller, scheduler, etcd and proxy services inactives, and my first reaction was: how does the cluster work in this way?

Deepening the situation and reading the microk8s documentation (https://microk8s.io/), I found that, starting from microk8s 1.20 version, all the kubernetes functionalities – apiserver, controller, scheduler, kube-proxy and db- are stored in the kubelite process. Infact the apiserver port, 16443,, and the kubelet port, 10250, are managed by kubelite process.

root@microk8s01:/etc/# kubectl cluster-info
Kubernetes control plane is running at https://127.0.0.1:16443
CoreDNS is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://127.0.0.1:16443/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
root@microk8s01:/etc/# ss -lp |grep 16443
tcp LISTEN 0 4096 *:16443 *:* users:(("kubelite",pid=122424,fd=32))
root@microk8s01:/etc/# ss -lp |grep 10250
tcp LISTEN 0 4096 *:10250 *:* users:(("kubelite",pid=122424,fd=32))
root@microk8s01:/etc/haproxy# ps -afe |grep kubelite
root 122424 1 6 10:26 ? 00:02:23 /snap/microk8s/2210/kubelite --scheduler-args-file=/var/snap/microk8s/2210/args/kube-scheduler --controller-manager-args-file=/var/snap/microk8s/2210/args/kube-controller-manager --proxy-args-file=/var/snap/microk8s/2210/args/kube-proxy --kubelet-args-file=/var/snap/microk8s/2210/args/kubelet --apiserver-args-file=/var/snap/microk8s/2210/args/kube-apiserver --kubeconfig-file=/var/snap/microk8s/2210/credentials/client.config --start-control-plane=true
root 153903 1583 0 11:05 pts/0 00:00:00 grep --color=auto kubelite

The kubelite has, as input parameters, the configuration files used by apiserver, controller, scheduler, kube-proxy and apiserver. The arguments are stored in these files:

  1. Scheduler args: /var/snap/microk8s/2210/args/kube-schedule.
  2. Apiserver args: /var/snap/microk8s/2210/args/kube-apiserver.
  3. Controller args: /var/snap/microk8s/2210/args/kube-controller-manager.
  4. Kubelet args: /var/snap/microk8s/2210/args/kubelet.
  5. Kube-proxy args: /var/snap/microk8s/2210/args/kube-proxy.

Taking at look, for example, at the apiserver configuration we can see the common apiserver parameters.

root@microk8s01:~/# cat /var/snap/microk8s/2210/args/kube-apiserver
--cert-dir=${SNAP_DATA}/certs
--service-cluster-ip-range=10.152.183.0/24
--authorization-mode=RBAC,Node
--service-account-key-file=${SNAP_DATA}/certs/serviceaccount.key
--client-ca-file=${SNAP_DATA}/certs/ca.crt
--tls-cert-file=${SNAP_DATA}/certs/server.crt
--tls-private-key-file=${SNAP_DATA}/certs/server.key
--kubelet-client-certificate=${SNAP_DATA}/certs/server.crt
--kubelet-client-key=${SNAP_DATA}/certs/server.key
--secure-port=16443
--token-auth-file=${SNAP_DATA}/credentials/known_tokens.csv
--insecure-port=0
--storage-backend=dqlite
--storage-dir=${SNAP_DATA}/var/kubernetes/backend/
--allow-privileged=true
--service-account-issuer='https://kubernetes.default.svc'
--service-account-signing-key-file=${SNAP_DATA}/certs/serviceaccount.key
--feature-gates=RemoveSelfLink=false
--requestheader-client-ca-file=${SNAP_DATA}/certs/front-proxy-ca.crt
--requestheader-allowed-names=front-proxy-client
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--proxy-client-cert-file=${SNAP_DATA}/certs/front-proxy-client.crt
--proxy-client-key-file=${SNAP_DATA}/certs/front-proxy-client.key

The limit of this cluster is that some kubernetes apiserver parameters are not accepted. Infact, I tried to added oidc parameters – well explained in this ask ubuntu https://askubuntu.com/questions/1243695/microk8s-oidc-with-keycloak/1339374#1339374 – and it didn’t work. This doesn’t permit to install an oidc provider, like dex ( https://github.com/dexidp/dex ) with a oidc proxy, like gangway (https://github.com/heptiolabs/gangway), that offer together a GUI, configured by known backend protocols like ldap, for username and password authentication.

The kubelite is running in any node of the cluster and it means that every node of the cluster is also a controller that is not possible to separate it from worker nodes. For restarting kubelite, with all the controllers inside it, it’s possible to use the systemd command:

root@microk8s01:#systemctl restart snap.microk8s.daemon-kubelite.service

As container runtime, running in each node of the cluster so that Pods can run, is used containerd that can be restarted in this way:

root@microk8s01:#systemctl restart snap.microk8s.daemon-containerd.service

Instead of docker, if it’s necessary to troubleshoot container issue, the command ctr, that is installed in each node inside the directory /snap/microk8s/2210/bin/ctr, could be used or by, in a simple way, with ctr natively invoked by microk8s:

root@microk8s01:#kubectl describe pod nginx-6bd6c9685f-fqtgq  |grep Conta
Containers:
    Container ID:   containerd://cc6b48c5ebbe3914b9e918f6842e0ed356a10bcad684b1f679e8e34ac75b8101
  ContainersReady   True /snap/microk8s/2210/bin/ctr -namespace k8s.io -address 
root@microk8s01:#/snap/microk8s/2210/bin/ctr -namespace k8s.io -address /var/snap/microk8s/common/run/containerd.sock containers ls |grep cc6b48c5ebbe3914b9e918f6842e0ed356a10bcad684b1f679e8e34ac75b8101
cc6b48c5ebbe3914b9e918f6842e0ed356a10bcad684b1f679e8e34ac75b8101 sha256:d1a364dc548d5357f0da3268c888e1971bbdb957ee3f028fe7194f1d61c6fdee io.containerd.runc.v1
root@microk8s01:#microk8s ctr containers ls|grep cc6b48c5ebbe3914b9e918f6842e0ed356a10bcad684b1f679e8e34ac75b8101
cc6b48c5ebbe3914b9e918f6842e0ed356a10bcad684b1f679e8e34ac75b8101 sha256:d1a364dc548d5357f0da3268c888e1971bbdb957ee3f028fe7194f1d61c6fdee io.containerd.runc.v1

As said before, the network plugin used is calico that is formed by a controller, that watches continually tha apiserver, and a daemon set, running in any node of the cluster that contains the daemons for distribute routing information to other nodes and configure routes to local node and the calico configuration expressed by crd resources:

root@microk8s01:# kubectl get pods -A |grep calico
kube-system calico-kube-controllers-f7868dd95-vnbtf 1/1 Running 0 14d
kube-system calico-node-bb4dw 1/1 Running 0 14d
kube-system calico-node-k8h4s 1/1 Running 0 14d
kube-system calico-node-6bd9m 1/1 Running 0 14d
root@microk8s01:# kubectl get daemonset -A |grep calico
kube-system calico-node 3 3 3 3 3 kubernetes.io/os=linux 14d
root@microk8s01:# kubectl get crd -A |grep calico
bgpconfigurations.crd.projectcalico.org 2021-05-19T14:19:41Z
bgppeers.crd.projectcalico.org 2021-05-19T14:19:41Z
blockaffinities.crd.projectcalico.org 2021-05-19T14:19:41Z
clusterinformations.crd.projectcalico.org 2021-05-19T14:19:41Z
felixconfigurations.crd.projectcalico.org 2021-05-19T14:19:41Z
globalnetworkpolicies.crd.projectcalico.org 2021-05-19T14:19:41Z
globalnetworksets.crd.projectcalico.org 2021-05-19T14:19:41Z
hostendpoints.crd.projectcalico.org 2021-05-19T14:19:41Z
ipamblocks.crd.projectcalico.org 2021-05-19T14:19:41Z
ipamconfigs.crd.projectcalico.org 2021-05-19T14:19:41Z
ipamhandles.crd.projectcalico.org 2021-05-19T14:19:41Z
ippools.crd.projectcalico.org 2021-05-19T14:19:41Z
networkpolicies.crd.projectcalico.org 2021-05-19T14:19:41Z
networksets.crd.projectcalico.org 2021-05-19T14:19:41Z

A metric server, that scraps metric from the 1025 port of kubelet, is running and configured and it permits to monitor in real time the cpu and memory usage of pods and nodes:

root@microk8s01:# kubectl get pods -A |grep metric
kube-system metrics-server-8bbfb4bdb-fpg64 1/1 Running 0 12d
root@microk8s01:~# kubectl top nodes
W0603 13:00:15.879802 214384 top_node.go:119] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
microk8s01 250m 6% 4302Mi 54%
microk8s02 188m 4% 3714Mi 47%
microk8s03 127m 3% 3410Mi 43%

As datastore, instead of etcd, is used dqlite. An alternate backing store, running inside the kubelite process. that also uses raft algoritm for voting the leader in the same approach provided by etcd. The minimum number of nodes for creating a cluster in high availability is three, and it explains why a microk8s cluster must be formed by a minimum of three nodes. More information about dqlite utilities, to use for example backup and restore, are explained In https://microk8s.io/docs.

Conclusions

This microk8s kubernetes solution is easy to install and very useful for testing, demo and development environment. Canonical has released it for production, and in this context, it’s better to be awareness about the following features that could become limits to the scalability of the solution:

  1. Every node of the cluster is a controller node where the kubelite process, that contains all the controller part of the cluster, is running. This could have impact to the worker pods; of course, there is no way separate controllers from workers node.
  2. The api server and the controllers could not accept, preventing to custom the cluster according to your needs, all the kubernetes standard arguments. 

Despite all this, the solution is light, flexible and, after enabling all the services necessary in an production environment, I think it is really worth trying it in Production.

LEAVE A COMMENT