Adventures with a home Kubernetes cluster

Like many techies out there, I’ve accumulated various Raspberry Pi like development boards over the years. And also like many techies, most of them have been sitting in a “tech all use someday” drawer.

Well that someday finally came for me :)

I had a few weeks off from work over the winter holidays, which gave me plenty of time to take stock of all the hardware I had, and what I could do with it. This included

A five disk RAID enclosure exposed over USB3
Raspberry Pi model B (OG model)
CubbieBoard 1
Banana Pi M1
HP Netbook (2012??)

Out of those 5 pieces of hardware, I was currently only using the RAID and the Netbook as a subpar NAS. Since the Netbook didn’t support USB3, I wasn’t getting the full speed potential out of my RAID.

Life Goals!

That RAID was being done a disservice by the netbook, so I set some goals for a better setup:

A NAS with USB3 and Gigabit ethernet
A better way to manage the software on the device
(bonus) Ability to stream some media off the RAID to my Fire TV

Since none of the devices I had supported both USB3 and Gigabit ethernet, I sadly had to do some shopping.

The board I landed on was the ROC-RK3328-CC. It had all the specs I wanted, and had decent enough OS support.

With the hardware specs addressed (and waiting for it to arrive), I turned my attention to the 2nd goal.

Managing the software on the device

Part of what I felt made my dev-board projects eventually fail in the past, was my lack of reproducibility and documentation. Whenever I got everything setup the way I needed it, I never wrote down the steps I took, or blog posts I followed. And when something eventually went wrong months or years later, I would have no idea what I had originally done when attempting to fix an issue.

So I said to myself “this time it will be different!”

This time will be different

I turned to a beast I know quite well, Kubernetes!

While K8s is a pretty heavy handed solution to a pretty simple problem, after almost three years managing clusters at $dayjob using various solutions (home grown, kops, etc), it’s also something I’m deeply familiar with.

Plus, deploying it outside of a cloud environment, and on ARM devices for that matter, seemed like an interesting challenge.

I also figured that since my existing hardware didn’t have the specs I needed for the NAS, I could at least cluster them, and maybe some of the software that didn’t need the higher specs could run on my older devices.

Kubernetes on ARM

Since I hadn’t had a chance at work to try using the kubeadm tool for provisioning clusters, I figured now was a perfect time to take it for a test drive.

For my OS I decided on Armbian, as it had the most support across all the boards I had.

I found a good blog post for setting up Kubernets on a Raspberry Pi using HypriotOS. Since I wasn’t to confident in HypriotOS being available for all my boards, I adapted the instructions to Debian/Armbian.

Prerequisites

Before starting I needed to install

Docker
kubelet
kubeadm
kubectl

Docker needed to be installed using their convenience script (as noted if running Raspbian).

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

And then I installed the Kubernetes components based on the instructions from the Hypriot blog, but adapted to lock all my dependencies to specific versions.

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get install -y kubelet=1.13.1-00 kubectl=1.13.1-00 kubeadm=1.13.1-00

Raspberry Pi B

I hit my first hiccup when trying to bootstrap a cluster on my original Raspberry Pi B.

$ kubeadm init
Illegal instruction

Turns out that Kubernetes dropped support for ARMv6. Oh well, that left the CubbieBoard and the Banana Pi.

Banana Pi

The same process on the Banana PI initially seemed to have much more success, but the kubeadm init command eventually timed out waiting for the control plane to be healthy.

error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

Checking what the containers were doing with docker ps, I saw that the kube-controller-manager and kube-scheduler were both up for at least 4-5 minutes, but the kube-api-server was only about a minute or two old.

$ docker ps
CONTAINER ID   COMMAND                  CREATED              STATUS           
de22427ad594   "kube-apiserver --au…"   About a minute ago   Up About a minute
dc2b70dd803e   "kube-scheduler --ad…"   5 minutes ago        Up 5 minutes     
60b6cc418a66   "kube-controller-man…"   5 minutes ago        Up 5 minutes     
1e1362a9787c   "etcd --advertise-cl…"   5 minutes ago        Up 5 minutes     

Obviously the api-server was dying, or an external process was killing it and restarting it.

Checking the logs I saw some pretty standard looking startup procedures, a log that it had started listening on the secure port, and then a long pause before lots of TLS handshake errors.

06:48.604881  naming_controller.go:284] Starting NamingConditionController
06:48.605031  establishing_controller.go:73] Starting EstablishingController
06:50.791098  log.go:172] http: TLS handshake error from 192.168.1.155:50280: EOF
06:51.797710  log.go:172] http: TLS handshake error from 192.168.1.155:50286: EOF
06:51.971690  log.go:172] http: TLS handshake error from 192.168.1.155:50288: EOF
06:51.990556  log.go:172] http: TLS handshake error from 192.168.1.155:50284: EOF
06:52.374947  log.go:172] http: TLS handshake error from 192.168.1.155:50486: EOF
06:52.612617  log.go:172] http: TLS handshake error from 192.168.1.155:50298: EOF
06:52.748668  log.go:172] http: TLS handshake error from 192.168.1.155:50290: EOF

And then the server would shutdown shortly after. Some more Googling brought me to this issue which seemed to indicate this was possibly caused by slow crypto on some ARM devices.

I took a leap and figured that maybe the api-server was being overwhelmed by the repeated retries of the scheduler and controller-manager.

Moving those files out of the manifests directory would tell the kubelet to terminate those pods.

mkdir /etc/kubernetes/manifests.bak
mv /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/manifests.bak/
mv /etc/kubernetes/manifests/kube-controller-mananger.yaml /etc/kubernetes/manifests.bak/

Tailing the logs of the api-server, I saw it get further than before, but it was still dying around the 2 minute mark. Then I remembered, the manifest probably contained a liveness probe with timeouts that were set much to low for my slow-as-crap device (thats the technical term).

So I checked /etc/kubernetes/manifests/kube-api-server.yaml, and sure enough…

livenessProbe:
  failureThreshold: 8
  httpGet:
    host: 192.168.1.155
    path: /healthz
    port: 6443
    scheme: HTTPS
  initialDelaySeconds: 15
  timeoutSeconds: 15

My pod was getting killed after 135 seconds (initialDelaySeconds + timeoutSeconds * failureThreshold). I bumped the initialDelaySeconds up to 120…

Success! Well, I still got the handshake errors (presumably from the kubelet), but it made it through the startup.

06:54.957236  log.go:172] http: TLS handshake error from 192.168.1.155:50538: EOF
06:55.004865  log.go:172] http: TLS handshake error from 192.168.1.155:50384: EOF
06:55.118343  log.go:172] http: TLS handshake error from 192.168.1.155:50292: EOF
06:55.252586  cache.go:39] Caches are synced for autoregister controller
06:55.253907  cache.go:39] Caches are synced for APIServiceRegistrationController controller
06:55.545881  controller_utils.go:1034] Caches are synced for crd-autoregister controller
...
06:58.921689  storage_rbac.go:187] created clusterrole.rbac.authorization.k8s.io/cluster-admin
06:59.049373  storage_rbac.go:187] created clusterrole.rbac.authorization.k8s.io/system:discovery
06:59.214321  storage_rbac.go:187] created clusterrole.rbac.authorization.k8s.io/system:basic-user

Once the api-server was up, I moved the controller and scheduler yamls back into the manifests directory, and they started up normally as well.

Now to double check, could I get everything to boot up normally if I left all the files in the manifests directory, and just increased the livenessProbe initial delay?

29:33.306983  reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.Service: Get https://192.168.1.155:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.1.155:6443: i/o timeout
29:33.434541  reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.ReplicationController: Get https://192.168.1.155:6443/api/v1/replicationcontrollers?limit=500&resourceVersion=0: dial tcp 192.168.1.155:6443: i/o timeout
29:33.435799  reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.PersistentVolume: Get https://192.168.1.155:6443/api/v1/persistentvolumes?limit=500&resourceVersion=0: dial tcp 192.168.1.155:6443: i/o timeout
29:33.477405  reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1beta1.PodDisruptionBudget: Get https://192.168.1.155:6443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0: dial tcp 192.168.1.155:6443: i/o timeout
29:33.493660  reflector.go:134] k8s.io/client-go/informers/factory.go:132: Failed to list *v1.PersistentVolumeClaim: Get https://192.168.1.155:6443/api/v1/persistentvolumeclaims?limit=500&resourceVersion=0: dial tcp 192.168.1.155:6443: i/o timeout
29:37.974938  controller_utils.go:1027] Waiting for caches to sync for scheduler controller
29:38.078558  controller_utils.go:1034] Caches are synced for scheduler controller
29:38.078867  leaderelection.go:205] attempting to acquire leader lease  kube-system/kube-scheduler
29:38.291875  leaderelection.go:214] successfully acquired lease kube-system/kube-scheduler

Yes, it eventually works, but these older devices were probably not going to be suitable for running the control plane, given that repeated TLS connections caused such a drastic slowdown. But, I now had a working K8s install on ARM!

Moving on…

Mounting the RAID

Since SD cards are not suitable for long-term sustained writes, I wanted to have the more volatile parts of the filesystem persisted on a more durable medium, in this case, the RAID. I gave it 4 partitions

50GB
2x 20GB
3.9TB

I didn’t have a precise use for the 20GB ones, but I wanted to leave some options open in the future.

In the /etc/fstab file I mounted the 50GB partition at /mnt/root, and the 3.9TB partition at /mnt/raid, and then mounted the etcd and docker directories into the 50gb partition.

UUID=655a39e8-9a5d-45f3-ae14-73b4c5ed50c3 /mnt/root ext4 defaults,rw,user,auto,exec 0 0
UUID=0633df91-017c-4b98-9b2e-4a0d27989a5c /mnt/raid ext4 defaults,rw,user,auto 0 0
/mnt/root/var/lib/etcd /var/lib/etcd none defaults,bind 0 0
/mnt/root/var/lib/docker /var/lib/docker none defaults,bind 0 0

The ROC-RK3328-CC Arrives

With the new board in hand, I fire it up, install the K8s prerequisites, and run kubeadm init. After a few minutes it succeeds and prints the join command to run on other nodes.

Success! No need to fiddle with timeouts.

Since this board is also the one that needs to host the RAID, I need to setup the mounts again as well. Putting it all together:

1. Disk mounts in /etc/fstab

UUID=655a39e8-9a5d-45f3-ae14-73b4c5ed50c3 /mnt/root ext4 defaults,rw,user,auto,exec 0 0
UUID=0633df91-017c-4b98-9b2e-4a0d27989a5c /mnt/raid ext4 defaults,rw,user,auto 0 0
/mnt/root/var/lib/etcd /var/lib/etcd none defaults,bind 0 0
/mnt/root/var/lib/docker /var/lib/docker none defaults,bind 0 0

2. Install Docker and K8s binaries

curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" > /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get install -y kubelet=1.13.1-00 kubectl=1.13.1-00 kubeadm=1.13.1-00

3. Set a unique hostname (Important once I add multiple nodes)

hostnamectl set-hostname k8s-master-1

4. Initialize Kubernetes

I skip the control plane phase because I want to be able to schedule normal pods on this node as well.

kubeadm init --skip-phases mark-control-plane

5. Install a network plugin

The Hypriot blogpost was a little out of date, as Weave is also a supported network plugin on ARM.

export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

6. Add node labels

Since I’ll need the NAS server to run on this node, I need to mark it with some labels I can use when scheduling.

kubectl label nodes k8s-master-1 marshallbrekka.raid=true
kubectl label nodes k8s-master-1 marshallbrekka.network=gigabit

Joining other nodes to the cluster

Setting up my other devices (Banana Pi, CubbieBoard) was just as easy. I would follow the first 3 steps (customizing the mounts for drives or flash storage I had available), and then running the kubeadm join command instead of kubeadm init.

Finding ARM docker containers.

While I could normally build most docker containers I wanted from my Mac, doing so for ARM was not as easy. I did find many blog posts showing how to use QEMU to accomplish the task, but I ended up finding most of the apps I needed already built, many of which from linuxserver.

Next Steps

I still don’t have my initial device setup quite as automated/scripted as I would like, but the few commands I do have to run (mounts, docker, kubeadm) I now have well documented in a Git repo. The rest of my apps are also defined as K8s yamls in that repo as well, which makes it trivial to re-create the setup if i need to rebuild from scratch for any reason.

Looking forward, a few things I would like to do

Make the masters HA
Add monitoring/alerting so I know when components fail
Change my router DCHP settings to use my in-cluster DNS so apps are more easily discoverable (who wants to remember private IPs)
Run MetalLB to expose cluster services to my private network (DNS, etc)