Category Archives: Software

Using podman to run a postgresql server

I’m moving over to podman for containers where possible, because I like where the project is going. I have no specific objections to docker, and in fact use it for several of my own projects, but podman feels more kubernetes-ish to me somehow.

When I set up k3s, except for trivial instances, I generally use postgresql as the backend storage. Again, this is a personal preference, as I have a lot of experience with postgresql, and it’s the database I’m most comfortable using. But installing and maintaining postgresql can be a chore, and the project create a really great container that makes it easy to run. So I’ve decided to document how I use podman to create a postgres instance in this post.

#!/usr/bin/env bash

CONTAINER_NAME=k3spg
VOLUME_NAME=k3spg-data
DB_NAME=k3s
DB_USER=k3s
DB_PORT=5432

PASSWORD=$(date +%s | sha256sum | base64 | head -c 32 ; echo)
echo "${PASSWORD}" > ${CONTAINER_NAME}.pg-pw.txt

podman volume create ${VOLUME_NAME}
podman create 
        -v ${VOLUME_NAME}:/var/lib/postgresql/data 
        -e POSTGRES_PASSWORD=${PASSWORD} 
        -e POSTGRES_DB=${DB_NAME} 
        -e POSTGRES_USER=${DB_USER} 
        -p 5432:${DB_PORT} 
        --restart on-failure 
        --name ${CONTAINER_NAME} 
        -d 
        postgres:12.4

podman generate systemd 
        --new 
        --files 
        --name 
        ${CONTAINER_NAME}

echo "Your db user password is ${PASSWORD}"

Progress on pvc storage


I’ve been quiet, but also busy. After building and configuring my Cloudshell 2, I played around with several potential uses for it. Meanwhile, I ran into several issues with local-path-provider in k3s. Some of these were due to my own sloppiness; I managed to create two postgresql pods that played in a lot of the same spaces, and all manner of mischief ensued.

Tonight, I got nfs-client-provisioner working. This involved setting up an nfs server on the Odroid xu4 that powers the Cloudshell2, which I’ll describe in another post.

The nfs provisioner for kubernetes requires that the nfs client be installed on all the nodes in the cluster. This turned out to be pretty easy. Ansible to the rescue:

ansible -m package -a 'name=nfs-common' master:workers

The next step was to set up the provisioner’s configuration. Most of what I did here was based on the NFS section of Isaac Johnson’s post at FreshBrewed.Science I grabbed a copy of the sample config first:

wget https://raw.githubusercontent.com/jromers/k8s-ol-howto/master/nfs-client/values-nfs-client.yaml

These values need to be changed to match your NFS server configuration.

replicaCount: 1

nfs:
  server: 10.0.96.30
  path: /storage1/pvc
  mountOptions:

storageClass:
  archiveOnDelete: false

Once that was done, it was time to deploy the client. That can be accomplished via helm:

helm install nfs -f values-nfs-client.yaml 
    stable/nfs-client-provisioner 
    --set image.repository=quay.io/external_storage/nfs-client-provisioner-arm

Wait a few minutes for the deploy to finish, and there we have it:

$ kubectl get storageclasses
NAME                   PROVISIONER                                RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path                      Delete          WaitForFirstConsumer   false                  23d
nfs-client             cluster.local/nfs-nfs-client-provisioner   Delete          Immediate              true                   57m

The current state of pi-kube


A closer look at the cluster

Hardware and Operating System

Currently, the cluster consists of 5 Raspberry Pi 4B systems with 4GB memory. Each of these has a 16gb micro SD card and a 16gb USB flash drive. I use the SD cards for boot only, but the root filesystem is on the flash drives. I’m not overclocking anything in the cluster yet, mostly because I don’t trust the power supply arrangement enough.

The systems are running Ubuntu 20.04 minimal, the stock distribution code from Canonical.

A 6th Raspberry Pi 4B acts as a file server. It’s equiped with a 256gb USB flash drive. The file server runs Ubuntu 19.04 (it’s due for a rebuild soon.) I use gluster to mount the flash drive on all of the cluster servers. Currently, I’m not using replication or sharding with gluster, although at some point I intend to pursue that. The file system is mounted as /data/shared on each member system in the cluster.

Kubernetes configuration

The system runs k3s from Rancher. One RPi 4B servers as the master (and is also a node); there are 4 RPi 4B nodes besides this.

The stock k3s configuration is deployed, using sqlite for storage and Traefik to manage ingresses. Cert-manager is used to manage letsencrypt requests. External (internet) DNS for ingress is provided by AWS Route 53, although this is not currently automated. My home network using [pihole] for DNS and DHCP; static DHCP leases are used for all hardware nodes. pihole also provides internal DNS spoofing of the external domain names, since the external IP address of my router does not work on the internal network.

The entire system is deployed via a set of ansible playbooks. You can see those playbooks here in github.

Reboot


Last night, I tore down the entire stack and rebuilt it. This was to accommodate two things:

  • The release of Ubuntu 20.04, which required an upgrade, and
  • A hardware change involving using USB flash drives for the root file system on all nodes, rather than just a micro SD card.

As a result of this, I’ve come up with some changes to node provisioning – the steps required to go from bare hardware to an operating node. I’m planning a step-by-step guide for building out the cluster based on this. Stay tuned.

Getting Cert-manager to work


I’ve been sort-of following the series that Lee Carpenter is doing over at carpie.net, but for a while I was hung up on getting cert-manager to work. The specific failure mode I had was this:

My external IP address (the IP assigned to my router by the cable company) for some reason isn’t routed correctly from inside my home network. The IP responds to pings, and DNS resolves it, but any SSH, HTTP or HTTPS traffic (and presumably any other TCP connections) all hang indefinitely. This appears to be a router issue, since my router, a TP-Link Archer 20-based model, doesn’t use an alternate port for its web admin UI. The router presents the UI on port 443 with a self-signed certificate, and redirects port 80 traffic to 443. I suspect that the web server embedded in the router’s firmware is catching my web connections (the ones that originate inside the network) and doesn’t know what to do with them, so they just hang.

External connections are properly routed, as I’ve got port-forwarding configured to send the traffic to the kube cluster.

Here’s why this is a problem: cert-manager has a “sanity check” it runs before issuing a certificate request; if you are using the http01 verification strategy, cert-manager tries to reach the verification challenge response URL before it sends any cert requests to letsencrypt. This makes sense, since there’s no reason to send a request if letsencrypt can’t find the verification challenge response.

Except, in this case, the response actually is correctly configured, and if you hit that URL from outside of my home network, you would see it. The sanity check, however, is running inside, and thus it was failing, thus no certificate for me!.

The solution to this was simple: I run pi-hole on my home network, as both a DHCP server and a DNS server. So all I had to do was “spoof” my external DNS name on the internal network, so that it resolved to the internal address of the kube cluster, rather than the external address of the router.

At least, it sounds simple. In reality, it proved to be difficult, mainly because I made a decision when I started building my cluster to use Ubuntu server (which is a full 64 bit OS) rather than Raspian (which runs userspace in 32-bit, even on Raspberry Pi 4). And I’m running Ubuntu 19.10, which means that (by default) I’m using systemd-resolved to handle DNS resolution.

I’ve long ago gotten over my distaste for systemd, but man, systemd-resolved is pure evil. If you think you understand how Linux DNS resolution works, be prepared to feel dumb. I won’t go into all the reasons why I think what they’ve done with resolution in systemd is evil, but I will say this: no matter what I did, the cert-manager pods seemed to not use my internal DNS server, until I fully disabled (and apt purged) systemd-resolved, and did a whole bunch of other stuff to get resolv.conf back to what anyone who’s used Unix for 30 years would expect.

I actually walked away from this for a while, because it was so frustrating. And in the course of trying to figure out what was wrong, I rebuilt the kube cluster without traefik, and installed metallb and nginx using GrĂ©goire Jeanmart’s helpful articles as a guide. Let me be clear: traefik was NOT the problem, and not even related. My issue was with DNS. But at this point, I’ve got the cluster working with cert-manager, so I think I’m just going to leave it the way it is for now.

Refactoring update


This is going to be kind of a rambling post. Sorry in advance.

I’ve done some reworking of both the hardware and software. These are presented in no particular order.

My original stack had a 1tb external HD wired up via a SATA2-to-USB3 connector which has proved problematic; I was connecting it to the master node, but I had persistent low voltage warnings. I’ve got some plans to run the HD off a powered USB hub, but with COVID-19 messing with everyone’s work and shipment schedule, it might be a month before the parts required arrive. So in the meantime, I’ve replaced the drive with a USB flash drive. This is only a 16gb drive, but it’s enough to let me tinker with NFS and persistent volumes.

With that settled, I’ve done some ansible work to get an NFS server configured on the master node, and have NFS clients running on each of the worker nodes. This lets me have a common pool for persistent volume claims to work off. I haven’t actually started using PVCs yet, so no idea how well this will work, but it’s a start.

I’ve added a Rock Pi 4 from Radxa to the stack. Eventually, once the power issues are resolved, my plan is to convert this to a dedicated NAS (perhaps using Open Media Vault) and take the NFS server burden off the master node. The Rock Pi might be a challenge, as support for it seems spotty and it seems to run off images specifically created for it; we will see how this experiment works out. If all else fails, I’ve managed to pick up an Atomic Pi that might serve nicely.

I’ve replaced the BrambleCase with a Cloudlet Cluster case, also from C4Labs. I like the BrambleCase a lot, but the Cloutlet case offers easier access to the boards installed in it, which works better at this phase of the project. I’d still recommend, guardedly, the BrambleCase; it’s a fine piece of engineering, albeit a bit tough to assemble (especially with my near-60 eyes and fingers.) I’ve kept the Bramble for some other RPi projects I’m planning.

After struggling with internal name resolution issues, I’ve made two sweeping changes. First, I’ve added a dedicated RPi 4 running pihole. My main reason for doing this is because I’ve got a DNS spoofing requirement, which I’ll cover below. The second change was to systematically disable systemd-resolved on all the Ubuntu 19 systems I’m running (which includes Kepler, my day-to-day Linux desktop, which is built off an old Mac Mini, and which probably deserves a whole series of posts itself.) I have had nothing but grief and misfortune with systemd-resolved, and it’s bad enough that I’ve decided to disable it anywhere I can. There are a lot of critiques and defenses of systemd and related project out on the web; I won’t go into the controversies, because I’m not firmly in either the pro- or anti- camps as far as systemd goes, but systemd-resolved violates the principle of least surprise, and the way it works both obscures DNS resolution and intentionally breaks how classic resolv.conf/glibc resolver works. Systemd-resolved expects a world where there is one contiguous DNS namespace, and all DNS servers agree on all hosts. That doesn’t work for internal networks, which is basically every corporate network and a lot of home networks as well.

I’ve been trying to follow the series that Lee Carpenter has been doing on his RPi/k3s cluster, but I am hung up on getting cert-manager to work. I’ll update more on those issues in another post once I land on a solution, but the gist of things is: the external interface of my router is not reachable from my internal net. As a result of this, cert-manager fails its self-check, because the self-check tries to make sure that the ACME challenge url is reachable (from its container) before it actually forwards the request to letsencrypt. This doesn’t work with “regular” DNS for me, because the internet DNS resolves project.kube.thejimnicholson.com to my external IP, and the container (inside my network) can’t reach that IP. To try and solve this, I use dnsmasq internally (via pihole.) So far, this hasn’t helped, which I’ve tracked down to one of two things: either coreDNS is configured wrong in my cluster, or the cert-manager containers are hard-wired to use some external DNS rather than refer back to the node’s DNS configuration for resolving names. I’ll have more to say about this once I’ve solved it.

Installing vcgencmd on Ubuntu 19.10


  • Add the ppa for it: sudo add-apt-repository ppa:ubuntu-raspi2/ppa
  • Edit /etc/apt/sources.list.d/ubuntu-raspi2-ubuntu-ppa-eoan.list. Change eoan to bionic, because bionic is the latest supported release.
  • Do sudo apt update
  • Do sudo apt install libraspberrypi-bin

This will install vcgencmd.

This can be easily automated using ansible:

tasks:
  - name: Set up the ppa for vcgencmd
    become: true
    apt_repository:
        repo: ppa:ubuntu-raspi2/ppa
        codename: bionic

  - name: Install ubuntu raspberry pi support library
    apt:
        name: libraspberrypi-bin

Taken from here.

Preparing the system OS image


Download the Ubuntu raspberry pi image from the Ubuntu Server for Raspberry Pi page.

Then use the gnome-disks program to copy it onto an SD card.

And then wait.

Once the image has been written, remove the SD card and then reinsert it (if necessary; about half the time it just mounted the new partitions for me.) Then you need to do things:

  1. Create an empty file called “ssh” on the system-boot partition. There are any number of ways to do this, but something like touch /media/$USER/system-boot/ssh should work if you like working from the command line.
  2. There will be a file in the system-boot partition named nobtcmd.txt. This file contains the kernel command line options used when linux is started. To work with k3s, you want to add cgroup_memory=1 cgroup_enable=memory to the end of the line.
  3. If you want to overclock your Pi, you should add your overclock parameters to usrcfg.txt in the system-boot partition. This is slightly different from the way you do it in Raspbian, where you edit config.txt directly; The Ubuntu boot system has things broken out a bit more.

You need to do these things for each SD card you’re setting up. Once this is done, you will have a system image that will boot Ubuntu on your cards, and you’re ready to start deploying things.