Cluster external iSCSI initiator to Longhorn volume target, via Cilium's CEW feature
14 December, 2021This post was partly prompted by the realisation that it’s nearly 2022 and I haven’t posted anything for the year…
Anyway, here’s a snappily-titled post that demos a cool feature of Cilium - Cluster External Workloads - that lets you extend cluster networking to external clients, i.e other virtual machines, so that they can access resources hosted in Kubernetes. I’m a big fan of Cilium as aside from all the security and observability benefits, it also has a lot of cool features such as this that help bridge the gap between the ‘old world’ and the Cloud Native way of doing things 🌠
For this example we’re going to make use of a Longhorn feature that lets you connect any iSCSI initiator to a Longhorn Volume as a target. This particular use-case was prompted by a data recovery scenario, in which maybe you have a VM outside of your Kubernetes cluster to which you’d like to present a Kubernetes PV.
Cluster bring-up
I’ve created four virtual machines for my cluster - one as a controlplane / etcd host, and three workers. I’m going to use the venerable RKE for this, so I just need to craft a simple cluster.yaml
and run rke up
:
cluster_name: cilium
ssh_agent_auth: true
ignore_docker_version: true
nodes:
- address: 192.168.20.30
user: nick
role:
- controlplane
- etcd
- address: 192.168.20.181
user: nick
role:
- worker
- address: 192.168.20.79
user: nick
role:
- worker
- address: 192.168.20.184
user: nick
role:
- worker
kubernetes_version: v1.20.9-rancher1-1
ingress:
provider: nginx
network:
plugin: none
Once the cluster is up, install Cilium with the CEW feature enabled:
$ helm repo add cilium https://helm.cilium.io/
$ helm repo update
$ helm install cilium cilium/cilium --version 1.9.9 \
--namespace kube-system \
--set externalWorkloads.enabled=true
I like to have a VIP to make network access to nodes in my cluster highly available, so for this I install kube-karp:
$ git clone https://github.com/immanuelfodor/kube-karp
$ cd kube-karp/helm
$ helm install kube-karp . \
--set envVars.virtualIp=192.168.20.200 \
--set envVars.interface=eth0 \
-n kube-karp --create-namespace
$ ping -c 2 192.168.20.200
PING 192.168.20.200 (192.168.20.200) 56(84) bytes of data.
64 bytes from 192.168.20.200: icmp_seq=1 ttl=63 time=1.82 ms
64 bytes from 192.168.20.200: icmp_seq=2 ttl=63 time=2.15 ms
--- 192.168.20.200 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1002ms
rtt min/avg/max/mdev = 1.821/1.983/2.146/0.162 ms
This is the IP I’ll use in the next step when configuring Cilium on my cluster external VM.
Configure external workload
I’ve created another VM which won’t be part of my Kubernetes cluster, and it’s called longhorn-client
with an IP of 192.168.20.171
. This VM has the open-iscsi
package installed.
Create and apply the CiliumExternalWorkload
resource definition. The name needs to match the hostname of the external VM:
$ cat longhown-cew.yaml
apiVersion: cilium.io/v2
kind: CiliumExternalWorkload
metadata:
name: longhorn-client
labels:
io.kubernetes.pod.namespace: default
spec:
ipv4-alloc-cidr: 10.192.1.0/30
$ kubectl apply -f longhorn-cew.yaml
ciliumexternalworkload.cilium.io/longhorn-client created
Grab the TLS keys necessary for external workloads to authenticate with Cilium in our cluster, and scp
them to our VM:
$ curl -LO https://raw.githubusercontent.com/cilium/cilium/v1.9/contrib/k8s/extract-external-workload-certs.sh
$ chmod +x extract-external-workload-certs.sh
$ ./extract-external-workload-certs.sh
$ ls external*
external-workload-ca.crt external-workload-tls.crt external-workload-tls.key
$ scp external* 192.168.20.171:
Warning: Permanently added '192.168.20.171' (ED25519) to the list of known hosts.
external-workload-ca.crt 100% 1151 497.8KB/s 00:00
external-workload-tls.crt 100% 1123 470.4KB/s 00:00
external-workload-tls.key 100% 1675 636.3KB/s 00:00
Install Cilium on external VM
On the cluster external VM, run the following, adjusting CLUSTER_ADDR
for your setup (in my case it’s the VIP):
$ curl -LO https://raw.githubusercontent.com/cilium/cilium/v1.9/contrib/k8s/install-external-workload.sh
$ chmod +x install-external-workload.sh
docker pull cilium/cilium:v1.9.9
CLUSTER_ADDR=192.168.20.200 CILIUM_IMAGE=cilium/cilium:v1.9.9 ./install-external-workload.sh
After a few seconds this last command should return, and then you can verify connectivity as follows:
$ cilium status
KVStore: Ok etcd: 1/1 connected, lease-ID=7c027b34bb8c593c, lock lease-ID=7c027b34bb8c593e, has-quorum=true: https://clustermesh-apiserver.cilium.io:32379 - 3.4.13 (Leader)
Kubernetes: Disabled
Cilium: Ok 1.9.9 (v1.9.9-5bcf83c)
NodeMonitor: Disabled
Cilium health daemon: Ok
IPAM: IPv4: 2/3 allocated from 10.192.1.0/30, IPv6: 2/4294967295 allocated from f00d::aab:0:0:0/96
BandwidthManager: Disabled
Host Routing: Legacy
Masquerading: IPTables
Controller Status: 18/18 healthy
Proxy Status: OK, ip 10.192.1.2, 0 redirects active on ports 10000-20000
Hubble: Disabled
Cluster health: 5/5 reachable (2021-08-11T11:34:42Z)
And on the Kubernetes side, if you look at the status of the CEW resource we created you’ll see it has our node’s IP address:
$ kubectl get cew longhorn-client
NAME CILIUM ID IP
longhorn-client 57664 192.168.20.171
From our VM, an additional test is that we should now be able to resolve cluster internal FQDN’s. The install-external-workload.sh
script should’ve updated /etc/resolv.conf
for us, but note that you might need to also disable systemd-resolved
or netconfig (in my case) so that update doesn’t get clobbered. If it’s working, you can test this by running the following command:
$ dig +short clustermesh-apiserver.kube-system.svc.cluster.local
10.43.234.249
Install Longhorn
Install Longhorn with the default settings into our target cluster:
helm repo add longhorn https://charts.longhorn.io
helm repo update
helm install longhorn longhorn/longhorn -n longhorn-system --create-namespace
With Longhorn rolled out, use the UI to create a new volume, set the frontend to be ‘iSCSI’, and then make sure it’s attached to a host in your cluster. Verify its status:
$ kubectl get lhv test -n longhorn-system
NAME STATE ROBUSTNESS SCHEDULED SIZE NODE AGE
test attached healthy True 21474836480 192.168.20.184 25s
Grab the iSCSI endpoint for this volume either via the UI (under ‘Volume Details’) or via kubectl
:
$ kubectl get lhe -n longhorn-system
NAME STATE NODE INSTANCEMANAGER IMAGE AGE
test-e-b8eb676b running 192.168.20.184 instance-manager-e-baea466a longhornio/longhorn-engine:v1.1.2 17m
$ kubectl get lhe test-e-b8eb676b -n longhorn-system -o jsonpath='{.status.endpoint}'
iscsi://10.0.2.24:3260/iqn.2019-10.io.longhorn:test/1
Connect iSCSI initiator (client) to Longhorn volume
Now let’s try and connect our cluster-external VM to the Longhorn volume in our target cluster:
$ iscsiadm --mode discoverydb --type sendtargets --portal 10.0.2.24 --discover
10.0.2.24:3260,1 iqn.2019-10.io.longhorn:test
$ iscsiadm --mode node --targetname iqn.2019-10.io.longhorn:test --portal 10.0.2.24:3260 --login
$ iscsiadm --mode node
10.0.2.24:3260,1 iqn.2019-10.io.longhorn:test
And now this volume is now available as /dev/sdb
in my case:
$ journalctl -xn 100 | grep sdb
Aug 12 09:05:11 longhorn-client kernel: sd 3:0:0:1: [sdb] 41943040 512-byte logical blocks: (21.5 GB/20.0 GiB)
Aug 12 09:05:11 longhorn-client kernel: sd 3:0:0:1: [sdb] Write Protect is off
Aug 12 09:05:11 longhorn-client kernel: sd 3:0:0:1: [sdb] Mode Sense: 69 00 10 08
Aug 12 09:05:11 longhorn-client kernel: sd 3:0:0:1: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
Aug 12 09:05:11 longhorn-client kernel: sd 3:0:0:1: [sdb] Attached SCSI disk
$ lsblk /dev/sdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sdb 8:16 0 20G 0 disk
NB: Note that in the current implementation, the path to this iSCSI target (endpoint) is not HA. It’s useful for some ad-hoc data access and recovery, but you cannot rely on this approach for anything beyond that for now.