Running the Rancher CIS Operator on any Kubernetes cluster

Rancher 2.5 has ushered in a bunch of changes, and some of the functionality like backups and CIS scans have been moved out into their own Operators. It’s possible to make use of these on any Kubernetes cluster, not just one that’s been deployed and managed via Rancher, so this post details the steps necessary to deploy and run specifically the CIS Operator and view the results.

First of all, here’s my cluster deployed in AWS. It’s a four-node cluster, deployed using RKE, with pretty much the defaults:

$ kubectl get nodes
NAME                                         STATUS   ROLES               AGE     VERSION
ip-172-31-13-88.eu-west-2.compute.internal   Ready    worker              2m50s   v1.19.2
ip-172-31-4-76.eu-west-2.compute.internal    Ready    worker              2m50s   v1.19.2
ip-172-31-6-16.eu-west-2.compute.internal    Ready    worker              2m50s   v1.19.2
ip-172-31-8-152.eu-west-2.compute.internal   Ready    controlplane,etcd   2m51s   v1.19.2

Install the Operator using the official Rancher Helm charts:

$ helm repo add rancher https://charts.rancher.io
$ helm repo update
$ helm install rancher-cis-benchmark-crd rancher/rancher-cis-benchmark-crd \
  --create-namespace -n cis-operator-system
$ helm install rancher-cis-benchmark rancher/rancher-cis-benchmark \
  -n cis-operator-system

At this point we’ve some objects created in the cis-operator-system namespace as well as some new custom resource definitions that we can examine:

$ kubectl get crds | grep cis
clusterscanbenchmarks.cis.cattle.io                   2020-10-22T09:40:57Z
clusterscanprofiles.cis.cattle.io                     2020-10-22T09:40:57Z
clusterscanreports.cis.cattle.io                      2020-10-22T09:40:57Z
clusterscans.cis.cattle.io                            2020-10-22T09:40:57Z

$ kubectl get all -n cis-operator-system
NAME                                READY   STATUS    RESTARTS   AGE
pod/cis-operator-5cc97bd778-4t45g   1/1     Running   0          10m

NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/cis-operator   1/1     1            1           10m

NAME                                      DESIRED   CURRENT   READY   AGE
replicaset.apps/cis-operator-5cc97bd778   1         1         1       10m
$ kubectl get clusterscanbenchmarks
NAME                     CLUSTERPROVIDER   MINKUBERNETESVERSION   MAXKUBERNETESVERSION
cis-1.5                                    1.15.0
eks-1.0                  eks               1.15.0
gke-1.0                  gke               1.15.0
rke-cis-1.5-hardened     rke               1.15.0
rke-cis-1.5-permissive   rke               1.15.0

$ kubectl get clusterscanprofiles
cis-1.5-profile          cis-1.5
eks-profile              eks-1.0
gke-profile              gke-1.0
rke-profile-hardened     rke-cis-1.5-hardened
rke-profile-permissive   rke-cis-1.5-permissive

Let’s use the built-in profile rke-profile-permissive to perform a scan in accordance with the CIS Kubernetes benchmark 1.5. We’ll create an object of kind ClusterScan and refer to the rke-profile-permissive profile:

$ kubectl apply -f - << EOF
---
apiVersion: cis.cattle.io/v1
kind: ClusterScan
metadata:
  name: rke-cis
spec:
  scanProfileName: rke-profile-permissive
EOF

Now we can check the status of this via the clusterscans CRD:

$ kubectl get clusterscans
NAME      CLUSTERSCANPROFILE       TOTAL   PASS   FAIL   SKIP   NOT APPLICABLE   LASTRUNTIMESTAMP
rke-cis   rke-profile-permissive                                                 2020-10-22T10:02:53Z

While it’s running, if you check in the cis-operator-system namespace you’ll see a number of Pods have been launched that are doing the job of running this scan against our cluster:

$ kubectl get pods -n cis-operator-system
NAME                                                            READY   STATUS              RESTARTS   AGE
cis-operator-5cc97bd778-4t45g                                   1/1     Running             0          21m
security-scan-runner-rke-cis-kbqfl                              1/1     Running             0          20s
sonobuoy-rancher-kube-bench-daemon-set-8016e26168744e62-2662m   0/2     ContainerCreating   0          8s
sonobuoy-rancher-kube-bench-daemon-set-8016e26168744e62-2dw7q   2/2     Running             0          8s
sonobuoy-rancher-kube-bench-daemon-set-8016e26168744e62-hq4mt   0/2     ContainerCreating   0          8s
sonobuoy-rancher-kube-bench-daemon-set-8016e26168744e62-qbggn   0/2     ContainerCreating   0          8s

After a minute or so, our scan should run to completion:

$ kubectl get clusterscans
NAME      CLUSTERSCANPROFILE       TOTAL   PASS   FAIL   SKIP   NOT APPLICABLE   LASTRUNTIMESTAMP
rke-cis   rke-profile-permissive   92      58     0      0      34               2020-10-22T10:02:53Z

And now we can checkout the report:

$ kubectl get clusterscanreports
NAME                  LASTRUNTIMESTAMP                                            BENCHMARKVERSION
scan-report-rke-cis   2020-10-22 10:03:26.744176873 +0000 UTC m=+1304.643435643   rke-cis-1.5-permissive

The report itself is in JSON:

$ kubectl get clusterscanreport scan-report-rke-cis -o jsonpath="{.spec.reportJSON}" | jq
{
  "version": "rke-cis-1.5-permissive",
  "total": 92,
  "pass": 58,
  "fail": 0,
  "skip": 0,

✂️ ---------

Piping the output via jq tidies things up, but it’s still not particularly consumable within our terminal. Obviously the output is designed to be parsed and displayed by something else (i.e Rancher, duh), but we can also quickly tidy it up via a bit of Python to just dump the output into tables:

#!/usr/bin/env python3

import sys
import json
from prettytable import PrettyTable

data = json.load(sys.stdin)

resultsTable = PrettyTable()
summaryTable = PrettyTable()

resultsTable.field_names = ["ID", "Area", "Description", "Result"]
resultsTable.align = "l"
resultsTable.sortby = "ID"

for r in data["results"]:
    for c in r["checks"]:
        resultsTable.add_row(
            [c["id"], r["description"], c["description"], c["state"]])

summaryTable.field_names = ["Total", "Pass", "Fail", "Skip", "N/A"]
summaryTable.align = "r"
summaryTable.add_row(
    [data["total"], data["pass"], data["fail"],
        data["skip"], data["notApplicable"]]
)

print(resultsTable)
print(summaryTable)

Finally, if we save that as ~/tmp/scanreport.py then we can pipe the output of the previous command, minus jq, and see our results:

$ kubectl get clusterscanreport scan-report-rke-cis -o jsonpath="{.spec.reportJSON}" | ~/tmp/scanreport.py

+--------+----------------------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
| ID     | Area                             | Description                                                                                                       | Result        |
+--------+----------------------------------+-------------------------------------------------------------------------------------------------------------------+---------------+
| 1.1.1  | Master Node Configuration Files  | Ensure that the API server pod specification file permissions are set to 644 or more restrictive (Scored)         | notApplicable |
| 1.1.11 | Master Node Configuration Files  | Ensure that the etcd data directory permissions are set to 700 or more restrictive (Scored)                       | pass          |
| 1.1.12 | Master Node Configuration Files  | Ensure that the etcd data directory ownership is set to etcd:etcd (Scored)                                        | notApplicable |
| 1.1.13 | Master Node Configuration Files  | Ensure that the admin.conf file permissions are set to 644 or more restrictive (Scored)                           | notApplicable |
| 1.1.14 | Master Node Configuration Files  | Ensure that the admin.conf file ownership is set to root:root (Scored)                                            | notApplicable |
| 1.1.15 | Master Node Configuration Files  | Ensure that the scheduler.conf file permissions are set to 644 or more restrictive (Scored)                       | notApplicable |
| 1.1.16 | Master Node Configuration Files  | Ensure that the scheduler.conf file ownership is set to root:root (Scored)                                        | notApplicable |
| 1.1.17 | Master Node Configuration Files  | Ensure that the controller-manager.conf file permissions are set to 644 or more restrictive (Scored)              | notApplicable |

✂️ --------

+-------+------+------+------+-----+
| Total | Pass | Fail | Skip | N/A |
+-------+------+------+------+-----+
|    92 |   58 |    0 |    0 |  34 |
+-------+------+------+------+-----+