Kubernetes Security Hardening: CIS Benchmarks and NSA/CISA Guidance
A comprehensive guide to hardening Kubernetes clusters based on CIS Benchmarks, NSA/CISA guidance — covering RBAC, Pod Security Standards, etcd encryption, Falco, and default-deny network policies.
The NSA and CISA published their Kubernetes Hardening Guide citing container and cluster-level misconfigurations as the primary source of Kubernetes incidents. Most attacks they documented didn't involve novel exploits — they exploited default configurations, overpermissioned service accounts, and absent network segmentation. This guide covers the concrete hardening steps from CIS Kubernetes Benchmark v1.8 and NSA/CISA guidance.
Control Plane Hardening
Secure API Server Configuration
The API server is the entry point to all Kubernetes operations. Key hardening flags:
# /etc/kubernetes/manifests/kube-apiserver.yaml
spec:
containers:
- command:
- kube-apiserver
# Disable anonymous authentication
- --anonymous-auth=false
# Enable RBAC and Node authorization
- --authorization-mode=Node,RBAC
# Require client certificates for kubelet connections
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
# Enable audit logging
- --audit-log-path=/var/log/kubernetes/audit.log
- --audit-log-maxage=30
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
- --audit-policy-file=/etc/kubernetes/audit-policy.yaml
# Disable profiling endpoints
- --profiling=false
# Restrict admission controllers
- --enable-admission-plugins=NodeRestriction,PodSecurity,ResourceQuota,LimitRanger
# Disable service account automounting at admission
- --disable-admission-plugins=ServiceAccount
# Bind to specific interface
- --bind-address=10.0.0.1
# TLS settings
- --tls-min-version=VersionTLS12
- --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
Audit Policy Configuration
A comprehensive audit policy captures security-relevant events without logging every list/watch request:
# /etc/kubernetes/audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Log auth changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["secrets", "configmaps"]
- group: "rbac.authorization.k8s.io"
resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
# Log pod exec, port-forward, and attach
- level: RequestResponse
verbs: ["create"]
resources:
- group: ""
resources: ["pods/exec", "pods/portforward", "pods/attach"]
# Log all create/update/delete at Metadata level
- level: Metadata
verbs: ["create", "update", "patch", "delete"]
# Exclude noisy read requests
- level: None
verbs: ["get", "list", "watch"]
resources:
- group: ""
resources: ["nodes", "pods", "services", "endpoints"]
Encrypt etcd at Rest
etcd stores all cluster state including Secrets. Without encryption, anyone with access to the etcd data directory can read all secrets:
# /etc/kubernetes/encryption-config.yaml
apiVersion: apiserver.config.k8s.io/v1
kind: EncryptionConfiguration
resources:
- resources:
- secrets
providers:
- aescbc:
keys:
- name: key1
secret: <base64-encoded-32-byte-key>
- identity: {} # Fallback for unencrypted data (remove after migration)
Apply to kube-apiserver:
- --encryption-provider-config=/etc/kubernetes/encryption-config.yaml
Then re-encrypt all existing secrets:
kubectl get secrets --all-namespaces -o json | kubectl replace -f -
For key rotation, add the new key as the first entry in the keys list, restart the API server, re-encrypt all secrets, then remove the old key.
RBAC: Least Privilege Service Accounts
Disable Automounting of Service Account Tokens
By default, Kubernetes mounts a service account token in every pod. This is unnecessary for most workloads and provides attackers with API credentials if they compromise a container:
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-app
namespace: production
automountServiceAccountToken: false
Or disable at the pod level:
apiVersion: v1
kind: Pod
spec:
automountServiceAccountToken: false
serviceAccountName: my-app
Create Minimal-Permission Service Accounts
Each workload should have its own service account with only the permissions it needs:
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: config-reader
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["app-config"] # Scope to specific ConfigMap names
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: config-reader-binding
namespace: production
subjects:
- kind: ServiceAccount
name: my-app
namespace: production
roleRef:
kind: Role
name: config-reader
apiGroup: rbac.authorization.k8s.io
Audit Overpermissioned ClusterRoles
Find service accounts with cluster-admin or wildcard permissions:
# List all cluster-admin bindings
kubectl get clusterrolebindings -o json | \
jq '.items[] | select(.roleRef.name=="cluster-admin") | {name:.metadata.name, subjects:.subjects}'
# Find roles with wildcard verbs or resources
kubectl get clusterroles -o json | \
jq '.items[] | select(.rules[]?.verbs[]? == "*") | .metadata.name'
Pod Security Standards
Pod Security Standards (PSS) replaced PodSecurityPolicy in Kubernetes 1.25. They operate at namespace level via labels and enforce three profiles:
- Privileged: No restrictions (for system namespaces)
- Baseline: Prevents known privilege escalation vectors
- Restricted: Heavily restricted, requires dropping all capabilities and running as non-root
# Apply restricted policy to production namespace
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: v1.28
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/warn: restricted
Workloads in this namespace must comply with the restricted profile:
apiVersion: v1
kind: Pod
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: my-app:1.0
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]
resources:
limits:
cpu: "500m"
memory: "128Mi"
requests:
cpu: "100m"
memory: "64Mi"
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
Network Policies: Default-Deny Architecture
Without network policies, all pods can communicate with all other pods across all namespaces. Implement a default-deny policy in every namespace:
# Default deny all ingress and egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
Then explicitly allow required traffic:
# Allow frontend to reach backend on port 8080
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
---
# Allow DNS resolution (required for all pods)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
namespace: production
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- ports:
- port: 53
protocol: UDP
- port: 53
protocol: TCP
Note: Network policies require a CNI plugin that supports them (Calico, Cilium, Weave Net). The built-in kubenet plugin does not enforce network policies.
Runtime Security with Falco
Falco monitors system calls in real-time and alerts on suspicious behavior. It can detect container breakouts, privilege escalation, cryptomining, and data exfiltration:
# Install Falco via Helm
helm repo add falcosecurity https://falcosecurity.github.io/charts
helm install falco falcosecurity/falco \
--namespace falco \
--create-namespace \
--set falco.grpc.enabled=true \
--set falco.grpcOutput.enabled=true \
--set driver.kind=ebpf
Key Falco rules to enable:
# rules/custom.yaml
# Alert on shell execution inside containers
- rule: Shell Spawned in Container
desc: Shell was spawned inside a container
condition: >
spawned_process and container and
shell_procs and proc.pname != "sudo"
output: >
Shell spawned in container (user=%user.name container=%container.name
image=%container.image.repository:%container.image.tag
shell=%proc.name parent=%proc.pname)
priority: WARNING
# Alert on sensitive file reads
- rule: Read Sensitive File
desc: An attempt to read sensitive files
condition: >
open_read and container and
(fd.name startswith /etc/shadow or
fd.name startswith /etc/ssh or
fd.name = /proc/1/environ)
output: >
Sensitive file opened for reading (file=%fd.name user=%user.name
container=%container.name image=%container.image.repository)
priority: ERROR
# Alert on network connection to unexpected destinations
- rule: Unexpected Network Outbound
desc: Container making unexpected outbound connection
condition: >
outbound and container and
not proc.name in (allowed_processes) and
not fd.sip in (allowed_ips)
output: >
Unexpected outbound connection (container=%container.name
connection=%fd.name ip=%fd.sip port=%fd.sport)
priority: WARNING
Route Falco alerts to your SIEM via Falcosidekick, which supports Slack, PagerDuty, Elasticsearch, Datadog, and dozens of other outputs.
Image Security
Scan Images in CI/CD
Integrate vulnerability scanning into your pipeline so images with critical CVEs never reach production:
# GitHub Actions example
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: 'my-registry/my-app:${{ github.sha }}'
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1' # Fail the build on critical/high CVEs
Use Distroless or Minimal Base Images
Standard images like ubuntu:latest or python:3.11 contain hundreds of packages that expand the attack surface. Distroless images contain only the application and its runtime dependencies:
# Multi-stage build with distroless final image
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user -r requirements.txt
FROM gcr.io/distroless/python3-debian12
COPY --from=builder /root/.local /root/.local
COPY --from=builder /app /app
USER nonroot:nonroot
ENTRYPOINT ["python", "/app/main.py"]
Enforce Image Signing with Sigstore
Use cosign to sign images in CI/CD and verify signatures in Kubernetes admission via the built-in admission controller or a policy engine like Kyverno:
# Sign image after build
cosign sign --key cosign.key my-registry/my-app:sha256-abcdef123
# Verify in Kyverno policy
kubectl apply -f - <<EOF
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: verify-image-signature
spec:
validationFailureAction: enforce
rules:
- name: check-image-signature
match:
resources:
kinds: [Pod]
verifyImages:
- imageReferences: ["my-registry/*"]
attestors:
- entries:
- keys:
publicKeys: |
-----BEGIN PUBLIC KEY-----
...
-----END PUBLIC KEY-----
EOF
CIS Benchmark Automated Scanning
Run kube-bench to audit your cluster configuration against CIS Kubernetes Benchmark:
kubectl apply -f https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
kubectl logs -f job/kube-bench
kube-bench checks over 100 controls covering API server flags, etcd configuration, kubelet settings, and RBAC configuration, mapping each finding to CIS control IDs.
The most impactful controls to prioritize are: etcd encryption at rest (#1 in terms of blast radius if compromised), RBAC with minimal service account permissions (#2 most commonly misconfigured), default-deny network policies (stops lateral movement), and Falco runtime monitoring (detects active exploitation). These four address the majority of documented Kubernetes attack patterns.