Community post originally published on Neon Mirrors by Chip Zoller
In real life, imposed rules often have cases where exceptions may be required but on a case-by-case basis. Policy is really no different here. While prevention of objectively “bad” behavior should be commonplace and enforced as widely as possible, there are valid situations where the rule may need to be bent slightly. I’ve covered how some of these exceptions work in Kyverno in the past, but I also wanted to explore the possibility of creating some sort of “self-driving” exception system even if just conceptual in nature. In this blog, I’ll share a fun little concept project I concocted on how to use Kyverno to implement a one-time pass code system for allowing these exceptions. It’s probably not highly practical, but it does give you a sense of what’s possible and just how powerful and flexible Kyverno can be to deliver even semi-crazy use cases like this one.
Chances are high you’re using some sort of validation policies in your cluster if you’re reading this article. And chances are also pretty high that at least one of those policies is in Enforce
mode which, as you probably know, will prevent a “bad” resource from being created should it violate one or more rules in the policy. There are a couple ways to provide exceptions in Kyverno. One of those is to define an exclude
block in a rule and list them there. Another is to define them centrally in another Kubernetes resource like a ConfigMap. And yet another still is to use the formal PolicyException resource introduced in Kyverno 1.9. These are all really useful mechanisms that you should try and employ. But what if in some situations you just wanted to be mostly hands off and provide a bit more loose control? What if you could just let developers and other users know how they can get around policy but still with some form of an access system? I thought I’d play around with that idea a bit and wanted to see if I could do something like a one-time pass code system for Kyverno. It turns out that because of the amazing flexibility and power of Kyverno, not only can this be done but it really wasn’t that difficult!
At the end of the day, the idea is this: provide a unique one-time pass code (OTP) back to a user if their resource is blocked by a validate rule but ensure that code and use of it is documented so it can be audited. And, obviously, to prevent reuse of any code more than once.
With a combination of a couple of different Kyverno policies which use both validation and mutation for existing resources, this is all possible. The full sequence of how I wanted this to work is shown below.
And here‘s how to put this together.
First, we‘ll need a Namespace I‘m calling platform
in which to put our ConfigMap used as the OTP journal. Obviously, in a case where, for some reason, you wanted to implement this in a “real“ environment, you‘d absolutely want to protect this with RBAC so users can‘t read it. This ConfigMap has a key called codes
with just some starter codes to give you an idea of the formatting and sample contents.
apiVersion: v1
kind: ConfigMap
metadata:
name: otp
namespace: platform
data:
codes: |-
- ua8v92pg
- 9akvm2o7
Next, we need to create the validation rules. There are two rules going on in this policy.
- The
invalid-otp
rule is universal and not tied to any specific rule or other policy. It simply checks for creation of Deployments which have theotp
label set that the code hasn‘t been consumed. This will come into play later. - The
host-namespaces otp
rule is just an existing rule from the Pod Security Standards of Kyverno policies which has been slightly modified to look-up codes from the ConfigMap mentioned earlier. You’ll see that the OTP code is actually created in themessage
field of this rule. This is important because in the next phase, we’ll harvest this information to be the input driver for the ConfigMap.
Also, notice how I’ve used spec.applyRules: One
in this policy and ordered the rules such that invalid-otp
is first. This is to prevent creation of yet another OTP if a user either specifies an invalid one or a code which has already been consumed. Although OTP codes will be generated automatically any time there is a Deployment which fails the host-namespaces-otp
rule, we only want a code to be generated when they aren’t trying to specify one in the first place.
Below is the full validation policy.
apiVersion: kyverno.io/v2beta1
kind: ClusterPolicy
metadata:
name: disallow-host-namespaces-otp
spec:
validationFailureAction: Enforce
background: false
applyRules: One
rules:
- name: invalid-otp
match:
any:
- resources:
kinds:
- Deployment
operations:
- CREATE
selector:
matchLabels:
otp: "?*"
context:
- name: otp
configMap:
name: otp
namespace: platform
preconditions:
all:
- key: "{{ request.object.metadata.labels.otp }}"
operator: AnyNotIn
value: "{{ parse_yaml(otp.data.codes) }}"
validate:
message: The code {{ request.object.metadata.labels.otp }} is invalid or has already been used.
deny: {}
- name: host-namespaces-otp
match:
any:
- resources:
kinds:
- Deployment
operations:
- CREATE
context:
- name: otp
configMap:
name: otp
namespace: platform
preconditions:
all:
- key: "{{ request.object.metadata.labels.otp || '' }}"
operator: AnyNotIn
value: "{{ parse_yaml(otp.data.codes) }}"
validate:
message: >-
Sharing the host namespaces is disallowed. The fields spec.hostNetwork,
spec.hostIPC, and spec.hostPID must be unset or set to `false`. To get around this,
you may use a one-time pass code "{{ random('[0-9a-z]{8}') }}" assigned as the value of
a label with key "otp". Use of this code will be recorded along with your username.
pattern:
spec:
template:
spec:
=(hostPID): false
=(hostIPC): false
=(hostNetwork): false
The net effect here is if a user tries to create a “bad” Deployment which violates the host-namespaces-otp
rule, it’ll block them but return a message containing the OTP code and how to use it. Notice also how I’m warning in the message that, if you use this code, it’ll be recorded for audit purposes.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
namespace: default
labels:
app: busybox
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
hostIPC: true
containers:
- image: busybox:1.28
name: busybox
command: ["sleep", "9999"]
$ kubectl apply -f baddeploy.yaml
Error from server: error when creating "baddeploy.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
resource Deployment/default/busybox was blocked due to the following policies
disallow-host-namespaces-otp:
host-namespaces-otp: 'validation error: Sharing the host namespaces is disallowed.
The fields spec.hostNetwork, spec.hostIPC, and spec.hostPID must be unset or set
to `false`. To get around this, you may use a one-time pass code "ee4co4k8" assigned
as the value of a label with key "otp". Use of this code will be recorded along
with your username. rule host-namespaces-otp failed at path /spec/template/spec/hostIPC/'
Next, we need to implement the ConfigMap management system so that OTP codes are added when they need to be and removed upon first use. This was the fun part. Let me explain how this works.
First, in the add-otp
rule, in order to dynamically add the OTP codes to the ConfigMap, we’re parsing them out of the Event Kyverno generates whenever there’s a blocked resource. This Event–just a standard Kubernetes v1 Event–contains the message which contains the OTP we saw earlier. Since Kyverno can match on these Events (you will need to update your resource filter to allow this), we can use that specific Event as the trigger for a mutate-existing rule on our ConfigMap.
Note: if you remove the Event resource filter you will increase the processing load on Kyverno which will, in turn, require more resources.
With this OTP code extracted from the message, we can append it to the ConfigMap.
Second, in the manage-otp
rule, we’re watching for the creation of Deployments that set the otp
label and, if that value is valid, we’re modifying its entry in the ConfigMap to record the timestamp and also username of the actor who consumed it. This serves a dual purpose in that because this information has been appended, the code itself is invalidated. Much better than simply deleting the code from the list.
Below is the second policy with both rules.
apiVersion: kyverno.io/v2beta1
kind: ClusterPolicy
metadata:
name: manage-otp-list
spec:
rules:
- name: add-otp
match:
any:
- resources:
kinds:
- v1/Event
names:
- "disallow-host-namespaces-otp.?*"
preconditions:
all:
- key: "{{ request.object.reason }}"
operator: Equals
value: PolicyViolation
- key: "{{ contains(request.object.message, 'one-time pass code') }}"
operator: Equals
value: true
context:
- name: otp
variable:
jmesPath: split(request.object.message,'"') | [1]
mutate:
targets:
- apiVersion: v1
kind: ConfigMap
name: otp
namespace: platform
patchStrategicMerge:
data:
codes: |-
{{ @ }}
- {{ otp }}
- name: manage-otp
match:
any:
- resources:
kinds:
- Deployment
operations:
- CREATE
selector:
matchLabels:
otp: "?*"
context:
- name: otp
configMap:
name: otp
namespace: platform
preconditions:
all:
- key: "{{ request.object.metadata.labels.otp }}"
operator: AnyIn
value: "{{ parse_yaml(otp.data.codes) }}"
mutate:
targets:
- apiVersion: v1
kind: ConfigMap
name: otp
namespace: platform
context:
- name: used
variable:
jmesPath: replace_all(target.data.codes,'{{request.object.metadata.labels.otp}}','{{request.object.metadata.labels.otp}}-{{time_now_utc()}}-{{request.userInfo.username}}')
patchStrategicMerge:
data:
codes: |-
{{ used }}
Try it out with a Deployment which uses the code provided earlier.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
namespace: default
labels:
app: busybox
otp: 1t1h360g
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
hostIPC: true
containers:
- image: busybox:1.28
name: busybox
command: ["sleep", "9999"]
When a valid code is consumed, Kyverno will update the ConfigMap to transform this
apiVersion: v1
kind: ConfigMap
metadata:
name: otp
namespace: platform
data:
codes: |-
- ua8v92pg
- 9akvm2o7
- 1t1h360g
into this
apiVersion: v1
kind: ConfigMap
metadata:
name: otp
namespace: platform
data:
codes: |-
- ua8v92pg
- 9akvm2o7
- 1t1h360g-2023-06-21T15:04:59Z-czoller
Alright, let‘s try it out end–to–end and see this whole thing work!
Create a “bad“ Deployment.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
namespace: default
labels:
app: busybox
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
hostIPC: true
containers:
- image: busybox:1.28
name: busybox
command: ["sleep", "9999"]
$ kubectl apply -f baddeploy.yaml
Error from server: error when creating "baddeploy.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
resource Deployment/default/busybox was blocked due to the following policies
disallow-host-namespaces-otp:
host-namespaces-otp: 'validation error: Sharing the host namespaces is disallowed.
The fields spec.hostNetwork, spec.hostIPC, and spec.hostPID must be unset or set
to `false`. To get around this, you may use a one-time pass code "uq1s17g8" assigned
as the value of a label with key "otp". Use of this code will be recorded along
with your username. rule host-namespaces-otp failed at path /spec/template/spec/hostIPC/'
Let‘s use the code uq1s17g8
just provided.
I‘ll take the same “bad“ Deployment and add that as the value of a label called otp
.
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox
namespace: default
labels:
app: busybox
otp: uq1s17g8
spec:
replicas: 1
selector:
matchLabels:
app: busybox
template:
metadata:
labels:
app: busybox
spec:
hostIPC: true
containers:
- image: busybox:1.28
name: busybox
command: ["sleep", "9999"]
$ kubectl apply -f baddeploy.yaml
deployment.apps/busybox created
Let‘s ensure someone cannot use this same code a second time, so we‘ll delete the Deployment we just created.
$ kubectl delete deploy busybox
deployment.apps "busybox" deleted
And try to create the same exact Deployment once again.
$ kubectl apply -f baddeploy.yaml
Error from server: error when creating "baddeploy.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
resource Deployment/default/busybox was blocked due to the following policies
disallow-host-namespaces-otp:
invalid-otp: The code uq1s17g8 is invalid or has already been used.
There you can see that the same code uq1s17g8
is now flagged as invalid since it was used once before.
As a privileged cluster admin, we can also check our otp
ConfigMap and see who and when a code was used.
$ kubectl -n platform get cm otp -o yaml
apiVersion: v1
data:
codes: |-
- ua8v92pg
- 9akvm2o7
- 1t1h360g-2023-06-21T15:04:59Z-czoller
- uq1s17g8-2023-06-21T15:10:18Z-jdoe
kind: ConfigMap
metadata:
annotations:
policies.kyverno.io/last-applied-patches: |
manage-otp.manage-otp-list.kyverno.io: replaced /data/codes
creationTimestamp: "2023-06-20T13:01:27Z"
name: otp
namespace: platform
resourceVersion: "5147565"
uid: ed2cce4e-6cf4-4309-b2cc-a2c45493ef4e
And there you have it, your very own OTP system for Kyverno which is self–managed and allows for auditing.
Even though this concept probably isn‘t very practical to use in the real world, I had fun just experimenting with the idea to see if it was possible. Who knows, maybe some of you out there can even use this!