If you have used Kyverno, you may already know that currently, Kyverno runs with a single replica. Which means that it does not run in high availability mode. This becomes a problem if the single instance of Kyverno stops running or fails. Incoming requests are no longer validated and processed by Kyverno and depending on your configuration, new requests to the Kubernetes API server could be blocked. Also running multiple replicas for Kyverno allows it to scale background processing to handle request volume in large clusters. Essentially supporting high availability for Kyverno by allowing multiple replicas to be deployed, improves its resiliency and increases its overall availability.
While there are several other features in Kyverno release 1.4, the most important feature is the support for running Kyverno in high availability mode. In this post we will discuss the design for high availability in Kyverno.
High Availability Design
At a high level, Kyverno consists of three main components:
- Webook register – responsible for receiving admission review requests and monitoring webhook configuration and secrets.
- Policy controller – Responsible for processing validate policies.
- Generate controller – Responsible for processing generate policies.
Kyverno high level architecture
In order to support high availability in Kyverno, leader election is enabled in three components – webhook register, policy controller (background controller) and generate controller. Once the leader election is enabled, all processing will be done by the leader. While one replica is the leader and doing all the processing, other replicas are continuously monitoring the lease lock. The leadership is lost when the leader shuts down or restarts for any reason e.g. node failure. At this time, the lease lock becomes available immediately, triggering a leader election. Once a new leader is elected, it continues processing admission review requests preventing any interruption.
Below is an example of the lease lock used for leader election.
High availability support in Kyverno has been one of the most requested features by the community members. Lack of HA has prevented users from enabling advanced capabilities of Kyverno in production. With the Kyverno 1.4 release, high availability support is available, addressing a major requirement for using Kyverno in production. If you have been waiting for HA support in Kyverno, please try the 1.4 release and provide your feedback! You can reach out to the Kyverno team on the #kyverno Slack channel or mailing list.
kubectl get lease -n kyverno kyverno -o yaml | k neat apiVersion: coordination.k8s.io/v1 kind: Lease metadata: name: kyverno namespace: kyverno spec: acquireTime: "2021-06-08T21:41:41.901518Z" holderIdentity: kyverno-6cf7477544-59jq5_0eaa5c04-ea91-45de-84b7-3fb2fedc1928 leaseDurationSeconds: 15 leaseTransitions: 10 renewTime: "2021-06-09T02:16:21.622838Z"
Also, if you are already using Kyverno and looking to simplify policy management across your clusters, please checkout the Nirmata Policy Manager for Kyverno. You can register for early access here or below: https://nirmata.com/nirmata-policy-manager/. For general questions on Nirmata and our Kubernetes offerings, please contact us. Please visit our Resources section for deeper understanding on container management and Kubernetes.