Drift Detection for Kubernetes: The Missing Link in Secure GitOps

Drift Detection for Kubernetes: The Missing Link in Secure GitOps

Modern Kubernetes environments are meant to be declarative — what you define in Git is what should be running in production. But in reality, things drift. Resources change without review, permissions are updated manually, or configuration baselines diverge. This phenomenon — known as configuration drift — silently erodes the reliability and security of even the most well-engineered platforms.

In this post, we’ll explore what drift detection means for Kubernetes, why it’s critical even in GitOps workflows, and how Policy-as-Code can help teams prevent and detect it automatically.

What is Drift Detection?

Drift detection is the process of identifying when the actual state of a Kubernetes resource deviates from its declared or desired state.
Drift can occur in many ways:

  • A user manually edits a resource (kubectl edit/apply) outside of GitOps workflows.
  • An automated tool or controller changes a setting unexpectedly.
  • A misconfigured Helm or CI/CD pipeline overwrites values.

When drift occurs, your clusters may no longer match compliance, security, or operational expectations — and those changes often go unnoticed until an incident occurs.

Implications for Security

Uncontrolled drift can create shadow configurations — resources that bypass review and violate security policies.
For example:

  • An altered ClusterRole granting unintended privileges.
  • A NetworkPolicy removed to “debug” an app.
  • A PodSecurity setting reverted for convenience.

These deviations can open the door to privilege escalation, data exposure, or compliance violations. In regulated environments (e.g., SOC 2, PCI DSS), even temporary drift can trigger audit failures.

Why Drift Detection Matters — Even with GitOps

GitOps promises “declarative state, continuously reconciled.” However, GitOps tools like ArgoCD or Flux only enforce what’s tracked in Git.
They:

  • Reconcile resources at intervals, not instantly.
  • May ignore changes made by other controllers or operators.
  • Often lack fine-grained awareness of who made a change or why.

In practice, drift still occurs between syncs or in non-Git-managed namespaces.
Without an independent drift detection layer, teams assume their clusters are compliant — until they’re not.

Customer Use Case: Detecting Unauthorized RBAC Drift

One of Nirmata’s enterprise customers manages very large Kubernetes clusters with thousands of pods. To maintain least-privilege access, they use external IAM systems for authentication and authorization — meaning ClusterRoles and ClusterRoleBindings should never change manually.

However, platform admins discovered cases where these resources were modified directly in clusters, bypassing GitOps. 

The result:
Unauthorized users gained temporary escalated privileges, and compliance audits flagged RBAC inconsistencies.

To prevent this, the team implemented a drift detection policy using Kyverno and Nirmata Control Hub.

Policy-as-Code to the Rescue

With Policy-as-Code, drift detection can be automated and enforced natively inside Kubernetes. Kyverno policies can monitor resource updates, compare them to expected ownership or audit information, and report violations immediately.

Example Policy: Detect Unauthorized Changes to ClusterRoles and ClusterRoleBindings

kind: ClusterPolicy
metadata:
  name: detect-rbac-drift
spec:
  validationFailureAction: Audit
  background: false
  rules:
    - name: detect-unauthorized-rbac-change
      match:
        any:
        - resources:
            kinds:
              - ClusterRole
              - ClusterRoleBinding
              - Role
              - RoleBinding
      exclude:
        any:
        - clusterRoles:
          - cluster-admin
      validate:
        failureAction: Audit
        message: "Unauthorized modification detected in {{request.object.kind}}/{{request.object.metadata.name}}. Only cluster-admin may modify RBAC resources."
        deny: {}

How It Works

  • Runs in audit mode to monitor changes continuously.
  • Reports any drift caused by users other than cluster-admin.
  • Integrates with Nirmata Control Hub for visibility, alerts, and remediation workflows.

This simple policy ensures that RBAC resources remain immutable unless changed through approved, GitOps-controlled processes. This policy can easily be flipped from ‘audit’ to ‘enforce’ mode to block any changes to the RBAC resources.

Setup: Give Kyverno Permission to Read RBAC Resources

Kyverno’s reports controller must have permission to read these RBAC objects in order to generate PolicyReports.

Run the following commands once per cluster:

# Create a ClusterRole that allows the reports controller to read RBAC resources

kubectl create clusterrole kyverno-reports-rbac-permissions \
  --verb=get,list,watch \
  --resource=roles,rolebindings,clusterroles,clusterrolebindings
# Label it for aggregation to the reports controller

kubectl label clusterrole kyverno-reports-rbac-permissions rbac.kyverno.io/aggregate-to-reports-controller=true

This ensures that Kyverno can monitor and report on RBAC drift across the cluster.

Testing the Policy

Try creating a new Role manually as a non-admin user (or using a service account):

kubectl create role test-admin-violations \
  --verb=get,list --resource=pods -n nginx \
  --as=system:serviceaccount:default:test-admin-sa

Kyverno will detect the unauthorized modification and generate a PolicyViolation event.

Check for violations:

kubectl get events --field-selector reason=PolicyViolation -A

You should see something like:

NAMESPACE   LAST SEEN   TYPE      REASON            OBJECT                    MESSAGE
nginx       5m1s        Warning   PolicyViolation   role/test-admin-violations   policy detect-rbac-drift/detect-unauthorized-rbac-change fail: Unauthorized modification detected in Role/test-admin-violations. Only cluster-admin may modify RBAC resources.

You can also view the detailed PolicyReport:

kubectl get polr -A

Output:

NAMESPACE   NAME                                   KIND   NAME                     PASS   FAIL   WARN   ERROR   SKIP   AGE
nginx       b5d2a954-e1ed-43e4-9f38-d6bfa5841f12   Role   test-admin-violations    0      1      0      0       0      5m21s

And for full details:

kubectl get polr b5d2a954-e1ed-43e4-9f38-d6bfa5841f12 -n nginx -o yaml

apiVersion: wgpolicyk8s.io/v1alpha2
kind: PolicyReport
metadata:
  creationTimestamp: "2025-11-01T21:24:29Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: kyverno
  name: b5d2a954-e1ed-43e4-9f38-d6bfa5841f12
  namespace: nginx
  ownerReferences:
  - apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    name: test-admin-violations
    uid: b5d2a954-e1ed-43e4-9f38-d6bfa5841f12
  resourceVersion: "8827222"
  uid: 5410b8f0-19bd-4782-86be-b17126ec9af6
results:
- message: Unauthorized modification detected in Role/test-admin-violations. Only
    cluster-admin may modify RBAC resources.
  policy: detect-rbac-drift
  result: fail
  rule: detect-unauthorized-rbac-change
  scored: true
  source: kyverno
  timestamp:
    nanos: 0
    seconds: 1762032249
scope:
  apiVersion: rbac.authorization.k8s.io/v1
  kind: Role
  name: test-admin-violations
  namespace: nginx
  uid: b5d2a954-e1ed-43e4-9f38-d6bfa5841f12
summary:
  error: 0
  fail: 1
  pass: 0
  skip: 0

This report shows the exact resource, violation message, and policy name — providing auditable drift detection evidence for compliance teams.

Beyond RBAC: Other Drift Detection Possibilities

Drift detection can extend far beyond permissions:

Category Drift Example Mitigation
Network Security Unauthorized change in NetworkPolicy Detect non-approved CIDR or port updates
Pod Security Privilege escalation in SecurityContext Detect if privileged pods are re-enabled
Image Governance Container images changed outside of approved registries Enforce registry and tag validation
Resource Quotas Namespace resource limits updated manually Detect quota mismatches with declared values

By layering these checks with AI-driven insights, teams can move from reactive drift detection → predictive prevention — anticipating where drift is likely before it happens.

Summary

Drift detection isn’t just about keeping your YAMLs clean — it’s about trust, traceability, and control in dynamic environments.
Even in GitOps-based workflows, unauthorized or accidental changes can silently compromise your security posture.

By adopting Policy-as-Code with tools like Kyverno and extending it with AI-driven predictive drift detection, platform engineering teams can ensure every Kubernetes cluster stays compliant, consistent, and secure — always.

Next Steps

Remediator Agent for Kubernetes – AI-Powered Policy Remediation
No Comments

Sorry, the comment form is closed at this time.