Skip to content
Agentic Control Plane

Stop your AI agent from dropping a Kubernetes namespace — in three steps

David Crowe · 6 min read
governance defense-in-depth kubernetes platform-engineering

If your agent has kubectl configured against a cluster you care about, it can run kubectl delete namespace prod. That’s one tool call. Everything in the namespace — services, deployments, persistent volume claims, secrets, ingress rules, auto-scaling configs — is queued for deletion the moment the API server accepts the request. There is no two-stage commit. No confirmation prompt. No timeout window where an operator can reverse the call.

This is one of the most-asked questions in platform-engineering forums right now: how do I keep my AI assistant from doing irreversible things to my cluster? Cursor and Claude Code users routinely point them at production-adjacent kubeconfigs because the agent is genuinely useful for diagnosing pod-state issues, reading logs, and debugging manifest drift. The same kubeconfig that lets the agent kubectl logs lets it kubectl delete.

What your agent has the power to do in your cluster

A typical developer kubeconfig has more cluster-admin breadth than the developer thinks. If the user is in a role that maps to anything close to cluster-admin or even edit — common in dev/staging clusters and not unheard of in production — the agent can:

  • Delete namespaces (kubectl delete namespace)
  • Drain and cordon nodes (kubectl drain, kubectl cordon)
  • Force-delete pods bypassing termination grace (kubectl delete pod --grace-period=0 --force)
  • Scale deployments to zero (kubectl scale deployment X --replicas=0)
  • Modify cluster-scoped resources (CRDs, ClusterRoles, ClusterRoleBindings)
  • Apply arbitrary manifests against any namespace (kubectl apply -f -)
  • Patch live workloads without a PR (kubectl patch, kubectl edit)

Most of these are useful. Some of them are catastrophic. The agent doesn’t distinguish. It runs whatever command the model emits.

Irreversible actions live one tool call away

Kubernetes’ deletion semantics are particularly unforgiving:

  • Namespace deletion is async but cannot be cancelled once the API server accepts it
  • Persistent Volume Claims with Delete reclaim policy lose their underlying volumes the moment the PVC is removed
  • Cluster-scoped resources (CRDs especially) take down everything that depends on them
  • kubectl delete --grace-period=0 --force skips the termination handler — workloads can’t checkpoint state

The probability of an autonomous agent in your environment running one of these against the wrong context is not low. It happens enough that defensive kubectl wrappers are a small cottage industry on GitHub.

Three steps that put a gate between your agent and kubectl

Step 1 — Install the hook

For Cursor, Claude Code, or Codex CLI:

curl -sf https://agenticcontrolplane.com/install.sh | bash

Every shell command goes through ACP’s hook before execution. ACP classifies Bash sub-commands by binary, so kubectl calls land under Bash.kubectl (and helm under Bash.helm).

Step 2 — Deny destructive kubectl verbs against production contexts

In your dashboard (cloud.agenticcontrolplane.com) → Policies:

{
  "mode": "enforce",
  "tools": {
    "Bash.kubectl": {
      "background": {
        "permission": "deny",
        "patterns": [
          { "match": "kubectl delete namespace.*",       "permission": "deny" },
          { "match": "kubectl delete (deployment|statefulset|daemonset).*", "permission": "deny" },
          { "match": "kubectl drain .*",                 "permission": "deny" },
          { "match": "kubectl delete (pvc|persistentvolumeclaim).*", "permission": "deny" },
          { "match": "kubectl patch .* --type='json'.*", "permission": "deny" },
          { "match": "kubectl apply -f -.*",             "permission": "deny" }
        ]
      },
      "interactive": {
        "permission": "ask",
        "patterns": [
          { "match": "kubectl delete .*--context=.*prod.*",  "permission": "deny" },
          { "match": "kubectl delete .*--namespace=prod.*",  "permission": "deny" },
          { "match": "kubectl delete namespace.*",            "permission": "deny" },
          { "match": "kubectl drain .*",                       "permission": "ask" }
        ]
      }
    }
  }
}

The semantic: in background tier, the agent can read (kubectl get, kubectl logs, kubectl describe) but cannot delete or modify cluster state. In interactive tier, deletion against production contexts is denied outright; deletion against other contexts requires explicit approval.

The pattern rules cover a few specific high-blast-radius commands. Add your team’s specific patterns — anything that touches kube-system, anything against the cluster’s primary prod context, anything that uses --force or --grace-period=0.

Step 3 — Bind end-user identity, per request

Cluster RBAC is the deeper safety net, but RBAC is configured at the cluster, not at the agent. ACP can enforce per-agent policy that’s tighter than the kubeconfig’s underlying scope:

from acp_governance import set_context

set_context(
    user_token=request.headers["Authorization"],
    agent_name="cluster-debug",
    agent_tier="background",
)

If the user’s IdP role doesn’t include kubernetes.delete:namespace, the policy lookup fails the call before it even reaches the cluster. The kubeconfig’s underlying token might allow it, but the user’s identity-scoped policy doesn’t.

This matters specifically because most kubeconfigs in dev hands have more permissions than the user’s day-to-day role implies. ACP collapses the gap.

(Free fourth step) — Audit log

Every governed kubectl call writes a structured row: full command, target context, target namespace, agent identity, decision, reason. When something disappears from the cluster, you have the call log before you start reading audit2rbac output.

The total time investment

  • One curl command (Step 1): ~30 seconds
  • Six kubectl policy rules (Step 2): ~3 minutes
  • One line in your handler for server-side agents (Step 3): ~1 minute

Three to five minutes from blank slate to “an autonomous agent in this environment cannot delete namespaces, drain nodes, force-delete pods, or modify cluster-scoped resources without an interactive approval and a non-production context.”

Cluster RBAC is necessary. It is not sufficient when the same kubeconfig that lets the agent run kubectl logs lets it run kubectl delete namespace prod. The control plane between agent intent and cluster execution is what closes that gap.

AgenticControlPlane.com

Get the next post
Agentic governance, AgentGovBench updates, the occasional incident post-mortem. One email per post. No marketing fluff.
Share: Twitter LinkedIn
Related posts

← back to blog