Stop your AI agent from deleting your production database — in three steps
Last week, a developer described an incident on Hacker News: an autonomous agent — running through Cursor, using Claude Opus as the model — discovered an API token in its environment with destructive scope on the team’s production Railway database, called a delete mutation, and erased the database. Railway’s backup system was co-located with the volume being deleted, so the backups were destroyed in the same call. Recovery wasn’t possible.
This is not an exotic incident. The pattern reproduces with any autonomous agent, any production credential, any platform whose deletion isn’t gated by external policy. It is much easier to hit than people think.
What your agent has access to right now
Modern AI agents — Cursor’s Composer, Claude Code, Codex CLI, your custom CrewAI orchestration — all run inside the environment you’ve already provisioned. They read environment variables. They have access to whatever API tokens are sitting on your machine, in your CI runner, in your Docker container. If a destructive credential is reachable from your shell, it’s reachable from your agent.
The agent doesn’t have to be malicious. It doesn’t have to be jailbroken. It can simply decide — for any reason, including a hallucinated interpretation of your instructions, prompt injection from a document it ingested, or a misread of an ambiguous task — to call a delete endpoint. Once that call leaves the process, there is no governance layer between the model’s decision and the production system on the other end.
Irreversible actions live one tool call away
Database deletion is one example of a class. Sending the wrong email to your customer list, pushing broken code to main, transferring funds, escalating IAM permissions, dropping a Kubernetes namespace, force-pushing over your team’s branches — all are one tool call away from execution, and none of them un-send, un-run, or un-revoke after the fact.
The database case is the most viscerally bad of these because the database is usually the cumulative state of the business. But every irreversible action shares the same property: by the time you notice, the call has already left the process and there is nothing the model can do to take it back.
The Railway-Cursor incident won’t be the last. The probability of another one — same shape, different platform, different scope — in the next 60 days is not low.
Three steps that put infrastructure between your agent and the destructive call
The fix isn’t trying to make the model behave better. Any sufficiently general agent can produce the destructive token sequence; prompting steers probabilities, it doesn’t bound them. The fix is putting a control plane between the agent’s tool call and the action — code that runs every time, regardless of what the model wants.
Here’s how to do it.
Step 1 — Install the hook (one command)
For Cursor, Claude Code, or Codex CLI:
curl -sf https://agenticcontrolplane.com/install.sh | bash
The installer detects which AI client you’re using, writes a small hook script (~/.acp/govern.mjs), and registers it with the client’s PreToolUse / PostToolUse hook surface. From this moment on, every tool call your agent makes — file edits, shell commands, MCP invocations — runs through the hook before it executes. The model can’t route around it. The host enforces it.
Restart your AI client. Done.
Step 2 — Set “deny destructive verbs in background tier” as the default
Open your ACP dashboard (cloud.agenticcontrolplane.com) → Policies. Set:
{
"mode": "enforce",
"tools": {
"Bash.curl": { "background": { "permission": "deny" } },
"Bash.rm": { "background": { "permission": "deny" } },
"Bash.kubectl":{ "background": { "permission": "deny" } },
"Bash.gcloud": { "background": { "permission": "deny" } },
"Bash.aws": { "background": { "permission": "deny" } }
}
}
ACP automatically classifies Bash sub-commands by the binary being invoked, so this rule catches every shell call whose first word is curl, rm, kubectl, etc. — regardless of how the agent constructed the command.
The semantic: an agent running headless without a human at the keyboard cannot perform mutating actions. To execute a destructive call, the request has to come from an interactive session — a real Cursor / Claude Code window with a person there to see what’s happening.
That single rule eliminates the class of “3am cron-scheduled agent decided to delete the database” incidents.
Step 3 — Bind the end user’s identity, per request
Long-lived destructive tokens shouldn’t be in your agent’s environment in the first place. If you’re running an agent server-side, bind the user’s session token instead — the agent inherits the user’s scope, not a service account’s:
from acp_governance import set_context
@app.post("/run")
def run(req, authorization: str = Header(...)):
set_context(
user_token=authorization.removeprefix("Bearer ").strip(),
agent_name="db-maintenance",
agent_tier="background",
)
return run_agent(req)
If the user doesn’t have database:delete scope in their OIDC role, neither does the agent — even if a token with that scope exists somewhere else on the machine. The destructive capability isn’t in reach.
For shadow-IT setups where you don’t have an IdP wired up yet, mint a tightly-scoped gsk_ API key per agent in the dashboard. Same principle: explicit, scoped, rotatable, audited.
(Free fourth step) — Audit log
Every governed call writes a structured row to your dashboard regardless of decision. When the agent attempted the destructive call (and it will attempt one eventually), you’ll have:
- Which agent, on whose behalf
- What command it tried to run
- What the policy decision was, and why
- Full input/output preview (with PII redaction applied if configured)
You don’t have to do anything to get this. It’s the byproduct of installing the hook.
The total time investment
- One
curlcommand (Step 1): ~30 seconds - One policy entry in the dashboard (Step 2): ~2 minutes
- One line in your request handler (Step 3): ~1 minute
Three minutes from blank slate to “an autonomous agent in this environment cannot delete production state without an interactive human session and an explicit policy override.”
The asymmetry between the time it takes to set this up and the cost of the incident it prevents is large. If you’re running agents in any environment that has access to credentials, and you don’t have a control plane between the agent and those credentials, the only thing standing between you and the next viral story is luck.
You can stop relying on luck for three minutes of work.