Incident series · Part 1 of 11

Stop your AI agent from running `rm -rf` on your filesystem — in three steps

David Crowe · April 26, 2026 · 6 min read

governance defense-in-depth tool-policy filesystem-safety

A developer reported on the Cursor forum that during a development session, the Cursor agent decided to execute rm -rf on the wrong directory. The command ran. The files were gone. There was no --interactive flag, no confirmation prompt, no policy gate — the model emitted the command, the shell executed it.

This is one of the most-reported AI-agent incidents on developer forums right now. It’s also one of the most preventable.

What your agent has access to right now

Modern AI agents — Cursor’s Composer, Claude Code, Codex CLI, your custom orchestration — execute shell commands as the user that launched them. If your shell can run rm -rf $HOME, your agent can. If your shell can dd if=/dev/zero of=/dev/sda, your agent can. The OS doesn’t distinguish “the human typed this” from “the model emitted this through a tool call.”

The agent doesn’t have to be malicious. It can simply decide — on a misread of the directory layout, a hallucinated path variable, or an unfortunate completion of a command-construction prompt — to run a destructive command. And modern shells don’t have a “wait, are you sure?” step for the model.

Once the command leaves the process, there is no governance layer between the model’s decision and the filesystem.

Irreversible actions live one tool call away

rm -rf is one example of a class. dd overwrites raw devices. shred zeroes file contents. find . -delete walks recursively. mkfs formats partitions. git clean -fdx discards uncommitted work. All of these are one tool call away from execution, and none of them un-delete, un-write, or un-format after the fact.

For most developers, the home directory is the cumulative state of months or years of work — uncommitted changes, local configurations, downloaded artifacts that don’t exist anywhere else. Some of it is in source control. Most of it isn’t.

The Cursor forum incident isn’t unique. Multiple variants have been reported across r/cursor, GitHub issues on anthropics/claude-code, and various dev blogs. The probability of another one in your team in the next 60 days is not low.

Three steps that put a control plane between your agent and your filesystem

The fix isn’t smarter prompting. Any general-purpose agent can produce the destructive shell command for whatever filesystem path it has access to — prompting steers probabilities, doesn’t bound them. The fix is putting a gate between the agent’s tool call and the shell, code that runs every time, regardless of what the model wants.

Step 1 — Install the hook

For Cursor, Claude Code, or Codex CLI:

curl -sf https://agenticcontrolplane.com/install.sh | bash

The installer detects which AI client you’re using, writes a small hook script at ~/.acp/govern.mjs, and registers it with the client’s PreToolUse / PostToolUse hook surface. From this point, every shell command the agent runs goes through the hook before it executes. The model can’t route around it. The host enforces it.

Restart your AI client. Done.

Step 2 — Deny destructive shell verbs by default

ACP automatically classifies Bash sub-commands by the first binary in the command — Bash.rm, Bash.dd, Bash.git, Bash.curl, etc. — so your policy doesn’t need to enumerate every dangerous regex. The classifier puts the call into a category; you set policy on the category.

Open your dashboard (cloud.agenticcontrolplane.com) → Policies. Set:

{
  "mode": "enforce",
  "tools": {
    "Bash.rm":     { "background": { "permission": "deny" }, "interactive": { "permission": "ask" } },
    "Bash.dd":     { "background": { "permission": "deny" }, "interactive": { "permission": "ask" } },
    "Bash.mkfs":   { "background": { "permission": "deny" }, "interactive": { "permission": "deny"  } },
    "Bash.shred":  { "background": { "permission": "deny" }, "interactive": { "permission": "ask" } }
  }
}

The semantic: in background tier (cron-scheduled agents, headless Composer runs, anything without a human at the keyboard), destructive shell verbs are denied outright. In interactive tier (your live Cursor / Claude Code session), they require an out-of-band approval — you’ll see an ask prompt in the dashboard before the call executes.

That single rule eliminates the class of “the agent ran rm -rf on the wrong path while I wasn’t looking” incidents.

Step 3 — Bind end-user identity, per request

If you’re running an agent server-side (a webhook handler, a scheduled job, a multi-tenant SaaS), don’t let the agent inherit the service account’s filesystem permissions. Bind the end-user’s identity instead:

from acp_governance import set_context

@app.post("/run")
def run(req, authorization: str = Header(...)):
    set_context(
        user_token=authorization.removeprefix("Bearer ").strip(),
        agent_name="dev-assistant",
        agent_tier="background",
    )
    return run_agent(req)

ACP’s policy engine resolves permissions against the user’s IdP scopes. If the user’s role doesn’t include filesystem-write on the path the agent is targeting, neither does the agent — even if the service account has broader permissions.

For shadow-IT setups without an IdP wired in, mint a tightly-scoped gsk_ API key per agent. Same principle: the agent acts as something explicit and scoped, not as the service-as-root.

(Free fourth step) — Audit log

Every governed call writes a structured row regardless of decision. The denied rm -rf attempt is logged with the full command string, the agent identity, the tier, the policy reason, and a timestamp. When you ask “did the agent try to delete anything today?”, the answer has receipts.

You don’t have to do anything to get this. It’s the byproduct of installing the hook.

The total time investment

One curl command (Step 1): ~30 seconds
Four policy entries in the dashboard (Step 2): ~2 minutes
One line in your request handler (Step 3, optional): ~1 minute

Three minutes from blank slate to “an autonomous agent in this environment cannot destructively rewrite the filesystem without an interactive human session and an explicit policy override.”

If you’ve ever watched the agent type a path that didn’t look right and felt your stomach drop, three minutes of work removes the part where the dropping matters.

The shell is one tool of many — a real Claude Code session declares 76, most of which you’ve never watched it invoke. The Tool Surface Index groups them all by blast radius so you can set the rest of the posture too, and getting started covers the install above in more detail.

AgenticControlPlane.com

Share: Twitter LinkedIn