What our scheduled agents can't do — and why that's the point
The unnerving part of running an agent that builds other agents isn’t the building. It’s that the agents it ships go off and run — on a schedule, on a trigger, while no one is watching — and they take real actions: send the email, write the record, post to the channel, hit the API. An unattended agent that can act is exactly where damage lives. A prompt can’t fix that, because the prompt is the thing you’re trusting, and trust is not a control.
So the question we had to answer about our own fleet wasn’t “is the agent well-behaved?” It was “what is it allowed to do, and what stops it the moment it tries something it shouldn’t?”
The same agent, watched and unwatched, shouldn’t have the same permissions
Here’s the insight the whole posture rests on. An agent you’re driving interactively — you, in the loop, watching each step — is a different risk than the same agent firing at 3 a.m. with nobody home. The interactive one can ask you when it’s unsure. The unattended one can’t. They should not carry the same permissions.
Agentic Control Plane makes that distinction first-class. Every tool call carries a tier — interactive, subagent, background, or api — and policy is written per tier. A human-driven agent might be allowed to run a destructive command after a confirm; the background version of that same agent is denied it outright. Not warned. Denied, before the call runs.
And the tier isn’t something the agent can talk its way out of. A scoped key — the kind a subagent runs on — is forced to its tier server-side, no matter what the request claims. An agent can’t escalate itself by asserting tier: interactive; the gateway already decided what it is.
How a call gets decided
When an agent makes a tool call, it doesn’t reach the tool. It reaches the governance hook first, which decides — deterministically — in four moves:
- Classify the action. Not “this is a Bash call,” but which Bash call:
Bash.git,Bash.rm,Bash.curl.api.stripe.com. A file write becomesWrite.credentialsif it touches a secret; a fetch becomesWebFetch.github.com. The action’s actual blast radius is in its name. - Resolve the tier — interactive / subagent / background / api, server-authoritative.
- Look it up, most-specific first. Policy is matched on a waterfall from
Bash.curl.api.stripe.comdown toBash, first match wins. So you can allowgitbut denyrm, allowcurlto your own API but ask on a curl anywhere else — distinctions a coarse “allow Bash: yes/no” can’t make. - Decide: allow, deny, ask the human, or modify (redact the params and let it through). Then log it, with the decision and the reason, on the same trail as everything else.
Concretely: say a scheduled digest agent is supposed to read a Slack channel, and a config slip points it at slack.postMessage instead. Classified as a write, resolved to background tier, matched against a policy where background can’t post — and it’s denied at the hook, before Slack is ever called. The outcome isn’t an incident and an apology in the channel. It’s thirty denied rows in the audit log that a human reviews later and fixes. The best control story is a boring one: nothing happened.
Control narrows as it delegates
The other place an agent-builder has to be careful is delegation. The agents it ships spawn subagents of their own, and authority has to shrink at every hop — never grow.
When an agent spawns a child, the child’s permissions are the intersection of what it asks for and what its parent actually holds. A parent scoped to github.repos.* and github.issues.read that spawns a child wanting issues.write doesn’t get it — the child can’t hold a permission its parent never had. Budgets narrow the same way: a child runs on the smaller of its own cap and the parent’s remaining balance. And the original human identity rides the whole chain, so an action three agents deep still answers the question on whose behalf? The floor is the human who started it; nobody below can exceed it.
You can only deny what an agent can reach
Default-deny at runtime is the backstop. The first line is narrowing what a generated agent can even hold. When the builder picks tools for an agent it’s shipping, it prefers platform built-ins over connecting a user’s full OAuth account — notifications.sendEmail from a sandboxed sender over the keys to someone’s whole inbox — because a smaller grant is a smaller blast radius. It can only pick from tools the tenant has actually enabled, and any tool it hallucinates is rejected before the agent ever ships. The agent that reaches production is already scoped down; the runtime hook enforces what’s left.
None of this depends on the agent being well-written. That’s the point. The control lives below the model, in a layer the agent calls through, not one it can choose to honor. That’s the policy pillar of ACP: allow, deny, ask, or redact — per action, per tier, per agent, enforced at the call. The agents we don’t watch are the ones we trust least, and they’re the ones held tightest.
Control what your agents can do →