AI Agent Tool Allowlists: Deny by Default, Scope per Task, Audit Everything

David Crowe · 6 min read

tool-policy allowlist default-deny claude-code codex mcp

An AI agent tool allowlist is the explicit set of tools an agent is permitted to call — everything not on the list is denied by default. It’s the single highest-leverage control you can put on an agent, because an agent’s capability isn’t what it does, it’s what it can do: in one real Claude Code session we captured, the harness declared 76 tools to the model, and 64 of them were never invoked. Every one of those 64 is standing capability — send email, publish a web page, schedule its own future runs — waiting on nothing but a model deciding to use it.

This page is the reference: what an allowlist is, how to configure one in Claude Code, Codex CLI, and MCP, how to scope it per task, and — honestly — where a client-side allowlist stops holding and what to do about that.

Why deny-by-default, and where it’s wrong

Deny-by-default is the right posture for the same reason it’s the right posture everywhere else in security: the cost asymmetry. A denied tool the agent didn’t need costs you nothing. An allowed tool it misused costs you whatever that tool could do — and for tools that send, pay, or delete, that’s not a rollback.

But apply it naively and you get the failure mode every developer who’s tried it knows: an agent that asks permission to read a file makes hundreds of asks per session, and by day two you’re approving on reflex. So the practical rule, argued in full here, is:

Allow the core loop — read, search, edit, execute-in-sandbox. That’s the agent’s hands.
Deny the outward tail — send, publish, schedule, remote-spawn — until first legitimate need. Un-deny one tool at a time, not a family.
Gate the irreversible behind a human, if you have an approval mechanism that a human actually reads.

The Tool Surface Index maintains live captures of what each coding agent actually declares, grouped by blast radius, with a verdict per family. Sort your own agent’s tools into those families and the allowlist writes itself.

How to set a tool allowlist in Claude Code

Claude Code reads permission rules from settings.json (project-level .claude/settings.json or user-level ~/.claude/settings.json):

{
  "permissions": {
    "allow": [
      "Read",
      "Edit",
      "Bash(npm run test:*)",
      "Bash(git diff:*)"
    ],
    "deny": [
      "WebFetch",
      "Read(./.env)",
      "Read(./secrets/**)",
      "Bash(curl:*)"
    ]
  }
}

Three things worth knowing that the examples don’t show:

Bash rules are prefix matches on the command string, not a semantic understanding of the command. Bash(git diff:*) allows anything starting git diff. Anthropic’s own docs are upfront that Bash deny rules can be sidestepped by how a command is written — treat them as guardrails against accidents, not a security boundary. (More on this below.)
deny beats allow — a rule in both lists is denied.
MCP tools use the form mcp__servername__toolname — so a connected MCP server’s tools can be allowlisted individually, not just as a server.

How to set one in Codex CLI

Codex ships a deliberately tighter surface — 17 declared tools against Claude Code’s 76 — and controls it with approval policies and sandbox modes rather than per-tool lists, in ~/.codex/config.toml:

approval_policy = "untrusted"     # ask before any command not on the trusted list
sandbox_mode    = "workspace-write"  # writes confined to the workspace

The trade: less granularity, but the sandbox does structural work a string-matched list can’t. What Codex doesn’t give you is per-tool allow/deny on its non-shell tools — the practical route to that is governing at the MCP boundary or the model proxy, same as below.

How to allowlist MCP tools

An MCP server is a package of tools, and connecting one is a family-sized grant — the calendar server you added for scheduling also reads every event. Two levels:

Client-side: the mcp__server__tool rules above — real granularity, but they live in a config file the agent’s host can edit, and they only bind that one client.
Server-side (gateway): front the MCP server with a gateway that exposes only an enabled subset of tools per workspace. The agent’s tools/list simply never includes the rest. Nothing to bypass client-side, because the denied tools don’t exist from the agent’s point of view. This is how ACP does it — the allowlist is enforced where the tool call lands, not where it’s emitted.

Scope per task, not per install

A standing allowlist answers “what may this agent ever do.” The better question is “what does it need for this task” — a code-review agent needs read and comment; the same harness doing a release needs push and publish. Two mechanisms make per-task scoping cheap:

Tiers. Split policy by context: interactive (you’re watching) vs background (cron jobs, CI, headless runs). The same tool can be ask interactively and deny in background — an agent running at 3am has no one to ask.
Flip-on-first-need. Start narrow. When the agent legitimately needs a denied tool, the deny message tells you which rule fired; un-denying one tool is a one-line change. That workflow beats guessing the full list upfront, every time.

The deeper version of this argument — narrow, typed tools as the precondition for any meaningful policy — is in agent access control with scoped tools.

Where client-side allowlists stop holding

Now the honest part. A client-side allowlist — settings.json, hooks, any check that runs inside the agent’s own process — holds against mistakes, which is most of what goes wrong. It does not hold against the two things people quietly assume it does:

String-matching misses semantics. A denied Bash(rm:*) doesn’t catch the same deletion expressed as a script the agent wrote and then executed. The effects of a shell command aren’t in its prefix. We walk the documented bypass classes in Claude Code’s deny list can be bypassed.
The client can opt out. Claude Code’s --dangerously-skip-permissions flag disables the permission prompts and every hook — including ours. An allowlist the client enforces is an allowlist the client can drop.

The fix isn’t a cleverer regex; it’s moving enforcement to a boundary the agent can’t route around: a gateway that executes the tool calls (denied tools don’t exist), or a model proxy that strips denied tool_calls before they ever reach the harness. Client-side rules stay useful — they’re fast, local, and catch the accidents — but the list that has to hold lives outside the agent’s process.

Audit is the allowlist’s other half

An allowlist you can’t audit decays silently. Harnesses lazy-load tools, MCP servers connect mid-session, updates add families — in our captured traffic, one session gained 21 tools partway through the day with no prompt and no changelog. The allowlist you wrote in January is matched against a surface you counted in January.

So the operational loop is: capture the declared surface, diff it across sessions, and log every allow/deny decision with the rule that made it. That’s the part that turns a config file into a control. Doing it continuously — every declaration captured, every change diffed, every tool a click to allow, flag, or deny — is what ACP does for coding agents:

curl -sf https://agenticcontrolplane.com/install.sh | bash

But the posture stands with any tooling: deny by default, allow the loop, scope to the task, and re-check the count — because the allowlist is only as current as your last look at the surface.

Share: Twitter LinkedIn

← back to blog