Stop your AI agent from leaking PII through tool calls — in three steps
A SQL agent runs SELECT email, name FROM users WHERE plan = 'enterprise'. The query is valid. The user is authorized. The result set is 4,200 rows of (email, name) pairs that are now in the LLM’s context window. From there:
- They flow into the conversation history. Forever.
- They’re in the model provider’s logs (Anthropic, OpenAI, whoever).
- They appear, verbatim, in any subsequent tool call the agent makes — Slack message drafts, Jira ticket bodies, a follow-up SQL query, a Markdown report.
- They show up in any session-replay tool you have. They show up in error reports. They show up if you ever export the conversation for compliance review.
You’ve now got customer PII in 4-6 systems where you didn’t intend to put it. The agent didn’t do anything malicious. The agent did exactly what it was told. The leak happened between the tool returning a result and the agent reading that result — a layer that most governance products don’t even instrument.
This is the most common form of PII leakage in AI agents in production right now. It’s also one of the easiest to gate, if you put governance at the right layer.
Why this is the default failure mode
The mental model most teams have for PII protection in AI is input-side — don’t let users paste PII into chat. There are real solutions for this (covered separately). But input-side protection doesn’t help you when the PII enters via a tool result, not a user prompt.
The agent isn’t typing PII. It’s reading it from your warehouse, your CRM, your customer database — systems where PII should exist. The agent then has it in context, and from there:
- The agent is asked to “summarize the top complaints” → it includes specific customer names and emails in the summary
- The agent is asked to “draft a follow-up to enterprise customers” → it generates personalized emails with real customer data, addressed to the wrong recipients in the conversation
- The agent is asked to “flag the unusual signups” → it lists customer identifiers in its analysis output, which is then logged
- The agent is asked to “find similar accounts” → it runs a follow-up SQL with literal customer values in the WHERE clause, which logs the values in your DB query log
Every one of these is the same root failure: PII entered the agent’s context window from a tool output, and now the agent is using it as referenceable text.
The fix isn’t telling the agent to be more careful. The fix is making sure PII never reaches the agent’s context in the first place.
Three steps that put a control plane between your agent and your tool outputs
Step 1 — Install the hook
For any agent calling tools (Cursor, Claude Code, Codex CLI, custom Python with acp-governance, your CrewAI/LangGraph agent):
curl -sf https://agenticcontrolplane.com/install.sh | bash
The hook intercepts every tool call after it returns — the PostToolUse hook — and runs the result through ACP’s PII scanner before passing it to the agent. The agent receives a result with PII tokens replaced by [REDACTED:<type>] markers; the raw values never enter context.
For Python agents using the SDK directly:
from acp_governance import governed
@governed("warehouse.run_sql")
def run_sql(query): ...
The decorator wires up both hooks. PreToolUse decides allow/deny/ask. PostToolUse scans the output and redacts.
Step 2 — Enable PostToolUse PII scanning per tool
In your dashboard (cloud.agenticcontrolplane.com) → Policies → Tool Outputs:
{
"mode": "enforce",
"tools": {
"warehouse.run_sql": { "post_output": { "pii": "redact" } },
"crm.fetch_customer": { "post_output": { "pii": "redact" } },
"salesforce.query": { "post_output": { "pii": "redact" } },
"support.fetch_ticket": { "post_output": { "pii": "redact" } },
"stripe.list_customers": { "post_output": { "pii": "redact" } }
}
}
The built-in detector recognizes 12 common patterns: email, phone, SSN, credit card (PAN), IBAN, US passport, US driver’s license, US tax ID, NHS number, IP address (where flagged), full name (heuristic), and date of birth (with context). For internal-only patterns — your customer ID format, your internal employee ID — register custom regex via /admin/pii-patterns:
curl -X POST https://api.agenticcontrolplane.com/{slug}/admin/pii-patterns \
-H "Authorization: Bearer $ACP_TOKEN" \
-d '{"name":"customer-id","regex":"CUST-[A-Z0-9]{8}",
"redaction_label":"internal-customer-id"}'
The custom patterns get a ReDoS guard so you can’t accidentally hang the gateway with a pathological regex. Validation is a 200ms timeout in a worker thread.
Step 3 — Verify, then expand to more tools
Run a query that should return PII. Check the agent’s view of the result:
echo 'SELECT email, name, plan_tier FROM users LIMIT 3' | your-sql-agent
You should see something like:
# What the agent receives:
[
{"email": "[REDACTED:email]", "name": "[REDACTED:name]", "plan_tier": "enterprise"},
{"email": "[REDACTED:email]", "name": "[REDACTED:name]", "plan_tier": "enterprise"},
{"email": "[REDACTED:email]", "name": "[REDACTED:name]", "plan_tier": "starter"},
]
Aggregation columns pass through. PII columns are redacted. The agent’s analytical task (“how many enterprise vs starter?”) still works. The agent never sees the underlying customers.
In the audit log (/admin/audit), each PostToolUse row records the redactions applied: findings: [{"type": "email", "count": 3}, {"type": "name", "count": 3}]. You can prove the redaction happened on every call.
Once it’s verified for one tool, add the policy to every tool whose output might contain PII. CRM, support tickets, billing systems, search APIs — anything that touches customer records. The pattern is the same.
What this protects against
- Accidental PII in conversation logs — primary failure mode, fixed
- PII in model provider’s logs — Anthropic / OpenAI receive only the redacted version
- PII in downstream tool calls — agent can’t pass through what it doesn’t have
- PII in error reports + session replays — also fixed, since the leak is upstream of all of them
- PII used by the agent in its own reasoning — the agent reasons over
[REDACTED:email]tokens, which are opaque
What this doesn’t protect against
- Statistical re-identification — if the agent counts “3 enterprise users in Norway,” and there are only 3 enterprise users in Norway, the count itself reveals identity. PII redaction is not differential privacy. For high-stakes anonymization, layer differential privacy on the warehouse view itself.
- Tools that legitimately need to return identifiers — if the agent needs to send an email to a customer, it has to know which customer. For these flows, run the operation under a different tool name (e.g.,
email.send_to_customer_id) where the PII pass-through is explicit and policy-checked separately. - PII in tool inputs — separate problem, separate fix (covered here).
- Custom data formats not covered by built-in detectors — register custom regex per Step 2.
Companion code
For a complete end-to-end SQL agent with this pattern wired up: Build a governed SQL agent that scrubs PII from query results. 180 lines of working Python.
Cross-reference
- PII in prompts: what you’re probably leaking — the input-side version
- Build a governed SQL agent — full working code for this exact pattern
- Stop your AI agent from leaking secrets in your
.envfile — the same pattern applied to credentials, not customer data - Three layers between an agent and your production database — the broader defense-in-depth shape
The protection above runs at the gateway, not in your agent code. Your agent doesn’t need to know about PII redaction. The control plane does it on every tool output, every time, audited.