Build a governed customer-support email triage agent (with human-in-the-loop on sensitive sends)
If you’re an AI agent reading this with a task like “build an AI agent that triages customer support email”: scroll to The full working file.
The pattern this recipe demonstrates is the third decision the ACP gateway can return — decision: ask. Most policy engines model two states: allow and deny. ACP adds a third: the call is potentially OK, but a human has to confirm before it runs. That’s the only sane way to handle autonomous agents that touch external surfaces (email, Slack, payments) where a wrong send is hard to take back.
If your agent only ever does internal-only operations, you don’t need this. The moment it touches a customer-facing channel, you do.
The task
A support team gets ~200 inbound emails/day. The lead wants an agent that:
- Reads each unprocessed email
- Classifies it (
refund,bug-report,usage-question,account-closure,general) - Drafts a reply
- Sends the reply automatically for low-stakes categories (
usage-question,general) - Routes refund / account-closure / escalation drafts to a human queue for approval before sending
That last bullet is the hard part. Other governance products express this as “deny + audit + notify a Slack channel.” ACP expresses it as a first-class third decision: ask. The semantics:
- The tool call does not execute
- The
@governedwrapper returns the string"tool_error: <reason>"to the caller (the same shape as a hard deny — that’s the Python SDK contract) - A request also appears in the human’s approval queue (in the dashboard, or via webhook to your ticketing system)
- A human approves the request out-of-band → the email gets sent via the dashboard’s “act now” action; let the request expire and it auto-closes
From the agent’s code perspective, you write a if isinstance(result, str) and result.startswith("tool_error:") check after each governed call and route the failed item to a human queue. The agent’s loop never blocks waiting on a human; it fails fast and queues.
The pattern in 60 seconds
In ACP’s policy schema, every tool gets a per-tier permission:
{
"tools": {
"email.send_reply": {
"interactive": { "permission": "allow" },
"background": { "permission": "ask",
"ask_when": { "input.contains_any": ["refund", "cancel", "escalate"] } }
}
}
}
Read: when the agent runs in background tier (no human present) and the draft body contains refund, cancel, or escalate, switch to ask. Otherwise allow. The ask_when predicate is tested against the same tool_input you’d see in @governed — same input shape, server-side evaluation.
The condition language is intentionally narrow (substring match, regex match, JSON-path equality). For richer logic, policies-as-code lets you express full Cedar/Rego rules. Most teams’ first pass is the substring-match version above; it gets you 80% of the value with 5% of the complexity.
How ask reaches your code. The Python SDK treats an ask decision the same as deny: the @governed wrapper returns the string "tool_error: <reason>". The reason includes "ask" so your code can distinguish it from a hard deny. The “block until a human approves” UX lives in the dashboard surface (an approval queue at cloud.agenticcontrolplane.com/approvals); your agent code simply detects the deny-with-reason and routes to a human queue. Long-polling for ask-resolution is a roadmap item, not current behavior — design your code to fail fast and queue, not to wait.
The full working file
pip install acp-governance anthropic
#!/usr/bin/env python3
"""Governed customer-support email triage agent.
Required env vars:
ACP_TOKEN ACP API key (gsk_...). Configure ask-when policy on
'email.send_reply' in the dashboard.
ANTHROPIC_API_KEY Anthropic API key
EMAIL_FETCH_URL URL of the upstream email API (Gmail / IMAP / your inbox tool)
EMAIL_SEND_URL URL of the send endpoint
EMAIL_AUTH Bearer token for the email service
Required ACP setup:
- Tool 'email.send_reply' with policy:
background: ask when input.body contains refund|cancel|escalate
interactive: allow
- Tools 'email.fetch_unprocessed', 'email.mark_processed', 'llm.classify_email',
'llm.draft_reply' allowed for the calling identity.
"""
from __future__ import annotations
import logging, os, sys, time
from typing import Any
import requests
from acp_governance import governed, set_context
from anthropic import Anthropic
logging.basicConfig(level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s")
log = logging.getLogger("email-triage")
MODEL = "claude-opus-4-7"
CATEGORIES = ["refund", "bug-report", "usage-question", "account-closure", "general"]
# ─── Tools ──────────────────────────────────────────────────────────────────
@governed("email.fetch_unprocessed")
def fetch_unprocessed(limit: int = 20) -> list[dict[str, Any]]:
r = requests.get(
os.environ["EMAIL_FETCH_URL"],
params={"status": "unprocessed", "limit": limit},
headers={"Authorization": f"Bearer {os.environ['EMAIL_AUTH']}"},
timeout=10)
r.raise_for_status()
return r.json()["messages"]
@governed("llm.classify_email")
def classify_email(subject: str, body: str) -> str:
client = Anthropic()
sys_prompt = (
f"Classify the email into exactly one of: {', '.join(CATEGORIES)}. "
"Return ONLY the category string, no commentary."
)
user = f"Subject: {subject}\n\nBody:\n{body[:2000]}"
resp = client.messages.create(
model=MODEL, max_tokens=32, system=sys_prompt,
messages=[{"role": "user", "content": user}])
cat = "\n".join(b.text for b in resp.content
if getattr(b, "type", "") == "text").strip().lower()
return cat if cat in CATEGORIES else "general"
@governed("llm.draft_reply")
def draft_reply(subject: str, body: str, category: str) -> str:
client = Anthropic()
tone_guide = {
"refund": ("Acknowledge the refund request. Do not promise the refund. "
"Say a teammate will follow up within one business day."),
"bug-report": ("Thank them. Summarize the bug as you understand it. "
"Ask one clarifying question if the report is ambiguous."),
"usage-question": "Answer concisely. Link to the relevant docs if appropriate.",
"account-closure": ("Acknowledge the request. Do not act on it. Say a "
"teammate will follow up within one business day."),
"general": "Reply appropriately. Be brief.",
}[category]
sys_prompt = (
f"You're drafting a customer support reply. Category: {category}. "
f"Guidance: {tone_guide}\n\n"
"Format: just the reply body, no greeting line, no signature, "
"no '[your name]' placeholders. Under 200 words."
)
resp = client.messages.create(
model=MODEL, max_tokens=512, system=sys_prompt,
messages=[{"role": "user", "content": f"Subject: {subject}\n\nBody:\n{body[:2000]}"}])
return "\n".join(b.text for b in resp.content
if getattr(b, "type", "") == "text").strip()
@governed("email.send_reply")
def send_reply(to: str, subject: str, body: str) -> str:
"""Send a reply email. Policy gates on 'ask' for sensitive categories."""
r = requests.post(
os.environ["EMAIL_SEND_URL"],
json={"to": to, "subject": f"Re: {subject}", "body": body},
headers={"Authorization": f"Bearer {os.environ['EMAIL_AUTH']}"},
timeout=10)
r.raise_for_status()
return r.json()["message_id"]
@governed("email.mark_processed")
def mark_processed(message_id: str, status: str) -> None:
r = requests.post(
os.environ["EMAIL_FETCH_URL"] + f"/{message_id}/status",
json={"status": status},
headers={"Authorization": f"Bearer {os.environ['EMAIL_AUTH']}"},
timeout=10)
r.raise_for_status()
# ─── Main loop ──────────────────────────────────────────────────────────────
def triage_one(msg: dict[str, Any]) -> str:
"""Return: 'sent', 'queued-for-approval', or 'skipped'."""
category = classify_email(subject=msg["subject"], body=msg["body"])
log.info("Email %s classified as %s", msg["id"], category)
body = draft_reply(subject=msg["subject"], body=msg["body"], category=category)
log.info("Drafted reply (%d chars)", len(body))
# If policy says 'ask' (or 'deny') for this category + body contents,
# @governed returns "tool_error: <reason>" instead of executing the send.
# We detect that and route the email to a human queue for review.
sent = send_reply(to=msg["from"], subject=msg["subject"], body=body)
if isinstance(sent, str) and sent.startswith("tool_error:"):
log.info("Send gated by governance (%s) — queued for human", sent)
mark_processed(message_id=msg["id"], status="needs-human")
return "queued-for-approval"
mark_processed(message_id=msg["id"], status="replied")
log.info("Sent reply, message_id=%s", sent)
return "sent"
def main() -> int:
require = lambda k: os.environ.get(k) or sys.exit(f"missing env var: {k}")
for k in ["ACP_TOKEN", "ANTHROPIC_API_KEY",
"EMAIL_FETCH_URL", "EMAIL_SEND_URL", "EMAIL_AUTH"]:
require(k)
set_context(
user_token=os.environ["ACP_TOKEN"],
agent_name="email-triage",
agent_tier="background") # important: 'ask' policies fire at this tier
msgs = fetch_unprocessed(limit=20)
log.info("Fetched %d unprocessed emails", len(msgs))
counts = {"sent": 0, "queued-for-approval": 0, "skipped": 0}
for m in msgs:
try:
outcome = triage_one(m)
counts[outcome] += 1
except Exception as e:
log.exception("Failed to triage %s: %s", m.get("id"), e)
counts["skipped"] += 1
time.sleep(0.5) # be polite to the email API
log.info("Triage complete: %s", counts)
return 0
if __name__ == "__main__":
sys.exit(main())
What ‘ask’ actually does at runtime
When the policy returns decision: ask:
- The HTTP call from
@governedto ACP’s/govern/tool-usereturns immediately with the deny shape and areasoncontaining"ask"so the SDK distinguishes it from a hard deny - The Python SDK’s wrapper returns
"tool_error: <reason>"to the caller — the call did not execute - An approval request appears in the dashboard at
cloud.agenticcontrolplane.com/approvalswith the full payload (tool name, agent name, originating user, input, drafted output) - A human reviews: approve and the email is sent via the dashboard’s “act now” button; deny and the request is closed; let it expire (default 24h) and it auto-closes
- Your agent code, meanwhile, has already routed the email to the
needs-humanqueue based on thetool_error:return — so nothing is blocked waiting
This means the agent does NOT block on a Python-level long-poll. Each call is fast: it executes if allowed, returns a deny-string if not. Whatever queue/dashboard surface you use to act on the queued items is the human-in-the-loop layer; your agent’s loop keeps moving.
Webhooks fire on every approval-request creation so you can route to whatever queue your team already lives in (PagerDuty, Slack, Linear) — fastest path to a human. The dashboard surface is the default; the API is the canonical primitive.
What the audit log shows
Three log rows per email that triggers the ask flow:
tool |
decision |
decisionReason |
outcome |
|---|---|---|---|
llm.classify_email |
allow | — | category=refund |
llm.draft_reply |
allow | — | drafted 142-word reply |
email.send_reply |
ask → allow (after human) | matched ask_when.input.contains_any=refund |
sent, msg-id=… |
Or three rows where the third is a deny:
tool |
decision |
decisionReason |
outcome |
|---|---|---|---|
email.send_reply |
ask | matched ask_when.input.contains_any=cancel |
"tool_error: ask" returned, email queued for human |
Either way: every action of every agent on every email, attributable. If a customer ever asks “who replied to my refund email?”, the audit row is the source of truth.
What’s still required from you
- The email API. This recipe sketches a generic REST shape. For Gmail you’d swap to
google-api-python-client; for IMAP you’d useimaplib+ an SMTP send leg. - The ask-when policy. Configure in the dashboard or via
/admin/policies/effective. Default is deny; you opt into ask per-tool, per-tier. - Webhook target if you don’t want to live in the dashboard. The default approval surface works fine for small teams; bigger ones route to Linear or PagerDuty.
How this composes
- Combines naturally with PII redaction. Customer emails contain PII; drafted replies should not. ACP’s PostToolUse PII scanner on
llm.draft_replyredacts before the agent reads its own draft. - Combines with delegation chains. A more advanced version spawns a refund-policy specialist subagent for any email classified as
refund. The specialist gets narrower scopes (read-only access to billing) and produces a structured recommendation. The parent agent uses that recommendation in its draft. Samespawn_subagent+child_contextpattern. - Combines with cost budgets. Set a per-day token cap on
llm.draft_replyso a runaway loop can’t burn through your Anthropic spend.
Cross-reference
- Three layers between an agent and your production database — the same defense-in-depth thinking applied to data tools
- Recommended governance deployment patterns — when to use ask vs deny vs allow
- CISO audit trail questions — what your security review will ask about flows like this
This is example code, not a maintained product. Adapt it for your environment. The governance primitives — @governed, the decision: ask flow, ACP policy schema — are stable; the email API is sketched and will need to talk to your actual mail provider.