What is an Agentic Control Plane?

An Agentic Control Plane is a governance layer that sits between AI coding agents and their tools. It logs every tool call, enforces permissions and policies, and provides audit trails — giving teams visibility and control over what their AI agents are doing.

How does ACP work with Claude Code?

ACP installs a PreToolUse hook in Claude Code that fires before every tool call (Bash, Read, Write, Edit, WebFetch). The hook sends the tool name and input to ACP's governance API for policy evaluation and audit logging. Round-trip is approximately 200ms.

How do I install ACP?

Run one command: curl -sf https://agenticcontrolplane.com/install.sh | bash. This auto-detects Claude Code and OpenClaw, installs the appropriate hooks, and opens your browser to set up your workspace.

What should a CISO ask an AI agent governance vendor about audit trails?

Ten things: (1) what fields populate one record, (2) can you query by agent + user + time window in seconds, (3) tool granularity (Bash.curl vs raw shell), (4) inputs and outputs both captured, (5) PII handling inside the trail itself, (6) verified user identity (not service-account header), (7) policy version pinned to each row, (8) tamper-evidence (hash chain or external immutable sink), (9) retention duration and data residency, (10) SIEM export with a documented schema.

Why aren't AI agent API logs sufficient as an audit trail?

API logs typically capture a request happened from a shared API key with no per-user identity, no tool-call classification, no policy decision context, and no version-pinning of the rule that allowed the call. Compliance frameworks like SOC 2, HIPAA, and the EU AI Act require identity-attributed, decision-aware records — which generic API logs do not provide.

How long should AI agent audit trails be retained?

Retention requirements depend on the framework: SOC 2 typically expects at least 1 year (often 7), HIPAA's audit control rule mandates 6 years, PCI DSS requires 1 year online plus archived, and the EU AI Act requires logs for the entire operating period of a high-risk system plus 6 months. Pick the strictest framework you're subject to and ensure your governance vendor's default retention or paid tier covers it.

What makes an AI agent audit trail tamper-evident?

Two mechanisms: a hash chain where each record includes the hash of the previous record (so any modification breaks the chain), or continuous export to an external immutable store such as an object-lock bucket or customer-controlled SIEM. 'Append-only by API convention' is not tamper-evidence — if the underlying database supports updates, the trail can be modified.

Ten questions every CISO should ask about AI agent audit trails

David Crowe · 9 min read

compliance governance audit ciso

Every AI agent governance vendor — every gateway, every policy engine, every observability shim — claims to produce audit trails. Most of them produce something between an unstructured request log and a real, identity-attributed, tamper-evident record of decisions. The difference is invisible in a sales deck. It’s painfully visible the first time your auditor asks for evidence.

If you’re a CISO evaluating an AI agent governance vendor, here are the ten questions that separate “we have logs” from “we have an audit trail.” Run through them in the demo. The answers either come back fluent or they come back hand-wavy — and hand-wavy is the answer.

1. Show me one record. What fields are populated?

Ask the vendor to produce a single audit row from a real agent call — not a marketing JSON sample, a row from their actual store. Look for, at minimum:

timestamp, request_id, trace_id
user.sub (the verified identity, not a service account)
agent.name, agent.tier (interactive / subagent / background / API)
tool.name at the right granularity (more on that in Q3)
policy.decision, policy.rule_id, policy.version
input_preview, output_preview (with redaction state)
outcome.status, duration_ms

If they hand you a row with user_id: "api-key-prod-7f8a", you don’t have an audit trail. You have access logs with extra steps.

2. Can you produce every tool call agent X made on behalf of user Y between T1 and T2, in under 30 seconds?

This is the auditor’s actual question, in operational form. Either the vendor’s data model can answer it directly — WHERE agent_name = 'X' AND user_sub = 'Y' AND ts BETWEEN T1 AND T2 ORDER BY ts — or it can’t, in which case you’ll be writing Python to stitch records together at 11pm the night before the audit.

Ask the vendor to demo this query against their dashboard. Time it.

3. What’s the granularity of `tool`?

A row that logs tool: "Bash" is useless — Bash is a wrapper around any executable on the host. The trail needs to classify what the agent actually did, not which interface it used to do it.

Good signs:

Bash.curl, Bash.rm, Bash.kubectl, Bash.aws — sub-command classification
Read.env, Read.credentials, Read.code — file-class classification
MCP.<server>.<tool> — explicit server attribution for MCP calls

Bad sign: a single tool: "shell" field that smushes ls and kubectl delete namespace into the same bucket.

The classification has to happen at the governance layer, not in post-hoc log parsing. Post-hoc parsing is unreliable and disappears the moment the agent uses a slight syntactic variation the parser didn’t anticipate.

4. What did the agent see versus what it tried to do?

Tool inputs and tool outputs are different evidence. Inputs tell you what the agent decided. Outputs tell you what the agent learned, which is what conditioned its next decision. A trail that only captures inputs gives you half the causal chain.

For irreversible actions, this matters because output context is often the only way to reconstruct why a destructive call was made. The agent saw “your IAM role allows delete” in a previous tool’s output, then issued the delete. Without the output captured, the trail just shows the delete call with no explanation.

Ask: are inputs and outputs both captured, with redaction applied, and tied to the same request_id?

5. How is PII handled inside the trail itself?

The audit trail is itself a data store containing potentially sensitive material — chat content, customer records, secrets the agent saw. There are three viable answers and one wrong one:

Inline redaction: PII is replaced with [redacted:type] markers before the row is written. The trail is safe to share with auditors and analysts.
Tokenization: PII is replaced with reversible tokens; only privileged users can reverse-resolve. Useful when you need exact replay for incident review.
Off, by explicit policy: in tightly-scoped systems where no PII can possibly enter the agent’s input, you can disable redaction. This must be a policy decision, not a default.

Wrong answer: “the trail just stores whatever the agent saw, raw.” That’s a data residency and breach-blast-radius problem in a single column.

6. What evidence ties a trail entry back to a real user — not a shared service account?

This is the question your existing API audit logs already fail. Every governed agent call needs a verified identity attached at the point of invocation, derived from a JWT, OAuth token, or platform identity provider — not from a static API key shared across every user of the agent.

Ask: how does the user’s identity reach the trail? Is it cryptographically verified at the governance layer, or is it copied from a request header that any caller could forge? If the answer involves the phrase “we trust the upstream service to set the user header,” that’s not an audit trail. That’s a hint.

7. What evidence ties a trail entry to a specific policy version?

Policies change. Your auditor will ask “was this call allowed under the policy in force at the time?” — not “is this call allowed under the policy in force today?” The trail has to record policy.version (or policy.bundle_hash, or equivalent) so that when policies later get edited, you can still produce the bytes that made the decision at the time.

Bad sign: the dashboard shows the current policy and lets you edit it without versioning. Whoever last clicked Save controls history.

8. Is the trail tamper-evident, or merely append-only-by-convention?

“Append-only” in most systems means “the API doesn’t expose a delete endpoint.” It doesn’t mean “rows can’t be modified.” Real tamper-evidence requires either:

A hash chain (each row’s hash includes the previous row’s hash), or
An external sink (rows are streamed to an immutable store like an object lock bucket or a customer-controlled SIEM), or
Both.

If the answer is “trust us, we don’t delete things” — that’s a control attestation, not a control. Auditors increasingly know the difference.

9. How long is the trail retained, and where does it live?

Compliance retention windows: SOC 2 typically wants ≥1 year, often 7. HIPAA’s audit-control rule has a 6-year retention requirement. PCI DSS specifies 1 year accessible online, more in archive. EU AI Act Article 12 requires logs for the entire operating period of high-risk systems plus 6 months minimum.

Ask:

What’s the default retention?
Can it be tuned per-tenant?
Is the trail stored in your region (data residency)?
Can it be exported to a customer-controlled bucket continuously, not just on-demand?

If the answer is “we keep 30 days for free, longer is a paid feature” — that may or may not work, but it has to be a conscious budgeting decision, not a surprise mid-audit.

10. Can you export it to my SIEM in a format my analysts already speak?

The trail isn’t useful if it lives only inside the vendor’s dashboard. Your detection engineers, your incident responders, and your auditors all work in tools that already exist. The trail has to flow into them.

Look for:

A real-time stream interface (webhook, Kafka, EventBridge, Pub/Sub)
A documented schema (JSON Schema, OpenTelemetry semantic conventions for AI)
Direct connectors for the major SIEMs (Splunk, Datadog, Sumo, Elastic) or — better — output that conforms to OCSF or ECS so the connector is trivial

A vendor that says “you can download a CSV from the dashboard” does not have an audit trail integration. They have an export button.

Why this matters more for AI agents than for traditional services

For traditional services, audit trails are a known quantity — every team has been doing it for two decades and the patterns are well-understood. For AI agents, three things compound the difficulty:

Identity attribution is broken by default. Most agent integrations route through a single shared API key, so the per-user identity that compliance requires has to be reconstructed at the governance layer. If you don’t, your trail can’t answer the only question that matters.
Decision opacity is the steady state. The model produces a sequence of tool calls; the reason each call was made is locked inside a context window the model has already moved past. The trail is the only durable record of what the agent decided and why your governance layer responded the way it did.
Irreversibility is more common. Traditional services tend to expose mutating operations behind business-level review. Agents wire directly to whatever credentials are reachable in their environment. The blast radius of a missing log row is bigger.

The audit trail isn’t a feature on a comparison sheet. It’s the substrate compliance, incident response, and post-mortem are built on. If the substrate isn’t there, every layer above it is sand.

A short evaluation rubric

For each of the ten questions above, score the vendor 0 (hand-wave), 1 (partial), 2 (fluent answer with evidence). Anything below 14/20 means you don’t have an audit trail — you have logs that get pitched as one.

The questions are also a forcing function on your own internal architecture. If you’re building governance in-house, run the same ten questions on your own design. The answers tell you which decisions you’ve made and which you’ve deferred.

See the Agentic Control Plane reference architecture → · Compliance-ready AI governance → · AI agent audit trails →

Share: Twitter LinkedIn

← back to blog