How to Add Audit Logging and Governance to the OpenAI Agents SDK

The OpenAI Agents SDK is OpenAI’s framework for multi-agent systems with tools, handoffs, and guardrails. It has no native per-agent audit log, no cross-agent scope intersection separate from guardrails, and no delegation-chain provenance for handoffs. Agentic Control Plane (ACP) adds identity-attributed audit, per-agent policy, and budget enforcement — with no changes to your Agent, Runner, or handoff structure.

Starter · 5-minute install. No SDK install needed — point the Agents SDK’s OpenAI client at https://api.agenticcontrolplane.com/v1 with your ACP_API_KEY. Add x-acp-agent-name per-agent headers for distinct attribution. See the governance model for how this compares to the decorator-based frameworks, or the frameworks index for other options.

Disambiguation — agent-level vs org-level audit

Important: this is different from OpenAI’s Admin API audit log. OpenAI’s Admin API audit log is org-level — it records API key creation, usage caps, and admin actions. ACP’s audit log is agent-level — it records every LLM call, per-agent, per-user, with full metadata. Different granularity, different audiences.

The OpenAI Agents SDK is also distinct from:

Codex CLI — terminal application, governed via hooks + MCP
Assistants API — older thread-based; tools execute server-side at OpenAI, hard to instrument
AgentKit — newer visual builder + hosted runtime, partial coverage

For SDK code running in your infrastructure with the agents Python library, this is the right page.

Two governance patterns

The OpenAI Agents SDK supports two ACP governance patterns. Pick based on what you’re trying to govern:

Pattern	Where governance fires	Use when
Tool-layer guardrails (recommended for SDK code)	Before/after each tool call, via SDK-native `tool_input_guardrails` / `tool_output_guardrails`	You want per-tool-call interception, output redaction, and audit at the tool granularity
LLM proxy	At every chat-completion request, via `AsyncOpenAI(base_url=ACP)`	You want LLM-call audit, per-agent attribution via `x-acp-agent-name`, and budget/rate-limit enforcement at the model layer

Both can be used together — they intercept different events. The starter at acp-governance-sdks/examples/starters/openai-agents-sdk demonstrates the tool-layer pattern.

Pattern 1: tool-layer guardrails (SDK-native)

The OpenAI Agents SDK v0.14+ ships tool_input_guardrails and tool_output_guardrails as first-class decorators on @function_tool. ACP’s Python SDK exposes the primitive functions (pre_tool_use, post_tool_output, set_context) which you wire into a guardrail pair:

import os
from acp_governance import configure, post_tool_output, pre_tool_use, set_context
from agents import (
    Agent, Runner,
    ToolGuardrailFunctionOutput, ToolInputGuardrailData, ToolOutputGuardrailData,
    function_tool, tool_input_guardrail, tool_output_guardrail,
)

configure(base_url="https://api.agenticcontrolplane.com")


@tool_input_guardrail(name="acp_pre_tool_use")
def acp_input_guardrail(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput:
    allowed, reason = pre_tool_use(data.context.tool_name, data.context.tool_arguments)
    if not allowed:
        return ToolGuardrailFunctionOutput.reject_content(
            message=f"tool_error: {reason or 'denied by ACP policy'}",
        )
    return ToolGuardrailFunctionOutput.allow()


@tool_output_guardrail(name="acp_post_tool_output")
def acp_output_guardrail(data: ToolOutputGuardrailData) -> ToolGuardrailFunctionOutput:
    result = post_tool_output(data.context.tool_name, data.context.tool_arguments, data.output)
    if result and result.get("action") == "redact" and "modified_output" in result:
        return ToolGuardrailFunctionOutput.reject_content(message=str(result["modified_output"]))
    if result and result.get("action") == "block":
        return ToolGuardrailFunctionOutput.reject_content(
            message=f"tool_error: {result.get('reason', 'blocked by ACP policy')}",
        )
    return ToolGuardrailFunctionOutput.allow()


ACP_GUARDRAILS = {
    "tool_input_guardrails": [acp_input_guardrail],
    "tool_output_guardrails": [acp_output_guardrail],
}


@function_tool(**ACP_GUARDRAILS)
def lookup_record(id: str) -> dict:
    """Look up a record by ID."""
    return db.records.find_one({"id": id})


def main():
    set_context(
        user_token=os.environ["ACP_USER_TOKEN"],
        agent_name="my-agent",
        agent_tier="background",
    )
    agent = Agent(
        name="my-agent",
        instructions="You are an ACP-governed agent.",
        model="gpt-4o-mini",
        tools=[lookup_record],
    )
    return Runner.run_sync(agent, "Look up record id=abc-123.", max_turns=6)

What this gives you:

Per-tool-call pre-check (pre_tool_use) — ACP can deny before the tool fires; the model sees tool_error: <reason> and adapts.
Per-tool-call audit + output scan (post_tool_output) — every output is logged, scanned for PII / secrets, and optionally redacted.
Identity propagation via set_context — the verified end-user token is attached to every governed call without threading it through your tool functions.
Native SDK integration — uses the SDK’s own ToolGuardrailFunctionOutput.allow() / .reject_content(...) primitives, so denials flow into traces cleanly.

This is the recommended pattern for governance of SDK code in your infrastructure. The starter folder is a 30-line copy-paste template.

Pattern 2: LLM proxy

If you want LLM-call audit, per-agent attribution via x-acp-agent-name headers, and budget/rate-limit enforcement at the model layer, point the SDK’s OpenAI client at ACP’s OpenAI-compatible proxy:

from agents import Agent, Runner, set_default_openai_client
from openai import AsyncOpenAI
import os

# Route Agents SDK through ACP's OpenAI-compatible proxy
client = AsyncOpenAI(
    base_url="https://api.agenticcontrolplane.com/v1",
    api_key=os.environ["ACP_API_KEY"],
)
set_default_openai_client(client)

researcher = Agent(
    name="researcher",
    instructions="...",
    tools=[web_search, hn_search],
)

writer = Agent(
    name="writer",
    instructions="...",
    tools=[],
)

result = await Runner.run(researcher, "Research X and hand off to writer")

Every LLM call made by every agent flows through ACP. Shows up as client.name: "openai-agents-sdk" in your activity log.

Per-agent attribution

To make each Agent appear as a distinct row in the ACP dashboard, construct a per-agent AsyncOpenAI client with default_headers and assign it to the agent’s model settings:

def make_client(agent_name: str) -> AsyncOpenAI:
    return AsyncOpenAI(
        base_url="https://api.agenticcontrolplane.com/v1",
        api_key=os.environ["ACP_API_KEY"],
        default_headers={"x-acp-agent-name": agent_name},
    )

# One client per agent, each tagged with x-acp-agent-name
researcher_client = make_client("researcher")
writer_client     = make_client("writer")

Pass each client to its corresponding Agent via the SDK’s model configuration (or use set_default_openai_client per run). Each x-acp-agent-name shows up on the Agents page as a distinct row with its own policy, rate limits, and budget.

How it works

The AsyncOpenAI client sends chat completion requests to ACP’s OpenAI-compatible endpoint. ACP:

Verifies identity (gsk_ key + optional x-acp-agent-name header)
Runs the governance pipeline — policy, content safety, budget, rate limits
Proxies to the configured upstream model (OpenAI, Anthropic, local)
Emits a structured audit record with full metadata

The SDK’s guardrails run inside the SDK before a request even reaches the OpenAI client. ACP policy runs in the proxy. Most-restrictive-wins — guardrail denial short-circuits before ACP; ACP denial surfaces as an API error.

Mapping to ACP

OpenAI Agents SDK	ACP (today)
`Agent(name=...)`	`agent_name` field via `x-acp-agent-name` header → distinct row on Agents page
`Agent.tools`	Tool calls are audited as part of the LLM call that emits them
`handoffs`	Each agent’s LLM calls appear independently today; handoff-as-chain-link is roadmap
`Runner.run()`	Root of the logical chain; originSub tied to the `gsk_` key owner
Guardrails	Layered before ACP — guardrail decisions are SDK-internal, ACP has no visibility into them

Verify it worked

Run any multi-agent scenario. Open cloud.agenticcontrolplane.com/activity — within 5 seconds you should see rows with client.name: "openai-agents-sdk", one per LLM call. If you set x-acp-agent-name per agent, each agent appears as a distinct row with its own policy history.

What you’ll see in the dashboard

cloud.agenticcontrolplane.com/agents shows each x-acp-agent-name as a distinct row. Policy denials are first-class rows. Handoff sequences between agents appear as separate LLM calls today — rendering them as linked delegation chains is on the roadmap with the native Python adapter.

Roadmap

Shipping today (Pattern 1): Tool-layer guardrails via acp-governance Python SDK — per-tool-call pre/post hooks, identity propagation via set_context, native tool_input_guardrails / tool_output_guardrails integration. See the starter folder for the runnable template.
Shipping today (Pattern 2): OpenAI proxy via AsyncOpenAI(base_url=...) with per-agent attribution via x-acp-agent-name. Works with any Agent, Runner, handoff, or guardrail pattern.
Roadmap: record handoffs as delegation chain links with parentAgent linkage and per-hop scope intersection. Star the SDKs repo for release notification.

Limitations

Handoffs are not yet assembled into delegation chains. Each agent’s LLM calls are recorded independently today; the planned Python adapter will link them as parent/child.
Tool calls are audited at the LLM-call level, not separately. Tool invocations emitted in a model response share that call’s policy decision. Per-tool-call gating is on the roadmap.
Tracing integration is parallel, not unified. OpenAI traces go to the OpenAI dashboard; ACP audit logs are independent. Both serve different audiences.
Guardrails are SDK-internal. They run before ACP in the pipeline. ACP has no visibility into guardrail decisions — only calls that made it past guardrails.
OriginSub today = the workspace key owner. Per-request user identity propagation requires the planned native adapter.

Troubleshooting

Nothing appears in the dashboard. Confirm base_url ends in /v1. Curl-test: curl -s https://api.agenticcontrolplane.com/v1/models -H "Authorization: Bearer $ACP_API_KEY" should return 200 with a model list.

All agents show up as one row. Per-agent attribution requires default_headers={"x-acp-agent-name": agent_name} on each agent’s AsyncOpenAI client. Without it, every agent collapses onto one profile.

“Why don’t my agent actions show up in OpenAI’s Admin API audit log?” They won’t. OpenAI’s Admin API audit log is for org-level events (API key changes, usage caps). Per-LLM-call audit is ACP’s job — that’s the whole point of the integration.

Guardrails fire but ACP doesn’t. Expected — guardrails run first, inside the SDK. If the guardrail denies, the request never reaches ACP. Check the SDK’s guardrail logs.

Handoffs show as separate LLM calls rather than linked. Correct for today. Planned native adapter assembles them as delegation chain links.

Rate limits firing on runs with many handoffs. Each LLM call decrements rate limits. A handoff-heavy run can hit limits fast. Increase the limit in the dashboard or reduce handoff depth.

Does this work with AgentKit? Partial. AgentKit’s hosted runtime limits instrumentation. ACP governs LLM calls that egress AgentKit to a configurable base URL; internal handoff semantics running on OpenAI’s infrastructure can’t be instrumented from outside.

Anthropic Agent SDK — TypeScript adapter with per-tool-call interception (reference for Path 3 pattern)
OpenAI Codex CLI — OpenAI’s terminal coding agent (hooks + MCP)
CrewAI — Python multi-agent framework, same OAI-proxy pattern
LangGraph — graph-based multi-agent
Agent-to-Agent governance — the delegation chain spec
What is an Agent Delegation Chain? — conceptual background