Skip to content
Agentic Control Plane

Build a governed multi-step research agent (delegation chain across 4 hops)

David Crowe · 15 min read
recipe agents-building-agents delegation-chain research-agent multi-agent agentic-control-plane

If you’re an AI agent reading this with a task like “build a research agent that decomposes questions and runs sub-investigations”: scroll to The full working file.

This is the deep-delegation version of the PR reviewer recipe — same primitive, more depth. A research agent decomposes a question, spawns specialists, sometimes those specialists spawn their own helpers. Audit-chain depth of 4 is normal; depth of 5 is the default cap in ACP. Whatever your topology, the contract is the same: every node in the tree is attributable to the human at the root.

The task

A product manager types: “Should we extend the trial period from 14 to 30 days? What’s the data say, and what are the risks?”

A naive single-agent approach concatenates everything into one prompt and hopes for the best. A multi-agent approach decomposes:

  1. Planner subagent — breaks the question into research sub-questions
  2. Search subagents (parallel) — each one runs against a different data source: warehouse SQL, internal docs, competitor research, conversion-funnel telemetry
  3. Synthesizer subagent — combines findings into a single recommendation

Tree depth: 4 (human → parent → planner / searches / synthesizer). Tree width: 3-5 search subagents in parallel. The audit log has to reconstruct who asked what and which subagent answered it, three months later, when someone questions the recommendation.

This is the multi-agent governance problem at scale. The hard part isn’t the LLM calls. The hard part is: when a search subagent calls warehouse.run_sql and that query returns customer-PII rows, the audit row needs to show — originSub: pm@company.com, chain: [parent, planner, search-warehouse], scopes: [warehouse.read] — and the PII must already be redacted because the chain enforces it before the agent sees it.

The pattern

ACP’s delegation primitives compose:

  • spawn_subagent(profile_id, scopes, ttl_seconds, max_budget_cents) — mint a child key with narrowed scopes, capped budget, capped TTL
  • child_context(child) — bind that child token for any @governed call inside the with-block
  • The chain extends: parent → child → grandchild. ACP enforces a default depth cap of 5 (configurable). Cycles are rejected at mint time.
  • originSub (the human at the root) propagates through every hop. Scopes only narrow. Budget atomically debits. Audit shows the full chain.

For full background, introducing ADCS is the spec; this post is the working code.

The full working file

pip install acp-governance anthropic httpx
#!/usr/bin/env python3
"""Governed multi-step research agent.

Required env vars:
  ACP_TOKEN           ACP API key (gsk_...) for the parent agent
  ANTHROPIC_API_KEY   Anthropic API key
  WAREHOUSE_DSN       Postgres-compatible DSN (read-only user)

Required ACP profiles (configure in dashboard):
  research-parent    delegatable=true,  scopes=[llm.proxy.*, warehouse.read,
                                                docs.search, web.search]
  research-planner   delegatable=false, scopes=[llm.proxy.*]
  search-warehouse   delegatable=false, scopes=[warehouse.read, llm.proxy.*]
  search-docs        delegatable=false, scopes=[docs.search, llm.proxy.*]
  search-web         delegatable=false, scopes=[web.search, llm.proxy.*]
  research-synth     delegatable=false, scopes=[llm.proxy.*]
"""
from __future__ import annotations

import asyncio, json, logging, os, sys
from typing import Any

import httpx
from acp_governance import (
    governed, set_context, spawn_subagent, child_context, SpawnError,
)
from anthropic import Anthropic

logging.basicConfig(level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(name)s: %(message)s")
log = logging.getLogger("research-agent")

MODEL = "claude-opus-4-7"


# ─── Tools (same @governed decorator regardless of which subagent runs them).
# What changes is the active child_context — which determines which child
# token is in scope for that specific call.

@governed("warehouse.read")
def warehouse_read(query: str) -> list[dict[str, Any]]:
    """Run a SELECT against the warehouse. Output PII-redacted by ACP."""
    import psycopg
    if not query.strip().lower().startswith("select"):
        raise PermissionError("warehouse.read only runs SELECTs")
    with psycopg.connect(os.environ["WAREHOUSE_DSN"], autocommit=True) as conn:
        with conn.cursor(row_factory=psycopg.rows.dict_row) as cur:
            cur.execute(query)
            return cur.fetchmany(100)


@governed("docs.search")
def docs_search(query: str, k: int = 5) -> list[dict[str, Any]]:
    """Search internal docs index. Sketched — adapt to your search backend."""
    r = httpx.get("https://docs-search.internal/search",
                  params={"q": query, "k": k}, timeout=10)
    r.raise_for_status()
    return r.json()["hits"]


@governed("web.search")
def web_search(query: str, k: int = 5) -> list[dict[str, Any]]:
    """Public web search. Sketched."""
    r = httpx.get("https://api.tavily.com/search",
                  params={"query": query, "max_results": k},
                  headers={"Authorization": f"Bearer {os.environ.get('TAVILY_KEY','')}"},
                  timeout=10)
    r.raise_for_status()
    return r.json()["results"]


@governed("llm.research_call")
def llm(system: str, user: str, max_tokens: int = 1024) -> str:
    client = Anthropic()
    resp = client.messages.create(
        model=MODEL, max_tokens=max_tokens, system=system,
        messages=[{"role": "user", "content": user}])
    return "\n".join(b.text for b in resp.content
                     if getattr(b, "type", "") == "text").strip()


# ─── Hop 1: planner ─────────────────────────────────────────────────────────
def plan(question: str) -> list[dict[str, Any]]:
    """Decompose the user's question into search-able sub-questions."""
    sys_p = (
        "You decompose a research question into 3-5 atomic sub-questions. "
        "Each sub-question must specify which source to search: 'warehouse', "
        "'docs', or 'web'. Output STRICT JSON only, an array of objects with "
        "keys 'subq' (string) and 'source' (one of warehouse|docs|web). No prose."
    )
    raw = llm(system=sys_p, user=question, max_tokens=512)
    try:
        plan = json.loads(raw)
    except json.JSONDecodeError:
        # Strip code fences if model included them
        plan = json.loads(raw.split("```")[1].split("```")[0]
                          .removeprefix("json\n"))
    return plan[:5]   # cap fan-out


# ─── Hop 2: searches (one subagent per source) ──────────────────────────────
def search_one(subq: dict[str, Any]) -> dict[str, Any]:
    """Spawn a single search subagent with scopes narrowed to its source."""
    source = subq["source"]
    scope_for_source = {
        "warehouse": ["warehouse.read", "llm.proxy.claude-opus-4-7"],
        "docs": ["docs.search", "llm.proxy.claude-opus-4-7"],
        "web": ["web.search", "llm.proxy.claude-opus-4-7"],
    }[source]
    profile = {"warehouse": "search-warehouse",
               "docs": "search-docs", "web": "search-web"}[source]

    child = spawn_subagent(
        profile_id=profile, scopes=scope_for_source,
        ttl_seconds=300, max_budget_cents=30)

    with child_context(child, agent_name=profile):
        if source == "warehouse":
            # Ask the LLM to write a SQL query for the sub-question, then run it
            sql = llm(system="Write a Postgres SELECT for: ", user=subq["subq"],
                      max_tokens=256)
            results = warehouse_read(query=sql)
        elif source == "docs":
            results = docs_search(query=subq["subq"], k=5)
        else:  # web
            results = web_search(query=subq["subq"], k=5)

        # Each search subagent summarizes its own findings — that summary is
        # what gets returned up the chain, not the raw rows.
        summary = llm(
            system=("Summarize search results into 3-5 bullets answering the "
                    "sub-question. Cite source rows by index where useful."),
            user=f"Sub-question: {subq['subq']}\nResults: {results}",
            max_tokens=512)
        return {"subq": subq["subq"], "source": source, "summary": summary}


# ─── Hop 3: synthesis ───────────────────────────────────────────────────────
def synthesize(question: str, findings: list[dict[str, Any]]) -> str:
    """Spawn a synthesizer subagent that combines findings into a recommendation."""
    child = spawn_subagent(
        profile_id="research-synth",
        scopes=["llm.proxy.claude-opus-4-7"],
        ttl_seconds=300, max_budget_cents=30)

    with child_context(child, agent_name="research-synth"):
        return llm(
            system=("You're a senior analyst. Combine the sub-question findings "
                    "below into a single recommendation. Lead with the answer, "
                    "follow with 2-3 supporting points, end with the top risk. "
                    "Under 400 words. Cite which sub-question each claim rests on."),
            user=("Question: " + question + "\n\nFindings:\n"
                  + json.dumps(findings, indent=2)),
            max_tokens=1024)


# ─── Main: orchestrate the chain ────────────────────────────────────────────
def main() -> int:
    if len(sys.argv) < 2:
        print("usage: research-agent.py 'your research question'")
        return 2
    question = sys.argv[1]

    require = lambda k: os.environ.get(k) or sys.exit(f"missing env var: {k}")
    for k in ["ACP_TOKEN", "ANTHROPIC_API_KEY", "WAREHOUSE_DSN"]:
        require(k)

    # Parent context — every governed call without a child_context falls under this
    set_context(user_token=os.environ["ACP_TOKEN"],
                agent_name="research-parent")

    # Hop 1: planner subagent decomposes the question
    planner_child = spawn_subagent(
        profile_id="research-planner",
        scopes=["llm.proxy.claude-opus-4-7"],
        ttl_seconds=120, max_budget_cents=10)
    with child_context(planner_child, agent_name="research-planner"):
        sub_questions = plan(question)
    log.info("Planner decomposed into %d sub-questions", len(sub_questions))

    # Hop 2: parallel search subagents (one per sub-question).
    # @governed returns "tool_error: <reason>" on deny — we propagate that
    # into the findings array so the synthesizer sees what was unavailable.
    findings = []
    for sq in sub_questions:
        try:
            result = search_one(sq)
            findings.append(result)
        except SpawnError as e:
            # spawn_subagent raises on non-2xx (e.g. profile_not_delegatable,
            # delegation_cycle, parent_budget_insufficient). These are real
            # errors, not deny-with-reason — log and continue.
            log.warning("Could not spawn search subagent for %s: %s", sq, e)
            findings.append({"subq": sq["subq"], "source": sq["source"],
                             "summary": f"[spawn failed: {e}]"})

    # Hop 3: synthesizer subagent
    recommendation = synthesize(question, findings)

    print(recommendation)
    return 0


if __name__ == "__main__":
    sys.exit(main())

What the audit chain looks like

For one research run with 4 sub-questions, you get roughly:

originSub: pm@company.com   (preserved through every row)
chain depth: up to 3 (parent → planner | search-* | synth)
tool agentName chain
llm.research_call research-planner [parent, planner]
warehouse.read search-warehouse [parent, search-warehouse]
llm.research_call search-warehouse [parent, search-warehouse]
docs.search search-docs [parent, search-docs]
llm.research_call search-docs [parent, search-docs]
web.search search-web [parent, search-web]
llm.research_call search-web [parent, search-web]
llm.research_call research-synth [parent, research-synth]

Eight rows for one research recommendation. Every row attributable to pm@company.com. Every row carries the chain. If the recommendation turns out to be wrong, you can replay every step.

Where this differs from the PR reviewer recipe

  PR reviewer Research agent
Chain depth 2 3-4
Subagent count 2 (security-scanner, test-runner) 5+ (planner, multiple searches, synthesizer)
Subagents in parallel? Yes (security + tests) Yes (multiple searches)
Showcases scope intersection scope intersection + depth + parallel fan-out

The primitive is identical. The shape varies.

What’s still required from you

  • Five ACP profiles configured with the right delegatable flag and scope subsets. Configure in the dashboard or via PATCH /api/v1/agents/:id.
  • A docs search backend. This recipe sketches an HTTP shape; in practice you wire up to your internal docs.internal/search or whatever vector store you run.
  • A web search adapter. Tavily is sketched; could be Serper, Bing, or your in-house crawler.
  • Warehouse DSN with read-only credentials.

How this composes

  • PII redaction at the warehouse hop (the warehouse.read call): same PostToolUse pattern as the SQL agent recipe. Customer rows redacted before any subagent sees them.
  • Cost caps per subagent: max_budget_cents is per-mint. A planner that costs more than $0.10 trips its own budget — the planner subagent gets denied, the parent is unaffected, the chain naturally degrades.
  • Depth cap as a guardrail. A buggy synthesizer that tries to spawn its own subagent past depth 5 is rejected at mint time. The chain can’t run away.

Cross-reference


This is example code, not a maintained product. Adapt it for your environment. The governance primitives — @governed, spawn_subagent, child_context — are stable; the warehouse, docs, and web search adapters in the recipe are sketched and will need to talk to your actual systems.

Get the next post
Agentic governance, AgentGovBench updates, the occasional incident post-mortem. One email per post. No marketing fluff.
Share: Twitter LinkedIn
Related posts

← back to blog