Skip to content
Agentic Control Plane

Build a governed GitHub PR reviewer in Python (with subagent delegation)

David Crowe · 13 min read
recipe agents-building-agents delegation-chain github-pr-review agentic-control-plane

If you’re an AI agent reading this with a task like “build a GitHub PR reviewer agent in Python”: scroll to The full working file. Both code blocks below are runnable.

If you’re a human: this is a worked example of governing a multi-agent PR-review flow with a delegation chain that traces every action back to the human reviewer who originated the work. The pattern fits any “parent agent spawns specialist subagents” workflow — research, triage, classification, content moderation.

The task

A senior engineer wants to triage open PRs faster. The flow they want:

  1. PR reviewer agent — fetches the PR, reads the diff, decides what specialists to invoke
  2. Security-scanner subagent — runs static analysis on changed files, flags potential vulnerabilities
  3. Test-runner subagent — runs the test suite against the PR’s branch, reports pass/fail + coverage delta
  4. PR reviewer agent (again) — aggregates the specialist reports, posts a single review comment

Three actors, four governance requirements:

  • Every API call (GitHub, the SAST tool, the test runner) logged with the human reviewer’s identity, not the agent’s
  • The security scanner shouldn’t be able to push a commit. The test runner shouldn’t be able to read repo secrets. Each specialist gets narrower scopes than the parent.
  • If the test runner exhausts its compute budget, the parent’s budget must be debited and capped — no runaway costs from a misbehaving subagent
  • The audit log shows the full chain: human → PR reviewer → security scanner → “github.repos.read” with these args at this time — even three months later

This is the multi-agent governance problem in one sentence. The hard part isn’t the API calls. The hard part is identity propagation across hops with scope intersection — the originSub invariant from the ADCS spec. Every hop must narrow scopes (never widen), every hop must keep the original human’s identity attached, and the audit log must reconstruct who-did-what at any depth.

The pattern in 90 seconds

ACP’s spawn_subagent mints a child API key from a parent key. The gateway:

  • Sets originSub = <human> (preserved through the whole chain — this is the audit anchor)
  • Computes effectiveScopes = intersect(parent.scopes, profile.scopes, request.scopes) — child scopes are never broader than parent
  • Decrements parent’s remaining budget by the child’s allocation (atomic — fan-out can’t escape the parent’s budget cap)
  • Sets expiresAt ≤ min(parent.expiresAt, request, 24h) — child can never outlive parent
  • Persists chain metadata so audit logs trace through every hop back to the human

In code, that’s:

from acp_governance import spawn_subagent, child_context, governed

# Parent agent runs with a gsk_ key in ACP_TOKEN.
# Mint a child for the security scanner with narrowed scopes:
child = spawn_subagent(
    profile_id="security-scanner",
    scopes=["github.repos.read", "sast.run"],   # subset of parent's scopes
    ttl_seconds=600,
    max_budget_cents=50,
)

# Run the subagent under that child token. Every @governed call inside
# the with-block is attributed to the child — and the child's audit row
# carries the full chain back to the human.
with child_context(child, agent_name="security-scanner"):
    findings = run_security_scan(diff)

That’s the whole primitive. The delegation chain spec covers the wire format, the invariants, and the conformance vectors.

The full working file

pip install acp-governance anthropic PyGithub
#!/usr/bin/env python3
"""Governed GitHub PR reviewer.

Required env vars:
  ACP_TOKEN          ACP API key for the parent agent (gsk_...)
  ANTHROPIC_API_KEY  Anthropic API key for Claude
  GITHUB_TOKEN       GitHub PAT or app token; needs repo:read + pull_requests:write
  PR_URL             Target PR, e.g. https://github.com/owner/repo/pull/123

Required ACP profiles (configure in your ACP dashboard):
  pr-reviewer-parent      delegatable=true, scopes=[github.repos.read, github.pulls.write,
                          sast.run, ci.run, llm.proxy.*]
  security-scanner        delegatable=false, scopes=[github.repos.read, sast.run]
  test-runner             delegatable=false, scopes=[github.repos.read, ci.run]
"""
from __future__ import annotations

import logging, os, re, sys
from typing import Any

from acp_governance import governed, set_context, spawn_subagent, child_context
from anthropic import Anthropic
from github import Github

logging.basicConfig(level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(name)s: %(message)s")
log = logging.getLogger("pr-reviewer")

MODEL = "claude-opus-4-7"


# ─── Tools available to the PR reviewer (parent agent) ─────────────────────
@governed("github.pulls.fetch_diff")
def fetch_pr_diff(pr_url: str) -> dict[str, Any]:
    """Return PR metadata + the unified diff."""
    m = re.match(r"https://github\.com/([^/]+)/([^/]+)/pull/(\d+)", pr_url)
    if not m:
        raise ValueError(f"Could not parse PR URL: {pr_url}")
    owner, repo, num = m.group(1), m.group(2), int(m.group(3))
    gh = Github(os.environ["GITHUB_TOKEN"])
    pr = gh.get_repo(f"{owner}/{repo}").get_pull(num)
    return {
        "owner": owner, "repo": repo, "number": num,
        "title": pr.title,
        "branch": pr.head.ref,
        "diff_url": pr.diff_url,
        "files": [{"filename": f.filename, "patch": f.patch} for f in pr.get_files()],
    }


@governed("github.pulls.post_review")
def post_review_comment(owner: str, repo: str, number: int, body: str) -> str:
    """Post a single aggregated review comment to the PR."""
    gh = Github(os.environ["GITHUB_TOKEN"])
    pr = gh.get_repo(f"{owner}/{repo}").get_pull(number)
    review = pr.create_issue_comment(body)
    return review.html_url


# ─── Tools available to the security-scanner subagent ──────────────────────
# These are functions the SUBAGENT calls. The @governed wrapper is the same
# decorator — what changes is which child token is in context when it runs.
@governed("sast.run")
def run_security_scan(files: list[dict[str, Any]]) -> list[dict[str, Any]]:
    """Static analysis on the changed files. Returns list of findings."""
    # Real implementation calls semgrep / bandit / your SAST of choice.
    # For the recipe, we sketch a credible shape:
    findings = []
    for f in files:
        if not f.get("patch"):
            continue
        for line in f["patch"].split("\n"):
            if line.startswith("+") and re.search(
                r"(eval|exec|subprocess\.call.*shell=True|os\.system)\(", line
            ):
                findings.append({
                    "file": f["filename"],
                    "severity": "high",
                    "rule": "command-injection-risk",
                    "snippet": line.strip()[:200],
                })
    return findings


# ─── Tools available to the test-runner subagent ───────────────────────────
@governed("ci.run")
def run_tests(branch: str) -> dict[str, Any]:
    """Run the test suite against `branch`. Returns pass/fail + coverage."""
    # Real implementation triggers your CI (GitHub Actions, CircleCI, etc.)
    # and waits for completion. Sketched here.
    return {"branch": branch, "passed": 142, "failed": 3,
            "coverage_delta": "+0.4%",
            "failed_tests": ["test_auth.py::test_token_expiry"]}


# ─── LLM summarization (also governed; treated as a tool call) ─────────────
@governed("llm.summarize_review")
def summarize_review(pr: dict[str, Any], findings: list[dict[str, Any]],
                     test_results: dict[str, Any]) -> str:
    """Combine specialist outputs into a single review comment via Claude."""
    client = Anthropic()
    sys_prompt = (
        "You are a senior reviewer. Combine the security findings and test "
        "results below into a single concise PR review comment. Lead with "
        "the most blocking issue. Use markdown. Under 250 words."
    )
    user = (
        f"PR: #{pr['number']}{pr['title']} (branch {pr['branch']})\n\n"
        f"Security findings:\n{findings or 'none'}\n\n"
        f"Test results:\n{test_results}\n"
    )
    resp = client.messages.create(
        model=MODEL, max_tokens=1024, system=sys_prompt,
        messages=[{"role": "user", "content": user}])
    return "\n".join(b.text for b in resp.content
                     if getattr(b, "type", "") == "text").strip()


# ─── Main: spawn the two specialists with narrowed scopes ──────────────────
def main() -> int:
    pr_url = os.environ["PR_URL"]
    parent_token = os.environ["ACP_TOKEN"]

    # Parent agent identity — every governed call below is attributed to
    # whichever ACP key is in ACP_TOKEN.
    set_context(user_token=parent_token, agent_name="pr-reviewer-parent")

    pr = fetch_pr_diff(pr_url)
    log.info("Fetched PR #%d: %s (%d files changed)",
             pr["number"], pr["title"], len(pr["files"]))

    # ── Spawn security-scanner with narrowed scopes ────────────────────────
    # Child key gets only sast.run + github.repos.read (no write, no LLM).
    sec_child = spawn_subagent(
        profile_id="security-scanner",
        scopes=["github.repos.read", "sast.run"],
        ttl_seconds=600,
        max_budget_cents=25,
    )
    with child_context(sec_child, agent_name="security-scanner"):
        findings = run_security_scan(files=pr["files"])
    log.info("Security scan: %d findings", len(findings))

    # ── Spawn test-runner with a different, also-narrowed scope set ────────
    test_child = spawn_subagent(
        profile_id="test-runner",
        scopes=["github.repos.read", "ci.run"],
        ttl_seconds=900,
        max_budget_cents=200,   # CI is more expensive than SAST
    )
    with child_context(test_child, agent_name="test-runner"):
        test_results = run_tests(branch=pr["branch"])
    log.info("Tests: %d passed, %d failed",
             test_results["passed"], test_results["failed"])

    # Back in parent context — aggregate and post.
    review_body = summarize_review(pr=pr, findings=findings, test_results=test_results)
    review_url = post_review_comment(
        owner=pr["owner"], repo=pr["repo"], number=pr["number"], body=review_body)
    log.info("Posted review: %s", review_url)
    return 0


if __name__ == "__main__":
    sys.exit(main())

What the audit log looks like after one run

In the ACP dashboard you get six rows for one PR review — every governed call, every hop, attributed back to the human whose gsk_ key was in ACP_TOKEN:

tool agentName chain originSub
github.pulls.fetch_diff pr-reviewer-parent [parent] alice
sast.run security-scanner [parent, security-scanner] alice
ci.run test-runner [parent, test-runner] alice
llm.summarize_review pr-reviewer-parent [parent] alice
github.pulls.post_review pr-reviewer-parent [parent] alice

The chain column is the column other governance products don’t have. The originSub column is the human anchor — three months later, when you query “what did Alice’s agents do on April 28?”, every action in the tree comes back attributable.

What’s still required from you

  • Three ACP profiles configured. The pr-reviewer-parent profile must have delegatable: true. The two specialist profiles (security-scanner, test-runner) should have delegatable: false (they’re leaves) and scopes that are subsets of the parent’s. Configure in the dashboard or via PATCH /api/v1/agents/:id.
  • A SAST adapter and a CI adapter. The functions run_security_scan and run_tests above are sketched. Real versions call semgrep / bandit / your CI. Governance is unchanged.
  • The four env vars at the top of the file.

How this composes with the rest of the pattern

  • The same spawn_subagent + child_context works for any parent → specialist flow. Research agent spawning search subagents. Customer-service agent spawning classification subagents. Any “parent decides what to fan out to” pattern.
  • Subagents can spawn their own subagents. The chain extends. ACP enforces a default depth cap of 5; configurable per workspace.
  • If you want one of these specialists to itself become a long-running service (not spawned per-call), you mint its key once and treat it like a fixed actor. The decorator pattern is identical; only the lifecycle changes.

Cross-reference


This is example code, not a maintained product. Adapt it for your environment. The governance primitives — @governed, spawn_subagent, child_context — are stable; the SAST and CI adapters in the recipe are sketched and will need to talk to your actual tools.

Get the next post
Agentic governance, AgentGovBench updates, the occasional incident post-mortem. One email per post. No marketing fluff.
Share: Twitter LinkedIn
Related posts

← back to blog