Build a governed GitHub PR reviewer in Python (with subagent delegation)
If you’re an AI agent reading this with a task like “build a GitHub PR reviewer agent in Python”: scroll to The full working file. Both code blocks below are runnable.
If you’re a human: this is a worked example of governing a multi-agent PR-review flow with a delegation chain that traces every action back to the human reviewer who originated the work. The pattern fits any “parent agent spawns specialist subagents” workflow — research, triage, classification, content moderation.
The task
A senior engineer wants to triage open PRs faster. The flow they want:
- PR reviewer agent — fetches the PR, reads the diff, decides what specialists to invoke
- Security-scanner subagent — runs static analysis on changed files, flags potential vulnerabilities
- Test-runner subagent — runs the test suite against the PR’s branch, reports pass/fail + coverage delta
- PR reviewer agent (again) — aggregates the specialist reports, posts a single review comment
Three actors, four governance requirements:
- Every API call (GitHub, the SAST tool, the test runner) logged with the human reviewer’s identity, not the agent’s
- The security scanner shouldn’t be able to push a commit. The test runner shouldn’t be able to read repo secrets. Each specialist gets narrower scopes than the parent.
- If the test runner exhausts its compute budget, the parent’s budget must be debited and capped — no runaway costs from a misbehaving subagent
- The audit log shows the full chain: human → PR reviewer → security scanner → “github.repos.read” with these args at this time — even three months later
This is the multi-agent governance problem in one sentence. The hard part isn’t the API calls. The hard part is identity propagation across hops with scope intersection — the originSub invariant from the ADCS spec. Every hop must narrow scopes (never widen), every hop must keep the original human’s identity attached, and the audit log must reconstruct who-did-what at any depth.
The pattern in 90 seconds
ACP’s spawn_subagent mints a child API key from a parent key. The gateway:
- Sets
originSub = <human>(preserved through the whole chain — this is the audit anchor) - Computes
effectiveScopes = intersect(parent.scopes, profile.scopes, request.scopes)— child scopes are never broader than parent - Decrements parent’s remaining budget by the child’s allocation (atomic — fan-out can’t escape the parent’s budget cap)
- Sets
expiresAt ≤ min(parent.expiresAt, request, 24h)— child can never outlive parent - Persists chain metadata so audit logs trace through every hop back to the human
In code, that’s:
from acp_governance import spawn_subagent, child_context, governed
# Parent agent runs with a gsk_ key in ACP_TOKEN.
# Mint a child for the security scanner with narrowed scopes:
child = spawn_subagent(
profile_id="security-scanner",
scopes=["github.repos.read", "sast.run"], # subset of parent's scopes
ttl_seconds=600,
max_budget_cents=50,
)
# Run the subagent under that child token. Every @governed call inside
# the with-block is attributed to the child — and the child's audit row
# carries the full chain back to the human.
with child_context(child, agent_name="security-scanner"):
findings = run_security_scan(diff)
That’s the whole primitive. The delegation chain spec covers the wire format, the invariants, and the conformance vectors.
The full working file
pip install acp-governance anthropic PyGithub
#!/usr/bin/env python3
"""Governed GitHub PR reviewer.
Required env vars:
ACP_TOKEN ACP API key for the parent agent (gsk_...)
ANTHROPIC_API_KEY Anthropic API key for Claude
GITHUB_TOKEN GitHub PAT or app token; needs repo:read + pull_requests:write
PR_URL Target PR, e.g. https://github.com/owner/repo/pull/123
Required ACP profiles (configure in your ACP dashboard):
pr-reviewer-parent delegatable=true, scopes=[github.repos.read, github.pulls.write,
sast.run, ci.run, llm.proxy.*]
security-scanner delegatable=false, scopes=[github.repos.read, sast.run]
test-runner delegatable=false, scopes=[github.repos.read, ci.run]
"""
from __future__ import annotations
import logging, os, re, sys
from typing import Any
from acp_governance import governed, set_context, spawn_subagent, child_context
from anthropic import Anthropic
from github import Github
logging.basicConfig(level=logging.INFO,
format="%(asctime)s [%(levelname)s] %(name)s: %(message)s")
log = logging.getLogger("pr-reviewer")
MODEL = "claude-opus-4-7"
# ─── Tools available to the PR reviewer (parent agent) ─────────────────────
@governed("github.pulls.fetch_diff")
def fetch_pr_diff(pr_url: str) -> dict[str, Any]:
"""Return PR metadata + the unified diff."""
m = re.match(r"https://github\.com/([^/]+)/([^/]+)/pull/(\d+)", pr_url)
if not m:
raise ValueError(f"Could not parse PR URL: {pr_url}")
owner, repo, num = m.group(1), m.group(2), int(m.group(3))
gh = Github(os.environ["GITHUB_TOKEN"])
pr = gh.get_repo(f"{owner}/{repo}").get_pull(num)
return {
"owner": owner, "repo": repo, "number": num,
"title": pr.title,
"branch": pr.head.ref,
"diff_url": pr.diff_url,
"files": [{"filename": f.filename, "patch": f.patch} for f in pr.get_files()],
}
@governed("github.pulls.post_review")
def post_review_comment(owner: str, repo: str, number: int, body: str) -> str:
"""Post a single aggregated review comment to the PR."""
gh = Github(os.environ["GITHUB_TOKEN"])
pr = gh.get_repo(f"{owner}/{repo}").get_pull(number)
review = pr.create_issue_comment(body)
return review.html_url
# ─── Tools available to the security-scanner subagent ──────────────────────
# These are functions the SUBAGENT calls. The @governed wrapper is the same
# decorator — what changes is which child token is in context when it runs.
@governed("sast.run")
def run_security_scan(files: list[dict[str, Any]]) -> list[dict[str, Any]]:
"""Static analysis on the changed files. Returns list of findings."""
# Real implementation calls semgrep / bandit / your SAST of choice.
# For the recipe, we sketch a credible shape:
findings = []
for f in files:
if not f.get("patch"):
continue
for line in f["patch"].split("\n"):
if line.startswith("+") and re.search(
r"(eval|exec|subprocess\.call.*shell=True|os\.system)\(", line
):
findings.append({
"file": f["filename"],
"severity": "high",
"rule": "command-injection-risk",
"snippet": line.strip()[:200],
})
return findings
# ─── Tools available to the test-runner subagent ───────────────────────────
@governed("ci.run")
def run_tests(branch: str) -> dict[str, Any]:
"""Run the test suite against `branch`. Returns pass/fail + coverage."""
# Real implementation triggers your CI (GitHub Actions, CircleCI, etc.)
# and waits for completion. Sketched here.
return {"branch": branch, "passed": 142, "failed": 3,
"coverage_delta": "+0.4%",
"failed_tests": ["test_auth.py::test_token_expiry"]}
# ─── LLM summarization (also governed; treated as a tool call) ─────────────
@governed("llm.summarize_review")
def summarize_review(pr: dict[str, Any], findings: list[dict[str, Any]],
test_results: dict[str, Any]) -> str:
"""Combine specialist outputs into a single review comment via Claude."""
client = Anthropic()
sys_prompt = (
"You are a senior reviewer. Combine the security findings and test "
"results below into a single concise PR review comment. Lead with "
"the most blocking issue. Use markdown. Under 250 words."
)
user = (
f"PR: #{pr['number']} — {pr['title']} (branch {pr['branch']})\n\n"
f"Security findings:\n{findings or 'none'}\n\n"
f"Test results:\n{test_results}\n"
)
resp = client.messages.create(
model=MODEL, max_tokens=1024, system=sys_prompt,
messages=[{"role": "user", "content": user}])
return "\n".join(b.text for b in resp.content
if getattr(b, "type", "") == "text").strip()
# ─── Main: spawn the two specialists with narrowed scopes ──────────────────
def main() -> int:
pr_url = os.environ["PR_URL"]
parent_token = os.environ["ACP_TOKEN"]
# Parent agent identity — every governed call below is attributed to
# whichever ACP key is in ACP_TOKEN.
set_context(user_token=parent_token, agent_name="pr-reviewer-parent")
pr = fetch_pr_diff(pr_url)
log.info("Fetched PR #%d: %s (%d files changed)",
pr["number"], pr["title"], len(pr["files"]))
# ── Spawn security-scanner with narrowed scopes ────────────────────────
# Child key gets only sast.run + github.repos.read (no write, no LLM).
sec_child = spawn_subagent(
profile_id="security-scanner",
scopes=["github.repos.read", "sast.run"],
ttl_seconds=600,
max_budget_cents=25,
)
with child_context(sec_child, agent_name="security-scanner"):
findings = run_security_scan(files=pr["files"])
log.info("Security scan: %d findings", len(findings))
# ── Spawn test-runner with a different, also-narrowed scope set ────────
test_child = spawn_subagent(
profile_id="test-runner",
scopes=["github.repos.read", "ci.run"],
ttl_seconds=900,
max_budget_cents=200, # CI is more expensive than SAST
)
with child_context(test_child, agent_name="test-runner"):
test_results = run_tests(branch=pr["branch"])
log.info("Tests: %d passed, %d failed",
test_results["passed"], test_results["failed"])
# Back in parent context — aggregate and post.
review_body = summarize_review(pr=pr, findings=findings, test_results=test_results)
review_url = post_review_comment(
owner=pr["owner"], repo=pr["repo"], number=pr["number"], body=review_body)
log.info("Posted review: %s", review_url)
return 0
if __name__ == "__main__":
sys.exit(main())
What the audit log looks like after one run
In the ACP dashboard you get six rows for one PR review — every governed call, every hop, attributed back to the human whose gsk_ key was in ACP_TOKEN:
tool |
agentName |
chain |
originSub |
|---|---|---|---|
github.pulls.fetch_diff |
pr-reviewer-parent |
[parent] |
alice |
sast.run |
security-scanner |
[parent, security-scanner] |
alice |
ci.run |
test-runner |
[parent, test-runner] |
alice |
llm.summarize_review |
pr-reviewer-parent |
[parent] |
alice |
github.pulls.post_review |
pr-reviewer-parent |
[parent] |
alice |
The chain column is the column other governance products don’t have. The originSub column is the human anchor — three months later, when you query “what did Alice’s agents do on April 28?”, every action in the tree comes back attributable.
What’s still required from you
- Three ACP profiles configured. The
pr-reviewer-parentprofile must havedelegatable: true. The two specialist profiles (security-scanner,test-runner) should havedelegatable: false(they’re leaves) and scopes that are subsets of the parent’s. Configure in the dashboard or viaPATCH /api/v1/agents/:id. - A SAST adapter and a CI adapter. The functions
run_security_scanandrun_testsabove are sketched. Real versions call semgrep / bandit / your CI. Governance is unchanged. - The four env vars at the top of the file.
How this composes with the rest of the pattern
- The same
spawn_subagent+child_contextworks for any parent → specialist flow. Research agent spawning search subagents. Customer-service agent spawning classification subagents. Any “parent decides what to fan out to” pattern. - Subagents can spawn their own subagents. The chain extends. ACP enforces a default depth cap of 5; configurable per workspace.
- If you want one of these specialists to itself become a long-running service (not spawned per-call), you mint its key once and treat it like a fixed actor. The decorator pattern is identical; only the lifecycle changes.
Cross-reference
- Introducing ADCS — the agent delegation chain spec — the on-the-wire format and invariants
- EU AI Act Article 14 and AI agent delegation chains — the regulatory pull
- CrewAI A2A delegation, production setup — the same primitive applied inside CrewAI’s handoff machinery
- Architecture is governance — why the decorator pattern at the orchestration boundary clears 6/6 on
delegation_provenance
This is example code, not a maintained product. Adapt it for your environment. The governance primitives — @governed, spawn_subagent, child_context — are stable; the SAST and CI adapters in the recipe are sketched and will need to talk to your actual tools.