Skip to content
Agentic Control Plane

Runtime Authorization for AI Agents: Why Static Policies Break

David Crowe · · 10 min read
authorization

Your RBAC system assigns roles at login. Your AI agent makes 47 tool calls in 90 seconds. By the time a human could review the third call, the agent has already finished.

This is the authorization problem that nobody designed for.


The authorization model was designed for humans

Traditional authorization follows a clean path: user authenticates, gets a role, role grants permissions, permissions are checked on each request. This works because humans are slow. They click one button at a time. They read responses. They make maybe 20 requests per session. Authorization decisions happen at human speed, and the model holds.

AI agents break every assumption underneath that model.

Agents chain tool calls autonomously. They make context-dependent decisions about what to access. They operate at machine speed — dozens of calls per minute, each one a new authorization decision your system needs to evaluate. The question isn’t “does this user have permission.” It’s “does this agent, acting for this user, have permission to do this specific thing, right now, given what it’s already done?”

Traditional apps have two parties: User authenticates with Backend. Identity flows directly. Authorization is clean. Audit trails work.

Agentic apps have three parties: User authenticates with an LLM runtime. The LLM decides to call your backend. Your backend sees a service account token. It has no idea who initiated the request, what the agent has already done, or what it plans to do next. Authorization is blind.


Why static RBAC fails

Role-Based Access Control was designed for a world where a user logs in, gets a role, and that role governs their entire session. Four things break when you apply this to AI agents.

Speed. An agent makes 50 tool calls while you’re still reviewing the first one. Static role checks at session start don’t help when the agent’s actual behavior unfolds dynamically across dozens of calls. By the time you realize the agent is doing something you didn’t intend, it’s already done.

Context sensitivity. An agent querying your CRM is fine. The same agent querying your CRM and then emailing that data to an external address is a data exfiltration pattern. Static roles evaluate each call in isolation. They can’t evaluate chains of actions, and chains are where the risk lives.

Scope creep. An agent starts with a “read CRM contacts” call. It decides it also needs to “update a contact record” and “create a follow-up task.” Each individual permission might be granted by the user’s role. But the combination — read, modify, create — in the context of a single autonomous session may violate your data governance policy. Static RBAC has no concept of cumulative authorization.

Delegation ambiguity. The user has admin permissions. Should their agent? Static RBAC says yes — the agent inherits whatever the user has. Security says no. An agent operating autonomously at machine speed should never hold the same privilege level as a human who reviews each action before taking it. RFC 8693 (OAuth 2.0 Token Exchange) exists precisely because this is a known problem in delegated access. Most agent frameworks ignore it entirely.


What runtime authorization actually means

Runtime authorization is per-call policy evaluation applied to every tool invocation an agent makes — not just at session start, not just at the API gateway, but on every single action the agent attempts.

Four principles define it:

Per-call policy evaluation. Every tool call triggers a policy check. Not every session. Not every minute. Every call. If an agent makes 200 tool calls in a workflow, that’s 200 authorization decisions, each evaluated against the current policy state.

Deny-by-default. Agents have no access until explicitly granted. This aligns with OWASP’s Top 10 for Agentic AI, which identifies excessive agency as a critical risk. An agent that can do anything the user can do — by default, with no explicit grant — is an agent waiting to be exploited. The secure default is zero permissions, with each capability explicitly allowed.

Scope inheritance with restriction. An agent’s permissions are always a strict subset of the delegating user’s permissions. The CFO’s agent shouldn’t have CFO-level access. An intern and a CFO should get different agent capabilities because they have different roles — and because an autonomous process operating at machine speed warrants tighter constraints than a human operating deliberately. This is the principle of least privilege applied to delegation.

Policy as code. Authorization rules are defined declaratively, version-controlled, and enforced at the gateway. Not buried in application logic. Not scattered across microservices. Defined in one place, enforced consistently, auditable by anyone with repo access.


What this looks like in practice

A sales rep asks their AI agent: “Pull the Q1 pipeline and email a summary to the VP.”

The agent plans three steps: (1) query CRM for Q1 pipeline data, (2) compose the email, (3) send the email.

Runtime authorization evaluates each step as the agent attempts it:

  • Step 1: Agent calls tool:crm:read. User has tool:crm:read in their scopes. Policy check passes. ALLOW.
  • Step 2: Agent composes the email internally. No tool call — this is agent-internal reasoning. No policy check needed.
  • Step 3: Agent calls tool:email:send:external. User has tool:email:send but tenant policy restricts external sends to manager role and above. User’s role is rep. DENY.

The agent completes step 1, drafts the email, but can’t send it. The audit log records the denial with the exact policy rule that triggered it. The user sees: “I prepared the Q1 summary but don’t have permission to email it externally. You can ask your manager to send it, or share it through an internal channel.”

Here’s the policy definition:

policies:
  - role: rep
    allow:
      - tool:crm:read
      - tool:email:draft
      - tool:email:send:internal
    deny:
      - tool:email:send:external
      - tool:crm:write

  - role: manager
    allow:
      - tool:crm:read
      - tool:crm:write
      - tool:email:send:internal
      - tool:email:send:external

Declarative. Version-controlled. Readable by security teams who don’t write code. Enforceable by machines that don’t read intent.


The enforcement point matters

Where you enforce authorization is as important as what you enforce.

If authorization lives inside the agent — as prompt instructions, system messages, or “permission skills” — you’re relying on the LLM to follow rules. That’s prompt engineering, not security. Snyk’s audit of ClawHub found that permission skills were just suggestions to the LLM, not enforcement layers. The agent could ignore them, and sometimes did.

Authorization must happen at the gateway — the control plane — where the agent can’t bypass it. The control plane evaluates policy before the tool call reaches your backend. The agent doesn’t get to decide whether it has permission. The gateway decides.

// The control plane evaluates policy on every tool call
const decision = checkPermissions({
  user: { sub: "auth0|8f3a2b1c", role: "rep" },
  tool: "email:send:external",
  scopes: userScopes,
  policy: tenantPolicy,
});

// decision.allowed = false
// decision.reason = "role:rep denied tool:email:send:external"
// decision.logged = true (written to audit trail)

This is a hard boundary, not a soft suggestion. The request never reaches your email API. The denial is logged with the user’s verified identity, the tool that was requested, the policy rule that matched, and a timestamp. That audit record exists whether the agent likes it or not.

HIPAA, SOC 2, and GDPR all require demonstrable access controls. The EU AI Act’s Article 14 mandates human oversight of high-risk AI systems. “We told the LLM not to do that” satisfies none of them. Infrastructure-level enforcement does.


Budget as authorization

Runtime authorization isn’t limited to tool access. It extends to usage governance — and this is where most teams have a blind spot.

Per-user rate limits. Per-agent budget caps. Runaway detection for agents stuck in retry loops. These are authorization decisions: you’re authorized to consume $50/day in agent operations, not $5,000.

A single agent in a tight error loop can burn through your entire monthly LLM budget before anyone notices. Rate limiting by request count doesn’t help when one expensive model call costs more than a thousand cheap ones. Budget-aware authorization tracks estimated cost per call and enforces spending limits as a first-class policy.

const budgetCheck = evaluateBudget({
  user: { sub: "auth0|8f3a2b1c" },
  estimatedCost: 0.12,
  windowSpend: 48.73,    // spent so far today
  dailyLimit: 50.00,     // tenant policy
});

// budgetCheck.allowed = false
// budgetCheck.reason = "daily budget exceeded (48.73 + 0.12 > 50.00)"

This is authorization in the economic sense. You wouldn’t give an employee an unlimited corporate credit card with no transaction alerts. Don’t give their agent one either.


Where to start

If your agents are calling tools today, ask three questions:

  1. Is authorization checked on every call, or just at session start? If it’s session-level, you’re authorizing intent, not actions. The agent’s actual behavior may diverge from what was authorized.

  2. Is it enforced at the infrastructure level, or inside the agent’s prompt? If it’s in the prompt, it’s a suggestion. Infrastructure-level enforcement means the agent can’t bypass it regardless of what the LLM decides to do.

  3. Can your agent exceed its user’s permissions? If the agent inherits the user’s full role with no restriction, you’ve given an autonomous process the same trust level as a deliberate human — at 100x the speed.

GatewayStack’s agentic control plane enforces per-call policy evaluation, deny-by-default scoping, and budget-aware rate limiting at the gateway. The agent framework doesn’t matter. The model provider doesn’t matter. Authorization is enforced at the trust boundary.

Get started · Reference architecture · Data plane vs control plane

Share: Twitter LinkedIn
Related posts

← back to blog