Skip to content
Agentic Control Plane

WebMCP Ships Without Agent Identity. Here's Why That Matters.

David Crowe · · 13 min read
standards mcp identity

Chrome 146 Canary recently shipped WebMCP behind a flag. A banking site can now expose a getBalance tool as structured markup, and any browser-native AI agent can call it. The agent gets the account balance. The banking site has no idea which agent asked. No identity. No scoped permissions. No audit trail linking the call to a specific caller.

I filed issue #96 on the WebMCP repo to flag three security gaps. This post explains what’s missing, why the browser layer is the best place to fix it, and what the spec needs before production deployments put real user data behind unidentified tool calls.


What WebMCP is

WebMCP is a W3C Draft Community Group Report published in March 2026. It creates a browser-native API for websites to expose functionality as structured tools to AI agents.

Think of it as MCP for the browser. Anthropic’s Model Context Protocol handles the server side – backend services, databases, APIs. WebMCP handles the client side – live web page interaction, browser-native tool registration, user-facing consent flows. Two layers, same fundamental pattern: tools exposed to agents, agents calling tools autonomously.

The spec supports two tool types. Declarative tools are defined in HTML markup and handled by the browser itself. Imperative tools use navigator.modelContext.registerTool() and execute JavaScript callbacks when invoked. Both types let a website say “here are the things an agent can do on this page.”

The spec is thoughtful. The editors have done serious work on user consent models, the SubmitEvent.agentInvoked property, and the destructiveHint advisory flag. What it doesn’t have is any mechanism for a tool to know who’s calling it.


Gap 1: Agent identity should not be deferred

Issue #54 raised identity at TPAC 2025. The working group resolved to defer it. The reasoning was understandable – identity is hard, and shipping a v1 without it avoids blocking the rest of the spec on an unsolved problem.

But deferral has a cost.

A banking site exposes getBalance as a WebMCP tool. Chrome’s built-in Copilot calls it. A third-party browser extension with agent capabilities calls it. A research prototype agent calls it. The tool’s execute() callback fires in all three cases. The callback receives a ModelContextClient object. That object exposes requestUserInteraction().

It does not expose which agent is calling.

SubmitEvent.agentInvoked tells the page that an agent acted. It doesn’t tell the page which one. The difference matters. A banking site might trust Chrome’s built-in agent and reject an unknown third-party extension. A healthcare portal might allow an agent that’s passed a security review and block one that hasn’t. Without agent identity in the callback, every caller is anonymous. The tool cannot make per-caller trust decisions.

This is the same three-party problem that exists on the server side. User authenticates with browser. Agent calls tool. Tool sees a valid invocation but has no idea who initiated it or which agent is acting. Server-side MCP has this gap too. WebMCP has the chance to solve it at the platform level.


Gap 2: Permissions are binary, not scoped

A tool that reads account data and a tool that initiates a wire transfer have wildly different risk profiles. In the current spec, both are just tools. There’s no mechanism for a tool to declare required access levels. No scoping. No tiered permissions.

The spec includes destructiveHint, which is explicitly advisory. The spec text says the browser or agent can ignore it. This is the right design for a hint – hints should be ignorable. But it means there’s no enforceable permission layer.

Consider a financial services page exposing three tools:

  • getBalance – read-only, low risk
  • transferFunds – write, high risk
  • closeAccount – destructive, critical risk

In the current spec, all three are registered the same way. An agent that’s granted access to the page’s tools gets all of them. There’s no mechanism to say “this agent can read balances but cannot initiate transfers.” It’s all or nothing.

This is the deny-by-default problem applied to the browser. An agent should start with zero permissions and receive only the scopes explicitly granted. WebMCP currently gives agents binary access – either the tool is available or it isn’t. Per-tool scoping with enforceable access levels is missing entirely.

// Current WebMCP: all tools registered equally, no scope differentiation
navigator.modelContext.registerTool({
  name: "getBalance",
  description: "Returns account balance",
  execute: async (input, client) => {
    // No way to require "finance:read" scope
    // No way to reject based on caller permissions
    // client exposes: requestUserInteraction()
    // client does NOT expose: agent identity, granted scopes
    return { balance: "$1,234.56" };
  }
});

navigator.modelContext.registerTool({
  name: "transferFunds",
  description: "Initiates wire transfer",
  destructiveHint: true,  // advisory only -- agent can ignore
  execute: async (input, client) => {
    // Same callback signature as getBalance
    // No enforceable permission boundary
    return { status: "transferred" };
  }
});

The browser is the natural enforcement point. It sits between every agent and every tool. It can enforce scoped permissions the way it enforces CORS, CSP, and the permissions API. But only if the spec defines the primitives.


Gap 3: Delegation context doesn’t flow to imperative tools

Imperative tools are the powerful ones. They run arbitrary JavaScript in response to agent invocations. The execute() callback receives a ModelContextClient – and that client is sparse.

ModelContextClient exposes requestUserInteraction(). That’s it. No agent identity. No scope information. No correlation ID for audit trails. No delegation chain showing user-to-agent-to-tool provenance.

Here’s what the callback receives today versus what it needs:

// What's needed: delegation context flows to the tool
navigator.modelContext.registerTool({
  name: "getBalance",
  requiredScopes: ["finance:read"],
  execute: async (input, client) => {
    // client.agentId        -> "chrome-builtin-copilot"
    // client.grantedScopes  -> ["finance:read"]
    // client.correlationId  -> "req_abc123"
    // client.delegatingUser -> { sub: "user_456" }

    if (!client.grantedScopes.includes("finance:read")) {
      throw new ToolPermissionError("finance:read required");
    }

    // Now the tool can: make per-caller trust decisions,
    // log the correlation ID for audit, reject unknown agents
    return { balance: "$1,234.56" };
  }
});

Without delegation context, imperative tools are executing in a trust vacuum. The tool has no way to log who called it in a way that’s forensically useful. It can log that a call happened, but not which agent, acting for which user, with which permissions. That’s an audit trail with a hole in the middle.


Two deeper gaps from the community discussion

The issue thread surfaced two additional gaps worth highlighting.

The permissions vocabulary problem. Who defines what access levels mean? If one publisher defines finance:read and another defines account:view, they might mean the same thing. Without shared semantics, permissions fragment by publisher. Every site invents its own vocabulary. Agents can’t reason consistently about what they’re allowed to do across sites.

Capability semantics vs. identity. Recording who called a tool without recording what the agent believed the tool would do is forensically incomplete. An agent might call transferFunds believing it was checking a balance, because the tool’s description was misleading or the agent misinterpreted it. An audit trail that shows “Agent X called transferFunds” but not “Agent X believed it was calling a balance-check function” misses the intent layer. For incident response, both the action and the agent’s understanding of the action matter.

These are the kind of gaps that show up in production when real money moves through real tools.


Why the browser layer is the right enforcement point

Server-side MCP leaves trust enforcement to each tool author. Every MCP server independently decides whether to check identity, how to check it, and what to do when identity is missing. The result is predictable: most don’t check. The authentication gap in AI systems exists precisely because enforcement is optional and distributed.

The browser is different. The browser is already a trust mediator. It enforces CORS – tools can’t be called cross-origin without explicit permission. It enforces CSP – scripts can’t load from unauthorized sources. It enforces the Permissions API – sites can’t access the camera or microphone without user consent. These aren’t suggestions. They’re platform-level enforcement that no website can bypass.

WebMCP has the same architectural advantage. The browser sits between every agent and every tool on every page. If the spec defines identity primitives, the browser can enforce them the way it enforces every other security boundary. Agent identity verification becomes a platform guarantee, not an application-level hope.

This is the key insight: enforcement at the platform, not at the app. Server-side MCP missed this. WebMCP doesn’t have to.


The two-layer picture

The agentic tool ecosystem now has two layers:

  • Anthropic’s MCP – server-side. Backend services, databases, APIs. Agents call tools over HTTP or stdio. Identity and governance are handled (or not) by each tool server independently.

  • WebMCP – client-side. Browser-native. Agents interact with live web pages through structured tools. The browser mediates every call.

Both layers need identity, scoped permissions, and audit context. Neither has them fully solved. But the browser layer has a structural advantage: centralized enforcement. A fix in the browser runtime applies to every agent on every page. A fix in server-side MCP requires every tool author to adopt it independently.

The agentic control plane pattern applies to both layers. On the server side, it’s a gateway that sits between agents and backends, enforcing identity, policy, and audit on every tool call — and this already exists in production. On the browser side, it’s the browser runtime itself – if the spec includes the primitives. The browser is the control plane for client-side agent interactions. It just needs the identity and governance vocabulary to act like one.


What the spec needs

Three concrete additions would close the gaps identified in issue #96:

1. Agent identity in ModelContextClient. The execute() callback should receive an agentId – a browser-verified identifier for the calling agent. Chrome’s built-in Copilot, a third-party extension, a web-based agent framework: each gets a distinct identifier. Tools can make per-caller trust decisions. This doesn’t require solving the full identity problem. It requires the browser to tell the tool which agent is invoking it.

2. Enforceable scoped permissions. Tools should be able to declare requiredScopes – not hints, not advisories, but enforced requirements that the browser checks before dispatching the call. An agent that hasn’t been granted finance:write cannot invoke a tool that requires it. The browser enforces this the way it enforces Permissions-Policy.

3. Correlation context. Every tool invocation should carry a correlationId that links the call to the broader agent session. Tools can log it. Audit systems can reconstruct the chain: user initiated a session, agent made these calls, each tool produced these results. Without correlation IDs, you have isolated log entries with no causal chain.

None of these require solving identity for the entire web. They require the browser to expose, in a structured way, the information it already has about who’s calling and what they’ve been granted.


Where to start

If you’re building tools that agents will call – server-side or browser-side – the identity question is the same. Can the tool verify who’s calling? Can it make per-caller trust decisions? Can it produce an audit trail that ties actions to identities?

  1. Read the issue. Issue #96 lays out the three gaps with concrete recommendations. If you’re building on WebMCP, this is the spec discussion that affects you.

  2. Understand the pattern. The Agentic Control Plane is the architectural pattern for identity, policy, and audit enforcement between agents and tools. It applies whether the tools are server-side MCP endpoints or browser-side WebMCP registrations.

  3. Assess your exposure. If you’re exposing tools through WebMCP today – even behind a flag – ask the same questions you’d ask about server-side agent identity: does the tool know who’s calling? Can it reject callers it doesn’t trust? Can it log calls in a way that’s forensically complete?

The browser can solve the three-party problem at the platform level. The spec just needs identity primitives in v1 – not v2, not “future work,” not deferred to a later draft. Identity deferred is identity absent. And absent identity in a tool-calling protocol is a gap that every production deployment will feel.


The browser already mediates trust for every other web interaction. Agent-to-tool calls shouldn’t be the exception.

Share: Twitter LinkedIn
Related posts

← back to blog