The agentic web has a security problem, and we first proposed the solution forty-nine years ago.
The Confused Deputy
In 1977, Norm Hardy discovered something unsettling at Tymshare (an early commercial multi-tenant computing environment). “Administrative” functionality to be modified by operators only was contained in the directory (SYSX). As a multi-user environment that billed users for compute cycles, billing records needed to be written somewhere secured, in this case (SYSX)BILL. Additionally, the mainframe’s FORTRAN compiler (located at (SYSX)FORT) collected language statistics (stored in a file (SYSX)STAT) and thus needed write access to SYSX. Finally, the FORTRAN compiler accepted user-specified debugging output files. Someone eventually realized they could specify (SYSX)BILL as a debug output target. The compiler had write access to billing records, even though the user didn’t. It didn’t matter: the compiler wrote there anyway.
Hardy named this the Confused Deputy problem. The compiler served two masters: the user and the system operators, and had no mechanism to distinguish which authority it should use for which operation. It was confused about whose interests it was serving.
More concretely, the compiler's permissions were determined by who it was, not what it was asked to do.
Mark Miller and Dean Tribble coined the term ambient authority to refer to this sort of name-based “broad” access to resources. It turns out there is a lot of ambient authority in modern computing, intentional or otherwise: Windows Print Spooler, anyone?
ACLs Are the Problem
Every mainstream OS uses Access Control Lists. When Word saves your document, the OS asks: "Does this user have permission to write to this file?" If yes, proceed. The operation's legitimacy is irrelevant.
This seems intuitive until you realize: Word can do anything you can do. Send your files to Macedonia. Encrypt your disk. Delete your tax returns. You're permitted to do those things, so it is too. This has major implications for security.
From Morningstar's capabilities explainer:
When you run an application, as far as the OS is concerned, everything the application does is done by you. Another way to put this is, an application you run can do anything you can do.
The consequences of this sort of “identity transference” to an LLM agent is bonkers - it’s an identity that is separate from you, acting on your behalf, with functionally unlimited agency. You cannot trust an LLM to color inside the lines.
Capabilities: Don't Separate Designation from Authority
A capability is a single unforgeable token that both designates a resource and authorizes access to it. You can't name something you can't touch.
In an object-capability (ocap) system, object references are capabilities. You exercise authority by invoking methods. You transfer authority by passing references. You can only obtain capabilities through:
Creation: you made the resource
Transfer: someone gave you the capability
Endowment: you were initialized with it
No ambient authority. No identity questions. No confused deputies.
The Fix for Hardy's Compiler
With capabilities, the compiler would receive:
A capability to the statistics file from the operators
A capability to the output file from the user
Two distinct tokens, two distinct authorities, no confusion. The user cannot forge a capability to the billing file because they don't possess one.
Capability Patterns
Capabilities are also composable. Several patterns emerge:
Revocation: I don't give you my capability. I give you a proxy that forwards to my capability, and I retain a kill switch. When I revoke, your reference becomes useless. Unlike ACL revocation, this is immediate and doesn't require the resource to track who has access.
Attenuation: I have read/write access to a filesystem. I create a wrapper that only exposes read() on a specific subdirectory. I pass that capability to you. You get less than I have, and you can't escalate.
Combination: Combine camera access + GPS + clock + a signing key into a single capability that produces authenticated, timestamped, geotagged images. The recipient can't decompose it to access raw camera frames.
Why Now: The Agentic Confused Deputy
MCP servers, A2A agents, and the emerging agentic stack have reintroduced the confused deputy at scale, with the bonus property of superhuman speed.
The MCP Security Best Practices spec explicitly warns:
Attackers can exploit MCP proxy servers that connect to third-party APIs, creating "confused deputy" vulnerabilities.
An MCP server executing on behalf of a user has access to tools and APIs that the user authorized. A malicious prompt can trick the agent into invoking those tools for unintended purposes—classic confused deputy, now with natural language as the attack vector.
From Red Hat's MCP security analysis:
When an MCP server performs an action triggered by a user's request, there is a risk of a "confused deputy" problem. Ideally, the MCP server should execute this action on behalf of the user and with the user's permission. This is not guaranteed, however, and depends on the implementation.
The Windows MCP Security Architecture acknowledges the same threat model:
Cross-prompt injection can enable attackers to include untrusted prompt data and complete a confused deputy attack. In the case of a simple chat app, the implications of a prompt injection could be a jailbreak or leakage of memory data, with MCP the implications could be full remote code execution.
The Sampling Problem
MCP sampling is particularly dangerous. Per the specification:
Sampling in MCP allows servers to implement agentic behaviors, by enabling LLM calls to occur nested inside other MCP server features.
The spec says there "SHOULD always be a human in the loop." SHOULD, not MUST. Human-in-the-loop is an advisory requirement: implementations (or end users with flags like --dangerously-skip-permissions) can auto-approve, if they so choose.
A spec-compliant client can:
Connect to a remote MCP server
Receive a sampling/createMessage request
Auto-approve (spec allows this)
The LLM generates tool calls to local MCP servers
Local servers execute without human approval
The remote server never directly touched your local services. The client did, on behalf of an LLM responding to a prompt crafted by the remote server: a classic confused deputy problem.
Tool Authority Accumulation
When an MCP client connects to multiple servers, it accumulates ambient authority from all of them. The client has:
File system access (from the filesystem server)
Database queries (from the postgres server)
Shell execution (from the shell server)
API credentials (from various integration servers)
Every tool from every connected server is available to the LLM. A prompt injection in any data source can trigger any tool. There's no mechanism to say "this request came from the email server, so it should only have access to email tools."
The OAuth Illusion
Current MCP authorization uses OAuth 2.1. The authorization spec handles authentication—proving who the user is—but not capability delegation—controlling what authority flows to which operation.
OAuth answers "is this user allowed to use this server?" It cannot answer "should this specific operation, triggered by this specific prompt, have access to this specific resource?"
Proposal: A2A Capability Negotiation
This section proposes capability negotiation for A2A, enabling agents to request and receive minimal authority for specific tasks. Rather than accumulating ambient permissions from all connected servers, agents request only the capabilities needed for the task at hand, limiting blast radius from prompt injection or compromise. We have also created a Github Discussion on the A2A project with a Specification Enhancement Proposal: if this sounds good to you, please help us raise its profile!
The Core Problem
Current LLM clients accumulate ambient authority:
Client connects to: filesystem, database, email, shell
Client authority = filesystem ∪ database ∪ email ∪ shell
Prompt injection in email → can invoke shell
This violates the principle of least privilege: a task requiring email search should not carry shell execution authority.
Design Principle
Agents MUST request capabilities scoped to the task, not the session.
A capability is an unforgeable token that both designates a resource and authorizes a specific operation. Agents cannot invoke operations for which they hold no capability, regardless of what the underlying service permits.
Capability Advertisement
Servers MUST advertise available capability grants in their service description: