---
title: "Anthropic Just Made the Case for MCP Gateways"
description: "Anthropic's containment post never names a gateway. But its core lesson, contain agents at a deterministic boundary and treat every allowlist as a capability grant, is exactly what an MCP gateway gives you off the shelf."
canonicalUrl: "https://zuplo.com/blog/2026/06/04/anthropic-made-the-case-for-mcp-gateways"
pageType: "blog"
date: "2026-06-04"
authors: "nate"
tags: "Model Context Protocol, ai-agents, API Security"
image: "https://zuplo.com/og?text=Anthropic%20Just%20Made%20the%20Case%20for%20MCP%20Gateways"
---
Anthropic just published the best argument for an MCP gateway I have read this
year. It never uses the word gateway, and endorses no vendor, us included.

[How we contain Claude across products](https://www.anthropic.com/engineering/how-we-contain-claude)
is a candid writeup of how Anthropic keeps claude.ai, Claude Code, and Cowork
from doing real damage as the models get more capable. It circles one question:
how do you cap the blast radius?

Their answer splits in two. The host plane, files and processes on the machine,
stays the job of sandboxes and VMs. The network, tool, and MCP plane, everything
the agent reaches over the wire, needs a different control. If you run API
infrastructure, that second half should look familiar.

<CalloutAudience
  variant="useIf"
  items={[
    `You run or expose MCP servers to AI agents`,
    `You hand Claude Code, Cursor, or ChatGPT access to internal tools and APIs`,
    `You want to contain what an agent can reach, not just watch what it does`,
  ]}
/>

## Two ways to cap an agent's blast radius

Anthropic frames risk as likelihood times damage. Model training drives down
likelihood; nothing drives down damage except containment. That leaves two
levers: supervise behavior with a human in the loop, or contain capability,
"rather than supervising what the agent does, we supervise what it's able to
do," with sandboxes, virtual machines, and egress controls. Anthropic chose the
second.

## Approval fatigue is why supervision can't stand alone

The supervision lever decays. "Our telemetry showed users approved roughly 93%
of permission prompts," Anthropic reports, and more prompts mean less attention
each. They call it approval fatigue, and it "appeared within weeks." Trail of
Bits described the same dynamic,
[human review reduced to a rubber stamp](https://blog.trailofbits.com/2025/04/21/jumping-the-line/).

Their fix was not more prompts but a deterministic default: in the Claude Code
sandbox, reads are allowed, writes are allowed inside the workspace, and network
is denied. That single boundary cut permission prompts by 84%, and it holds
whether or not anyone is watching.

The model layer cannot be that boundary on its own either. On Gray Swan's agent
red-teaming benchmark, Claude Opus 4.7 holds prompt-injection success to about
0.1% on a single attempt but 5 to 6% after a hundred adaptive attempts, and
Claude Code's auto mode still lets roughly 17% of overeager actions through. As
Anthropic concludes, "protection in the model layer will never be 100%
effective, which is why it can't stand alone." Our
[Q1 2026 API and agent security scorecard](/blog/q1-2026-api-agent-security-scorecard)
tracks how those miss rates play out in practice.

## Both of Anthropic's worst incidents were egress failures

The two most instructive failures are the same shape: data leaving through a
permitted path.

In the first, a red team phished an Anthropic employee into launching Claude
Code with a prompt telling Claude to read `~/.aws/credentials` and POST them to
an external endpoint. Across 25 retries it exfiltrated 24 times. The
instructions arrived through the user, so the model layer had nothing anomalous
to catch. Anthropic's takeaway: "the only defense that holds is the
environment," egress controls that block the POST regardless of intent.

In the second, Cowork's egress allowlist passed traffic to `api.anthropic.com`.
A malicious file in the workspace carried hidden instructions and an attacker's
API key. With arbitrary outbound blocked, the instructions got Claude to call
the Files API with that key; the egress proxy saw an allowed domain and waved it
through to the attacker's account. "The sandbox worked perfectly, and yet the
data was exfiltrated."

Both are the egress leg of Simon Willison's
[lethal trifecta](https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/):
private data, untrusted content, and a way to talk to the outside world. With
all three present, the only reliable lever is cutting the third. As Anthropic
puts it, "the deterministic boundary is what gets hit when everything
probabilistic misses."

## An allowlist is a capability grant, not a filter

This is the key point. After the Cowork incident Anthropic reframed the problem:
"the allowlist is not a destination filter, it is a capability grant." Allowing
`api.anthropic.com` allowed every function reachable there, including file
uploads to arbitrary accounts.

That reframing changes what you build. "This agent may reach this server" is a
blunt grant; the control you want is per request and identity-aware: this agent,
acting for this user, may call this specific tool, right now.

Anthropic asks the same question from the other side: should an agent have its
own identity or inherit the user's? They say "the answer may be a blend of the
two." A boundary that knows who is calling, and on whose behalf, is where you
enforce that blend.

<CalloutTip variant="mistake">
  Treat a domain allowlist as a destination filter and you have quietly granted
  every capability reachable at that domain. Scope by capability and identity,
  not by hostname.
</CalloutTip>

For MCP the boundary has two concrete jobs, both deterministic. The first
curates which tools a server exposes: a gateway can publish a read-only view of
an upstream, say the Stripe server with the destructive tools filtered out. The
second binds every token to the server it was minted for. With
[RFC 8707](https://datatracker.ietf.org/doc/html/rfc8707) resource indicators, a
token issued for one virtual server is rejected at another, so a compromised
server cannot replay a user's token elsewhere. That closes the confused-deputy
attack the MCP spec calls out.

## Don't hand-roll the proxy

The most uncomfortable lesson is for anyone tempted to build their own boundary.
The battle-tested isolation primitives held: hypervisors, syscall filters, and
container runtimes like gVisor "survived more adversarial attention than
anything you'll build." The custom glue around them is what failed. Anthropic
says it twice, "the weakest layer is the one you built yourself" and "be wary of
custom components."

For the network and MCP plane, the custom glue is the OAuth server, token
validation, resource binding, and credential brokering. Getting MCP
authorization right means stacking overlapping RFCs, OAuth 2.1, PKCE, dynamic
client registration, protected-resource metadata, resource indicators, on top of
a spec rule that is easy to violate: an MCP server must not accept a token that
was not issued for it. Every re-implementation is a fresh chance to get it
wrong. Implement it once, not once per server.

One honest caveat. Anthropic's fix for the approved-domain exfil was a proxy
inside the VM, "because only the VM knows provenance." An external gateway lacks
that view, so it closes the gap from the other end: for any call routing through
it, the gateway brokers the credential, so what reaches the upstream is its own
scoped token, not a key a poisoned file slipped in.

## Where a gateway ends and the sandbox begins

Be explicit about scope so none of this reads as overreach. A gateway owns the
network, tool, MCP, and API plane: who is calling, what they may invoke, which
token is valid, what credential reaches the upstream, and a per-call audit of
it. It does not own the host plane. Process sandboxing, VM isolation, filesystem
boundaries, keeping `~/.aws` out of reach, that is the sandbox's job. Anthropic
built gVisor containers and local VMs because they ship agents to millions of
people. You probably will not, nor grant an agent that much local access.

![The two planes of agent containment: a sandbox or VM contains the host plane (filesystem, processes, local credentials), while an MCP gateway contains the network, tool, and MCP plane (OAuth and audience-bound tokens, tool curation, no token passthrough, per-call audit) between the agent and upstream MCP servers and APIs.](/blog-images/2026-06-04-anthropic-made-the-case-for-mcp-gateways/diagram-1.png)

A second limit is worth naming: a boundary only contains traffic that routes
through it. A gateway cannot stop a developer's editor from connecting straight
to an MCP server that bypasses it, so making the gateway the only path is its
own enforcement decision, through network policy or blocking direct MCP egress.
Ask that first, not last.

Every team can adopt the slice where Anthropic's hardest incidents happened: the
boundary in front of tools, APIs, and MCP servers. You do not need to be
Anthropic to need it. You need it precisely because their report shows what gets
through without it.

## What Zuplo puts at the MCP boundary

We built this boundary for MCP, and we use it ourselves: our own team reaches
the third-party MCP servers it depends on, Linear, Stripe, Notion, ClickHouse,
through this gateway, the read-only-Stripe view above included. The
[Zuplo MCP Gateway](/blog/introducing-zuplo-mcp-gateway), in public beta today,
maps onto Anthropic's argument point for point:

- **One OAuth-protected URL** in front of every MCP server your agents touch,
  yours and third-party, instead of a long-lived token pasted into each editor.
- **Tool curation per route.** Publish a read-only or hand-picked subset of an
  upstream's tools, the capability grant made explicit rather than inherited.
- **A full OAuth authorization server to spec**, with dynamic client
  registration, PKCE, and protected-resource metadata, plus RFC 8707 tokens
  bound to one virtual server so they cannot be replayed at another.
- **No token passthrough.** The gateway holds the upstream credentials and
  attaches them server-side, so the agent never sees them, and encrypts them at
  rest.
- **Programmable policies that inspect a tool response** before it re-enters the
  model's context, so a poisoned payload can be redacted or blocked on the way
  back.

That last one mirrors Anthropic: in Claude Code and Cowork, "tool calls route
through proxies that enforce network and file policy and can inspect return
values before they enter the model's context," and that classifier "can be a
small, fast model." It needs the same honesty, though: the deterministic parts,
authentication, token scope, resource binding, and tool curation, are the
guarantee, while content inspection for prompt injection is defense in depth,
and a detection rate is not a guarantee.

That is also why
[injection in MCP flows backwards](/blog/protect-mcp-against-prompt-injection):
the poisoned payload arrives in a tool response, so the boundary scans what
comes back, not just what goes in.

![The Zuplo MCP Gateway virtual-server wizard at the Tools step: Curate is selected over Passthrough, and each of the upstream's tools, prompts, and resources sits behind a checkbox, so the operator exposes only the safe subset and switches the destructive tools off.](/blog-images/introducing-zuplo-mcp-gateway/wizard-tool-curator.png)

<CalloutDoc
  title="MCP Gateway Quickstart"
  description="Build a virtual MCP server in the browser: pick an upstream, wire up OAuth, curate the tools, and point an agent at it."
  href="https://zuplo.com/docs/mcp-gateway/quickstart"
  icon="book"
/>

Anthropic's report is the rare security writeup that argues for an entire
category without selling anything. Take it at face value: contain at a
deterministic boundary, treat every allowlist as a capability grant, and don't
hand-roll the proxy.

For the MCP plane that boundary already exists off the shelf, so your
engineering can go to the host-plane problems that are genuinely yours.

The Zuplo MCP Gateway is in public beta and available now on every plan,
including the free one.
[Spin up a free project](https://portal.zuplo.com/signup?utm_source=zuplo-blog&utm_medium=web&utm_campaign=mcp-gateway)
and point your first agent at a virtual MCP server today.