Every major AI application — Claude, ChatGPT, Cursor, VS Code Copilot — now supports the Model Context Protocol. If you are building software that AI agents need to interact with, understanding MCP servers is no longer optional. They are the mechanism through which your APIs, databases, and internal tools become accessible to AI.
This guide explains what an MCP server is, how it fits into the broader MCP architecture, what capabilities it exposes, and how to get one running in production. Whether you are evaluating MCP for the first time or planning a production deployment, this is the foundational reference you need.
What is the Model Context Protocol (MCP)?
The Model Context Protocol (MCP) is an open standard for connecting AI applications to external data sources, tools, and workflows. Introduced by Anthropic in November 2024, MCP standardizes how AI systems discover and invoke external capabilities — much like how USB-C standardizes how devices connect to peripherals.
Before MCP, every integration between an AI application and an external system required custom code. If you wanted Claude to query your database and read your file system and call your internal API, you needed three separate integrations with three different interfaces. MCP replaces that fragmentation with a single, standardized protocol that any AI application can use to connect to any compatible server.
MCP is now supported across a wide range of clients, including Claude, ChatGPT, VS Code, Cursor, and many more. The protocol is open source and governed by a public specification, which means it is not tied to any single vendor.
MCP server vs. MCP protocol
This is a distinction that trips up many developers: MCP and MCP server are not synonyms.
MCP is the protocol specification. It defines the message format (JSON-RPC 2.0), the lifecycle (initialization, capability negotiation, termination), and the types of capabilities that can be exchanged (tools, resources, prompts). MCP is a set of rules — it does not run anywhere or do anything on its own.
An MCP server is a program that implements the protocol. It is the running process that exposes tools, resources, and prompts to AI applications. When an AI assistant like Claude invokes a tool to query your database or call your API, it is communicating with an MCP server that has been configured to expose those capabilities.
Think of it this way: HTTP is a protocol, and Nginx is a server that implements it. MCP is a protocol, and an MCP server is the implementation that makes your data and tools accessible to AI clients.
The four MCP architecture components
MCP follows a client-server architecture with four distinct components. Each plays a specific role in connecting AI applications to external capabilities.
Host
The host is the AI application that the user interacts with directly. Claude Desktop, ChatGPT, Cursor, and VS Code are all examples of MCP hosts. The host is responsible for managing one or more MCP clients and coordinating their interactions with the user and the underlying language model.
When you configure an MCP server in Claude Desktop’s settings, Claude Desktop is acting as the host. It creates a dedicated client for each server connection and routes tool calls between the language model and the appropriate server.
Client
The MCP client is a component within the host that maintains a dedicated connection to a single MCP server. Each client handles the protocol-level details: initialization, capability negotiation, sending requests, and processing responses.
A host creates one client per server connection. If Claude Desktop is connected
to three MCP servers (say, a filesystem server, a database server, and an API
server), it maintains three separate MCP clients — one for each. This 1
Server
The MCP server is the program that exposes capabilities. It receives JSON-RPC requests from clients, executes the requested operations, and returns results. An MCP server can expose tools (functions the AI can call), resources (data the application can read), and prompts (reusable interaction templates).
MCP servers can run locally on the same machine as the host or remotely over HTTP. A local server might provide access to your filesystem. A remote server might expose your company’s REST API as a set of MCP tools that any AI application can discover and invoke.
Transport
The transport layer defines how clients and servers communicate. MCP supports two transport mechanisms:
-
stdio (Standard I/O): The client launches the server as a local process and communicates through standard input and output streams. This is the simplest transport — no networking, no configuration, no latency. It is ideal for local tools like file system access or local database queries.
-
Streamable HTTP: The client sends JSON-RPC messages to the server over HTTP POST requests. The server can optionally use Server-Sent Events (SSE) for streaming responses. This transport enables remote server communication and supports standard HTTP authentication methods like OAuth, API keys, and bearer tokens.
The transport layer is abstracted from the protocol layer. Whether a client connects via stdio or Streamable HTTP, the JSON-RPC messages are identical. This means you can develop a server locally using stdio and deploy it remotely over HTTP without changing your tool definitions or business logic.
Note: The MCP specification originally defined an HTTP+SSE transport, but this was deprecated in favor of Streamable HTTP in the March 2025 specification revision (2025-03-26) and fully removed in the June 2025 revision (2025-06-18). Streamable HTTP is more flexible, supports standard HTTP infrastructure (load balancers, CDNs, proxies), and is the recommended transport for remote MCP servers.
The three MCP capabilities
An MCP server can expose three types of capabilities to clients. Each type has a different control model and authorization boundary, which matters when you are designing security policies.
Tools (model-controlled)
Tools are executable functions that the language model can invoke autonomously during a conversation. When an AI assistant decides it needs to look up weather data, query a database, or create a GitHub issue, it calls a tool.
Tools are the most powerful and most security-sensitive capability type. The language model decides when to call a tool and what arguments to pass. The user may not explicitly request the action — the model infers that it is needed to complete the task.
Each tool has a name, a description, and a JSON Schema that defines its input parameters. Here is an example of a tool definition:
Because tools can modify state (create records, trigger workflows, execute code), they require the strictest authorization controls. You should scope tool access carefully: different users or teams should have access to different sets of tools, and sensitive tools should require explicit user confirmation before execution.
Resources (application-controlled)
Resources are read-only data sources that the application can pull into the conversation as context. Unlike tools, resources are not invoked autonomously by the language model — the application (or the user) decides which resources to include.
Examples of resources include file contents, database schemas, API documentation, and configuration files. A resource provides context that helps the language model give better answers without giving it the ability to take action.
Resources carry lower security risk than tools because they are read-only and application-controlled. However, they still need governance — a resource that exposes customer data or internal credentials should not be accessible to unauthorized users.
Prompts (user-controlled)
Prompts are reusable interaction templates that users explicitly select. They provide pre-structured instructions for the language model, often including few-shot examples or specific formatting requirements.
For example, an MCP server for a database might expose a prompt called
sql-query-helper that includes examples of well-formed SQL queries and
instructions for the model to follow. The user selects this prompt to get
better-quality database interactions.
Prompts have the simplest control model: the user explicitly chooses them. There is no autonomous invocation and no risk of unintended actions.
Why three capability types matter
The separation into tools, resources, and prompts is not just organizational — it maps directly to authorization boundaries:
- Tools require the most restrictive policies (rate limiting, per-tool permissions, audit logging) because they are model-controlled and can modify state.
- Resources need access control but carry less risk because they are read-only.
- Prompts are the safest because they are user-initiated and do not directly interact with external systems.
When you design your MCP server’s security model, this three-tier structure tells you where to focus your effort. Most security incidents with MCP will involve tool invocations, not prompt selections.
How JSON-RPC requests flow through an MCP server
Every MCP interaction follows the same fundamental pattern, regardless of which transport is being used. Understanding this flow is essential for debugging, monitoring, and securing your MCP deployments.
1. Initialization and capability negotiation. When a client first connects
to a server, it sends an initialize request. The server responds with its
supported capabilities (which tools, resources, and prompts it exposes). The
client then sends a notifications/initialized notification to confirm the
connection is ready. This handshake ensures both sides know what the other
supports before any work begins.
2. Discovery. The client sends tools/list, resources/list, or
prompts/list requests to discover what the server offers. These listings can
be dynamic — a server might expose different tools based on the authenticated
user’s permissions.
3. Execution. When the language model decides to use a tool, the host sends
a tools/call request to the appropriate client, which forwards it to the
server. The request includes the tool name and arguments as a JSON-RPC 2.0
message:
4. Response. The server executes the tool and returns a structured response:
5. Notifications. Servers can send real-time notifications to connected
clients — for example, notifications/tools/list_changed when available tools
change. This keeps clients synchronized without polling.
A critical architectural point: the server never communicates directly with the language model. The client mediates all communication. The model tells the host it wants to call a tool, the host tells the client, the client sends the JSON-RPC request to the server, and the response travels back through the same chain. This separation keeps the protocol clean and the security boundaries clear.
Authorization and security for MCP servers
MCP was originally designed for local tool use where the AI assistant and the tools shared the same machine and the same trust boundary. That model breaks down the moment you deploy an MCP server remotely. A remote MCP server is, architecturally, an API — it accepts requests over HTTP, processes them, and returns results. That means it faces the same security challenges as any API:
- Authentication: Who is making this request? Is it a legitimate user or agent?
- Authorization: Is this caller allowed to invoke this specific tool with these arguments?
- Rate limiting: How many requests can this caller make? AI agents can generate hundreds of tool calls per minute.
- Audit logging: What was called, by whom, with what parameters, and what was the result?
Each capability type maps to different security requirements. Tools need the strictest controls because they are model-invoked and can modify state. Resources need access control to prevent data exposure. Prompts are the safest but should still be scoped to appropriate users.
Beyond these standard API security concerns, MCP servers face a unique threat: prompt injection. An attacker can embed malicious instructions in data that the AI model processes, tricking it into calling tools with unintended parameters. For example, a carefully crafted customer support ticket could instruct the model to delete records or exfiltrate data. Prompt injection defense — scanning tool inputs for suspicious patterns, sanitizing outputs, enforcing parameter constraints — is critical for any production MCP deployment.
For a deeper dive into MCP security patterns, see Securing MCP Servers: Authentication, Authorization, and Access Control.
Common use cases for MCP servers
MCP servers are being deployed across a wide range of industries and use cases. Here are the patterns that are emerging most quickly.
Code assistants and developer tools
AI-powered IDEs like Cursor and VS Code use MCP to connect to development
infrastructure: version control (GitHub, GitLab), CI/CD systems, error tracking
(Sentry), and project management tools. An MCP server for GitHub might expose
tools like create_pull_request, list_issues, and review_code — allowing
the AI to take actions directly in your development workflow.
Database and analytics queries
MCP servers that wrap database connections let AI assistants run queries, explore schemas, and generate reports without requiring users to write SQL. The server exposes the database as a set of tools (for queries) and resources (for schema information), with the MCP layer handling authentication and access control.
Internal tooling and enterprise workflows
Companies are using MCP servers to expose internal systems — CRMs, ticketing systems, ERPs, billing platforms — to AI assistants. Instead of building custom integrations for each AI tool, a single MCP server makes the internal system accessible to any MCP-compatible client. A customer support agent using Claude can look up account details, create tickets, and process refunds through MCP tools connected to backend systems.
Agentic workflows and multi-step automation
As AI agents become more autonomous, MCP provides the standard interface for agent-to-tool communication. An agent planning a multi-step task — “analyze last quarter’s sales, identify underperforming regions, draft a report, and email it to the team” — can discover and invoke the right tools at each step through MCP, without needing hardcoded integrations.
How to get started with MCP servers
There are three main paths to deploying MCP servers, depending on your starting point and requirements.
Build from scratch with the MCP SDK
The MCP specification provides SDKs for TypeScript, Python, and other languages. You can build an MCP server from scratch by defining your tools, resources, and prompts and wiring them up to your backend logic. This gives you full control but requires you to handle transport, authentication, rate limiting, observability, and all the other production concerns yourself.
This approach makes sense for simple, local-only servers (a filesystem server for your dev environment, for example) or for teams with strong infrastructure capabilities that want maximum flexibility.
Convert your OpenAPI spec to an MCP server
If you already have a REST API with an OpenAPI specification, you can automatically generate an MCP server that exposes your API operations as MCP tools. This is the fastest path from existing API to working MCP server — your tool names, descriptions, and parameter schemas are derived directly from your OpenAPI spec, so there is no schema drift.
Zuplo supports this workflow natively: you mark operations in your OpenAPI spec with MCP metadata, and the MCP Server Handler automatically exposes them as MCP tools. For a step-by-step walkthrough, see Create an MCP Server from Your OpenAPI Spec.
Use a managed MCP gateway
For production deployments with multiple MCP servers, multiple teams, and enterprise requirements, a managed MCP gateway handles the infrastructure so you can focus on defining tools. Zuplo’s MCP Gateway provides a centralized control plane for all your MCP servers — both first-party servers you build and third-party servers from vendors like GitHub, Stripe, or Sentry.
A managed gateway solves the production challenges that compound as you scale:
- Centralized authentication: Unified OAuth and API key management across all MCP servers, with credential brokering so users authenticate once.
- Role-based access control: Define which teams can access which tools. Give your engineering team infrastructure tools and your finance team reporting tools, all from the same governance layer.
- Observability and audit trails: Full logging of every tool invocation across every MCP server, with per-tool analytics and usage metrics.
- Security policies: Prompt injection detection, PII redaction, and rate limiting applied consistently across all MCP traffic.
- Edge deployment: MCP servers running on 300+ global edge locations for low-latency tool invocations, using Streamable HTTP transport.
For a decision framework on whether to build or buy your MCP infrastructure, see Build vs Buy MCP Server Infrastructure.
Where to go from here
This article covers the foundational concepts. Zuplo has a deep library of implementation guides that pick up where this explainer leaves off:
- Connecting MCP Servers to an API Gateway — how to route MCP traffic through a gateway for security and observability
- Best Practices for Mapping REST APIs to MCP Tools — design patterns for converting API endpoints into well-structured MCP tools
- Managing MCP Server Access at Scale — governance patterns for multi-team, multi-server deployments
- Securing MCP Servers: Authentication, Authorization, and Access Control — practical security patterns for production MCP servers
If you want to get hands-on immediately, create an MCP server from your OpenAPI spec in five minutes using Zuplo — no infrastructure to manage, no boilerplate to write.