MCP Architecture & Plugins
How MCP works under the hood, its limitations, and how plugins standardize team tooling.
The Integration Problem MCP Solves
Without MCP, every integration between an AI agent and an external service must be built from scratch for each host environment. An agent built for Cursor that connects to Slack, Gmail, and a database cannot reuse those integrations inside Claude Code — each integration is tightly coupled to the host. MCP introduces an abstraction layer: build the integration once as an MCP server, and any MCP-compatible client can connect to it.
Core Architecture: Hosts, Clients, and Servers
MCP has three components. An MCP host is any AI application that supports the protocol — Claude Desktop, Claude Code, Cursor, or custom agents. An MCP server exposes functionality (tools, resources, prompts) to hosts through a standardized interface. An MCP client lives inside the host and manages communication with a single server. Each client connects to exactly one server; a host needing multiple servers instantiates multiple clients.
| Component | Role | Examples |
|---|---|---|
| MCP Host | AI application that supports MCP | Claude Code, Claude Desktop, Cursor, custom agents |
| MCP Server | Exposes tools, resources, and prompts via the protocol | Weather API server, database server, GitHub server |
| MCP Client | Lives inside the host, manages communication with one server | One client per connected server |
Connecting and Configuring MCP Servers
MCP servers are added via the CLI using claude mcp add. They can run locally (via npx or a Python process) or remotely (via HTTP transport). Each server can be scoped to a project, a user, or a single session.
# Add a remote MCP server (HTTP transport)
claude mcp add --transport http context7 https://mcp.context7.com/mcp
# Add a local MCP server (stdio transport)
claude mcp add context7 -- npx -y @upstash/context7-mcp
# Scope options: --scope project | user | localMCP Scope Cheat Sheet
| Scope | Config Location | Persists? | Shared with Team? |
|---|---|---|---|
| Project | .mcp.json in project root | Yes | Yes (version-controlled) |
| User | ~/.claude/settings.json | Yes | No |
| Session | --mcp-config flag or added mid-session | No | No |
Use project scope for tools the whole team needs (database MCP, deployment scripts). Use user scope for personal tools (Slack, personal GitHub PAT). Use session scope for temporary experiments or untrusted servers you are evaluating.
Context Pollution: The Hidden Cost of MCP
Every connected MCP server loads its tool definitions into the context window before any prompt is processed. With multiple servers, tool descriptions alone can consume 20% or more of available context. This is context pollution — tokens wasted on tool definitions the agent never uses for the current task.
- ▸A configuration with 58 tools across GitHub, Slack, and database MCPs can consume 55,000+ tokens before any conversation starts
- ▸An agent tasked with fixing CSS still carries context for database queries, email sending, and PDF parsing
- ▸As context fills with irrelevant definitions, model performance degrades — more hallucinations, worse tool selection, missed instructions
Mitigating Context Pollution
The solution is scoped MCP configuration. Instead of one .mcp.json that loads everything, create task-specific config files and launch Claude Code with the --mcp-config flag. Combined with --strict-mcp-config to ignore the default MCP hierarchy, this can reduce MCP-related context from ~20% to ~2.5%.
# Create task-specific MCP configs
# .mcp.json.research — only Context7 + Tavily
# .mcp.json.database — only Postgres MCP
# Launch with scoped config
claude --mcp-config .mcp.json.research --strict-mcp-config
# Or manage dynamically within a session via /mcpOther MCP Limitations
| Limitation | Impact |
|---|---|
| Inefficient execution loops | Each tool call requires a full LLM round-trip. Multi-step tasks compound cost and latency exponentially. |
| JSON vs code mismatch | MCP uses structured tool-call tokens that LLMs were not primarily trained on. Code and text are more natural for models than synthetic tool invocation tokens. |
| Schema ambiguity | JSON schemas describe structure but not usage patterns — when to use a tool, how to combine tools, or what combinations to avoid. |
CodeMode: An Alternative Execution Model
Cloudflare's CodeMode addresses MCP's round-trip problem by converting MCP tool definitions into a TypeScript API. Instead of the LLM making individual tool calls, it generates code that calls the API. The code executes in a sandboxed environment in a single pass, reducing round trips and leveraging LLMs' natural strength with code generation.
Claude Code Plugins
Plugins bundle skills (slash commands), sub-agents, MCP servers, and hooks into a single installable package. Before plugins, sharing configuration across a team required manual copying of files between repositories. Plugins provide a marketplace-based distribution model — install from a repository URL, and all components are set up together.
- ▸Plugins are managed via the /plugin command
- ▸Marketplaces are JSON files listing available plugins with metadata and source paths
- ▸Individual components (a single MCP server or slash command) can be installed without the full plugin
- ▸Organizations can create private marketplaces for internal tooling
MCP servers run code with your user permissions and can access files, environment variables, and network resources. Plugins carry the same risk. Always review the source of MCP servers and plugins before installing — treat them like any third-party dependency.