MCP Security Risks: Complete Guide for 2026
MCP Security Risks: Complete Guide for 2026

MCP Security Risks: What Every Enterprise Needs to Know
Every MCP server you connect to your AI agents is a door. Some lead to productivity. Others lead to data exfiltration, credential theft, and unauthorized access to production systems.
The Model Context Protocol has become the standard integration layer for connecting AI agents to enterprise tools and data. Major platforms including Claude, GitHub Copilot, Cursor, and enterprise agent frameworks support MCP natively. That adoption is accelerating fast. But the security risks of MCP are growing just as quickly, and most organizations are deploying MCP servers without fully understanding the attack surface they're creating.
This guide breaks down the specific MCP security risks that matter for enterprise teams, explains where the protocol is vulnerable, and provides actionable mitigation strategies for each. For a broader look at MCP security controls and best practices, see our complete enterprise guide. If you're a security leader, platform engineer, or architect evaluating MCP for production use, this is the risk landscape you need to understand before your agents go live.
What Is MCP and Why Does Security Matter?
The Model Context Protocol (MCP) is an open standard created by Anthropic that defines how AI agents connect to external tools and data sources. Instead of building custom integrations for every service an agent needs to access, MCP provides a universal interface. One protocol, any tool.
An MCP server is a lightweight program that exposes specific capabilities to AI agents. Those capabilities could be querying a database, searching Jira tickets, sending Slack messages, or modifying records in a CRM. When an AI agent needs to perform an action, it calls a tool through MCP, and the server executes it.
The security question is straightforward: when AI agents can execute real actions on production systems through MCP, what happens when something goes wrong?
The answer depends on the risk. A misconfigured MCP server can expose sensitive data. A malicious one can exfiltrate credentials. A compromised one can give attackers access to every downstream system it connects to. And because MCP servers are executable code that hold authentication tokens, the blast radius of a single compromise can extend across your entire tool chain.
Understanding these risks is not optional for enterprise teams. It's the prerequisite for deploying MCP safely.
How MCP Works (And Where It Breaks)
MCP Client-Server Architecture
MCP uses a client-server architecture with four key components:
MCP Hosts are the applications where the AI model runs, such as Claude Desktop, an IDE assistant, or a custom agent framework.
MCP Clients live inside the host and manage connections to one or more MCP servers. Each client maintains a 1:1 session with a specific server.
MCP Servers expose tools, resources, and prompts through the standardized protocol. They bridge the gap between the AI agent and external systems.
Tools are the actual capabilities exposed by servers, such as "query a Postgres database" or "create a GitHub issue."
The interaction flow works like this: a user sends a prompt to the AI host. The host's MCP client packages the prompt along with descriptions of available tools and sends everything to the LLM. The model decides which tool to call and with what parameters. The client routes the call to the correct server, which executes the action and returns results. The LLM processes those results and responds to the user.
Local vs. Remote MCP Servers
Local MCP servers run on the user's machine, typically started via a command like npx or uvx and connected through standard I/O (stdio). They have direct access to the local filesystem, environment variables, and running processes. A compromised local server can execute arbitrary commands with the user's own privileges.
Remote MCP servers run as hosted services accessed over HTTP with Streamable HTTP transport. They don't have direct local access, but they hold authentication tokens for the services they connect to. A compromised remote server becomes a proxy for every system those tokens authorize.
Where Security Gaps Emerge
The trust chain in MCP runs from the user through the LLM, client, server, and finally to the tools and data. Security can break at any point:
At the LLM layer: The model can be manipulated through prompt injection to call tools it shouldn't or pass malicious parameters.
At the client layer: Clients may trust servers without verifying integrity, follow redirects to malicious endpoints, or leak session IDs.
At the server layer: Servers may request excessive permissions, hold long-lived credentials, or execute unvalidated commands.
At the tool layer: Tool descriptions can contain hidden instructions. Tool behavior can change after approval. Tool parameters can be exploited for injection attacks.
Every link in this chain is a potential attack surface. And because MCP interactions are non-deterministic (the LLM decides what to call based on natural language input), the attack surface shifts with every prompt.
The Top 10 MCP Security Risks
Security researchers, the OWASP GenAI Security Project, the MCP specification authors, and enterprise security teams have identified a growing catalog of MCP-specific vulnerabilities. Here are the ten most critical risks, ordered by frequency and severity across the research.
1. Prompt Injection
Prompt injection is the most widely documented MCP security risk, identified in nearly every major analysis of the protocol's attack surface.
The core vulnerability: LLMs cannot reliably distinguish between legitimate instructions from the user and malicious instructions embedded in consumed content. When an AI agent reads data through an MCP server (an email, a document, a database field, a web page), that content enters the model's context alongside the user's actual instructions. If the content contains hidden commands, the model may follow them.
In an MCP environment, prompt injection becomes especially dangerous because the agent has tools at its disposal. A traditional prompt injection might trick an LLM into generating misleading text. An MCP prompt injection can trick the agent into executing real actions: forwarding sensitive documents, modifying database records, or exfiltrating data through a connected tool.
There are two variants. Direct prompt injection occurs when an attacker provides malicious input directly to the agent. Indirect prompt injection occurs when malicious instructions are embedded in external data the agent consumes through MCP tools, such as a poisoned Jira ticket, a manipulated email, or a crafted document.
Mitigation: Sanitize and validate all inputs to and outputs from MCP tool calls. Apply content moderation guardrails between the LLM and MCP servers. Implement strict tool scoping so agents can only take actions appropriate to their role, regardless of what instructions they receive. Use PII redaction on returned data to limit what sensitive information enters the model's context.
2. Tool Poisoning
Tool poisoning exploits the trust that LLMs place in MCP tool descriptions. Every MCP server provides metadata about its tools, including what they do, what parameters they accept, and how they should be used. The LLM relies on these descriptions to decide which tool to call.
A malicious MCP server can embed harmful instructions within tool descriptions that are invisible during routine inspection but influence the model's behavior. The description might instruct the LLM to prioritize calling the poisoned tool over legitimate alternatives, to include sensitive data in tool parameters, or to suppress user-facing output about certain actions.
Because tool descriptions are treated as trusted context by the LLM, the model has no built-in mechanism to distinguish between legitimate operational instructions and embedded attack payloads. The attack works precisely because the protocol trusts server-provided metadata.
Mitigation: Validate all MCP tool metadata before exposing it to agents. Use a gateway that inspects tool descriptions for hidden or anomalous instructions. Maintain a verified registry of trusted MCP tools and flag any unauthorized or suspicious entries.
3. Privilege Abuse and Over-Permissioned Access
MCP servers frequently request more permissions than they actually need. A server that only requires read access to customer records might request full read-write access to the entire CRM. A server connecting to a cloud provider might request administrator-level scopes.
This violates the principle of least privilege, and it dramatically increases the blast radius of any compromise. When an overly permissioned MCP server is compromised through prompt injection, tool poisoning, or a supply chain attack, the attacker inherits every permission that server holds.
The problem is compounded by the confused deputy vulnerability. The MCP server acts on behalf of the user but often with the server's own (broader) permissions. If the server is tricked into performing an unauthorized action, those elevated permissions become the attacker's permissions.
Mitigation: Audit all MCP server permission requests against actual usage. Enforce the principle of least privilege at the tool level, not just the server level. Implement progressive scope elevation where servers start with minimal permissions and request additional scopes only when specific operations require them. Use per-client consent flows so users explicitly approve each access grant. For a deeper look at how access policies apply to MCP environments, see our guide on MCP access control.
4. Tool Shadowing and Shadow MCP
Tool shadowing occurs when malicious actors create rogue MCP tools that closely mimic trusted services. Without robust validation, employees and AI agents may unintentionally route requests to these malicious tools instead of the legitimate ones they intended to use.
Shadow MCP is the related problem of unauthorized MCP servers running in your environment without security team visibility. Individual developers can install MCP servers on their workstations with no central approval process. Each unvetted server expands your attack surface.
Naming collisions make this worse. If a malicious server registers a tool with the same name or a similar name as a trusted tool, the LLM may select the malicious version. The model makes tool selection decisions based on descriptions and names, and slight variations can redirect actions to attacker-controlled servers.
Mitigation: Maintain a centralized registry of approved MCP servers and tools. Continuously scan for unauthorized MCP server installations across your environment. Implement tool allowlisting so agents can only discover and call tools from approved servers. Alert when new, unrecognized tools appear in any agent's available tool set.
5. Rug Pull Attacks (Tool Redefinition)
A rug pull attack exploits the trust that builds over time. An MCP server operates legitimately for weeks or months, passing security reviews and building a track record of safe behavior. Then, in a quiet update, the server's tool descriptions or underlying behavior change.
A tool originally described as "search customer records" begins silently exfiltrating data to an external endpoint. A tool that previously required specific parameters starts accepting broader inputs that enable injection attacks. Because the server was previously trusted and approved, many organizations won't catch the change until damage is done.
This risk is especially acute in the MCP ecosystem because many servers auto-update from package registries. Without version pinning and change monitoring, a malicious update propagates instantly across every installation.
Mitigation: Pin all MCP server versions by content digest, not mutable tags. Monitor for any changes in tool descriptions, parameters, or behavior after initial approval. Implement automated alerts that trigger security re-review when a pinned server's metadata changes. Use cryptographic signing to verify server integrity on every startup.
6. Supply Chain Risks
MCP servers are executable code, and the ecosystem is growing rapidly with community-built servers for nearly every major SaaS product, database, and developer tool. Many are installed directly from GitHub repositories or package registries, often by individual developers without security team review.
The supply chain risks include:
Typosquatting and poisoned packages. Malicious actors publish MCP servers with names similar to popular legitimate servers.
Dependency compromise. A vulnerability or backdoor in a transitive dependency propagates to every installation.
Unsigned builds. Without cryptographic signing, there's no way to verify that the server binary matches the published source code.
Lack of enterprise-grade development practices. Many community MCP servers lack the testing, code review, and security hardening that enterprise software requires.
Security researchers analyzing the MCP ecosystem have found command injection flaws in a significant percentage of analyzed servers, underscoring the immaturity of security practices in the current landscape.
Mitigation: Source MCP servers only from verified providers. Pin versions by content digest. Require cryptographic signature verification. Include MCP server dependencies in your existing SAST and SCA pipelines. Apply the same software supply chain security standards you use for any other production dependency. The OWASP Practical Guide for Securely Using Third-Party MCP Servers provides a detailed framework for vetting and managing third-party servers.
7. Sensitive Data Exposure and Token Theft
MCP servers store OAuth tokens, API keys, and other credentials for the services they connect to. A single server might hold tokens for GitHub, Salesforce, Slack, and your internal databases. This concentration of credentials makes MCP servers high-value targets.
Improperly configured MCP environments can leak credentials through multiple channels: verbose error messages, unredacted tool outputs, debug logs, or misconfigured environment variables. Tokens passed through MCP interactions without proper validation (the "token passthrough" anti-pattern that the MCP specification explicitly forbids) create additional exposure.
If an attacker compromises one MCP server, they don't just gain access to that server. They gain access to every downstream system those tokens authorize. Calendar data combined with email content and file storage access enables cross-service data aggregation attacks that individual service compromises wouldn't allow.
Mitigation: Use short-lived, scoped credentials rather than long-lived tokens. Implement just-in-time credential injection from a centralized secrets manager instead of storing tokens in server configurations. Never log credentials in plaintext. Redact sensitive data from tool outputs before they enter the LLM's context. Auto-rotate credentials on a defined schedule.
8. Command Injection and Malicious Code Execution
When MCP servers pass unvalidated user or external inputs to underlying databases or system commands, classic injection vulnerabilities emerge. An attacker can craft tool parameters that escape their intended context and execute arbitrary commands on the server's host system.
Local MCP servers are particularly vulnerable because they frequently execute OS commands to perform their functions. If a tool parameter is passed directly into a shell command without sanitization, command injection becomes trivial. SQL injection is also possible when tool parameters are concatenated directly into database queries without parameterization.
The Red Hat security team documented specific examples of MCP servers with command injection vulnerabilities in their tool implementations, where user-supplied input was passed directly to subprocess.call without sanitization.
Mitigation: Enforce strict input validation on all tool parameters. Use parameterized queries for database operations. Sanitize all data before using it as arguments for command execution. Run local MCP servers in sandboxed environments that restrict what commands can be executed. Apply the same secure coding practices you would for any web application handling untrusted input.
9. The Confused Deputy Problem
The confused deputy problem is a well-known security vulnerability that manifests at the MCP layer when proxy servers connect to third-party APIs. The official MCP security documentation describes this attack in detail.
The attack works like this: an MCP proxy server uses a static client ID to authenticate with a third-party authorization server. A legitimate user authenticates normally, and the third-party server sets a consent cookie. An attacker later sends the user a crafted link containing a malicious authorization request. The user's browser still has the consent cookie from the previous legitimate session, so the consent screen is skipped. The authorization code is redirected to the attacker's server, and they gain access to the third-party API as the compromised user.
This attack exploits the combination of static client IDs, dynamic client registration, and consent cookies, all of which are common patterns in MCP proxy server implementations.
Mitigation: MCP proxy servers must implement per-client consent before forwarding to third-party authorization. Store consent decisions server-side, bound to specific client IDs. Validate redirect URIs with exact string matching. Generate cryptographically secure state parameters for each authorization request, and only set consent cookies after the user has explicitly approved the consent screen.
10. Session Hijacking and SSRF
MCP's Streamable HTTP transport uses session IDs to maintain state between clients and servers. If an attacker obtains a valid session ID through network interception, log exposure, or brute force guessing, they can impersonate the legitimate client and execute unauthorized actions.
Server-Side Request Forgery (SSRF) is a related vector specific to MCP's OAuth discovery flow. During authentication, MCP clients fetch URLs from sources that could be controlled by a malicious server: resource metadata URLs, authorization server endpoints, and token endpoints. A malicious server can populate these fields with URLs pointing to internal resources like cloud metadata endpoints (169.254.169.254), internal network services, or localhost databases. The MCP client then makes requests to these internal targets on the attacker's behalf, effectively bypassing network perimeter controls.
Mitigation: Use cryptographically secure, non-deterministic session IDs. Bind session IDs to user-specific information so a valid session ID alone is insufficient for access. Rotate and expire session IDs regularly. For SSRF protection, enforce HTTPS for all OAuth-related URLs, block requests to private and reserved IP ranges, validate redirect targets, and use egress proxies for server-side MCP client deployments. Be aware of DNS rebinding attacks where domains resolve to safe IPs during validation but internal IPs during actual requests.
Why Traditional Security Fails for MCP
Enterprise security teams have decades of experience securing APIs, managing access controls, and monitoring network traffic. But MCP introduces properties that existing tools were not designed to handle.
The caller is non-deterministic. Traditional API security assumes a predictable caller. With MCP, an LLM decides which tools to invoke based on natural language input. The same prompt can produce different tool calls depending on context, model state, and available tools. Firewalls and WAFs that rely on predictable request patterns cannot effectively protect against this.
Instructions and data are blurred. In traditional systems, the instruction (API call) and data (payload) are clearly separated. In MCP, the LLM treats tool descriptions, user prompts, and returned data as a single context stream. Malicious content in any of these can influence which tools are called and with what parameters. This fundamentally breaks the input validation models that traditional security controls depend on.
The attack surface is dynamic. Every new MCP server added to your environment expands what agents can do. Unlike traditional APIs deployed through change management, MCP servers can be installed by individual developers with no central approval. There's no static perimeter to defend.
Audit is structurally harder. When an agent chains multiple tool calls across multiple servers to complete a task, reconstructing the causal chain from user intent to system action requires purpose-built observability. Standard API logging captures individual requests, but it doesn't capture the agent-level reasoning that connected them.
The "opt-in" security ecosystem. Security in MCP is largely opt-in. The protocol defines capabilities and best practices, but enforcement depends entirely on implementation. Not every MCP server implements proper authentication. Not every client validates server integrity. This inconsistency means that your security posture is only as strong as the weakest server in your agent's tool set. As the governance gap widens with every new agent connection, the need for centralized enforcement becomes unavoidable.
MCP Security Best Practices
Mitigating these risks requires controls at every layer of the MCP stack:
Enforce least-privilege access for every MCP server. Each server should have only the permissions required for its specific function. Audit permission grants regularly and revoke unused scopes.
Require human-in-the-loop for sensitive actions. Destructive operations, permission changes, external communications, and financial transactions should require explicit human approval before execution.
Sandbox MCP server execution environments. Run local servers in containers or application sandboxes with restricted filesystem and network access. Run remote servers with CPU/memory limits and egress controls.
Pin and verify MCP server versions. Pin to immutable content digests, not mutable tags. Verify cryptographic signatures before deployment. Alert on any changes to tool metadata or server behavior.
Monitor and log all MCP interactions. Log every tool call with full context: who initiated it, which agent performed it, what server was called, what parameters were passed, and what data was returned. Export logs to your SIEM for compliance and incident investigation.
Use centralized gateway controls for MCP traffic. Route all agent-to-server communication through a central enforcement point that applies consistent policies, validates tool metadata, and manages credentials.
Validate tool descriptions and parameters at runtime. Inspect tool metadata for hidden instructions before exposing tools to agents. Validate tool call parameters against expected schemas before execution. Redact sensitive data from tool outputs before they reach the LLM.
How an MCP Gateway Solves These Risks
The common thread across all ten risks is the lack of a centralized enforcement point. When each MCP client connects directly to each server, security controls are scattered across configurations, server code, and network policies. There's no single place to enforce policy, no unified audit trail, and no consistent way to detect anomalies.
An MCP gateway solves this by sitting between agents and servers, mediating every tool call through a single policy layer:
Centralized authentication and authorization. Every tool call is authenticated and authorized against defined policies before reaching the server. No more relying on individual server implementations to get auth right.
Policy enforcement at the gateway layer. Tool allowlisting, scope restrictions, parameter validation, and data redaction happen at a single enforcement point, consistently applied across all agent-server interactions.
Real-time monitoring and threat detection. Every request and response is logged with full context. Behavioral baselines detect anomalies. Alerts fire when agents deviate from established patterns.
Supply chain protection. The gateway validates server integrity through signature verification and change monitoring. Unapproved servers are invisible to agents, not just blocked.
Scope minimization and access control. The gateway enforces least privilege by controlling what tools each agent can discover and invoke, independent of what the underlying server exposes.
Agen provides exactly this architectural layer for enterprise teams. It sits between AI agents and the applications they access, extending your existing identity and access management into agent interactions. Whether you're securing AI agents for workforce productivity or exposing your SaaS product to customer-built agents, Agen provides identity-aware access control, data governance at the tool-call level, full observability, and anomaly detection purpose-built for agent behavior patterns. For local MCP servers running on developer workstations, Agen Shield enforces intent restrictions, skill quarantine, and egress controls at the OS level before data ever leaves the endpoint. Agen runs in your VPC, on-premises, or as a managed cloud service, keeping your tokens, keys, and audit logs within your security boundary.
MCP Security Checklist for Enterprise Teams
Use this checklist to assess your organization's readiness for production MCP deployment:
Pre-Deployment Assessment
[ ] All MCP servers have been sourced from verified providers and passed security review
[ ] Server versions are pinned by content digest with cryptographic signature verification
[ ] Permission scopes have been audited against actual tool requirements (least privilege)
[ ] Sandboxing and execution isolation are configured for all server environments
[ ] Network egress is restricted to approved endpoints only
Authentication and Access Control
[ ] OAuth 2.1 authentication is required for all remote MCP servers
[ ] Per-client consent flows are implemented and enforced
[ ] Per-tool authorization policies are defined (not just per-server)
[ ] Short-lived, auto-rotating credentials are used instead of long-lived API keys
[ ] Credentials are injected at runtime from a centralized secrets manager
Runtime Monitoring
[ ] Every tool call is logged with user identity, agent identity, server, tool, parameters, and results
[ ] Logs export to your SIEM in real time
[ ] Behavioral baselines are established for normal agent activity
[ ] Anomaly detection alerts fire on deviations from baseline patterns
[ ] Correlation IDs enable end-to-end tracing of multi-tool call chains
Ongoing Governance
[ ] A formal MCP server approval workflow exists (submit, scan, review, stage, deploy)
[ ] Tool description and behavior changes trigger automated alerts and re-review
[ ] Human-in-the-loop approval gates are configured for destructive or high-sensitivity actions
[ ] Governance controls map to your compliance frameworks (SOC 2, ISO 27001, GDPR, HIPAA)
[ ] Executive reporting on AI agent risk posture is in place
Incident Response
[ ] Incident response playbooks include MCP-specific scenarios (tool poisoning, credential theft, prompt injection)
[ ] Token revocation procedures are defined for compromised MCP servers
[ ] Communication templates exist for notifying affected users and downstream services
FAQ: MCP Security Risks
What is the biggest MCP security risk?
Prompt injection is the most frequently cited and broadly applicable MCP security risk. Because LLMs cannot reliably distinguish between legitimate user instructions and malicious instructions embedded in consumed content, any data an agent reads through MCP tools can potentially influence its actions. This risk is amplified by MCP because the agent has real tools at its disposal, turning what would normally be a text manipulation attack into an action execution attack.
Can MCP servers be trusted?
Not by default. MCP servers are executable code that hold credentials and perform actions on production systems. Trust must be established through verification: sourcing from trusted providers, pinning versions by digest, validating cryptographic signatures, sandboxing execution, monitoring behavior, and routing all interactions through a gateway with policy enforcement. The question is not whether to trust MCP servers, but how to verify and constrain that trust continuously.
How do I secure MCP in production?
Start with the foundational controls: enforce OAuth 2.1 authentication, implement least-privilege permissions, sandbox execution environments, and pin server versions. Layer on runtime protection: log every tool call, establish behavioral baselines, and configure anomaly detection. Add governance: define approval workflows for new servers, require security review, and map controls to your compliance frameworks. The most effective architectural pattern is a centralized MCP gateway that enforces all of these policies at a single enforcement point.
What is tool poisoning in MCP?
Tool poisoning is an attack where malicious instructions are embedded within MCP tool metadata (descriptions, parameters, and operational instructions). Because LLMs trust tool descriptions to decide which tool to call and how to call it, poisoned metadata can influence the model to perform unauthorized actions, exfiltrate data, or bypass security controls. The attack is difficult to detect through routine inspection because the malicious instructions are embedded in metadata fields that appear legitimate on the surface.
Is MCP safe for enterprise use?
MCP is safe for enterprise use when deployed with appropriate security controls. The protocol itself is well-designed and actively maintained with security guidance from both the specification authors and organizations like OWASP. The risks come from how MCP is implemented and deployed: unvetted servers, excessive permissions, missing authentication, lack of monitoring, and absent governance. Enterprises that treat MCP deployment with the same security rigor they apply to any other production infrastructure can use the protocol safely and confidently.
MCP Security Risks: What Every Enterprise Needs to Know
Every MCP server you connect to your AI agents is a door. Some lead to productivity. Others lead to data exfiltration, credential theft, and unauthorized access to production systems.
The Model Context Protocol has become the standard integration layer for connecting AI agents to enterprise tools and data. Major platforms including Claude, GitHub Copilot, Cursor, and enterprise agent frameworks support MCP natively. That adoption is accelerating fast. But the security risks of MCP are growing just as quickly, and most organizations are deploying MCP servers without fully understanding the attack surface they're creating.
This guide breaks down the specific MCP security risks that matter for enterprise teams, explains where the protocol is vulnerable, and provides actionable mitigation strategies for each. For a broader look at MCP security controls and best practices, see our complete enterprise guide. If you're a security leader, platform engineer, or architect evaluating MCP for production use, this is the risk landscape you need to understand before your agents go live.
What Is MCP and Why Does Security Matter?
The Model Context Protocol (MCP) is an open standard created by Anthropic that defines how AI agents connect to external tools and data sources. Instead of building custom integrations for every service an agent needs to access, MCP provides a universal interface. One protocol, any tool.
An MCP server is a lightweight program that exposes specific capabilities to AI agents. Those capabilities could be querying a database, searching Jira tickets, sending Slack messages, or modifying records in a CRM. When an AI agent needs to perform an action, it calls a tool through MCP, and the server executes it.
The security question is straightforward: when AI agents can execute real actions on production systems through MCP, what happens when something goes wrong?
The answer depends on the risk. A misconfigured MCP server can expose sensitive data. A malicious one can exfiltrate credentials. A compromised one can give attackers access to every downstream system it connects to. And because MCP servers are executable code that hold authentication tokens, the blast radius of a single compromise can extend across your entire tool chain.
Understanding these risks is not optional for enterprise teams. It's the prerequisite for deploying MCP safely.
How MCP Works (And Where It Breaks)
MCP Client-Server Architecture
MCP uses a client-server architecture with four key components:
MCP Hosts are the applications where the AI model runs, such as Claude Desktop, an IDE assistant, or a custom agent framework.
MCP Clients live inside the host and manage connections to one or more MCP servers. Each client maintains a 1:1 session with a specific server.
MCP Servers expose tools, resources, and prompts through the standardized protocol. They bridge the gap between the AI agent and external systems.
Tools are the actual capabilities exposed by servers, such as "query a Postgres database" or "create a GitHub issue."
The interaction flow works like this: a user sends a prompt to the AI host. The host's MCP client packages the prompt along with descriptions of available tools and sends everything to the LLM. The model decides which tool to call and with what parameters. The client routes the call to the correct server, which executes the action and returns results. The LLM processes those results and responds to the user.
Local vs. Remote MCP Servers
Local MCP servers run on the user's machine, typically started via a command like npx or uvx and connected through standard I/O (stdio). They have direct access to the local filesystem, environment variables, and running processes. A compromised local server can execute arbitrary commands with the user's own privileges.
Remote MCP servers run as hosted services accessed over HTTP with Streamable HTTP transport. They don't have direct local access, but they hold authentication tokens for the services they connect to. A compromised remote server becomes a proxy for every system those tokens authorize.
Where Security Gaps Emerge
The trust chain in MCP runs from the user through the LLM, client, server, and finally to the tools and data. Security can break at any point:
At the LLM layer: The model can be manipulated through prompt injection to call tools it shouldn't or pass malicious parameters.
At the client layer: Clients may trust servers without verifying integrity, follow redirects to malicious endpoints, or leak session IDs.
At the server layer: Servers may request excessive permissions, hold long-lived credentials, or execute unvalidated commands.
At the tool layer: Tool descriptions can contain hidden instructions. Tool behavior can change after approval. Tool parameters can be exploited for injection attacks.
Every link in this chain is a potential attack surface. And because MCP interactions are non-deterministic (the LLM decides what to call based on natural language input), the attack surface shifts with every prompt.
The Top 10 MCP Security Risks
Security researchers, the OWASP GenAI Security Project, the MCP specification authors, and enterprise security teams have identified a growing catalog of MCP-specific vulnerabilities. Here are the ten most critical risks, ordered by frequency and severity across the research.
1. Prompt Injection
Prompt injection is the most widely documented MCP security risk, identified in nearly every major analysis of the protocol's attack surface.
The core vulnerability: LLMs cannot reliably distinguish between legitimate instructions from the user and malicious instructions embedded in consumed content. When an AI agent reads data through an MCP server (an email, a document, a database field, a web page), that content enters the model's context alongside the user's actual instructions. If the content contains hidden commands, the model may follow them.
In an MCP environment, prompt injection becomes especially dangerous because the agent has tools at its disposal. A traditional prompt injection might trick an LLM into generating misleading text. An MCP prompt injection can trick the agent into executing real actions: forwarding sensitive documents, modifying database records, or exfiltrating data through a connected tool.
There are two variants. Direct prompt injection occurs when an attacker provides malicious input directly to the agent. Indirect prompt injection occurs when malicious instructions are embedded in external data the agent consumes through MCP tools, such as a poisoned Jira ticket, a manipulated email, or a crafted document.
Mitigation: Sanitize and validate all inputs to and outputs from MCP tool calls. Apply content moderation guardrails between the LLM and MCP servers. Implement strict tool scoping so agents can only take actions appropriate to their role, regardless of what instructions they receive. Use PII redaction on returned data to limit what sensitive information enters the model's context.
2. Tool Poisoning
Tool poisoning exploits the trust that LLMs place in MCP tool descriptions. Every MCP server provides metadata about its tools, including what they do, what parameters they accept, and how they should be used. The LLM relies on these descriptions to decide which tool to call.
A malicious MCP server can embed harmful instructions within tool descriptions that are invisible during routine inspection but influence the model's behavior. The description might instruct the LLM to prioritize calling the poisoned tool over legitimate alternatives, to include sensitive data in tool parameters, or to suppress user-facing output about certain actions.
Because tool descriptions are treated as trusted context by the LLM, the model has no built-in mechanism to distinguish between legitimate operational instructions and embedded attack payloads. The attack works precisely because the protocol trusts server-provided metadata.
Mitigation: Validate all MCP tool metadata before exposing it to agents. Use a gateway that inspects tool descriptions for hidden or anomalous instructions. Maintain a verified registry of trusted MCP tools and flag any unauthorized or suspicious entries.
3. Privilege Abuse and Over-Permissioned Access
MCP servers frequently request more permissions than they actually need. A server that only requires read access to customer records might request full read-write access to the entire CRM. A server connecting to a cloud provider might request administrator-level scopes.
This violates the principle of least privilege, and it dramatically increases the blast radius of any compromise. When an overly permissioned MCP server is compromised through prompt injection, tool poisoning, or a supply chain attack, the attacker inherits every permission that server holds.
The problem is compounded by the confused deputy vulnerability. The MCP server acts on behalf of the user but often with the server's own (broader) permissions. If the server is tricked into performing an unauthorized action, those elevated permissions become the attacker's permissions.
Mitigation: Audit all MCP server permission requests against actual usage. Enforce the principle of least privilege at the tool level, not just the server level. Implement progressive scope elevation where servers start with minimal permissions and request additional scopes only when specific operations require them. Use per-client consent flows so users explicitly approve each access grant. For a deeper look at how access policies apply to MCP environments, see our guide on MCP access control.
4. Tool Shadowing and Shadow MCP
Tool shadowing occurs when malicious actors create rogue MCP tools that closely mimic trusted services. Without robust validation, employees and AI agents may unintentionally route requests to these malicious tools instead of the legitimate ones they intended to use.
Shadow MCP is the related problem of unauthorized MCP servers running in your environment without security team visibility. Individual developers can install MCP servers on their workstations with no central approval process. Each unvetted server expands your attack surface.
Naming collisions make this worse. If a malicious server registers a tool with the same name or a similar name as a trusted tool, the LLM may select the malicious version. The model makes tool selection decisions based on descriptions and names, and slight variations can redirect actions to attacker-controlled servers.
Mitigation: Maintain a centralized registry of approved MCP servers and tools. Continuously scan for unauthorized MCP server installations across your environment. Implement tool allowlisting so agents can only discover and call tools from approved servers. Alert when new, unrecognized tools appear in any agent's available tool set.
5. Rug Pull Attacks (Tool Redefinition)
A rug pull attack exploits the trust that builds over time. An MCP server operates legitimately for weeks or months, passing security reviews and building a track record of safe behavior. Then, in a quiet update, the server's tool descriptions or underlying behavior change.
A tool originally described as "search customer records" begins silently exfiltrating data to an external endpoint. A tool that previously required specific parameters starts accepting broader inputs that enable injection attacks. Because the server was previously trusted and approved, many organizations won't catch the change until damage is done.
This risk is especially acute in the MCP ecosystem because many servers auto-update from package registries. Without version pinning and change monitoring, a malicious update propagates instantly across every installation.
Mitigation: Pin all MCP server versions by content digest, not mutable tags. Monitor for any changes in tool descriptions, parameters, or behavior after initial approval. Implement automated alerts that trigger security re-review when a pinned server's metadata changes. Use cryptographic signing to verify server integrity on every startup.
6. Supply Chain Risks
MCP servers are executable code, and the ecosystem is growing rapidly with community-built servers for nearly every major SaaS product, database, and developer tool. Many are installed directly from GitHub repositories or package registries, often by individual developers without security team review.
The supply chain risks include:
Typosquatting and poisoned packages. Malicious actors publish MCP servers with names similar to popular legitimate servers.
Dependency compromise. A vulnerability or backdoor in a transitive dependency propagates to every installation.
Unsigned builds. Without cryptographic signing, there's no way to verify that the server binary matches the published source code.
Lack of enterprise-grade development practices. Many community MCP servers lack the testing, code review, and security hardening that enterprise software requires.
Security researchers analyzing the MCP ecosystem have found command injection flaws in a significant percentage of analyzed servers, underscoring the immaturity of security practices in the current landscape.
Mitigation: Source MCP servers only from verified providers. Pin versions by content digest. Require cryptographic signature verification. Include MCP server dependencies in your existing SAST and SCA pipelines. Apply the same software supply chain security standards you use for any other production dependency. The OWASP Practical Guide for Securely Using Third-Party MCP Servers provides a detailed framework for vetting and managing third-party servers.
7. Sensitive Data Exposure and Token Theft
MCP servers store OAuth tokens, API keys, and other credentials for the services they connect to. A single server might hold tokens for GitHub, Salesforce, Slack, and your internal databases. This concentration of credentials makes MCP servers high-value targets.
Improperly configured MCP environments can leak credentials through multiple channels: verbose error messages, unredacted tool outputs, debug logs, or misconfigured environment variables. Tokens passed through MCP interactions without proper validation (the "token passthrough" anti-pattern that the MCP specification explicitly forbids) create additional exposure.
If an attacker compromises one MCP server, they don't just gain access to that server. They gain access to every downstream system those tokens authorize. Calendar data combined with email content and file storage access enables cross-service data aggregation attacks that individual service compromises wouldn't allow.
Mitigation: Use short-lived, scoped credentials rather than long-lived tokens. Implement just-in-time credential injection from a centralized secrets manager instead of storing tokens in server configurations. Never log credentials in plaintext. Redact sensitive data from tool outputs before they enter the LLM's context. Auto-rotate credentials on a defined schedule.
8. Command Injection and Malicious Code Execution
When MCP servers pass unvalidated user or external inputs to underlying databases or system commands, classic injection vulnerabilities emerge. An attacker can craft tool parameters that escape their intended context and execute arbitrary commands on the server's host system.
Local MCP servers are particularly vulnerable because they frequently execute OS commands to perform their functions. If a tool parameter is passed directly into a shell command without sanitization, command injection becomes trivial. SQL injection is also possible when tool parameters are concatenated directly into database queries without parameterization.
The Red Hat security team documented specific examples of MCP servers with command injection vulnerabilities in their tool implementations, where user-supplied input was passed directly to subprocess.call without sanitization.
Mitigation: Enforce strict input validation on all tool parameters. Use parameterized queries for database operations. Sanitize all data before using it as arguments for command execution. Run local MCP servers in sandboxed environments that restrict what commands can be executed. Apply the same secure coding practices you would for any web application handling untrusted input.
9. The Confused Deputy Problem
The confused deputy problem is a well-known security vulnerability that manifests at the MCP layer when proxy servers connect to third-party APIs. The official MCP security documentation describes this attack in detail.
The attack works like this: an MCP proxy server uses a static client ID to authenticate with a third-party authorization server. A legitimate user authenticates normally, and the third-party server sets a consent cookie. An attacker later sends the user a crafted link containing a malicious authorization request. The user's browser still has the consent cookie from the previous legitimate session, so the consent screen is skipped. The authorization code is redirected to the attacker's server, and they gain access to the third-party API as the compromised user.
This attack exploits the combination of static client IDs, dynamic client registration, and consent cookies, all of which are common patterns in MCP proxy server implementations.
Mitigation: MCP proxy servers must implement per-client consent before forwarding to third-party authorization. Store consent decisions server-side, bound to specific client IDs. Validate redirect URIs with exact string matching. Generate cryptographically secure state parameters for each authorization request, and only set consent cookies after the user has explicitly approved the consent screen.
10. Session Hijacking and SSRF
MCP's Streamable HTTP transport uses session IDs to maintain state between clients and servers. If an attacker obtains a valid session ID through network interception, log exposure, or brute force guessing, they can impersonate the legitimate client and execute unauthorized actions.
Server-Side Request Forgery (SSRF) is a related vector specific to MCP's OAuth discovery flow. During authentication, MCP clients fetch URLs from sources that could be controlled by a malicious server: resource metadata URLs, authorization server endpoints, and token endpoints. A malicious server can populate these fields with URLs pointing to internal resources like cloud metadata endpoints (169.254.169.254), internal network services, or localhost databases. The MCP client then makes requests to these internal targets on the attacker's behalf, effectively bypassing network perimeter controls.
Mitigation: Use cryptographically secure, non-deterministic session IDs. Bind session IDs to user-specific information so a valid session ID alone is insufficient for access. Rotate and expire session IDs regularly. For SSRF protection, enforce HTTPS for all OAuth-related URLs, block requests to private and reserved IP ranges, validate redirect targets, and use egress proxies for server-side MCP client deployments. Be aware of DNS rebinding attacks where domains resolve to safe IPs during validation but internal IPs during actual requests.
Why Traditional Security Fails for MCP
Enterprise security teams have decades of experience securing APIs, managing access controls, and monitoring network traffic. But MCP introduces properties that existing tools were not designed to handle.
The caller is non-deterministic. Traditional API security assumes a predictable caller. With MCP, an LLM decides which tools to invoke based on natural language input. The same prompt can produce different tool calls depending on context, model state, and available tools. Firewalls and WAFs that rely on predictable request patterns cannot effectively protect against this.
Instructions and data are blurred. In traditional systems, the instruction (API call) and data (payload) are clearly separated. In MCP, the LLM treats tool descriptions, user prompts, and returned data as a single context stream. Malicious content in any of these can influence which tools are called and with what parameters. This fundamentally breaks the input validation models that traditional security controls depend on.
The attack surface is dynamic. Every new MCP server added to your environment expands what agents can do. Unlike traditional APIs deployed through change management, MCP servers can be installed by individual developers with no central approval. There's no static perimeter to defend.
Audit is structurally harder. When an agent chains multiple tool calls across multiple servers to complete a task, reconstructing the causal chain from user intent to system action requires purpose-built observability. Standard API logging captures individual requests, but it doesn't capture the agent-level reasoning that connected them.
The "opt-in" security ecosystem. Security in MCP is largely opt-in. The protocol defines capabilities and best practices, but enforcement depends entirely on implementation. Not every MCP server implements proper authentication. Not every client validates server integrity. This inconsistency means that your security posture is only as strong as the weakest server in your agent's tool set. As the governance gap widens with every new agent connection, the need for centralized enforcement becomes unavoidable.
MCP Security Best Practices
Mitigating these risks requires controls at every layer of the MCP stack:
Enforce least-privilege access for every MCP server. Each server should have only the permissions required for its specific function. Audit permission grants regularly and revoke unused scopes.
Require human-in-the-loop for sensitive actions. Destructive operations, permission changes, external communications, and financial transactions should require explicit human approval before execution.
Sandbox MCP server execution environments. Run local servers in containers or application sandboxes with restricted filesystem and network access. Run remote servers with CPU/memory limits and egress controls.
Pin and verify MCP server versions. Pin to immutable content digests, not mutable tags. Verify cryptographic signatures before deployment. Alert on any changes to tool metadata or server behavior.
Monitor and log all MCP interactions. Log every tool call with full context: who initiated it, which agent performed it, what server was called, what parameters were passed, and what data was returned. Export logs to your SIEM for compliance and incident investigation.
Use centralized gateway controls for MCP traffic. Route all agent-to-server communication through a central enforcement point that applies consistent policies, validates tool metadata, and manages credentials.
Validate tool descriptions and parameters at runtime. Inspect tool metadata for hidden instructions before exposing tools to agents. Validate tool call parameters against expected schemas before execution. Redact sensitive data from tool outputs before they reach the LLM.
How an MCP Gateway Solves These Risks
The common thread across all ten risks is the lack of a centralized enforcement point. When each MCP client connects directly to each server, security controls are scattered across configurations, server code, and network policies. There's no single place to enforce policy, no unified audit trail, and no consistent way to detect anomalies.
An MCP gateway solves this by sitting between agents and servers, mediating every tool call through a single policy layer:
Centralized authentication and authorization. Every tool call is authenticated and authorized against defined policies before reaching the server. No more relying on individual server implementations to get auth right.
Policy enforcement at the gateway layer. Tool allowlisting, scope restrictions, parameter validation, and data redaction happen at a single enforcement point, consistently applied across all agent-server interactions.
Real-time monitoring and threat detection. Every request and response is logged with full context. Behavioral baselines detect anomalies. Alerts fire when agents deviate from established patterns.
Supply chain protection. The gateway validates server integrity through signature verification and change monitoring. Unapproved servers are invisible to agents, not just blocked.
Scope minimization and access control. The gateway enforces least privilege by controlling what tools each agent can discover and invoke, independent of what the underlying server exposes.
Agen provides exactly this architectural layer for enterprise teams. It sits between AI agents and the applications they access, extending your existing identity and access management into agent interactions. Whether you're securing AI agents for workforce productivity or exposing your SaaS product to customer-built agents, Agen provides identity-aware access control, data governance at the tool-call level, full observability, and anomaly detection purpose-built for agent behavior patterns. For local MCP servers running on developer workstations, Agen Shield enforces intent restrictions, skill quarantine, and egress controls at the OS level before data ever leaves the endpoint. Agen runs in your VPC, on-premises, or as a managed cloud service, keeping your tokens, keys, and audit logs within your security boundary.
MCP Security Checklist for Enterprise Teams
Use this checklist to assess your organization's readiness for production MCP deployment:
Pre-Deployment Assessment
[ ] All MCP servers have been sourced from verified providers and passed security review
[ ] Server versions are pinned by content digest with cryptographic signature verification
[ ] Permission scopes have been audited against actual tool requirements (least privilege)
[ ] Sandboxing and execution isolation are configured for all server environments
[ ] Network egress is restricted to approved endpoints only
Authentication and Access Control
[ ] OAuth 2.1 authentication is required for all remote MCP servers
[ ] Per-client consent flows are implemented and enforced
[ ] Per-tool authorization policies are defined (not just per-server)
[ ] Short-lived, auto-rotating credentials are used instead of long-lived API keys
[ ] Credentials are injected at runtime from a centralized secrets manager
Runtime Monitoring
[ ] Every tool call is logged with user identity, agent identity, server, tool, parameters, and results
[ ] Logs export to your SIEM in real time
[ ] Behavioral baselines are established for normal agent activity
[ ] Anomaly detection alerts fire on deviations from baseline patterns
[ ] Correlation IDs enable end-to-end tracing of multi-tool call chains
Ongoing Governance
[ ] A formal MCP server approval workflow exists (submit, scan, review, stage, deploy)
[ ] Tool description and behavior changes trigger automated alerts and re-review
[ ] Human-in-the-loop approval gates are configured for destructive or high-sensitivity actions
[ ] Governance controls map to your compliance frameworks (SOC 2, ISO 27001, GDPR, HIPAA)
[ ] Executive reporting on AI agent risk posture is in place
Incident Response
[ ] Incident response playbooks include MCP-specific scenarios (tool poisoning, credential theft, prompt injection)
[ ] Token revocation procedures are defined for compromised MCP servers
[ ] Communication templates exist for notifying affected users and downstream services
FAQ: MCP Security Risks
What is the biggest MCP security risk?
Prompt injection is the most frequently cited and broadly applicable MCP security risk. Because LLMs cannot reliably distinguish between legitimate user instructions and malicious instructions embedded in consumed content, any data an agent reads through MCP tools can potentially influence its actions. This risk is amplified by MCP because the agent has real tools at its disposal, turning what would normally be a text manipulation attack into an action execution attack.
Can MCP servers be trusted?
Not by default. MCP servers are executable code that hold credentials and perform actions on production systems. Trust must be established through verification: sourcing from trusted providers, pinning versions by digest, validating cryptographic signatures, sandboxing execution, monitoring behavior, and routing all interactions through a gateway with policy enforcement. The question is not whether to trust MCP servers, but how to verify and constrain that trust continuously.
How do I secure MCP in production?
Start with the foundational controls: enforce OAuth 2.1 authentication, implement least-privilege permissions, sandbox execution environments, and pin server versions. Layer on runtime protection: log every tool call, establish behavioral baselines, and configure anomaly detection. Add governance: define approval workflows for new servers, require security review, and map controls to your compliance frameworks. The most effective architectural pattern is a centralized MCP gateway that enforces all of these policies at a single enforcement point.
What is tool poisoning in MCP?
Tool poisoning is an attack where malicious instructions are embedded within MCP tool metadata (descriptions, parameters, and operational instructions). Because LLMs trust tool descriptions to decide which tool to call and how to call it, poisoned metadata can influence the model to perform unauthorized actions, exfiltrate data, or bypass security controls. The attack is difficult to detect through routine inspection because the malicious instructions are embedded in metadata fields that appear legitimate on the surface.
Is MCP safe for enterprise use?
MCP is safe for enterprise use when deployed with appropriate security controls. The protocol itself is well-designed and actively maintained with security guidance from both the specification authors and organizations like OWASP. The risks come from how MCP is implemented and deployed: unvetted servers, excessive permissions, missing authentication, lack of monitoring, and absent governance. Enterprises that treat MCP deployment with the same security rigor they apply to any other production infrastructure can use the protocol safely and confidently.
Read More Articles
Empower your workforce with secure agents

© 2026 Agen™ | All rights reserved.
Deploy anywhere
Empower your workforce with secure agents

© 2026 Agen™ | All rights reserved.
Deploy anywhere


