Model context protocol (MCP) is quickly growing in popularity as a means for enabling AI assistants to connect and communicate with a range of data sources, tools, and services that can better inform their actions, recommendations, and decisions. The protocol standardizes this communication, thereby laying a stronger foundation for agentic AI.
Acting similar to APIs, MCP servers typically sit in front of a data store or service, making it easier for agents to pull the information they need, when they need it, without customized integration overhead. Companies can use MCP servers to expose their own data to their own AI processes, or to external users, and they can also use pre-built MCP servers to connect to popular services such as PayPal, Zapier, and Shopify.
But enterprises planning to use MCP servers as part of their AI strategies should be aware of the risks they may bring. And there are a lot of risks and potential vulnerabilities to watch out for. Here are the 10 of the most common issues organizations can encounter when employing MCP.
Cross-tenant data exposure
Similar to cross-site scripting attacks, cross-tenant data exposure allows one set of users to access data belonging to another set of users – internal teams, business partners, or customers.
The fact that this vulnerability has already been discovered in the wild, in an MCP server implementation from a tech-savvy company, is a warning sign to any enterprise setting up their MCP servers.
According to UpGuard researchers who discovered this problem, the solution is to ensure that MCP servers enforce strict tenant isolation and least-privilege access.
Living off AI attacks
A threat actor posing as an employee, business partner, or customer sends a request to a human support agent. But the request contains a hidden prompt injection with instructions that only an AI can read. When the human employee passes the request on to their AI assistant it then, by virtue of its link to an MCP server, has access to a tool that connects it to sensitive data and business processes. That access can now be leveraged for malicious purposes.
This is not a theoretical threat but a real one, and one that can affect even a tech-savvy company.
One way to help prevent that is for enterprises to project themselves by enforcing least privilege on AI actions, analyst prompts for suspicious content in real time and maintain audit logs of MCP activity.
Tool poisoning
Setting up an MCP server for the first time can be tricky. Luckily, there’s a ton of ready-to-use ones. MCP servers currently lists more than 15,000 in its directory.
But if you do a Google search and download the first MCP server you find, there’s no guarantee that this server will do what it’s supposed to.
In April, Invariant Labs demonstrated how a malicious MCP server could extract information from other systems, sidestepping encryption and security measures, by adding malicious instructions to the MCP server’s description field.
But it’s not just the description field that can hold malicious instructions, the attack surface extends to all the information generated by MCP servers, which includes items like function names, parameters, parameter defaults, required fields and types. MCP servers also generate other messages, such as error messages or follow-up prompts. These, too, can contain malicious instructions for AI agents to follow.
How do you know if your MCP server download is malicious? First, check the source. Does it come from a trusted organization? Second, look at the permissions it asks for. If its purpose is to provide funny pictures of cats, it doesn’t need access to your file system.
Finally, if you can, check its source code. That can be tricky, but there are already vendors out there that are trying to get a handle on this. BackSlash Security, for example, has already gone through seven thousand publicly available MCP servers and analyzed them for security risks and found instances of both suspicious and outright malicious behaviors.
And it’s not enough to just vet an MCP server once, when it is installed. There’s a well-known attack vector in the software supply chain, where packages are downloaded, used, become trusted and are then updated by bad actors with malicious code.
According to Invariant Labs, which calls this a “rug pull” attack, this can also happen with MCP servers. An MCP server is updated with malicious functionality, then, after it does its evil acts, it’s updated again and nobody is the wiser. “Such breaches could go unnoticed by the victim, with only the attacker aware of the compromise,” CyberArk researcher Nil Ashkenazi stated.
Toxic agent flows via trusted platforms
An AI agent can be manipulated into leaking data or executing malicious code via a MCP server trusted system by adding a prompt injection to a public platform.
Researchers demonstrated how this can work with a Github MCP server. In this attack, the threat actor creates a new issue on the public repository, containing a prompt injection. A company might have a public repository to collect bug reports and its AI agent might then carry out a routine instruction such as checking for open issues on the public repo by using GitHub’s MCP server. Then the AI agent reads the prompt injection — such as an instruction to collect private data in another, private, GitHub repository that it also has access to via that same GitHub MCP server. The GitHub server isn’t directly compromised but t’s used as a conduit to carry out the attacks.
Invariant researchers used Anthropic’s Claude Desktop to demonstrate this attack vector, which, by default, requires users to confirm individual tool calls. “However, many users already opt for an ‘always allow’ confirmation policy when using agents,” the researchers wrote.
Token theft and account takeover
If an attacker is able to obtain the OAuth token stored by an MCP server they can create their own MCP server instance using this stolen token, according to a report from Pillar Security. The OAuth tokens can be stolen if they are stored unencrypted in the MCP server’s config or code files, and the attacker gets access to it via a backdoor, social engineering or other methods.
In Gmail, for example, the attacker would then be able to access the victim’s entire Gmail history, send out new emails that look like they came from the victim, delete emails, search for sensitive information, and set up forwarding rules to monitor future communications.
“Unlike traditional account compromises that might trigger suspicious login notifications, using a stolen token through MCP may appear as legitimate API access, making detection more difficult,” the researchers wrote.
Composability chaining
An unchecked MCP server may have hidden depths. If you download and use a third-party MCP server, and don’t verify where its data comes from, it could be sending requests to a second remote MCP server.
CyberArk calls this MCP server attack vector “composability chaining.” That second MCP server could return valid output plus hidden malicious instructions, the first server merges this with its own responses and sends everything to the AI agent, which then executes the malicious instructions. If you have sensitive data stored in environment variables, it could be exfiltrated by the attackers using this method, without you ever connecting to the malicious MCP server directly.
User consent fatigue
One security guardrail that enterprises frequently implement is to require human approval for actions taken by AI agents. But this can be a double-edged sword. According to Palo Alto Networks, a malicious MCP server might inundate an AI agent and its human user with multiple innocuous requests, such as multiple read permissions.
After a while, the users just start approving them without reading each in detail. At that point, the MCP server slips in a malicious request. “The core idea of this attack is similar to multi-factor authentication fatigue attacks, where users, overwhelmed by continuous authentication prompts, may inadvertently grant access to unauthorized entities,” researchers stated.
One variant of the user consent fatigue attack is the sampling attack, which in the LLM context just means generating text. According to CyberArk, sampling is an advanced MCP feature that allows the MCP server to send a message requesting a response from an LLM. A human is supposed to review the message before it’s passed on to the LLM, but the malicious instructions can be buried deep inside the message where they’re easy to miss.
For example, the malicious MCP server could tell the LLM to grab all the environmental variables it can and send them over. Even if a human-in-the-loop is reviewing the sampling message before it goes to the LLM, the malicious instructions could be buried deep inside a long wall of harmless text. And, on the return trip, if the LLM’s response is also long and complicated then, again, the person might not notice the sensitive information hidden inside.
Admin bypass
In this attack vector, an MCP server is set up so that it doesn’t require identity verification as in the case of a company setting up an MCP server for its directory so that AI agents can easily look up information on behalf of users.
If the user is only allowed low-level access to this information, but the MCP server doesn’t check the identity of the person making the initial request, then the AI agent can grab more than the individual is allowed to know. The request could come from a disgruntled insider, a curious employee looking to see what they can use their AI agent to find out, or an external attacker who’s found some other way into the employee’s environment.
And if this MCP server is also exposed to external users, such as business partners, customers, or even the public, this privilege escalation could result in great damage.
Command injection
If an MCP server passes user input directly to other systems without proper validation, users can inject their own commands in a way similar to how SQL injections work. Attackers could test for command injection vulnerabilities across all tools exposed by an MCP server.
As with other types of injection attacks, MCP servers should never pass user input directly to shell commands, use proper input validation, and parameterized commands.
Tool shadowing
If an AI agent has access to multiple MCP servers one of those servers can trick the agent into using a different server inappropriately. One example is one server that provides general information about medical symptoms and another that has access to the patient billing system.
“The shadowing attack can cause the agent to redirect all patient billing information to the attacker’s email address,” said Christian Posta, global field CTO at security firm Solo.io, in their research.
The MCP server for the billing system is safe, secure, and working as intended. The malicious MCP server doesn’t appear to be doing anything wrong, and the bad behavior might not leave obvious traces in audit logs. But the AI agent suddenly starts emailing patient billing information or sending it out through other seemingly legitimate operations.