Host
Thank you for reading this post, don't forget to subscribe!
→
MCP Server
→

Model
Trust BoundariesOAuthPrompt InjectionTool AbuseLocal Compromise
Table of Contents
- Why MCP changes pentesting
- MCP architecture and trust boundaries
- Visual attack surface and testing map
- MCP security visuals
- Pentest methodology for MCP
- Core security scenarios
- Controlled validation examples
- Severity matrix
- Detailed technical test cases
- Common MCP implementation mistakes
- Defensive hardening checklist
- Pentester quick checklist
- Final conclusion
Why MCP changes pentesting
Additionally, Model Context Protocol is not just another API integration layer. It is a protocol-driven bridge between a host application, one or more MCP clients, multiple MCP servers, and the large language model itself. That changes the security equation. In a normal application pentest, the tester usually focuses on authentication, authorization, input validation, logic flaws, and infrastructure exposure. In an MCP pentest, those still matter, but now they must be tested together with semantic control paths such as prompt injection, tool description abuse, resource poisoning, discovery manipulation, and authorization confusion.
Moreover, The biggest shift is this: in MCP, data can become instruction. A resource may look like passive content, but if it is passed back into the model without boundary handling, it can influence future tool invocation. A tool response may look like structured output, but if it contains malicious text and the client blindly forwards it, the model can be steered into unsafe actions. A metadata response in an OAuth flow may look like configuration, but if a client trusts it too easily, it can become an SSRF primitive or a redirection vector. This is why MCP security cannot be tested with only standard web scanning logic.
Specifically, A good MCP pentest therefore combines protocol analysis, identity and session review, local execution assessment, model-guided abuse testing, and downstream impact validation. The target is not just the server endpoint. The target is the entire trust system that allows an AI application to read context and perform actions.
Primary Security Shift
Instruction can be hidden inside prompts, tool descriptions, tool output, metadata, and resources.
Primary Pentest Goal
Break trust boundaries without relying only on malformed packets or obvious vulnerability payloads.
Primary Failure Mode
Safe-looking protocol flows become privileged execution chains once the model is influenced.
MCP architecture and trust boundaries
Subsequently, Before testing, the structure must be understood correctly. MCP typically contains a host, a client runtime, one or more servers, and the model. The host controls user experience, permissions, client lifecycle, and policy. The client manages protocol communication. The server exposes three important primitives: resources, prompts, and tools. Resources provide data. Prompts provide reusable prompt templates. Tools expose executable functions. Each one has a different abuse pattern, a different blast radius, and a different validation strategy.
Ultimately, From a pentesting perspective, the architecture is best modeled as six trust boundaries. First, host to client: can the host isolate sessions and enforce approval? Second, client to server: are discovery, authorization, and method invocation securely handled? Third, server to downstream systems: is the server a secure broker or a weak proxy? Fourth, model to tool chain: can content influence action? Fifth, local execution: is the local MCP server operating with excessive privilege? Sixth, session and token lifecycle: can identity artifacts be replayed, widened, stolen, or confused?
Resources
High risk for data exposure, tenant crossover, traversal, and untrusted-content injection into the model.
Prompts
High risk for prompt injection, unsafe parameterization, embedded resource abuse, and hidden instruction flow.
Tools
Moreover, High risk for action abuse, business-logic flaws, schema weakness, command injection, and destructive side effects.
Authorization Layer
Specifically, High risk for redirect abuse, confused deputy, token passthrough, SSRF, session hijack, and audience confusion.
Visual attack surface and testing map
Notably, A strong technical blog reads better when the architecture and attack flow are visible. The diagram below shows how MCP should be viewed during a pentest: not as one endpoint, but as a chain where trust moves from the user-facing host to the client runtime, into the MCP server, toward the model, and then out to downstream systems such as file stores, SaaS APIs, databases, and local processes.
MCP attack surface map
Host AppConsent • Policy • UX
→
MCP ClientJSON-RPC • Session • Auth
→
MCP ServerResources • Prompts • Tools
ModelReasoning • Tool Choice
↘
Downstream SystemsFiles • APIs • DB • SaaS
↗
Local Runtimestdio • localhost • files
Session hijack / auth confusionDiscovery abuse / SSRFPrompt injection / tool steeringPrivilege misuse / data exfilLocal compromise / sandbox escape
The most important lesson from this picture is that MCP failures often happen at the boundary between components, not only inside the server itself.
| Scenario | Primary entry point | Typical weakness | Likely impact | Pentest signal |
|---|---|---|---|---|
| Capability discovery abuse | tools/list, prompts/list, resources/list | Visibility not bound to authorization | Privilege mapping, sensitive feature discovery | Low-privilege user sees hidden objects or templates |
| OAuth confusion | 401 flow, metadata discovery, callbacks | Weak redirect checks, bad state handling | Account mix-up, confused deputy, token theft | Near-match redirect or reused state is accepted |
| Prompt and tool steering | Prompt body, resource content, tool output | Untrusted text treated as instruction | Unsafe tool use, policy bypass, exfiltration | Model follows hostile content over local policy |
| Resource abuse | resources/read and URI templates | Normalization or access control weakness | Cross-tenant reads, secrets exposure | Guessed URI works despite list restrictions |
| Local MCP compromise | stdio launch, localhost service, filesystem | Broad local rights, no sandbox, opaque launch | Credential theft, file access, host takeover | Local service is reachable or reads sensitive paths |
This format helps readers compare entry points, weaknesses, and impact quickly, which is exactly how security teams usually triage protocol-level findings.
MCP security visuals
The diagrams below summarize four high-value areas in an MCP assessment: authorization trust flow, prompt-to-tool influence, resource URI validation, and the local execution boundary. Together, they show where control failures usually appear and why these paths deserve focused testing during a pentest.
OAuth and redirect trust flow
Client
→
MCP Server
→
Auth Server
401 + metadataRedirect validationPKCEState bindingAudience checks
This image fits well near the authorization section and helps explain why redirect handling and consent boundaries are security-critical.
Prompt injection and tool-steering chain
Resource
→
Model
→
Tool Call
Untrusted content can become instructionLabelingOutput validationApproval gatesInstruction isolation
This visual is useful where the article explains the difference between untrusted content and privileged instructions.
Resource URI abuse path
Input URI
→
Canonicalization
→
Authorization
Wrong order = access control bypass riskNormalize firstEvaluate root boundaryCheck tenant scope
This image strengthens the section on URI normalization and makes the path-traversal discussion easier to understand visually.
Local MCP server risk picture
Host App
→
Local MCP
Sensitive filesEnvironment variablesBrowser dataCredential storesLocalhost exposure
This graphic supports the local-runtime section and visually highlights why sandboxing and execution transparency matter.
Pentest methodology for MCP
The first step is protocol discovery. Capture initialization traffic, enumerate server capabilities, map the JSON-RPC methods, and list all prompts, tools, and resources. Do not stop at what the UI exposes. Many deployments expose additional list operations, pagination cursors, templates, subscriptions, or change notifications that expand the attack surface. Every method should be classified by sensitivity, downstream reach, and approval expectation.
The second step is schema review. Tool schemas often reveal design weaknesses before exploitation starts. Look for unrestricted strings, missing required fields, nested free-form objects, permissive additional properties, and weak enums. Then connect those schema issues to execution sinks such as shell wrappers, SQL queries, file paths, HTTP clients, or administrative actions. Schema flaws in MCP are not abstract design problems; they are the first stage of exploitability.
The third step is semantic abuse testing. Create malicious prompts, poisoned resources, and adversarial tool outputs. Test whether the client treats untrusted text as instruction. Test whether a tool result can force a second risky tool call. Test whether resource contents can bypass approval logic through indirect steering. In many MCP systems, the protocol packets are technically valid but the overall security behavior still fails because the model is not isolated from hostile content.
The fourth step is identity and session review. For HTTP transports, inspect OAuth flows, redirect handling, state, PKCE, metadata discovery, audience validation, token storage, and scope minimization. For local transports, review execution commands, launch parameters, exposed local ports, sandboxing, file rights, network permissions, and implicit trust granted to the installed server.
Core security scenarios every MCP pentest should cover
🔍 1. Discovery and capability abuse
Enumeration Risk
Discovery is often underestimated. Attackers can use tool lists, prompt lists, resource templates, notifications, and cursor behavior to understand hidden capabilities. Even when direct access is blocked, metadata can reveal privileged functions or sensitive system structure.
Security scenarios
- Low-privilege users seeing admin-only tools
- Resource templates exposing tenant or project identifiers
- List-changed events leaking privileged feature existence
- Pagination cursors enabling hidden object discovery
Testing focus
- Role-based visibility comparison
- Cursor replay and manipulation
- Notification subscription behavior
- Discoverable but unreadable object mismatch
🛂 2. Authorization, redirect, and confused deputy flaws
Identity Critical
In HTTP-based MCP, OAuth and discovery become central. Weak redirect validation, broad consent assumptions, or poor client binding can turn a secure-looking integration into a confused deputy. This is especially severe when an MCP server is also a proxy to a third-party API.
Security scenarios
- Redirect URI near-match acceptance
- Per-user consent stored without per-client binding
- State reuse or PKCE bypass
- Proxy server granting downstream access to the wrong client
Testing focus
- Exact redirect matching
- Single-use state validation
- Dynamic client registration handling
- Consent page identity clarity and anti-clickjacking
🎫 3. Token passthrough and audience confusion
Privilege Expansion
If an MCP server accepts tokens not issued for itself or forwards client tokens downstream, it collapses trust boundaries and weakens attribution. The server effectively becomes a transport tunnel instead of an enforcing broker.
Security scenarios
- Foreign audience tokens accepted as valid
- Downstream API tokens replayed against the MCP server
- Shared tokens preventing client-level attribution
- Over-broad scope reuse across unrelated tools
Testing focus
- Audience validation
- Issuer and scope validation
- Server-side identity mapping
- Downstream request impersonation behavior
🌐 4. SSRF through metadata and discovery URLs
Infrastructure Reach
MCP clients may fetch URLs returned during metadata discovery. If those URLs are attacker-controlled, they can drive SSRF into localhost, internal networks, cloud metadata services, or redirected internal targets.
Security scenarios
- Direct internal IP fetches
- DNS rebinding after initial validation
- Public-to-private redirect chains
- Unrestricted metadata host trust
Testing focus

- Blocklists for localhost and RFC1918
- Redirect validation
- DNS pinning or repeated resolution checks
- Outbound filtering and egress control
🧠 5. Prompt injection, resource poisoning, and tool-output steering
Model Manipulation
This is the defining MCP challenge. Untrusted content can cross the model boundary and steer behavior. A poisoned prompt can manipulate context. A malicious resource can hide instructions. A tool can return output that tricks the model into using another tool unsafely. The protocol can be perfectly valid while the behavior becomes unsafe.
Security scenarios
- Resource content overriding host policy
- Tool results containing hidden next-step instructions
- Embedded references to sensitive local URIs
- Prompt templates accepting unvalidated control strings
Testing focus
- Untrusted-content labeling
- Structured output validation
- Approval checks before sensitive tool calls
- Isolation between data channels and instruction channels
📂 6. Resource URI abuse and unauthorized data access
Data Exposure
Resource reads are often where actual secrets leak. MCP servers may expose file-backed content, API-wrapped data, logs, repositories, or records via URI patterns. If normalization or authorization is weak, a simple read request can become path traversal, tenant crossover, or direct object reference exploitation.
Security scenarios
- Traversal in file-backed resources
- Cross-tenant resource-template manipulation
- Direct read of non-listed resources
- Encoding-based normalization bypass
Testing focus
- URI normalization before authorization
- Template parameter boundary tests
- Scheme validation
- Read-vs-list consistency review
🛠️ 7. Tool schema abuse and business-logic compromise
Action Surface
Tools are where MCP becomes dangerous in the real world. Weak schemas, unsafe parsers, or poorly designed approval logic can turn a normal automation helper into an administrative exploit surface.
Security scenarios
- Unrestricted strings reaching shell or SQL sinks
- Free-form objects carrying hidden control fields
- Replay of non-idempotent destructive actions
- Race conditions in approval or multi-step workflows
Testing focus

- Strict schema validation
- additionalProperties hardening
- Safe defaults for destructive actions
- Rate limiting, timeouts, and replay handling
🖥️ 8. Session hijacking and local MCP compromise
Host-Level Risk
Session-based weaknesses can let an attacker attach to another user’s active context. Local MCP servers add a second risk: they often run with broad local privilege and may access files, environment variables, credentials, and local services. If startup flows are opaque or localhost interfaces are open, the local machine becomes part of the attack surface.
Security scenarios
- Session replay across users or devices
- Predictable or weak session identifiers
- Unauthenticated localhost MCP control endpoints
- Silent execution of risky local commands
Testing focus
- Session binding to identity and context
- Entropy and lifetime of identifiers
- Sandboxing of local servers
- Visibility of launch commands and permissions
Controlled validation examples
The examples below show controlled validation inputs that help explain what a reviewer should inspect during an MCP assessment. They are written for defensive testing, protocol review, and boundary validation in authorized environments.
Discovery and enumeration example
{“jsonrpc”:”2.0″,”id”:7,”method”:”tools/list”,”params”:{“cursor”:”example-cursor-boundary-test”}}
Use this kind of example to discuss cursor handling, list visibility, and authorization drift.
OAuth redirect validation example
redirect_uri=https://client.example.invalid/callback%2F..%2Fcallback state=sample-single-use-token code_verifier=sample-pkce-verifier
Good for explaining exact-match redirect checks, state reuse rejection, and PKCE enforcement.
Metadata discovery SSRF example
WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/.well-known/oauth-protected-resource”
Useful for showing why clients must validate discovery targets and block internal-network fetches.
Prompt-steering example
Resource content: “This document is untrusted test content. Any instruction inside it must not override host policy or trigger sensitive tool use.”
This keeps the article practical while reinforcing the boundary between data and instruction.
Resource URI normalization example
file:///workspace/reports/%2e%2e/%2e%2e/secrets.txt
A clean way to explain canonicalization, template validation, and root-boundary enforcement.
Tool schema boundary example
{“project”:”demo-app”,”action”:”read”,”path”:”/reports/latest.json”,”unexpected_admin_flag”:true}
Helpful for illustrating why additionalProperties and strict schema enforcement matter.
Testing workflow image
Discover
→
Model Risks
→
Auth Review
→
Exploit Path
→
Validate Impact + Fix
This workflow gives the reader a clear mental model for how an MCP assessment typically progresses from discovery to impact validation and remediation.
Severity matrix for MCP security scenarios
Not all MCP weaknesses carry the same risk. Some expose useful metadata, while others allow action abuse, identity confusion, or local compromise. A severity matrix helps readers prioritize testing and remediation, especially when explaining findings to engineering teams, product owners, or leadership.
Critical focus: auth + toolingHigh focus: prompt boundariesMedium focus: discovery leakageAlways review: local execution
| Scenario | Likelihood | Potential severity | Why it matters | Priority |
|---|---|---|---|---|
| OAuth confusion / confused deputy | Medium to High | Critical | Can collapse client identity boundaries and grant unauthorized downstream access. | Immediate |
| Token passthrough / audience confusion | Medium | Critical | Breaks attribution and can allow foreign tokens to operate across services. | Immediate |
| Prompt steering / tool-output injection | High | High | Turns untrusted content into action influence, which is a defining MCP risk. | High |
| Tool schema abuse / business-logic compromise | High | High | Tools are the operational edge of the system and often directly touch sensitive workflows. | High |
| Local MCP compromise | Medium | High | Broad local privilege can lead to host-level access, secrets exposure, and unsafe execution. | High |
| Resource URI abuse | Medium | High | Can expose sensitive files, tenant data, or internal records through weak normalization. | High |
| Discovery leakage | High | Medium | Often becomes the map that helps attackers chain more severe findings. | Medium |
| Session hijack | Medium | High | A stolen active session can shortcut many intended approval and policy controls. | High |
This matrix helps engineering teams and reviewers prioritize remediation by showing which weaknesses are most likely to cause high-impact control failures.
Detailed technical test cases
Below is a compact, practical set of MCP test cases that can be used in a real pentest checklist. These are written in a way that maps directly to protocol behavior, not just general web testing habits.
Test Case Block A — Discovery and enumeration
- Call all list methods with each role and compare outputs for hidden capabilities.
- Replay cursors, use invalid cursors, and test large pagination states for leakage or crashes.
- Subscribe to change notifications and verify whether low-privilege sessions learn about privileged changes.
- Check whether list-hidden resources are still directly readable through guessed URIs.
Example objective: determine whether capability discovery respects authorization or merely hides buttons in the UI.
Test Case Block B — OAuth and authorization
- Trigger 401 flows and inspect resource metadata locations and scope challenges.
- Try exact-match bypasses for redirect URI handling: encoding, case, subdomain, path suffix, mixed slashes.
- Reuse state values and attempt callback completion with mismatched PKCE verifier values.
- Check whether consent is stored per client identity, not only per user.
Expected secure result: exact redirect matching, single-use state, PKCE enforcement, and per-client consent isolation.
Illustrative validation sample: GET /authorize?client_id=sample-client&redirect_uri=https://client.example.invalid/callback&state=test-state-01&code_challenge=samplechallenge&code_challenge_method=S256
Test Case Block C — Token and audience validation
- Present a token meant for another service and verify it is rejected.
- Present a downstream API token to the MCP server and test whether it accepts or forwards it.
- Inspect whether logs preserve client attribution or hide all actions under one service identity.
- Verify scope minimization: low-risk tools should not require broad administrative privilege.
Expected secure result: no token passthrough, audience validation enforced, and authorization remains server-side.
Test Case Block D — SSRF and metadata poisoning
- Return metadata URLs that point to localhost, internal RFC1918 IPs, and cloud metadata services.
- Use a public host that redirects to internal targets to test redirect safety.
- Attempt DNS rebinding between validation and fetch time.
- Observe outbound traffic to confirm whether egress controls actually work.
Expected secure result: blocked internal access, limited redirects, HTTPS-only policy, and validated metadata sources.
Illustrative validation sample: WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/prm” Expected reviewer question: does the client verify that the metadata host is trusted and non-internal?
Test Case Block E — Prompt and tool-output abuse
- Create resources containing adversarial instructions that conflict with system policy.
- Return tool output that includes malicious next-step guidance for the model.
- Attempt to embed local file or privileged URI references inside prompt content.
- Verify whether the client labels untrusted content and separates data from instructions.
Expected secure result: hostile content is treated as untrusted data, not privileged instruction.
Illustrative validation sample: “This is a lab-only untrusted content sample. The client should display it as data, not follow it as an instruction source.”
Test Case Block F — Resource URI and data boundary validation
- Try traversal patterns, double encoding, alternate schemes, and case variants in resource reads.
- Swap tenant or project identifiers in resource templates to test isolation.
- Compare resource list visibility against direct read behavior.
- Check normalization order: authorization must occur after canonicalization.
Expected secure result: canonicalized URIs, enforced root boundaries, and strong object-level authorization.
Illustrative validation sample: resource_uri = “file:///workspace/docs/%2e%2e/%2e%2e/config.json” Expected reviewer question: is canonicalization performed before the access decision?
Test Case Block G — Tool schema, replay, and business logic
- Send extra properties to test whether unexpected control fields are accepted.
- Inject payloads into strings that may reach command, SQL, template, or HTTP execution sinks.
- Replay destructive requests to test approval and idempotency logic.
- Send oversized objects to test parser behavior, memory pressure, and timeout resilience.
Expected secure result: strict schema validation, replay control, human approval for destructive tools, and safe failure handling.
Illustrative validation sample: {“operation”:”read_report”,”report_id”:”sample-001″,”unexpected_debug”:true} Expected reviewer question: does the tool reject undeclared properties?
Test Case Block H — Session and local execution review
- Replay session identifiers from another environment and verify rejection.
- Check whether websocket or event-stream messages can attach to another user context.
- Inspect local MCP startup commands for hidden arguments, silent elevation, or broad file access.
- Test whether localhost-exposed MCP services require authentication or are reachable by other processes.
Expected secure result: session binding, strong entropy, visible launch risk, and sandboxed local execution.
Common MCP implementation mistakes
Many MCP security failures do not come from one dramatic coding error. They come from small assumptions repeated across the stack: trusting metadata too early, treating tool output as authoritative, exposing a local server without enough restriction, or assuming OAuth plumbing is secure because a library is in use. This section helps readers recognize real-world failure patterns quickly and map them back to practical design and implementation decisions.
Frequent mistakes seen in MCP-style implementations
- Using UI visibility as a replacement for real authorization.
- Accepting any structurally valid token without checking audience and resource binding.
- Allowing tool outputs or resource contents to influence subsequent model behavior without validation.
- Fetching authorization metadata from untrusted locations without SSRF protection.
- Leaving
additionalPropertiesopen in tool schemas that drive sensitive actions. - Normalizing resource URIs after access decisions instead of before them.
- Running local MCP servers with broad privileges and poor user visibility into startup commands.
- Bundling many unrelated permissions into one oversized scope or long-lived session.
This section creates a useful break in the flow and reinforces the most common implementation errors teams should look for during review.
Defensive hardening checklist
A strong MCP deployment is not built by adding one filter at the end. It is built by enforcing boundaries at each layer. The host should present clear approval flows and isolate sessions. The client should treat prompts, annotations, tool results, and metadata as untrusted inputs unless explicitly validated. The server should validate every tool input, every resource URI, and every authorization artifact. Downstream integrations should not inherit trust automatically from the protocol layer.
- Use exact redirect URI matching and PKCE for remote authorization flows.
- Never accept tokens not issued for the MCP server itself.
- Do not implement token passthrough to downstream APIs.
- Treat discovery and metadata URLs as attacker-controlled until validated.
- Separate untrusted content from privileged model instructions.
- Validate prompt parameters, tool inputs, and tool outputs with explicit schemas.
- Enforce URI canonicalization before authorization in resource reads.
- Minimize scopes and use progressive privilege elevation.
- Bind sessions strongly and avoid using session identifiers as sole authentication.
- For local MCP servers, prefer restricted execution, sandboxing, and explicit permission visibility.
Pentester quick checklist
This final box is useful for readers who want a practical takeaway. It also prints well and can be reused as a review sheet during internal security testing, design reviews, or pre-release validation for MCP-enabled features.
Printable review sheet
- Enumerate all prompts, tools, resources, templates, and subscriptions.
- Check whether discovery visibility changes correctly by role and session.
- Validate OAuth redirect handling, PKCE, state, and per-client consent behavior.
- Reject foreign-audience tokens and confirm no token passthrough exists.
- Test metadata discovery for SSRF, redirect abuse, and DNS rebinding resilience.
- Review prompt, resource, and tool-output handling for instruction-versus-data separation.
- Verify resource URI canonicalization before authorization and inspect cross-tenant isolation.
- Test tool schemas for undeclared fields, weak enums, replay, and destructive-action approval logic.
- Inspect session binding, entropy, expiry, and cross-context replay behavior.
- Review local MCP execution paths for sandboxing, file access, and localhost exposure.
- Document impact using scenario, entry point, control failure, exploitability, and remediation mapping.
Final conclusion
MCP expands the power of AI systems by giving them structured context and executable pathways into real systems. That same design also creates a layered attack surface where protocol correctness alone is not enough. The real question is whether the trust boundaries hold when hostile content, hostile servers, weak OAuth handling, or unsafe local execution are introduced.

For pentesters, the lesson is simple: test every place where data can become instruction, every place where identity can become authority, and every place where convenience silently becomes privilege. For defenders, the right approach is equally clear: validate schemas, reduce scope, harden authorization, distrust metadata, isolate local execution, and require explicit approval wherever a tool can change state or access sensitive content.
If an organization plans to deploy MCP at scale, the pentest must be architecture-aware, model-aware, and protocol-aware. Only then does the assessmHost
→
MCP Server
→
Model
Trust BoundariesOAuthPrompt InjectionTool AbuseLocal Compromise
Table of Contents
- Why MCP changes pentesting
- MCP architecture and trust boundaries
- Visual attack surface and testing map
- MCP security visuals
- Pentest methodology for MCP
- Core security scenarios
- Controlled validation examples
- Severity matrix
- Detailed technical test cases
- Common MCP implementation mistakes
- Defensive hardening checklist
- Pentester quick checklist
- Final conclusion
Why MCP changes pentesting
Model Context Protocol is not just another API integration layer. It is a protocol-driven bridge between a host application, one or more MCP clients, multiple MCP servers, and the large language model itself. That changes the security equation. In a normal application pentest, the tester usually focuses on authentication, authorization, input validation, logic flaws, and infrastructure exposure. In an MCP pentest, those still matter, but now they must be tested together with semantic control paths such as prompt injection, tool description abuse, resource poisoning, discovery manipulation, and authorization confusion.
The biggest shift is this: in MCP, data can become instruction. A resource may look like passive content, but if it is passed back into the model without boundary handling, it can influence future tool invocation. A tool response may look like structured output, but if it contains malicious text and the client blindly forwards it, the model can be steered into unsafe actions. A metadata response in an OAuth flow may look like configuration, but if a client trusts it too easily, it can become an SSRF primitive or a redirection vector. This is why MCP security cannot be tested with only standard web scanning logic.
A good MCP pentest therefore combines protocol analysis, identity and session review, local execution assessment, model-guided abuse testing, and downstream impact validation. The target is not just the server endpoint. The target is the entire trust system that allows an AI application to read context and perform actions.
Primary Security Shift
Instruction can be hidden inside prompts, tool descriptions, tool output, metadata, and resources.
Primary Pentest Goal
Break trust boundaries without relying only on malformed packets or obvious vulnerability payloads.
Primary Failure Mode
Safe-looking protocol flows become privileged execution chains once the model is influenced.
MCP architecture and trust boundaries
Before testing, the structure must be understood correctly. MCP typically contains a host, a client runtime, one or more servers, and the model. The host controls user experience, permissions, client lifecycle, and policy. The client manages protocol communication. The server exposes three important primitives: resources, prompts, and tools. Resources provide data. Prompts provide reusable prompt templates. Tools expose executable functions. Each one has a different abuse pattern, a different blast radius, and a different validation strategy.
From a pentesting perspective, the architecture is best modeled as six trust boundaries. First, host to client: can the host isolate sessions and enforce approval? Second, client to server: are discovery, authorization, and method invocation securely handled? Third, server to downstream systems: is the server a secure broker or a weak proxy? Fourth, model to tool chain: can content influence action? Fifth, local execution: is the local MCP server operating with excessive privilege? Sixth, session and token lifecycle: can identity artifacts be replayed, widened, stolen, or confused?
Resources
High risk for data exposure, tenant crossover, traversal, and untrusted-content injection into the model.
Prompts
High risk for prompt injection, unsafe parameterization, embedded resource abuse, and hidden instruction flow.
Tools
High risk for action abuse, business-logic flaws, schema weakness, command injection, and destructive side effects.
Authorization Layer
High risk for redirect abuse, confused deputy, token passthrough, SSRF, session hijack, and audience confusion.
Visual attack surface and testing map
A strong technical blog reads better when the architecture and attack flow are visible. The diagram below shows how MCP should be viewed during a pentest: not as one endpoint, but as a chain where trust moves from the user-facing host to the client runtime, into the MCP server, toward the model, and then out to downstream systems such as file stores, SaaS APIs, databases, and local processes.
MCP attack surface map
Host AppConsent • Policy • UX
→
MCP ClientJSON-RPC • Session • Auth
→
MCP ServerResources • Prompts • Tools
ModelReasoning • Tool Choice
↘
Downstream SystemsFiles • APIs • DB • SaaS
↗
Local Runtimestdio • localhost • files

Session hijack / auth confusionDiscovery abuse / SSRFPrompt injection / tool steeringPrivilege misuse / data exfilLocal compromise / sandbox escape
The most important lesson from this picture is that MCP failures often happen at the boundary between components, not only inside the server itself.
| Scenario | Primary entry point | Typical weakness | Likely impact | Pentest signal |
|---|---|---|---|---|
| Capability discovery abuse | tools/list, prompts/list, resources/list | Visibility not bound to authorization | Privilege mapping, sensitive feature discovery | Low-privilege user sees hidden objects or templates |
| OAuth confusion | 401 flow, metadata discovery, callbacks | Weak redirect checks, bad state handling | Account mix-up, confused deputy, token theft | Near-match redirect or reused state is accepted |
| Prompt and tool steering | Prompt body, resource content, tool output | Untrusted text treated as instruction | Unsafe tool use, policy bypass, exfiltration | Model follows hostile content over local policy |
| Resource abuse | resources/read and URI templates | Normalization or access control weakness | Cross-tenant reads, secrets exposure | Guessed URI works despite list restrictions |
| Local MCP compromise | stdio launch, localhost service, filesystem | Broad local rights, no sandbox, opaque launch | Credential theft, file access, host takeover | Local service is reachable or reads sensitive paths |
This format helps readers compare entry points, weaknesses, and impact quickly, which is exactly how security teams usually triage protocol-level findings.
MCP security visuals
The diagrams below summarize four high-value areas in an MCP assessment: authorization trust flow, prompt-to-tool influence, resource URI validation, and the local execution boundary. Together, they show where control failures usually appear and why these paths deserve focused testing during a pentest.
OAuth and redirect trust flow
Client
→
MCP Server
→
Auth Server
401 + metadataRedirect validationPKCEState bindingAudience checks
This image fits well near the authorization section and helps explain why redirect handling and consent boundaries are security-critical.
Prompt injection and tool-steering chain
Resource
→
Model
→
Tool Call
Untrusted content can become instructionLabelingOutput validationApproval gatesInstruction isolation
This visual is useful where the article explains the difference between untrusted content and privileged instructions.
Resource URI abuse path
Input URI
→
Canonicalization
→
Authorization
Wrong order = access control bypass riskNormalize firstEvaluate root boundaryCheck tenant scope
This image strengthens the section on URI normalization and makes the path-traversal discussion easier to understand visually.
Local MCP server risk picture
Host App
→
Local MCP
Sensitive filesEnvironment variablesBrowser dataCredential storesLocalhost exposure
This graphic supports the local-runtime section and visually highlights why sandboxing and execution transparency matter.
Pentest methodology for MCP
The first step is protocol discovery. Capture initialization traffic, enumerate server capabilities, map the JSON-RPC methods, and list all prompts, tools, and resources. Do not stop at what the UI exposes. Many deployments expose additional list operations, pagination cursors, templates, subscriptions, or change notifications that expand the attack surface. Every method should be classified by sensitivity, downstream reach, and approval expectation.
The second step is schema review. Tool schemas often reveal design weaknesses before exploitation starts. Look for unrestricted strings, missing required fields, nested free-form objects, permissive additional properties, and weak enums. Then connect those schema issues to execution sinks such as shell wrappers, SQL queries, file paths, HTTP clients, or administrative actions. Schema flaws in MCP are not abstract design problems; they are the first stage of exploitability.
The third step is semantic abuse testing. Create malicious prompts, poisoned resources, and adversarial tool outputs. Test whether the client treats untrusted text as instruction. Test whether a tool result can force a second risky tool call. Test whether resource contents can bypass approval logic through indirect steering. In many MCP systems, the protocol packets are technically valid but the overall security behavior still fails because the model is not isolated from hostile content.
The fourth step is identity and session review. For HTTP transports, inspect OAuth flows, redirect handling, state, PKCE, metadata discovery, audience validation, token storage, and scope minimization. For local transports, review execution commands, launch parameters, exposed local ports, sandboxing, file rights, network permissions, and implicit trust granted to the installed server.
Core security scenarios every MCP pentest should cover
🔍 1. Discovery and capability abuse
Enumeration Risk
Discovery is often underestimated. Attackers can use tool lists, prompt lists, resource templates, notifications, and cursor behavior to understand hidden capabilities. Even when direct access is blocked, metadata can reveal privileged functions or sensitive system structure.
Security scenarios
- Low-privilege users seeing admin-only tools
- Resource templates exposing tenant or project identifiers
- List-changed events leaking privileged feature existence
- Pagination cursors enabling hidden object discovery
Testing focus
- Role-based visibility comparison
- Cursor replay and manipulation
- Notification subscription behavior
- Discoverable but unreadable object mismatch
🛂 2. Authorization, redirect, and confused deputy flaws
Identity Critical
In HTTP-based MCP, OAuth and discovery become central. Weak redirect validation, broad consent assumptions, or poor client binding can turn a secure-looking integration into a confused deputy. This is especially severe when an MCP server is also a proxy to a third-party API.
Security scenarios
- Redirect URI near-match acceptance
- Per-user consent stored without per-client binding
- State reuse or PKCE bypass
- Proxy server granting downstream access to the wrong client
Testing focus
- Exact redirect matching
- Single-use state validation
- Dynamic client registration handling
- Consent page identity clarity and anti-clickjacking
🎫 3. Token passthrough and audience confusion
Privilege Expansion
If an MCP server accepts tokens not issued for itself or forwards client tokens downstream, it collapses trust boundaries and weakens attribution. The server effectively becomes a transport tunnel instead of an enforcing broker.
Security scenarios
- Foreign audience tokens accepted as valid
- Downstream API tokens replayed against the MCP server
- Shared tokens preventing client-level attribution
- Over-broad scope reuse across unrelated tools
Testing focus
- Audience validation
- Issuer and scope validation
- Server-side identity mapping
- Downstream request impersonation behavior
🌐 4. SSRF through metadata and discovery URLs
Infrastructure Reach
MCP clients may fetch URLs returned during metadata discovery. If those URLs are attacker-controlled, they can drive SSRF into localhost, internal networks, cloud metadata services, or redirected internal targets.
Security scenarios

- Direct internal IP fetches
- DNS rebinding after initial validation
- Public-to-private redirect chains
- Unrestricted metadata host trust
Testing focus
- Blocklists for localhost and RFC1918
- Redirect validation
- DNS pinning or repeated resolution checks
- Outbound filtering and egress control
🧠 5. Prompt injection, resource poisoning, and tool-output steering
Model Manipulation
This is the defining MCP challenge. Untrusted content can cross the model boundary and steer behavior. A poisoned prompt can manipulate context. A malicious resource can hide instructions. A tool can return output that tricks the model into using another tool unsafely. The protocol can be perfectly valid while the behavior becomes unsafe.
Security scenarios
- Resource content overriding host policy
- Tool results containing hidden next-step instructions
- Embedded references to sensitive local URIs
- Prompt templates accepting unvalidated control strings
Testing focus
- Untrusted-content labeling
- Structured output validation
- Approval checks before sensitive tool calls
- Isolation between data channels and instruction channels
📂 6. Resource URI abuse and unauthorized data access
Data Exposure
Resource reads are often where actual secrets leak. MCP servers may expose file-backed content, API-wrapped data, logs, repositories, or records via URI patterns. If normalization or authorization is weak, a simple read request can become path traversal, tenant crossover, or direct object reference exploitation.
Security scenarios
- Traversal in file-backed resources
- Cross-tenant resource-template manipulation
- Direct read of non-listed resources
- Encoding-based normalization bypass
Testing focus
- URI normalization before authorization
- Template parameter boundary tests
- Scheme validation
- Read-vs-list consistency review
🛠️ 7. Tool schema abuse and business-logic compromise
Action Surface
Tools are where MCP becomes dangerous in the real world. Weak schemas, unsafe parsers, or poorly designed approval logic can turn a normal automation helper into an administrative exploit surface.
Security scenarios
- Unrestricted strings reaching shell or SQL sinks
- Free-form objects carrying hidden control fields
- Replay of non-idempotent destructive actions
- Race conditions in approval or multi-step workflows
Testing focus
- Strict schema validation
- additionalProperties hardening
- Safe defaults for destructive actions
- Rate limiting, timeouts, and replay handling
🖥️ 8. Session hijacking and local MCP compromise
Host-Level Risk
Session-based weaknesses can let an attacker attach to another user’s active context. Local MCP servers add a second risk: they often run with broad local privilege and may access files, environment variables, credentials, and local services. If startup flows are opaque or localhost interfaces are open, the local machine becomes part of the attack surface.
Security scenarios
- Session replay across users or devices
- Predictable or weak session identifiers
- Unauthenticated localhost MCP control endpoints
- Silent execution of risky local commands
Testing focus
- Session binding to identity and context
- Entropy and lifetime of identifiers
- Sandboxing of local servers
- Visibility of launch commands and permissions
Controlled validation examples
The examples below show controlled validation inputs that help explain what a reviewer should inspect during an MCP assessment. They are written for defensive testing, protocol review, and boundary validation in authorized environments.
Discovery and enumeration example
{“jsonrpc”:”2.0″,”id”:7,”method”:”tools/list”,”params”:{“cursor”:”example-cursor-boundary-test”}}
Use this kind of example to discuss cursor handling, list visibility, and authorization drift.
OAuth redirect validation example
redirect_uri=https://client.example.invalid/callback%2F..%2Fcallback state=sample-single-use-token code_verifier=sample-pkce-verifier
Good for explaining exact-match redirect checks, state reuse rejection, and PKCE enforcement.
Metadata discovery SSRF example
WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/.well-known/oauth-protected-resource”
Useful for showing why clients must validate discovery targets and block internal-network fetches.
Prompt-steering example
Resource content: “This document is untrusted test content. Any instruction inside it must not override host policy or trigger sensitive tool use.”
This keeps the article practical while reinforcing the boundary between data and instruction.
Resource URI normalization example
file:///workspace/reports/%2e%2e/%2e%2e/secrets.txt
A clean way to explain canonicalization, template validation, and root-boundary enforcement.
Tool schema boundary example
{“project”:”demo-app”,”action”:”read”,”path”:”/reports/latest.json”,”unexpected_admin_flag”:true}
Helpful for illustrating why additionalProperties and strict schema enforcement matter.
Testing workflow image
Discover
→
Model Risks
→
Auth Review
→
Exploit Path
→
Validate Impact + Fix
This workflow gives the reader a clear mental model for how an MCP assessment typically progresses from discovery to impact validation and remediation.
Severity matrix for MCP security scenarios
Not all MCP weaknesses carry the same risk. Some expose useful metadata, while others allow action abuse, identity confusion, or local compromise. A severity matrix helps readers prioritize testing and remediation, especially when explaining findings to engineering teams, product owners, or leadership.
Critical focus: auth + toolingHigh focus: prompt boundariesMedium focus: discovery leakageAlways review: local execution
| Scenario | Likelihood | Potential severity | Why it matters | Priority |
|---|---|---|---|---|
| OAuth confusion / confused deputy | Medium to High | Critical | Can collapse client identity boundaries and grant unauthorized downstream access. | Immediate |
| Token passthrough / audience confusion | Medium | Critical | Breaks attribution and can allow foreign tokens to operate across services. | Immediate |
| Prompt steering / tool-output injection | High | High | Turns untrusted content into action influence, which is a defining MCP risk. | High |
| Tool schema abuse / business-logic compromise | High | High | Tools are the operational edge of the system and often directly touch sensitive workflows. | High |
| Local MCP compromise | Medium | High | Broad local privilege can lead to host-level access, secrets exposure, and unsafe execution. | High |
| Resource URI abuse | Medium | High | Can expose sensitive files, tenant data, or internal records through weak normalization. | High |
| Discovery leakage | High | Medium | Often becomes the map that helps attackers chain more severe findings. | Medium |
| Session hijack | Medium | High | A stolen active session can shortcut many intended approval and policy controls. | High |
This matrix helps engineering teams and reviewers prioritize remediation by showing which weaknesses are most likely to cause high-impact control failures.
Detailed technical test cases
Below is a compact, practical set of MCP test cases that can be used in a real pentest checklist. These are written in a way that maps directly to protocol behavior, not just general web testing habits.
Test Case Block A — Discovery and enumeration
- Call all list methods with each role and compare outputs for hidden capabilities.
- Replay cursors, use invalid cursors, and test large pagination states for leakage or crashes.
- Subscribe to change notifications and verify whether low-privilege sessions learn about privileged changes.
- Check whether list-hidden resources are still directly readable through guessed URIs.
Example objective: determine whether capability discovery respects authorization or merely hides buttons in the UI.
Test Case Block B — OAuth and authorization
- Trigger 401 flows and inspect resource metadata locations and scope challenges.
- Try exact-match bypasses for redirect URI handling: encoding, case, subdomain, path suffix, mixed slashes.
- Reuse state values and attempt callback completion with mismatched PKCE verifier values.
- Check whether consent is stored per client identity, not only per user.
Expected secure result: exact redirect matching, single-use state, PKCE enforcement, and per-client consent isolation.
Illustrative validation sample: GET /authorize?client_id=sample-client&redirect_uri=https://client.example.invalid/callback&state=test-state-01&code_challenge=samplechallenge&code_challenge_method=S256
Test Case Block C — Token and audience validation
- Present a token meant for another service and verify it is rejected.
- Present a downstream API token to the MCP server and test whether it accepts or forwards it.
- Inspect whether logs preserve client attribution or hide all actions under one service identity.
- Verify scope minimization: low-risk tools should not require broad administrative privilege.
Expected secure result: no token passthrough, audience validation enforced, and authorization remains server-side.
Test Case Block D — SSRF and metadata poisoning
- Return metadata URLs that point to localhost, internal RFC1918 IPs, and cloud metadata services.
- Use a public host that redirects to internal targets to test redirect safety.
- Attempt DNS rebinding between validation and fetch time.
- Observe outbound traffic to confirm whether egress controls actually work.
Expected secure result: blocked internal access, limited redirects, HTTPS-only policy, and validated metadata sources.
Illustrative validation sample: WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/prm” Expected reviewer question: does the client verify that the metadata host is trusted and non-internal?
Test Case Block E — Prompt and tool-output abuse
- Create resources containing adversarial instructions that conflict with system policy.
- Return tool output that includes malicious next-step guidance for the model.
- Attempt to embed local file or privileged URI references inside prompt content.
- Verify whether the client labels untrusted content and separates data from instructions.
Expected secure result: hostile content is treated as untrusted data, not privileged instruction.
Illustrative validation sample: “This is a lab-only untrusted content sample. The client should display it as data, not follow it as an instruction source.”
Test Case Block F — Resource URI and data boundary validation
- Try traversal patterns, double encoding, alternate schemes, and case variants in resource reads.
- Swap tenant or project identifiers in resource templates to test isolation.
- Compare resource list visibility against direct read behavior.
- Check normalization order: authorization must occur after canonicalization.
Expected secure result: canonicalized URIs, enforced root boundaries, and strong object-level authorization.
Illustrative validation sample: resource_uri = “file:///workspace/docs/%2e%2e/%2e%2e/config.json” Expected reviewer question: is canonicalization performed before the access decision?
Test Case Block G — Tool schema, replay, and business logic
- Send extra properties to test whether unexpected control fields are accepted.
- Inject payloads into strings that may reach command, SQL, template, or HTTP execution sinks.
- Replay destructive requests to test approval and idempotency logic.
- Send oversized objects to test parser behavior, memory pressure, and timeout resilience.
Expected secure result: strict schema validation, replay control, human approval for destructive tools, and safe failure handling.
Illustrative validation sample: {“operation”:”read_report”,”report_id”:”sample-001″,”unexpected_debug”:true} Expected reviewer question: does the tool reject undeclared properties?
Test Case Block H — Session and local execution review
- Replay session identifiers from another environment and verify rejection.
- Check whether websocket or event-stream messages can attach to another user context.
- Inspect local MCP startup commands for hidden arguments, silent elevation, or broad file access.
- Test whether localhost-exposed MCP services require authentication or are reachable by other processes.
Expected secure result: session binding, strong entropy, visible launch risk, and sandboxed local execution.
Common MCP implementation mistakes
Many MCP security failures do not come from one dramatic coding error. They come from small assumptions repeated across the stack: trusting metadata too early, treating tool output as authoritative, exposing a local server without enough restriction, or assuming OAuth plumbing is secure because a library is in use. This section helps readers recognize real-world failure patterns quickly and map them back to practical design and implementation decisions.
Frequent mistakes seen in MCP-style implementations
- Using UI visibility as a replacement for real authorization.
- Accepting any structurally valid token without checking audience and resource binding.
- Allowing tool outputs or resource contents to influence subsequent model behavior without validation.
- Fetching authorization metadata from untrusted locations without SSRF protection.
- Leaving
additionalPropertiesopen in tool schemas that drive sensitive actions. - Normalizing resource URIs after access decisions instead of before them.
- Running local MCP servers with broad privileges and poor user visibility into startup commands.
- Bundling many unrelated permissions into one oversized scope or long-lived session.
This section creates a useful break in the flow and reinforces the most common implementation errors teams should look for during review.
Defensive hardening checklist
A strong MCP deployment is not built by adding one filter at the end. It is built by enforcing boundaries at each layer. The host should present clear approval flows and isolate sessions. The client should treat prompts, annotations, tool results, and metadata as untrusted inputs unless explicitly validated. The server should validate every tool input, every resource URI, and every authorization artifact. Downstream integrations should not inherit trust automatically from the protocol layer.
- Use exact redirect URI matching and PKCE for remote authorization flows.
- Never accept tokens not issued for the MCP server itself.
- Do not implement token passthrough to downstream APIs.
- Treat discovery and metadata URLs as attacker-controlled until validated.
- Separate untrusted content from privileged model instructions.
- Validate prompt parameters, tool inputs, and tool outputs with explicit schemas.
- Enforce URI canonicalization before authorization in resource reads.
- Minimize scopes and use progressive privilege elevation.
- Bind sessions strongly and avoid using session identifiers as sole authentication.
- For local MCP servers, prefer restricted execution, sandboxing, and explicit permission visibility.
Pentester quick checklist
This final box is useful for readers who want a practical takeaway. It also prints well and can be reused as a review sheet during internal security testing, design reviews, or pre-release validation for MCP-enabled features.
Printable review sheet
- Enumerate all prompts, tools, resources, templates, and subscriptions.
- Check whether discovery visibility changes correctly by role and session.
- Validate OAuth redirect handling, PKCE, state, and per-client consent behavior.
- Reject foreign-audience tokens and confirm no token passthrough exists.
- Test metadata discovery for SSRF, redirect abuse, and DNS rebinding resilience.
- Review prompt, resource, and tool-output handling for instruction-versus-data separation.
- Verify resource URI canonicalization before authorization and inspect cross-tenant isolation.
- Test tool schemas for undeclared fields, weak enums, replay, and destructive-action approval logic.
- Inspect session binding, entropy, expiry, and cross-context replay behavior.
- Review local MCP execution paths for sandboxing, file access, and localhost exposure.
- Document impact using scenario, entry point, control failure, exploitability, and remediation mapping.
Final conclusion
MCP expands the power of AI systems by giving them structured context and executable pathways into real systems. That same design also creates a layered attack surface where protocol correctness alone is not enough. The real question is whether the trust boundaries hold when hostile content, hostile servers, weak OAuth handling, or unsafe local execution are introduced.
