Model Context Protocol (MCP) Security and Pentesting: Threats, Test Cases, and Hardening

Host

Thank you for reading this post, don't forget to subscribe!

MCP Server

MCP Architecture showing client-server communication with security boundaries
Figure 1: MCP Architecture – Client-Server Communication Flow

MCP Security Architecture - Client-Server Communication Flow
Figure 1: MCP Architecture showing secure client-server communication channels

Model

Trust BoundariesOAuthPrompt InjectionTool AbuseLocal Compromise

Table of Contents

Why MCP changes pentesting

Additionally, Model Context Protocol is not just another API integration layer. It is a protocol-driven bridge between a host application, one or more MCP clients, multiple MCP servers, and the large language model itself. That changes the security equation. In a normal application pentest, the tester usually focuses on authentication, authorization, input validation, logic flaws, and infrastructure exposure. In an MCP pentest, those still matter, but now they must be tested together with semantic control paths such as prompt injection, tool description abuse, resource poisoning, discovery manipulation, and authorization confusion.

Moreover, The biggest shift is this: in MCP, data can become instruction. A resource may look like passive content, but if it is passed back into the model without boundary handling, it can influence future tool invocation. A tool response may look like structured output, but if it contains malicious text and the client blindly forwards it, the model can be steered into unsafe actions. A metadata response in an OAuth flow may look like configuration, but if a client trusts it too easily, it can become an SSRF primitive or a redirection vector. This is why MCP security cannot be tested with only standard web scanning logic.

Specifically, A good MCP pentest therefore combines protocol analysis, identity and session review, local execution assessment, model-guided abuse testing, and downstream impact validation. The target is not just the server endpoint. The target is the entire trust system that allows an AI application to read context and perform actions.

Primary Security Shift

Instruction can be hidden inside prompts, tool descriptions, tool output, metadata, and resources.

Primary Pentest Goal

Break trust boundaries without relying only on malformed packets or obvious vulnerability payloads.

Primary Failure Mode

Safe-looking protocol flows become privileged execution chains once the model is influenced.

MCP architecture and trust boundaries

Subsequently, Before testing, the structure must be understood correctly. MCP typically contains a host, a client runtime, one or more servers, and the model. The host controls user experience, permissions, client lifecycle, and policy. The client manages protocol communication. The server exposes three important primitives: resources, prompts, and tools. Resources provide data. Prompts provide reusable prompt templates. Tools expose executable functions. Each one has a different abuse pattern, a different blast radius, and a different validation strategy.

Ultimately, From a pentesting perspective, the architecture is best modeled as six trust boundaries. First, host to client: can the host isolate sessions and enforce approval? Second, client to server: are discovery, authorization, and method invocation securely handled? Third, server to downstream systems: is the server a secure broker or a weak proxy? Fourth, model to tool chain: can content influence action? Fifth, local execution: is the local MCP server operating with excessive privilege? Sixth, session and token lifecycle: can identity artifacts be replayed, widened, stolen, or confused?

Resources

High risk for data exposure, tenant crossover, traversal, and untrusted-content injection into the model.

Prompts

High risk for prompt injection, unsafe parameterization, embedded resource abuse, and hidden instruction flow.

Tools

Moreover, High risk for action abuse, business-logic flaws, schema weakness, command injection, and destructive side effects.

Authorization Layer

Specifically, High risk for redirect abuse, confused deputy, token passthrough, SSRF, session hijack, and audience confusion.

Visual attack surface and testing map

Notably, A strong technical blog reads better when the architecture and attack flow are visible. The diagram below shows how MCP should be viewed during a pentest: not as one endpoint, but as a chain where trust moves from the user-facing host to the client runtime, into the MCP server, toward the model, and then out to downstream systems such as file stores, SaaS APIs, databases, and local processes.

MCP attack surface map

Host AppConsent • Policy • UX

MCP ClientJSON-RPC • Session • Auth

MCP ServerResources • Prompts • Tools

ModelReasoning • Tool Choice

Downstream SystemsFiles • APIs • DB • SaaS

Local Runtimestdio • localhost • files

Session hijack / auth confusionDiscovery abuse / SSRFPrompt injection / tool steeringPrivilege misuse / data exfilLocal compromise / sandbox escape

The most important lesson from this picture is that MCP failures often happen at the boundary between components, not only inside the server itself.

ScenarioPrimary entry pointTypical weaknessLikely impactPentest signal
Capability discovery abusetools/list, prompts/list, resources/listVisibility not bound to authorizationPrivilege mapping, sensitive feature discoveryLow-privilege user sees hidden objects or templates
OAuth confusion401 flow, metadata discovery, callbacksWeak redirect checks, bad state handlingAccount mix-up, confused deputy, token theftNear-match redirect or reused state is accepted
Prompt and tool steeringPrompt body, resource content, tool outputUntrusted text treated as instructionUnsafe tool use, policy bypass, exfiltrationModel follows hostile content over local policy
Resource abuseresources/read and URI templatesNormalization or access control weaknessCross-tenant reads, secrets exposureGuessed URI works despite list restrictions
Local MCP compromisestdio launch, localhost service, filesystemBroad local rights, no sandbox, opaque launchCredential theft, file access, host takeoverLocal service is reachable or reads sensitive paths

This format helps readers compare entry points, weaknesses, and impact quickly, which is exactly how security teams usually triage protocol-level findings.

MCP security visuals

The diagrams below summarize four high-value areas in an MCP assessment: authorization trust flow, prompt-to-tool influence, resource URI validation, and the local execution boundary. Together, they show where control failures usually appear and why these paths deserve focused testing during a pentest.

OAuth and redirect trust flow

Client

MCP Server

Auth Server

401 + metadataRedirect validationPKCEState bindingAudience checks

This image fits well near the authorization section and helps explain why redirect handling and consent boundaries are security-critical.

Prompt injection and tool-steering chain

Resource

Model

Tool Call

Untrusted content can become instructionLabelingOutput validationApproval gatesInstruction isolation

This visual is useful where the article explains the difference between untrusted content and privileged instructions.

Resource URI abuse path

Input URI

Canonicalization

Authorization

Wrong order = access control bypass riskNormalize firstEvaluate root boundaryCheck tenant scope

This image strengthens the section on URI normalization and makes the path-traversal discussion easier to understand visually.

Local MCP server risk picture

Host App

Local MCP

Sensitive filesEnvironment variablesBrowser dataCredential storesLocalhost exposure

This graphic supports the local-runtime section and visually highlights why sandboxing and execution transparency matter.

Pentest methodology for MCP

The first step is protocol discovery. Capture initialization traffic, enumerate server capabilities, map the JSON-RPC methods, and list all prompts, tools, and resources. Do not stop at what the UI exposes. Many deployments expose additional list operations, pagination cursors, templates, subscriptions, or change notifications that expand the attack surface. Every method should be classified by sensitivity, downstream reach, and approval expectation.

The second step is schema review. Tool schemas often reveal design weaknesses before exploitation starts. Look for unrestricted strings, missing required fields, nested free-form objects, permissive additional properties, and weak enums. Then connect those schema issues to execution sinks such as shell wrappers, SQL queries, file paths, HTTP clients, or administrative actions. Schema flaws in MCP are not abstract design problems; they are the first stage of exploitability.

The third step is semantic abuse testing. Create malicious prompts, poisoned resources, and adversarial tool outputs. Test whether the client treats untrusted text as instruction. Test whether a tool result can force a second risky tool call. Test whether resource contents can bypass approval logic through indirect steering. In many MCP systems, the protocol packets are technically valid but the overall security behavior still fails because the model is not isolated from hostile content.

The fourth step is identity and session review. For HTTP transports, inspect OAuth flows, redirect handling, state, PKCE, metadata discovery, audience validation, token storage, and scope minimization. For local transports, review execution commands, launch parameters, exposed local ports, sandboxing, file rights, network permissions, and implicit trust granted to the installed server.

Core security scenarios every MCP pentest should cover

🔍 1. Discovery and capability abuse

Enumeration Risk

Discovery is often underestimated. Attackers can use tool lists, prompt lists, resource templates, notifications, and cursor behavior to understand hidden capabilities. Even when direct access is blocked, metadata can reveal privileged functions or sensitive system structure.

Security scenarios

  • Low-privilege users seeing admin-only tools
  • Resource templates exposing tenant or project identifiers
  • List-changed events leaking privileged feature existence
  • Pagination cursors enabling hidden object discovery

Testing focus

  • Role-based visibility comparison
  • Cursor replay and manipulation
  • Notification subscription behavior
  • Discoverable but unreadable object mismatch

🛂 2. Authorization, redirect, and confused deputy flaws

Identity Critical

In HTTP-based MCP, OAuth and discovery become central. Weak redirect validation, broad consent assumptions, or poor client binding can turn a secure-looking integration into a confused deputy. This is especially severe when an MCP server is also a proxy to a third-party API.

Security scenarios

  • Redirect URI near-match acceptance
  • Per-user consent stored without per-client binding
  • State reuse or PKCE bypass
  • Proxy server granting downstream access to the wrong client

Testing focus

  • Exact redirect matching
  • Single-use state validation
  • Dynamic client registration handling
  • Consent page identity clarity and anti-clickjacking

🎫 3. Token passthrough and audience confusion

Privilege Expansion

If an MCP server accepts tokens not issued for itself or forwards client tokens downstream, it collapses trust boundaries and weakens attribution. The server effectively becomes a transport tunnel instead of an enforcing broker.

Security scenarios

  • Foreign audience tokens accepted as valid
  • Downstream API tokens replayed against the MCP server
  • Shared tokens preventing client-level attribution
  • Over-broad scope reuse across unrelated tools

Testing focus

  • Audience validation
  • Issuer and scope validation
  • Server-side identity mapping
  • Downstream request impersonation behavior

🌐 4. SSRF through metadata and discovery URLs

Infrastructure Reach

MCP clients may fetch URLs returned during metadata discovery. If those URLs are attacker-controlled, they can drive SSRF into localhost, internal networks, cloud metadata services, or redirected internal targets.

Security scenarios

  • Direct internal IP fetches
  • DNS rebinding after initial validation
  • Public-to-private redirect chains
  • Unrestricted metadata host trust

Testing focus

MCP Architecture showing client-server communication with security boundaries
Figure 1: MCP Architecture – Client-Server Communication Flow
  • Blocklists for localhost and RFC1918
  • Redirect validation
  • DNS pinning or repeated resolution checks
  • Outbound filtering and egress control

🧠 5. Prompt injection, resource poisoning, and tool-output steering

Model Manipulation

This is the defining MCP challenge. Untrusted content can cross the model boundary and steer behavior. A poisoned prompt can manipulate context. A malicious resource can hide instructions. A tool can return output that tricks the model into using another tool unsafely. The protocol can be perfectly valid while the behavior becomes unsafe.

Security scenarios

  • Resource content overriding host policy
  • Tool results containing hidden next-step instructions
  • Embedded references to sensitive local URIs
  • Prompt templates accepting unvalidated control strings

Testing focus

  • Untrusted-content labeling
  • Structured output validation
  • Approval checks before sensitive tool calls
  • Isolation between data channels and instruction channels

📂 6. Resource URI abuse and unauthorized data access

Data Exposure

Resource reads are often where actual secrets leak. MCP servers may expose file-backed content, API-wrapped data, logs, repositories, or records via URI patterns. If normalization or authorization is weak, a simple read request can become path traversal, tenant crossover, or direct object reference exploitation.

Security scenarios

  • Traversal in file-backed resources
  • Cross-tenant resource-template manipulation
  • Direct read of non-listed resources
  • Encoding-based normalization bypass

Testing focus

  • URI normalization before authorization
  • Template parameter boundary tests
  • Scheme validation
  • Read-vs-list consistency review

🛠️ 7. Tool schema abuse and business-logic compromise

Action Surface

Tools are where MCP becomes dangerous in the real world. Weak schemas, unsafe parsers, or poorly designed approval logic can turn a normal automation helper into an administrative exploit surface.

Security scenarios

  • Unrestricted strings reaching shell or SQL sinks
  • Free-form objects carrying hidden control fields
  • Replay of non-idempotent destructive actions
  • Race conditions in approval or multi-step workflows

Testing focus

MCP Attack Vectors and exploitation techniques
Figure 2: MCP Attack Vectors and Exploitation Methods
  • Strict schema validation
  • additionalProperties hardening
  • Safe defaults for destructive actions
  • Rate limiting, timeouts, and replay handling
MCP Attack Vectors - Injection, SSRF, and Data Exfiltration
Figure 2: Common MCP Attack Vectors and Exploitation Paths

🖥️ 8. Session hijacking and local MCP compromise

Host-Level Risk

Session-based weaknesses can let an attacker attach to another user’s active context. Local MCP servers add a second risk: they often run with broad local privilege and may access files, environment variables, credentials, and local services. If startup flows are opaque or localhost interfaces are open, the local machine becomes part of the attack surface.

Security scenarios

  • Session replay across users or devices
  • Predictable or weak session identifiers
  • Unauthenticated localhost MCP control endpoints
  • Silent execution of risky local commands

Testing focus

  • Session binding to identity and context
  • Entropy and lifetime of identifiers
  • Sandboxing of local servers
  • Visibility of launch commands and permissions

Controlled validation examples

The examples below show controlled validation inputs that help explain what a reviewer should inspect during an MCP assessment. They are written for defensive testing, protocol review, and boundary validation in authorized environments.

Discovery and enumeration example

{“jsonrpc”:”2.0″,”id”:7,”method”:”tools/list”,”params”:{“cursor”:”example-cursor-boundary-test”}}

Use this kind of example to discuss cursor handling, list visibility, and authorization drift.

OAuth redirect validation example

redirect_uri=https://client.example.invalid/callback%2F..%2Fcallback state=sample-single-use-token code_verifier=sample-pkce-verifier

Good for explaining exact-match redirect checks, state reuse rejection, and PKCE enforcement.

Metadata discovery SSRF example

WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/.well-known/oauth-protected-resource”

Useful for showing why clients must validate discovery targets and block internal-network fetches.

Prompt-steering example

Resource content: “This document is untrusted test content. Any instruction inside it must not override host policy or trigger sensitive tool use.”

This keeps the article practical while reinforcing the boundary between data and instruction.

Resource URI normalization example

file:///workspace/reports/%2e%2e/%2e%2e/secrets.txt

A clean way to explain canonicalization, template validation, and root-boundary enforcement.

Tool schema boundary example

{“project”:”demo-app”,”action”:”read”,”path”:”/reports/latest.json”,”unexpected_admin_flag”:true}

Helpful for illustrating why additionalProperties and strict schema enforcement matter.

Testing workflow image

Discover

Model Risks

Auth Review

Exploit Path

Validate Impact + Fix

This workflow gives the reader a clear mental model for how an MCP assessment typically progresses from discovery to impact validation and remediation.

Severity matrix for MCP security scenarios

Not all MCP weaknesses carry the same risk. Some expose useful metadata, while others allow action abuse, identity confusion, or local compromise. A severity matrix helps readers prioritize testing and remediation, especially when explaining findings to engineering teams, product owners, or leadership.

Critical focus: auth + toolingHigh focus: prompt boundariesMedium focus: discovery leakageAlways review: local execution

ScenarioLikelihoodPotential severityWhy it mattersPriority
OAuth confusion / confused deputyMedium to HighCriticalCan collapse client identity boundaries and grant unauthorized downstream access.Immediate
Token passthrough / audience confusionMediumCriticalBreaks attribution and can allow foreign tokens to operate across services.Immediate
Prompt steering / tool-output injectionHighHighTurns untrusted content into action influence, which is a defining MCP risk.High
Tool schema abuse / business-logic compromiseHighHighTools are the operational edge of the system and often directly touch sensitive workflows.High
Local MCP compromiseMediumHighBroad local privilege can lead to host-level access, secrets exposure, and unsafe execution.High
Resource URI abuseMediumHighCan expose sensitive files, tenant data, or internal records through weak normalization.High
Discovery leakageHighMediumOften becomes the map that helps attackers chain more severe findings.Medium
Session hijackMediumHighA stolen active session can shortcut many intended approval and policy controls.High

This matrix helps engineering teams and reviewers prioritize remediation by showing which weaknesses are most likely to cause high-impact control failures.

Detailed technical test cases

Below is a compact, practical set of MCP test cases that can be used in a real pentest checklist. These are written in a way that maps directly to protocol behavior, not just general web testing habits.

Test Case Block A — Discovery and enumeration

  • Call all list methods with each role and compare outputs for hidden capabilities.
  • Replay cursors, use invalid cursors, and test large pagination states for leakage or crashes.
  • Subscribe to change notifications and verify whether low-privilege sessions learn about privileged changes.
  • Check whether list-hidden resources are still directly readable through guessed URIs.

Example objective: determine whether capability discovery respects authorization or merely hides buttons in the UI.

Test Case Block B — OAuth and authorization

  • Trigger 401 flows and inspect resource metadata locations and scope challenges.
  • Try exact-match bypasses for redirect URI handling: encoding, case, subdomain, path suffix, mixed slashes.
  • Reuse state values and attempt callback completion with mismatched PKCE verifier values.
  • Check whether consent is stored per client identity, not only per user.

Expected secure result: exact redirect matching, single-use state, PKCE enforcement, and per-client consent isolation.

Illustrative validation sample: GET /authorize?client_id=sample-client&redirect_uri=https://client.example.invalid/callback&state=test-state-01&code_challenge=samplechallenge&code_challenge_method=S256

Test Case Block C — Token and audience validation

  • Present a token meant for another service and verify it is rejected.
  • Present a downstream API token to the MCP server and test whether it accepts or forwards it.
  • Inspect whether logs preserve client attribution or hide all actions under one service identity.
  • Verify scope minimization: low-risk tools should not require broad administrative privilege.

Expected secure result: no token passthrough, audience validation enforced, and authorization remains server-side.

Test Case Block D — SSRF and metadata poisoning

  • Return metadata URLs that point to localhost, internal RFC1918 IPs, and cloud metadata services.
  • Use a public host that redirects to internal targets to test redirect safety.
  • Attempt DNS rebinding between validation and fetch time.
  • Observe outbound traffic to confirm whether egress controls actually work.

Expected secure result: blocked internal access, limited redirects, HTTPS-only policy, and validated metadata sources.

Illustrative validation sample: WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/prm” Expected reviewer question: does the client verify that the metadata host is trusted and non-internal?

Test Case Block E — Prompt and tool-output abuse

  • Create resources containing adversarial instructions that conflict with system policy.
  • Return tool output that includes malicious next-step guidance for the model.
  • Attempt to embed local file or privileged URI references inside prompt content.
  • Verify whether the client labels untrusted content and separates data from instructions.

Expected secure result: hostile content is treated as untrusted data, not privileged instruction.

Illustrative validation sample: “This is a lab-only untrusted content sample. The client should display it as data, not follow it as an instruction source.”

Test Case Block F — Resource URI and data boundary validation

  • Try traversal patterns, double encoding, alternate schemes, and case variants in resource reads.
  • Swap tenant or project identifiers in resource templates to test isolation.
  • Compare resource list visibility against direct read behavior.
  • Check normalization order: authorization must occur after canonicalization.

Expected secure result: canonicalized URIs, enforced root boundaries, and strong object-level authorization.

Illustrative validation sample: resource_uri = “file:///workspace/docs/%2e%2e/%2e%2e/config.json” Expected reviewer question: is canonicalization performed before the access decision?

Test Case Block G — Tool schema, replay, and business logic

  • Send extra properties to test whether unexpected control fields are accepted.
  • Inject payloads into strings that may reach command, SQL, template, or HTTP execution sinks.
  • Replay destructive requests to test approval and idempotency logic.
  • Send oversized objects to test parser behavior, memory pressure, and timeout resilience.

Expected secure result: strict schema validation, replay control, human approval for destructive tools, and safe failure handling.

Illustrative validation sample: {“operation”:”read_report”,”report_id”:”sample-001″,”unexpected_debug”:true} Expected reviewer question: does the tool reject undeclared properties?

Test Case Block H — Session and local execution review

  • Replay session identifiers from another environment and verify rejection.
  • Check whether websocket or event-stream messages can attach to another user context.
  • Inspect local MCP startup commands for hidden arguments, silent elevation, or broad file access.
  • Test whether localhost-exposed MCP services require authentication or are reachable by other processes.

Expected secure result: session binding, strong entropy, visible launch risk, and sandboxed local execution.

Common MCP implementation mistakes

Many MCP security failures do not come from one dramatic coding error. They come from small assumptions repeated across the stack: trusting metadata too early, treating tool output as authoritative, exposing a local server without enough restriction, or assuming OAuth plumbing is secure because a library is in use. This section helps readers recognize real-world failure patterns quickly and map them back to practical design and implementation decisions.

Frequent mistakes seen in MCP-style implementations

  • Using UI visibility as a replacement for real authorization.
  • Accepting any structurally valid token without checking audience and resource binding.
  • Allowing tool outputs or resource contents to influence subsequent model behavior without validation.
  • Fetching authorization metadata from untrusted locations without SSRF protection.
  • Leaving additionalProperties open in tool schemas that drive sensitive actions.
  • Normalizing resource URIs after access decisions instead of before them.
  • Running local MCP servers with broad privileges and poor user visibility into startup commands.
  • Bundling many unrelated permissions into one oversized scope or long-lived session.

This section creates a useful break in the flow and reinforces the most common implementation errors teams should look for during review.

Defensive hardening checklist

A strong MCP deployment is not built by adding one filter at the end. It is built by enforcing boundaries at each layer. The host should present clear approval flows and isolate sessions. The client should treat prompts, annotations, tool results, and metadata as untrusted inputs unless explicitly validated. The server should validate every tool input, every resource URI, and every authorization artifact. Downstream integrations should not inherit trust automatically from the protocol layer.

  • Use exact redirect URI matching and PKCE for remote authorization flows.
  • Never accept tokens not issued for the MCP server itself.
  • Do not implement token passthrough to downstream APIs.
  • Treat discovery and metadata URLs as attacker-controlled until validated.
  • Separate untrusted content from privileged model instructions.
  • Validate prompt parameters, tool inputs, and tool outputs with explicit schemas.
  • Enforce URI canonicalization before authorization in resource reads.
  • Minimize scopes and use progressive privilege elevation.
  • Bind sessions strongly and avoid using session identifiers as sole authentication.
  • For local MCP servers, prefer restricted execution, sandboxing, and explicit permission visibility.

Pentester quick checklist

This final box is useful for readers who want a practical takeaway. It also prints well and can be reused as a review sheet during internal security testing, design reviews, or pre-release validation for MCP-enabled features.

Printable review sheet

  • Enumerate all prompts, tools, resources, templates, and subscriptions.
  • Check whether discovery visibility changes correctly by role and session.
  • Validate OAuth redirect handling, PKCE, state, and per-client consent behavior.
  • Reject foreign-audience tokens and confirm no token passthrough exists.
  • Test metadata discovery for SSRF, redirect abuse, and DNS rebinding resilience.
  • Review prompt, resource, and tool-output handling for instruction-versus-data separation.
  • Verify resource URI canonicalization before authorization and inspect cross-tenant isolation.
  • Test tool schemas for undeclared fields, weak enums, replay, and destructive-action approval logic.
  • Inspect session binding, entropy, expiry, and cross-context replay behavior.
  • Review local MCP execution paths for sandboxing, file access, and localhost exposure.
  • Document impact using scenario, entry point, control failure, exploitability, and remediation mapping.

Final conclusion

MCP expands the power of AI systems by giving them structured context and executable pathways into real systems. That same design also creates a layered attack surface where protocol correctness alone is not enough. The real question is whether the trust boundaries hold when hostile content, hostile servers, weak OAuth handling, or unsafe local execution are introduced.

MCP Attack Vectors - Injection, SSRF, and Data Exfiltration
Figure 2: Common MCP Attack Vectors and Exploitation Paths

For pentesters, the lesson is simple: test every place where data can become instruction, every place where identity can become authority, and every place where convenience silently becomes privilege. For defenders, the right approach is equally clear: validate schemas, reduce scope, harden authorization, distrust metadata, isolate local execution, and require explicit approval wherever a tool can change state or access sensitive content.

If an organization plans to deploy MCP at scale, the pentest must be architecture-aware, model-aware, and protocol-aware. Only then does the assessmHost

MCP Server

Model

Trust BoundariesOAuthPrompt InjectionTool AbuseLocal Compromise

Table of Contents

Why MCP changes pentesting

Model Context Protocol is not just another API integration layer. It is a protocol-driven bridge between a host application, one or more MCP clients, multiple MCP servers, and the large language model itself. That changes the security equation. In a normal application pentest, the tester usually focuses on authentication, authorization, input validation, logic flaws, and infrastructure exposure. In an MCP pentest, those still matter, but now they must be tested together with semantic control paths such as prompt injection, tool description abuse, resource poisoning, discovery manipulation, and authorization confusion.

The biggest shift is this: in MCP, data can become instruction. A resource may look like passive content, but if it is passed back into the model without boundary handling, it can influence future tool invocation. A tool response may look like structured output, but if it contains malicious text and the client blindly forwards it, the model can be steered into unsafe actions. A metadata response in an OAuth flow may look like configuration, but if a client trusts it too easily, it can become an SSRF primitive or a redirection vector. This is why MCP security cannot be tested with only standard web scanning logic.

A good MCP pentest therefore combines protocol analysis, identity and session review, local execution assessment, model-guided abuse testing, and downstream impact validation. The target is not just the server endpoint. The target is the entire trust system that allows an AI application to read context and perform actions.

Primary Security Shift

Instruction can be hidden inside prompts, tool descriptions, tool output, metadata, and resources.

Primary Pentest Goal

Break trust boundaries without relying only on malformed packets or obvious vulnerability payloads.

Primary Failure Mode

Safe-looking protocol flows become privileged execution chains once the model is influenced.

MCP architecture and trust boundaries

Before testing, the structure must be understood correctly. MCP typically contains a host, a client runtime, one or more servers, and the model. The host controls user experience, permissions, client lifecycle, and policy. The client manages protocol communication. The server exposes three important primitives: resources, prompts, and tools. Resources provide data. Prompts provide reusable prompt templates. Tools expose executable functions. Each one has a different abuse pattern, a different blast radius, and a different validation strategy.

From a pentesting perspective, the architecture is best modeled as six trust boundaries. First, host to client: can the host isolate sessions and enforce approval? Second, client to server: are discovery, authorization, and method invocation securely handled? Third, server to downstream systems: is the server a secure broker or a weak proxy? Fourth, model to tool chain: can content influence action? Fifth, local execution: is the local MCP server operating with excessive privilege? Sixth, session and token lifecycle: can identity artifacts be replayed, widened, stolen, or confused?

Resources

High risk for data exposure, tenant crossover, traversal, and untrusted-content injection into the model.

Prompts

High risk for prompt injection, unsafe parameterization, embedded resource abuse, and hidden instruction flow.

Tools

High risk for action abuse, business-logic flaws, schema weakness, command injection, and destructive side effects.

Authorization Layer

High risk for redirect abuse, confused deputy, token passthrough, SSRF, session hijack, and audience confusion.

Visual attack surface and testing map

A strong technical blog reads better when the architecture and attack flow are visible. The diagram below shows how MCP should be viewed during a pentest: not as one endpoint, but as a chain where trust moves from the user-facing host to the client runtime, into the MCP server, toward the model, and then out to downstream systems such as file stores, SaaS APIs, databases, and local processes.

MCP attack surface map

Host AppConsent • Policy • UX

MCP ClientJSON-RPC • Session • Auth

MCP ServerResources • Prompts • Tools

ModelReasoning • Tool Choice

Downstream SystemsFiles • APIs • DB • SaaS

Local Runtimestdio • localhost • files

MCP Security Testing Workflow and Methodology
Figure 3: Complete MCP Security Testing Process

Session hijack / auth confusionDiscovery abuse / SSRFPrompt injection / tool steeringPrivilege misuse / data exfilLocal compromise / sandbox escape

The most important lesson from this picture is that MCP failures often happen at the boundary between components, not only inside the server itself.

MCP Security Testing Workflow - From Discovery to Remediation
Figure 3: Complete MCP Security Testing Methodology
ScenarioPrimary entry pointTypical weaknessLikely impactPentest signal
Capability discovery abusetools/list, prompts/list, resources/listVisibility not bound to authorizationPrivilege mapping, sensitive feature discoveryLow-privilege user sees hidden objects or templates
OAuth confusion401 flow, metadata discovery, callbacksWeak redirect checks, bad state handlingAccount mix-up, confused deputy, token theftNear-match redirect or reused state is accepted
Prompt and tool steeringPrompt body, resource content, tool outputUntrusted text treated as instructionUnsafe tool use, policy bypass, exfiltrationModel follows hostile content over local policy
Resource abuseresources/read and URI templatesNormalization or access control weaknessCross-tenant reads, secrets exposureGuessed URI works despite list restrictions
Local MCP compromisestdio launch, localhost service, filesystemBroad local rights, no sandbox, opaque launchCredential theft, file access, host takeoverLocal service is reachable or reads sensitive paths

This format helps readers compare entry points, weaknesses, and impact quickly, which is exactly how security teams usually triage protocol-level findings.

MCP security visuals

The diagrams below summarize four high-value areas in an MCP assessment: authorization trust flow, prompt-to-tool influence, resource URI validation, and the local execution boundary. Together, they show where control failures usually appear and why these paths deserve focused testing during a pentest.

OAuth and redirect trust flow

Client

MCP Server

Auth Server

401 + metadataRedirect validationPKCEState bindingAudience checks

This image fits well near the authorization section and helps explain why redirect handling and consent boundaries are security-critical.

Prompt injection and tool-steering chain

Resource

Model

Tool Call

Untrusted content can become instructionLabelingOutput validationApproval gatesInstruction isolation

This visual is useful where the article explains the difference between untrusted content and privileged instructions.

Resource URI abuse path

Input URI

Canonicalization

Authorization

Wrong order = access control bypass riskNormalize firstEvaluate root boundaryCheck tenant scope

This image strengthens the section on URI normalization and makes the path-traversal discussion easier to understand visually.

Local MCP server risk picture

Host App

Local MCP

Sensitive filesEnvironment variablesBrowser dataCredential storesLocalhost exposure

This graphic supports the local-runtime section and visually highlights why sandboxing and execution transparency matter.

Pentest methodology for MCP

The first step is protocol discovery. Capture initialization traffic, enumerate server capabilities, map the JSON-RPC methods, and list all prompts, tools, and resources. Do not stop at what the UI exposes. Many deployments expose additional list operations, pagination cursors, templates, subscriptions, or change notifications that expand the attack surface. Every method should be classified by sensitivity, downstream reach, and approval expectation.

The second step is schema review. Tool schemas often reveal design weaknesses before exploitation starts. Look for unrestricted strings, missing required fields, nested free-form objects, permissive additional properties, and weak enums. Then connect those schema issues to execution sinks such as shell wrappers, SQL queries, file paths, HTTP clients, or administrative actions. Schema flaws in MCP are not abstract design problems; they are the first stage of exploitability.

The third step is semantic abuse testing. Create malicious prompts, poisoned resources, and adversarial tool outputs. Test whether the client treats untrusted text as instruction. Test whether a tool result can force a second risky tool call. Test whether resource contents can bypass approval logic through indirect steering. In many MCP systems, the protocol packets are technically valid but the overall security behavior still fails because the model is not isolated from hostile content.

The fourth step is identity and session review. For HTTP transports, inspect OAuth flows, redirect handling, state, PKCE, metadata discovery, audience validation, token storage, and scope minimization. For local transports, review execution commands, launch parameters, exposed local ports, sandboxing, file rights, network permissions, and implicit trust granted to the installed server.

Core security scenarios every MCP pentest should cover

🔍 1. Discovery and capability abuse

Enumeration Risk

Discovery is often underestimated. Attackers can use tool lists, prompt lists, resource templates, notifications, and cursor behavior to understand hidden capabilities. Even when direct access is blocked, metadata can reveal privileged functions or sensitive system structure.

Security scenarios

  • Low-privilege users seeing admin-only tools
  • Resource templates exposing tenant or project identifiers
  • List-changed events leaking privileged feature existence
  • Pagination cursors enabling hidden object discovery

Testing focus

  • Role-based visibility comparison
  • Cursor replay and manipulation
  • Notification subscription behavior
  • Discoverable but unreadable object mismatch

🛂 2. Authorization, redirect, and confused deputy flaws

Identity Critical

In HTTP-based MCP, OAuth and discovery become central. Weak redirect validation, broad consent assumptions, or poor client binding can turn a secure-looking integration into a confused deputy. This is especially severe when an MCP server is also a proxy to a third-party API.

Security scenarios

  • Redirect URI near-match acceptance
  • Per-user consent stored without per-client binding
  • State reuse or PKCE bypass
  • Proxy server granting downstream access to the wrong client

Testing focus

  • Exact redirect matching
  • Single-use state validation
  • Dynamic client registration handling
  • Consent page identity clarity and anti-clickjacking

🎫 3. Token passthrough and audience confusion

Privilege Expansion

If an MCP server accepts tokens not issued for itself or forwards client tokens downstream, it collapses trust boundaries and weakens attribution. The server effectively becomes a transport tunnel instead of an enforcing broker.

Security scenarios

  • Foreign audience tokens accepted as valid
  • Downstream API tokens replayed against the MCP server
  • Shared tokens preventing client-level attribution
  • Over-broad scope reuse across unrelated tools

Testing focus

  • Audience validation
  • Issuer and scope validation
  • Server-side identity mapping
  • Downstream request impersonation behavior

🌐 4. SSRF through metadata and discovery URLs

Infrastructure Reach

MCP clients may fetch URLs returned during metadata discovery. If those URLs are attacker-controlled, they can drive SSRF into localhost, internal networks, cloud metadata services, or redirected internal targets.

Security scenarios

MCP Security Testing Workflow - From Discovery to Remediation
Figure 3: Complete MCP Security Testing Methodology
  • Direct internal IP fetches
  • DNS rebinding after initial validation
  • Public-to-private redirect chains
  • Unrestricted metadata host trust

Testing focus

  • Blocklists for localhost and RFC1918
  • Redirect validation
  • DNS pinning or repeated resolution checks
  • Outbound filtering and egress control

🧠 5. Prompt injection, resource poisoning, and tool-output steering

Model Manipulation

This is the defining MCP challenge. Untrusted content can cross the model boundary and steer behavior. A poisoned prompt can manipulate context. A malicious resource can hide instructions. A tool can return output that tricks the model into using another tool unsafely. The protocol can be perfectly valid while the behavior becomes unsafe.

Security scenarios

  • Resource content overriding host policy
  • Tool results containing hidden next-step instructions
  • Embedded references to sensitive local URIs
  • Prompt templates accepting unvalidated control strings

Testing focus

  • Untrusted-content labeling
  • Structured output validation
  • Approval checks before sensitive tool calls
  • Isolation between data channels and instruction channels

📂 6. Resource URI abuse and unauthorized data access

Data Exposure

Resource reads are often where actual secrets leak. MCP servers may expose file-backed content, API-wrapped data, logs, repositories, or records via URI patterns. If normalization or authorization is weak, a simple read request can become path traversal, tenant crossover, or direct object reference exploitation.

Security scenarios

  • Traversal in file-backed resources
  • Cross-tenant resource-template manipulation
  • Direct read of non-listed resources
  • Encoding-based normalization bypass

Testing focus

  • URI normalization before authorization
  • Template parameter boundary tests
  • Scheme validation
  • Read-vs-list consistency review

🛠️ 7. Tool schema abuse and business-logic compromise

Action Surface

Tools are where MCP becomes dangerous in the real world. Weak schemas, unsafe parsers, or poorly designed approval logic can turn a normal automation helper into an administrative exploit surface.

Security scenarios

  • Unrestricted strings reaching shell or SQL sinks
  • Free-form objects carrying hidden control fields
  • Replay of non-idempotent destructive actions
  • Race conditions in approval or multi-step workflows

Testing focus

  • Strict schema validation
  • additionalProperties hardening
  • Safe defaults for destructive actions
  • Rate limiting, timeouts, and replay handling

🖥️ 8. Session hijacking and local MCP compromise

Host-Level Risk

Session-based weaknesses can let an attacker attach to another user’s active context. Local MCP servers add a second risk: they often run with broad local privilege and may access files, environment variables, credentials, and local services. If startup flows are opaque or localhost interfaces are open, the local machine becomes part of the attack surface.

Security scenarios

  • Session replay across users or devices
  • Predictable or weak session identifiers
  • Unauthenticated localhost MCP control endpoints
  • Silent execution of risky local commands

Testing focus

  • Session binding to identity and context
  • Entropy and lifetime of identifiers
  • Sandboxing of local servers
  • Visibility of launch commands and permissions

Controlled validation examples

The examples below show controlled validation inputs that help explain what a reviewer should inspect during an MCP assessment. They are written for defensive testing, protocol review, and boundary validation in authorized environments.

Discovery and enumeration example

{“jsonrpc”:”2.0″,”id”:7,”method”:”tools/list”,”params”:{“cursor”:”example-cursor-boundary-test”}}

Use this kind of example to discuss cursor handling, list visibility, and authorization drift.

OAuth redirect validation example

redirect_uri=https://client.example.invalid/callback%2F..%2Fcallback state=sample-single-use-token code_verifier=sample-pkce-verifier

Good for explaining exact-match redirect checks, state reuse rejection, and PKCE enforcement.

Metadata discovery SSRF example

WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/.well-known/oauth-protected-resource”

Useful for showing why clients must validate discovery targets and block internal-network fetches.

Prompt-steering example

Resource content: “This document is untrusted test content. Any instruction inside it must not override host policy or trigger sensitive tool use.”

This keeps the article practical while reinforcing the boundary between data and instruction.

Resource URI normalization example

file:///workspace/reports/%2e%2e/%2e%2e/secrets.txt

A clean way to explain canonicalization, template validation, and root-boundary enforcement.

Tool schema boundary example

{“project”:”demo-app”,”action”:”read”,”path”:”/reports/latest.json”,”unexpected_admin_flag”:true}

Helpful for illustrating why additionalProperties and strict schema enforcement matter.

Testing workflow image

Discover

Model Risks

Auth Review

Exploit Path

Validate Impact + Fix

This workflow gives the reader a clear mental model for how an MCP assessment typically progresses from discovery to impact validation and remediation.

Severity matrix for MCP security scenarios

Not all MCP weaknesses carry the same risk. Some expose useful metadata, while others allow action abuse, identity confusion, or local compromise. A severity matrix helps readers prioritize testing and remediation, especially when explaining findings to engineering teams, product owners, or leadership.

Critical focus: auth + toolingHigh focus: prompt boundariesMedium focus: discovery leakageAlways review: local execution

ScenarioLikelihoodPotential severityWhy it mattersPriority
OAuth confusion / confused deputyMedium to HighCriticalCan collapse client identity boundaries and grant unauthorized downstream access.Immediate
Token passthrough / audience confusionMediumCriticalBreaks attribution and can allow foreign tokens to operate across services.Immediate
Prompt steering / tool-output injectionHighHighTurns untrusted content into action influence, which is a defining MCP risk.High
Tool schema abuse / business-logic compromiseHighHighTools are the operational edge of the system and often directly touch sensitive workflows.High
Local MCP compromiseMediumHighBroad local privilege can lead to host-level access, secrets exposure, and unsafe execution.High
Resource URI abuseMediumHighCan expose sensitive files, tenant data, or internal records through weak normalization.High
Discovery leakageHighMediumOften becomes the map that helps attackers chain more severe findings.Medium
Session hijackMediumHighA stolen active session can shortcut many intended approval and policy controls.High

This matrix helps engineering teams and reviewers prioritize remediation by showing which weaknesses are most likely to cause high-impact control failures.

Detailed technical test cases

Below is a compact, practical set of MCP test cases that can be used in a real pentest checklist. These are written in a way that maps directly to protocol behavior, not just general web testing habits.

Test Case Block A — Discovery and enumeration

  • Call all list methods with each role and compare outputs for hidden capabilities.
  • Replay cursors, use invalid cursors, and test large pagination states for leakage or crashes.
  • Subscribe to change notifications and verify whether low-privilege sessions learn about privileged changes.
  • Check whether list-hidden resources are still directly readable through guessed URIs.

Example objective: determine whether capability discovery respects authorization or merely hides buttons in the UI.

Test Case Block B — OAuth and authorization

  • Trigger 401 flows and inspect resource metadata locations and scope challenges.
  • Try exact-match bypasses for redirect URI handling: encoding, case, subdomain, path suffix, mixed slashes.
  • Reuse state values and attempt callback completion with mismatched PKCE verifier values.
  • Check whether consent is stored per client identity, not only per user.

Expected secure result: exact redirect matching, single-use state, PKCE enforcement, and per-client consent isolation.

Illustrative validation sample: GET /authorize?client_id=sample-client&redirect_uri=https://client.example.invalid/callback&state=test-state-01&code_challenge=samplechallenge&code_challenge_method=S256

Test Case Block C — Token and audience validation

  • Present a token meant for another service and verify it is rejected.
  • Present a downstream API token to the MCP server and test whether it accepts or forwards it.
  • Inspect whether logs preserve client attribution or hide all actions under one service identity.
  • Verify scope minimization: low-risk tools should not require broad administrative privilege.

Expected secure result: no token passthrough, audience validation enforced, and authorization remains server-side.

Test Case Block D — SSRF and metadata poisoning

  • Return metadata URLs that point to localhost, internal RFC1918 IPs, and cloud metadata services.
  • Use a public host that redirects to internal targets to test redirect safety.
  • Attempt DNS rebinding between validation and fetch time.
  • Observe outbound traffic to confirm whether egress controls actually work.

Expected secure result: blocked internal access, limited redirects, HTTPS-only policy, and validated metadata sources.

Illustrative validation sample: WWW-Authenticate: Bearer resource_metadata=”https://metadata.example.invalid/prm” Expected reviewer question: does the client verify that the metadata host is trusted and non-internal?

Test Case Block E — Prompt and tool-output abuse

  • Create resources containing adversarial instructions that conflict with system policy.
  • Return tool output that includes malicious next-step guidance for the model.
  • Attempt to embed local file or privileged URI references inside prompt content.
  • Verify whether the client labels untrusted content and separates data from instructions.

Expected secure result: hostile content is treated as untrusted data, not privileged instruction.

Illustrative validation sample: “This is a lab-only untrusted content sample. The client should display it as data, not follow it as an instruction source.”

Test Case Block F — Resource URI and data boundary validation

  • Try traversal patterns, double encoding, alternate schemes, and case variants in resource reads.
  • Swap tenant or project identifiers in resource templates to test isolation.
  • Compare resource list visibility against direct read behavior.
  • Check normalization order: authorization must occur after canonicalization.

Expected secure result: canonicalized URIs, enforced root boundaries, and strong object-level authorization.

Illustrative validation sample: resource_uri = “file:///workspace/docs/%2e%2e/%2e%2e/config.json” Expected reviewer question: is canonicalization performed before the access decision?

Test Case Block G — Tool schema, replay, and business logic

  • Send extra properties to test whether unexpected control fields are accepted.
  • Inject payloads into strings that may reach command, SQL, template, or HTTP execution sinks.
  • Replay destructive requests to test approval and idempotency logic.
  • Send oversized objects to test parser behavior, memory pressure, and timeout resilience.

Expected secure result: strict schema validation, replay control, human approval for destructive tools, and safe failure handling.

Illustrative validation sample: {“operation”:”read_report”,”report_id”:”sample-001″,”unexpected_debug”:true} Expected reviewer question: does the tool reject undeclared properties?

Test Case Block H — Session and local execution review

  • Replay session identifiers from another environment and verify rejection.
  • Check whether websocket or event-stream messages can attach to another user context.
  • Inspect local MCP startup commands for hidden arguments, silent elevation, or broad file access.
  • Test whether localhost-exposed MCP services require authentication or are reachable by other processes.

Expected secure result: session binding, strong entropy, visible launch risk, and sandboxed local execution.

Common MCP implementation mistakes

Many MCP security failures do not come from one dramatic coding error. They come from small assumptions repeated across the stack: trusting metadata too early, treating tool output as authoritative, exposing a local server without enough restriction, or assuming OAuth plumbing is secure because a library is in use. This section helps readers recognize real-world failure patterns quickly and map them back to practical design and implementation decisions.

Frequent mistakes seen in MCP-style implementations

  • Using UI visibility as a replacement for real authorization.
  • Accepting any structurally valid token without checking audience and resource binding.
  • Allowing tool outputs or resource contents to influence subsequent model behavior without validation.
  • Fetching authorization metadata from untrusted locations without SSRF protection.
  • Leaving additionalProperties open in tool schemas that drive sensitive actions.
  • Normalizing resource URIs after access decisions instead of before them.
  • Running local MCP servers with broad privileges and poor user visibility into startup commands.
  • Bundling many unrelated permissions into one oversized scope or long-lived session.

This section creates a useful break in the flow and reinforces the most common implementation errors teams should look for during review.

Defensive hardening checklist

A strong MCP deployment is not built by adding one filter at the end. It is built by enforcing boundaries at each layer. The host should present clear approval flows and isolate sessions. The client should treat prompts, annotations, tool results, and metadata as untrusted inputs unless explicitly validated. The server should validate every tool input, every resource URI, and every authorization artifact. Downstream integrations should not inherit trust automatically from the protocol layer.

  • Use exact redirect URI matching and PKCE for remote authorization flows.
  • Never accept tokens not issued for the MCP server itself.
  • Do not implement token passthrough to downstream APIs.
  • Treat discovery and metadata URLs as attacker-controlled until validated.
  • Separate untrusted content from privileged model instructions.
  • Validate prompt parameters, tool inputs, and tool outputs with explicit schemas.
  • Enforce URI canonicalization before authorization in resource reads.
  • Minimize scopes and use progressive privilege elevation.
  • Bind sessions strongly and avoid using session identifiers as sole authentication.
  • For local MCP servers, prefer restricted execution, sandboxing, and explicit permission visibility.

Pentester quick checklist

This final box is useful for readers who want a practical takeaway. It also prints well and can be reused as a review sheet during internal security testing, design reviews, or pre-release validation for MCP-enabled features.

Printable review sheet

  • Enumerate all prompts, tools, resources, templates, and subscriptions.
  • Check whether discovery visibility changes correctly by role and session.
  • Validate OAuth redirect handling, PKCE, state, and per-client consent behavior.
  • Reject foreign-audience tokens and confirm no token passthrough exists.
  • Test metadata discovery for SSRF, redirect abuse, and DNS rebinding resilience.
  • Review prompt, resource, and tool-output handling for instruction-versus-data separation.
  • Verify resource URI canonicalization before authorization and inspect cross-tenant isolation.
  • Test tool schemas for undeclared fields, weak enums, replay, and destructive-action approval logic.
  • Inspect session binding, entropy, expiry, and cross-context replay behavior.
  • Review local MCP execution paths for sandboxing, file access, and localhost exposure.
  • Document impact using scenario, entry point, control failure, exploitability, and remediation mapping.

Final conclusion

MCP expands the power of AI systems by giving them structured context and executable pathways into real systems. That same design also creates a layered attack surface where protocol correctness alone is not enough. The real question is whether the trust boundaries hold when hostile content, hostile servers, weak OAuth handling, or unsafe local execution are introduced.